** Summary **

'''`[[cmdSplit]`''', by [dgp], parses a [script] into its constituent commands while properly handling semicolon-delimited commands and the "semicolon in a comment" problem.  It was written to support parsing of class bodies in an itcl-like, pure Tcl, OO framework into Tcl commands.


** See Also **

   [cmdStream]:   

   [Config file using slave interp], by [AMG]:   more-or-less the same thing, implemented using a slave [interp]reter


** Description **


`[[cmdSplit]` returns a list of the commands in a script.  The original post is
''[http://groups.google.com/group/comp.lang.tcl/msg/cfe2d00fc7b291be%|%How to
split a string into elements exactly as eval would do Options ,comp.lang.tcl
,1998-09-07]''.

[PYK] 2013-04-14:  I've modified `[[cmdSplit]` to not filter out comments, and
provided a simple helper script that does that if desired:

======
nocomments [cmdSplit $script]
======

code:

======
proc cmdSplit {script} {
    set commands {}
    set chunk {} 
    foreach line [split $script \n] {
        append chunk $line
        if {[info complete $chunk\n]} {
            # $chunk ends in a complete Tcl command, and none of the
            # newlines within it end a complete Tcl command.  If there
            # are multiple Tcl commands in $chunk, they must be
            # separated by semi-colons.
            set cmd {} 
            foreach part [split $chunk \;] {
                append cmd $part
                if {[info complete $cmd\n]} {
                    set cmd [string trimleft $cmd]
                    #drop empty commands
                    if {$cmd eq {}} {
                        continue
                    }
                    if {[string match \#* $cmd]} {
                        #the semi-colon was part of a comment.  Add it back
                        append cmd \;
                    } else {
                        lappend commands $cmd
                        set cmd {}
                    }
                } else {
                    # No complete command yet.
                    # Replace semicolon and continue
                    append cmd \;
                }
            }
            #if there was an "inline" comment, it will be in cmd, with an
            #additional semicolon at the end
            if {$cmd ne {}} {
                lappend commands [string replace $cmd[set cmd {}] end end]
            }
            set chunk {} 
        } else {
            # No end of command yet.  Put the newline back and continue
            append chunk \n
        }
    }
     if {![string match {} [string trimright $chunk]]} {
        return -code error "Can't parse script into a\
                sequence of commands.\n\tIncomplete\
                command:\n-----\n$chunk\n-----"
    }
    return $commands
}

proc nocomments {commands} {
    set res [list]
    foreach command $commands {
        if {![string match \#* $command]} {
            lappend res $command
        }
    }
    return $res
}
======


** wordSplit **

[Sarnold]: `[[wordSplit]` takes a command and returns its arguments as a list.

======
proc wordSplit {command} {
    if {![info complete $command]} {error "non complete command"}
    set res ""; # the list of words
    set chunk ""
    foreach word [split $command " \t"] {
        # testing each word until the word being tested makes the
        # command up to it complete
        # example:
        # set "a b"
        # set -> complete, 1 word
        # set "a -> not complete
        # set "a b" -> complete, 2 words
        append chunk $word
        if {[info complete "$res $chunk"]} {
            lappend res $chunk
            set chunk ""
        } else {
            append chunk " "
        }
    }
    set res
}
======

----

[aspect]: forgive my foolishness, but what is `[[wordSplit]` for?  From the
description it sounds like `[[wordSplit $command]] == [[lrange $command 1
end]]` but it seems to do something different.  If you want the elements of
`$command` as a list, just use `$command`!

[AMG]: `[[wordSplit]` splits an arbitrary string by whitespace, then attempts
to join the pieces according to the result of `[[[info complete]]`.  This results
in a list in which each element embeds its original quote characters.  Since an
odd number of trailing backslashes doesn't cause `[[[info complete]]` to return
false, `[[wordSplit]` doesn't correctly recognize backslashes used to quote
spaces.

I agree that `[[wordSplit]` doesn't appear to serve a useful purpose.  Its
input should already be a valid, directly usable list.

[aspect]: it also does strange things if there are consecutive spaces in the
input.  "each element embeds its original quote characters" seems to be the
important characteristic, but I can't think of a use-case where this would be
desirable .. hoping that [Sarnold] can elaborate on his original intention so
the example can be focussed (and corrected?). 

<<categories>> Parsing | Object Orientation