**More Quoting Hell - Regular Expressions 102** !!!!!! '''[Tcl Tutorial Lesson 20a%|%Previous lesson%|%]''' | '''[Tcl Tutorial Index%|%Index%|%]''' | '''[Tcl Tutorial Lesson 22%|%Next lesson%|%]''' !!!!!! `regexp ?switches? exp string ?matchVar? ?subMatch1 ... subMatchN?`: Searches `string` for the regular expression `exp`. If a parameter `matchVar` is given, then the substring that matches the regular expression is copied to `matchVar`. If `subMatchN` variables exist, then the parenthetical parts of the matching string are copied to the `subMatch` variables, working from left to right. `regsub ?switches? exp string subSpec varName`: Searches `string` for substrings that match the regular expression `exp` and replaces them with `subSpec`. The resulting string is copied into `varName`. The regular expression (`exp`) in the two regular expression parsing commands is evaluated by the Tcl parser during the Tcl substitution phase. This can provide a great deal of power, and also requires a great deal of care. These examples show some of the trickier aspects of regular expression evaluation. The fields in each example are discussed in painful detail in the most verbose level. The points to remember as you read the examples are: * A left square bracket ([[) has meaning to the substitution phase, and to the regular expression parser. * A set of parentheses, a plus sign, and a star have meaning to the regular expression parser, but not the Tcl substitution phase. * A backslash sequence (\n, \t, etc) has meaning to the Tcl substitution phase, but not to the regular expression parser. * A backslash escaped character (\[) has no special meaning to either the Tcl substitution phase or the regular expression parser. The phase at which a character has meaning affects how many escapes are necessary to match the character you wish to match. An escapecan be either enclosing the phrase in braces, or placing a backslash beforethe escaped character. To pass a left bracket to the regular expression parser to evaluate as arange of characters takes 1 escape. To have the regular expressionparser match a literal left bracket takes 2 escapes (one to escape the bracket in the Tcl substitution phase, and one to escape the bracket inthe regular expression parsing.). If you have the string placed withinquotes, then a backslash that you wish passed to the regular expressionparser must also be escaped with a backslash. Note: You can copy the code and run it in tclsh or wish to see the effects. ---- ****Example**** ====== # # Examine an overview of UNIX/Linux disks # set list1 [list \ {/dev/wd0a 17086 10958 5272 68% /}\ {/dev/wd0f 179824 127798 48428 73% /news}\ {/dev/wd0h 1249244 967818 218962 82% /usr}\ {/dev/wd0g 98190 32836 60444 35% /var}] foreach line $list1 { regexp {[^ ]* *([0-9]+)[^/]*(/[a-z]*)} $line match size mounted puts "$mounted is $size blocks" } # # Extracting a hexadecimal value ... # set line {Interrupt Vector? [32(0x20)]} regexp "\[^\t]+\t\\\[\[0-9]+\\(0x(\[0-9a-fA-F]+)\\)]" $line match hexval puts "Hex Default is: 0x$hexval" # # Matching the special characters as if they were ordinary # set str2 "abc^def" regexp "\[^a-f]*def" $str2 match puts "using \[^a-f] the match is: $match" regexp "\[a-f^]*def" $str2 match puts "using \[a-f^] the match is: $match" regsub {\^} $str2 " is followed by: " str3 puts "$str2 with the ^ substituted is: \"$str3\"" regsub "(\[a-f]+)\\^(\[a-f]+)" $str2 "\\2 follows \\1" str3 puts "$str2 is converted to \"$str3\"" ====== <> Resulting output ======none / is 17086 blocks /news is 179824 blocks /usr is 1249244 blocks /var is 98190 blocks Hex Default is: 0x20 using [^a-f] the match is: ^def using [a-f^] the match is: abc^def abc^def with the ^ substituted is: "abc is followed by: def" abc^def is converted to "def follows abc" ====== <> !!!!!! '''[Tcl Tutorial Lesson 20a%|%Previous lesson%|%]''' | '''[Tcl Tutorial Index%|%Index%|%]''' | '''[Tcl Tutorial Lesson 22%|%Next lesson%|%]''' !!!!!!