Scan and modify text files

Difference between version 4 and 5 - Previous - Next
[Arjen Markus] (21 february 2006) I am facing a task of modifying a lot of text files in a rather mechanical way. 
I used to do this kind of things with [AWK], 
but Tcl lends itself for this too. 
It is just a matter of the right "little language". 

The task I am facing is not really interesting for anyone else, 
but the characteristics are fairly common:

   * Certain modifications are required for a particular part of the file
   * Some modifications apply to particular lines   * Defining regular expressions to capture exactly the lines you need can be tricky. <<br>>So it is probably easier to do it in steps.

The script below allows you to delimit sections of a file by a start and a stop pattern. 
If the lines fall within a section, the associated script is run. 
To apply default processing (just copying to the output for instance), 
there is a fallback pattern - "otherwise". 

 # modify.tcl --
 #     Yet another AWK-like utility. This one reads a file line by line
 #     and decides on the basis of patterns marking the beginning and
 #     end of a block of lines (section) what actions to take.
 #     Note:
 #     - sections may overlap
 #     - what they do is up to you
 #     - special sections are: begin, end and otherwise
 #     - the command "nextline" causes the actions for any subsequent
 #       sections to be cancelled.

 namespace eval ::Sections {
     variable section_number 0
     variable section_data   {}
     variable section_active {}
     variable nextline       0

     namespace export section begin end otherwise nextline scanfile

     proc _begin     {} {}
     proc _end       {} {}
     proc _otherwise line {}

 # section --
 #     Define the beginning and end of a section and the actions to take
 # Arguments:
 #     begin       The regexp pattern marking the start
 #     end         The regexp pattern marking the end
 #     actions     The script to be run
 # Result:
 #     None
 proc ::Sections::section {begin end actions} {
     variable section_number
     variable section_active
     variable section_data

     lappend section_data   $begin $end
     lappend section_active 0
     proc ::Sections::$section_number line $actions
     incr section_number

 # begin --
 #     Define the actions for the beginning of a file
 # Arguments:
 #     actions     The script to be run
 # Result:
 #     None
 proc ::Sections::begin {actions} {
     proc ::Sections::_begin {} $actions

 # end --
 #     Define the actions for the end of a file
 # Arguments:
 #     actions     The script to be run
 # Result:
 #     None
 proc ::Sections::end {actions} {
     proc ::Sections::_end {} $actions

 # otherwise --
 #     Define the actions for lines not falling in any section
 # Arguments:
 #     actions     The script to be run
 # Result:
 #     None
 proc ::Sections::otherwise {actions} {
     proc ::Sections::_otherwise line $actions

 # nextline --
 #     Instruct the scanning procedure to skip all remaining sections
 # Arguments:
 #     None
 # Result:
 #     None
 proc ::Sections::nextline {} {
     variable nextline
     set nextline 1

 # scanfile --
 #     Scan the file, taking actions appropriate for the
 #     sections the line is part of
 # Arguments:
 #     filename    Name of the file to scan
 # Result:
 #     None
 proc ::Sections::scanfile {filename} {
     variable section_number
     variable section_data
     variable section_active
     variable nextline

     set infile [open $filename r]


     while { [gets $infile line] >= 0 } {
         set nextline 0

         set id -1
         set insection 0
         foreach {start stop} $section_data active $section_active {
             incr id
             if { $active } {
                 if { [regexp $stop $line] } {
                     lset section_active $id 0
             } else {
                 if { [regexp $start $line] } {
                     lset section_active $id 1
                     set active 1

             if { $active } {
                 set insection 1
                 $id $line
                 if { $nextline } {
         if { ! $insection } {
             _otherwise $line

     close $infile

 # main --
 #     Simple test case and demo
 namespace import ::Sections::*

 begin {
     puts "List of procedures:"
     set ::count 0

 section "^#.*--" "^ *proc" {
     puts "| $line"
     if { [regexp "#.*--" $line] } {
         set ::count 0

 section "{" "^#.*--" {
     incr ::count

     if { $line == "\}" } {
         # Naive criterium for the end of a procedure
         puts "(Number of lines: $::count)"

 scanfile $argv0
Very useful indeed ! I fixed a small bug: the "if {$insection} ..." test is better placed outside the foreach loop======
Vegory Fuseful indeed ! I fixed a small bug: the] -"if [C{$insection} ..." test 
is better placed outside the foreach loop

<<categoryies>>  File | String Processing] 