SCGI

Difference between version 43 and 45 - Previous - Next
'''SCGI''' is a protocol by which Web applications talk to a Web server.  It is simpler than its competitor [FastCGI], but does not support serving multiple requests over one [TCP] connection.  See the Description section below for more information.



** Implementations **

*** HTTP servers that implement SCGI ***

   * [Apache]
   * [Lighttpd]
   * [nginx]
   * [IIS] with the [ISAPI SCGI extension for IIS%|%ISAPI SCGI extension]

*** Tcl SCGI servers ***

   * `httpd::server.scgi` in [httpd (Tcllib)]
   * [Tanzer]
   * [tcl-scgi]
   * [Wapp]
   * [Woof!]



** Description **

[MJ] - SCGI (Simple Common Gateway Interface) [http://en.wikipedia.org/wiki/SCGI] is a replacement for [CGI] which has the benefit that all requests can be handled by a single instance of the SCGI server (the Tcl script in this case) eliminating the overhead of starting a new process for every request. Its goals are similar to [FastCGI] but the protocol between client (the webserver) and server (the script) is much simpler.

An advantage over Tcl modules embedded in Apache ([websh], [rivet], [mod_tcl]) is that it separates the Tcl part from the webserver, allowing a restart of the Tcl script without restarting the webserver or vice-versa.

The code below implements a simple SCGI server in Tcl which will display the information of the request as a result. This can easily be extended to fit your own purpose by overriding [[scgi::handle_request sock headers body]] The code below has some 8.5-isms but it should not be too difficult to make it 8.4 compatible.
I am not completely happy with the redefinition of the fileevent handlers that's going on (I am not sure if it's very elegant or a terrible hack), but I can't see another way to prevent the use of [global] variables containing the data already read. This might actually be a good case for [coroutine]s. Comments are welcome.  



** Example implementation **

======
package require html

namespace eval scgi {
    proc listen {host port} {
        socket -server [namespace code connect] -myaddr $host $port
    }

    proc connect {sock ip port} {
        fconfigure $sock -blocking 0 -translation {binary crlf}
        fileevent $sock readable [namespace code [list read_length $sock {}]]
    }

    proc read_length {sock data} {
        append data [read $sock]
        if {[eof $sock]} {
            close $sock
            return
        }
        set colonIdx [string first : $data]
        if {$colonIdx == -1} {
            # we don't have the headers length yet
            fileevent $sock readable [namespace code [list read_length $sock $data]]
            return
        } else {
            set length [string range $data 0 $colonIdx-1]
            set data [string range $data $colonIdx+1 end]
            read_headers $sock $length $data
        }
    }

    proc read_headers {sock length data} {
        append data [read $sock]

        if {[string length $data] < $length+1} {
            # we don't have the complete headers yet, wait for more
            fileevent $sock readable [namespace code [list read_headers $sock $length $data]]
            return
        } else {
            set headers [string range $data 0 $length-1]
            set headers [lrange [split $headers \0] 0 end-1]
            set body [string range $data $length+1 end]
            set content_length [dict get $headers CONTENT_LENGTH]
            read_body $sock $headers $content_length $body
        }
    }

    proc read_body {sock headers content_length body} {
        append body [read $sock]

        if {[string length $body] < $content_length} {
            # we don't have the complete body yet, wait for more
            fileevent $sock readable [namespace code [list read_body $sock $headers $content_length $body]]
            return
        } else {
            handle_request $sock $headers $body
        }
    }
}

proc handle_request {sock headers body} {
    array set Headers $headers

    parray Headers
    puts $sock "Status: 200 OK"
    puts $sock "Content-Type: text/html"
    puts $sock ""
    puts $sock "<HTML>"
    puts $sock "<BODY>"
    puts $sock [::html::tableFromArray Headers]
    puts $sock "</BODY>"
    puts $sock "<H3>Body</H3>"
    puts $sock "<PRE>$body</PRE>"
    if {$Headers(REQUEST_METHOD) eq "GET"} {
        puts $sock {<FORM METHOD="post" ACTION="/scgi">}
        foreach pair [split $Headers(QUERY_STRING) &] {
            lassign [split $pair =] key val
            puts $sock "$key: [::html::textInput $key $val]<BR>"
        }
        puts $sock "<BR>"
        puts $sock {<INPUT TYPE="submit" VALUE="Try POST">}
    } else {
        puts $sock {<FORM METHOD="get" ACTION="/scgi">}
        foreach pair [split $body &] {
            lassign [split $pair =] key val
            puts $sock "$key: [::html::textInput $key $val]<BR>"
        }
        puts $sock "<BR>"
        puts $sock {<INPUT TYPE="submit" VALUE="Try GET">}
    }
    puts $sock "</FORM>"
    puts $sock "</HTML>"
    close $sock
}

scgi::listen localhost 9999
vwait forever
======



** Discussion **

[MJ] 20071220 - Instead of reading the length one byte at a time, I changed the code to read as much as possible. This may or may not be better performing, but at least it fixes a DoS attack when the part before the first : is sent very slowly. This would result in a very tight while loop being executed pegging the CPU a 100%.
[MJ] 2023-06-08: I actually don't think this is an issue, because the webserver will only forward the result to the SCGI server after it's complete. Otherwise the CONTENT_LENGTH is unknown.

[MJ] - For file uploads the current implementation is not ideal. Here you should really override the ''read_body'' proc to fcopy the socket to the local file. Generally the code below can be expanded a bit to make integration into your app easier.

[sdw] - 2007-04-02 - And a "standard" template for you apps will can be found [Tcl Web Object Standards]

[APN] - [Woof!] uses a descendant of the above code for its SCGI support. Note the above code does not protect against malformed (and malicious) protocol input. Will update here once I fix Woof.

[MJ] - Usually a webserver forms the SCGI requests and I think it's a fair assumption that those requests are valid. But because it has been a while since I looked at this, what would malicious protocol input be?

[APN] I overlooked that the requests come from your own webserver so you are right. I missed that. By malicious, I meant input that would cause DoS attacks, e.g. sending a header length greater than the actual data would cause the above code to spike to 100% CPU, I think easily fixed by a EOF check. Thanks for this code BTW, as it is likely to be [Woof!]'s preferred web server interface mechanism as it supports Apache (with mod_scgi), nginx (mod_scgi), lighttpd (built-in) and IIS (with isapi_scgi).

[MJ] - The request length is also determined by the server, so if that forms the requests correctly, that's not really a problem either. Of course an [[eof]] check never hurts. Also you are very welcome, I am glad this is useful to someone. For me it was just a nice small project.

[MS] - I do wonder what webservers do with a POST request having a form variable ''CONTENT_LENGTH=987654321''. IIUC, the SCGI protocol [http://python.ca/scgi/protocol.txt] forbids form variables named ''CONTENT_LENGTH'' and ''SCGI'', as it forbids duplicate header names and those two are obligatory. Also ''REQUEST_METHOD'' and ''REQUEST_URI'' are likely to get you in trouble; any others?

[APN] Form variables are not sent as HTTP headers. They are part of the content. The SCGI restriction refers to HTTP headers only.

[GJW] I added a ''host'' argument to the server socket, because you typically do not want the SCGI service to listen on all interfaces.  In the most common cases, the SCGI service will on the same host as the web server, which exposes it to malicious input.  If the SCGI service is on a different host, behind a firewall so that only the web service can talk to it, then one may pass 0 as the ''host'' to get the previous behavior.

<<categories>> Protocol | Internet | Web