Version 24 of WebSocket

Updated 2011-11-26 22:25:56 by jbr

WebSockets are a nice alternative to XMLHTTPRequest for bi-directional communication between a web browser and a server application. They need a browser, e.g. Chrome, that supports the WebSocket API and a server that supports the WebSocket protocol.

jbr 2011-05-1 If someone is adding attributions to the wiki I wish that the would get it right. If you are not absolutely sure who added a comment then please leave it be.

agb Dec. 2010. Chrome now supports an updated version of the websocket protocol so the wibble example previously here no longer works (to see it have a look in this page's history) . The changes to get the current version working are non-trivial.

jbr 2010-12-20 - Here is code that will allow wibble to handshake with the new spec. The version of Chrome that I have (8.0.552.231) has the new handshake but sends the old data framing. I can send data from the client, but, I haven't gotten it to accept data messages from the server. Wibble.tcl needs to be patched to add a way for it to release the socket that will be the websocket channel without responding and closing it:

agb 2010-12-21 - I made a small change to ::wibble::ws-handle to check that chan read actually reads a byte. With this change I have successful, bi-directional messages over the web socket with chrome 8.0.552.224. Thanks for updating wibble.

 # Abort processing on this client.
 proc wibble::abortclient {} {
    return -code 7
 }

AMG: I take it that this command is to be called by a zone handler in order to get Wibble to terminate the coroutine without closing the socket. Correct?

Also, see my comments on Wibble wish list ([L1 ]) for an alternative, less invasive approach.

jbr: Set keepalive 0 at the top of ::wibble::process and then change the exceptional return handling like this:

     } on 7 outcome {
        set keepalive 1
     } finally {
        if { !$keepalive } {
            catch {chan close $socket}
        }
     }

AMG: Some time after you wrote the above, I have changed Wibble to have customizable cleanup handlers. With the latest version of Wibble, instead of modifying the finally block, change the initialization of the cleanup list (top of [process]) to the following:

    set cleanup {
        {chan close $file}
        {if {!$keepalive} {chan close $socket}}
        {dict unset ::wibble::icc::feeds $coro}
    }

jbr: Add this to the zone handlers:

  wibble::handle /ws websocket handler ws-demo

This is your server side callback:

 proc ::ws-demo { event sock { data {} } } {
    switch $event {
        connect {}
        message {
            puts "WS-Demo: $event $sock $data"
        }
    }
  
  ::wibble::ws-send $sock "Hello"
 }

AMG: Are connect and message the only two events that can happen?

jbr: Connect and message are the only two events. WebSockets is a very low level thing (data packets) with the application specific messaging completely undefined.

jbr: Utility to help the server send data frames, doesn't work yet!!

 proc ::wibble::ws-send { sock message } {
    # New data framing?
    #puts -nonewline $sock [binary format cc 4 [string length $message]]$message

    # Old data framing?
    puts  -nonewline $sock "\x00"
    puts  -nonewline $sock $message
    puts  -nonewline $sock "\xFF"

    flush $sock
 }

Handler to accept data from browser. Uses old data framing.

 proc ::wibble::ws-handle { handler sock } {
    if { [chan eof $sock] } {
        puts "Closed $sock"
        close $sock
    } else {
        set code [read $sock 1]
        if {[binary scan $code c code]} {        ; # Do I need this? I think so.
          switch $code {
            0 {
                set message {}

                while { [set c [read $sock 1]] != "\xFF" } {
                    append message $c
                }
                $handler message $sock $message
            }
            default {
                puts "Bad Blocking: $c"
            }
          }
       }
    }
 }

The Zone Handler

 package require md5
  
 proc ::wibble::websocket { state } {
    set upgrade    {}
    set connection {}
    dict with state request header {}

    if { $connection ne "Upgrade" || $upgrade ne "WebSocket" } {
        return
    }

    set sock [dict get $state request socket]

    puts "WebSocket Connect: $sock"

    set key1 [regsub -all {[^0-9]} ${sec-websocket-key1} {}]
    set spc1 [string length [regsub -all {[^ ]}   ${sec-websocket-key1} {}]]
    set key2 [regsub -all {[^0-9]}   ${sec-websocket-key2} {}]
    set spc2 [string length [regsub -all {[^ ]} ${sec-websocket-key2} {}]]

    set key3 [read $sock 8]

    set handler [dict get $state options handler]
    chan event $sock readable [list ::wibble::ws-handle $handler $sock]

    set key1 [expr $key1/$spc1]
    set key2 [expr $key2/$spc2]

    set challenge [binary format II $key1 $key2]$key3
    set response  [md5 $challenge]

    puts $sock "HTTP/1.1 101 WebSocket Protocol Handshake"
    puts $sock "Connection: Upgrade"
    puts $sock "Upgrade: WebSocket"
    puts $sock "Sec-WebSocket-Origin: http://localhost:8080"                ; # This shouldn't be hard coded!!
    puts $sock "Sec-WebSocket-Location: ws://localhost:8080/ws/demo"
    puts $sock ""

    chan configure $sock -translation binary
    puts $sock $response
    chan flush $sock

    $handler connect $sock  ; # There should be an option to pass a session Id here.

    abortclient
 }

AMG: Thanks for the code, guys. I will need to ponder some more before integrating this into Wibble, but I do think I want this feature. However, I think it would benefit from tighter integration. As far as I can tell, it leverages Wibble for establishing the connection but then takes over all I/O. This concept is quite similar to something JCW shared with me the other day, namely an implementation of Server-Sent Events [L2 ] [L3 ]. Whatever I do, I would like it to support both protocols, or at least their common requirements.

If you're wondering why I haven't integrated all this sooner, it's because AJAX was my priority. It may be terribly clumsy compared to WebSockets and Server-Sent Events, but it also has the most browser support.

jcw Neat... jbr's return 7 and keepalive idea look like a very useful tweak:

jbr 2011-05-01 Andy has offered a better way to handle this by removing the socket from the coroutines list and returning an uncaught error. No need to hack Wibble's main zone handler body.

     } on 7 outcome {
        set keepalive 1
     } finally {
        if { !$keepalive } {
            catch {chan close $socket}
        }
     }

Better than what I'm doing right now, which is to do an "icc get" to grab control over the socket by suspending the co-routine indefinitely. The problem with that is that I always get an error on socket close, as wibble tries to resume and send a response to the (now closed) socket. What's not clear to me is whether the "return 7" also causes the request's co-routine to be cleaned up right away (seems like a good idea).

AMG: The coroutine will always be cleaned up, thanks to the "finally" clause inside [process]. The only way to avoid the "finally" clause is to delete the current coroutine command (rename [info coroutine] "") then yield.

A few days ago I came up with another approach that I prefer to any presented on this page or the Wibble wish list: define a new key in the response dict that defines a custom I/O handler that [process] will execute instead of doing its normal post-[getresponse] activities. This way, more of the Wibble infrastructure is available to the custom code: error handling, automatic cleanup, and the ability to loop again and get another HTTP request from the same socket.


2011.1013 jbr Here we are almost a year later with an update.

WebSocket has again moved to a new handshake & framing. Here is a zone handler for Andy's newest wibble and chrome 14 websockets.

 package require sha1
 package require base64

 # Utility proc to frame and send short strings up to 126 chars
 #
 proc ::wibble::ws-send { sock message } {
    puts -nonewline $sock [binary format cc 0x81 [string length $message]]$message
    flush $sock
 } 

 # WebSocket handler proc to receive short (up to 126 chars) text format frames
 #
 proc ::wibble::ws-handle { handler sock } {

    if { [chan eof $sock] } {
        close $sock
    } else {
        binary scan [read $sock 1] c opcode
        binary scan [read $sock 1] c length

        set opcode [expr $opcode & 0x0F]
        set length [expr $length & 0x7F]

        binary scan [read $sock 4]       c* mask
        binary scan [read $sock $length] c* data

        set msg {}
        set i    0
        foreach char $data {
            append msg [binary format c [expr { $char^[lindex $mask [expr { $i%4 }]] }]]
            incr i
        }       
            
        $handler message $sock $msg
    }
 }

 # Zone handler
 #
 proc ::wibble::websocket { state } {
    set upgrade    {}
    set connection {}

    dict with state request header {}

    if { $connection ne "Upgrade" || $upgrade ne "websocket" } {
        return
    }

    set sock [dict get $state request socket]

    puts "WebSocket Connect: $sock"

    set response [base64::encode [sha1::sha1 -bin ${sec-websocket-key}258EAFA5-E914-47DA-95CA-C5AB0DC85B11]]
    set handler  [dict get $state options handler]

    puts $sock "HTTP/1.1 101 WebSocket Protocol Handshake"
    puts $sock "Upgrade:    websocket"
    puts $sock "Connection: Upgrade"
    puts $sock "Sec-WebSocket-Accept: $response"
    puts $sock ""

    chan configure $sock -translation binary
    chan event     $sock readable [list ::wibble::ws-handle $handler $sock]

    $handler connect $sock

    return -code 7
 }

At the top of the ::wibble::process proc I initialize the keepsock variable and modified the cleanup procs like this:

    set keepsock 0
    set cleanup {
        {chan close $file}
        { if { !$keepsock } { chan close $socket } }
        {dict unset ::wibble::icc::feeds $coro}
    }

Then I added a clause in the try structure to catch the return -code 7 from the zone handler:

    } on 7 outcome {
            set keepsock 1
    } finally {

Works for me. Thanks Andy.


AMG: I made an update of the above for version 2011-11-24 [L4 ]. Thanks to some spiffy new features in Wibble, it doesn't require any core modifications. However, I'm not quite ready to post it here, since I need to test it. I grabbed a copy of Chrome, but I don't know how to get it to talk to WebSocket anything. If you could please post an example HTML/JavaScript file that makes use of the above, I'll debug what I've got and post the combined demo here.

A more flexible alternative to the WebSocket handler proc receiving the event (connect or message) as its first argument is to support separate connect and message handler command prefixes in the options dict. For compatibility with existing handlers, the connecthandler and messagehandler could name the same proc, but with an extra connect or message argument. Does this sound like a worthwhile change? Also, how about a disconnect handler? Would that ever have any value? The ICC feed mechanism already has a lapse system which is designed to detect application-level user disconnects, as opposed to protocol-level TCP socket disconnects.

In my development code, I got rid of the socket arguments since the socket name is always [namespace tail [info coroutine]].

In addition to the max-126-character string protocol, is there a way to send and receive binary data? Should we support it?


AMG: Disregard the above, it's overcome by events. I redid my update; it's totally new code now. It still doesn't require any core modifications, which is good, and I'm still not ready to post it here, which is bad. However, I did test it, and it works so far. I ran into some fundamental design issues, which I'll discuss below. I decided against separate connecthandler and messagehandler; a single handler works well enough, and it can dispatch to separate procs if needed. I kept the disconnect event, on the theory that a handler might want an opportunity to clean up after itself. The socket arguments are still gone. I added support for binary data, frames longer than 126 bytes, ping/pong, close frames, fragmentation and continuation frames, basically everything I could find in the WebSocket protocol draft document [L5 ].

The big design problem derives from the fact that WebSocket doesn't require the client and server to take turns the way HTTP does. Wibble successfully models HTTP with this main loop, running inside a coroutine:

while {1} {
    set request [getrequest $port $socket $peerhost $peerport]
    set response [getresponse $request]
    if {![{*}$sendcommand $socket $request $response]} {
        chan close $socket
        break
    }
}

With WebSocket, both sides can talk simultaneously; yet this is done with only a single TCP connection. This does not map well to Wibble's one-coroutine-per-socket model, shown above. The Wibble coroutine cycles between three states: get the client data, generate the response, send the response, then repeat or terminate.

Let's take a simple example: a calculator with a clock. Implement these two features as separate connections, and there's little challenge. But multiplex them together, and you've got problems. The client should be able to send a new math expression at any time, to which the server should rapidly respond with the result. Every second, the server should also send the new time, without the client asking.

My first inclination is to use [icc] to wait for either readability or timeout. ([icc] can also be used to wait for events coming from other coroutines, but that's not important here.) That would work, if not for the problem of incomplete reads. Reading the WebSocket data is nontrivial. You have to read the first two bytes for the basic header, then read zero, two, or eight more for the extended length (depending on the length given in the basic header), plus four more for the mask, and then you have to read the actual data. For extra joy, consider that a message can be fragmented across multiple frames, and that control frames can be interleaved with the fragmented data frames. Also consider that each frame can be up to 14+2**64 bytes in size. Socket readability certainly does not guarantee that the entire frame is immediately available for reading. This isn't UDP. ;^)

Since reading the data isn't atomic, it's a stateful process, with a potential wait state everywhere a read happens. At every wait state the system needs to return to the event loop so that a different socket can be serviced. Sounds like a job for coroutines. ;^) But... what if the other operation waiting to be processed exists in the same coroutine as the read? That doesn't work.

Here's what I'm considering. Create a second coroutine for each WebSocket connection which doesn't directly fit the HTTP turn-taking model. Name the coroutine the same as the socket, but put it in the ::wibble::ws namespace, such that it's separate from the one in the ::wibble namespace yet [namespace tail [info coroutine]]] is still the socket name. The original Wibble coroutine is in charge of reading the client data, and whenever a message is received, it sends it to the new coroutine via [icc]. The new coroutine also roughly follows the HTTP model shown above, but with [getrequest] replaced with [icc get]. This way it aggregates messages from multiple sources and gets them atomically. When it wants to send, it writes to the socket. Meanwhile, the first coroutine only writes to the socket when initiating and tearing down the connection.

To make this efficient, I would have to modify [icc] to not concatenate the event identification and data together into a single string. Instead, the payload would need to be a separate Tcl_Obj.

Now I'm gonna digress from WebSocket in the interest of searching for a unified architecture, since I'm always looking for ideas. :^)

This two-coroutine model isn't necessary for any of the pure-HTTP concepts, since the communication direction strictly alternates. Imagine if I went ahead and made two coroutines anyway. They'd always be taking turns. Each one would do its thing, notify the other that it's done, then wait for the other to finish. Therefore they might as well just be a single coroutine, which is exactly what I have.

Actually, there is one major exception, one time when the HTTP client can do something that's of interest to the server even though it's the server's turn to talk. That one thing is: disconnect. In an AJAX long polling situation, the server can hang for minutes at a time waiting for there to be something worthwhile to report. During that time, the client can disconnect, but the server wouldn't be able to tell until it tries to read from the client. Of course, the server can't try to read until after it sends something. My current solution is to have the server periodically send a no-op, just as an excuse to check if the client is still alive. If I had a two-coroutine model for AJAX long polling, the client read coroutine could always be on the lookout for client disconnect; when it happens, it can use [icc] to let the other coroutine know it's quitting time. (More likely, it'll use [icc] to let all the other coroutines know that the one client died, and it'll just directly terminate the coroutines associated with the socket.)

However, there's still a problem: unclean disconnects. If the client makes a request, it's now the server's turn to respond. While the server is hanging out waiting for something worthwhile to happen, the client sends some more data. The only way the read coroutine can tell the difference between the client sending data and the client disconnecting is to try to read from the client, so that's what it does. It can't do this without bound, so eventually it'll have to stop reading. At this point, the read coroutine again loses its ability to know when the client disconnects. One possible solution is to read without bound, but that opens up a DoS attack. Another possible solution is to forcibly disconnect the client if it sends too much data out-of-turn, but this interferes with HTTP pipelining. Everything comes down to the fact that the portable C I/O API doesn't have an out-of-band notification of client disconnects. So, I can't think a solution solid enough to justify the complexity of adding another coroutine, and I strongly lean against adopting a two-coroutine model for the Wibble core. But I'm definitely open to suggestions!


jbr 2011-11-26 Andy, my thinking here is that once you've gotten an accept via the websocket zone handler then you just have a socket. You can use icc if you like or you can just register a fileevent. There is no reason to tie it to any other wibble infrastructure or HTTP concepts, its just a socket connected to some javascript in a web browser. The wibble co-routine should just return without closing the socket.