Every Word is a Constructor

Difference between version 7 and 8 - Previous - Next
'''Every word is a constructor''' ('''EWIAC''') conveys the idea that while the
arguments to a [routine] are derived from the [word%|%words] in a [command],
those words are not in themselves the values for those arguments.



** See Also **

   [Tcl Chatroom] 2017-06-02:   A discussion on the topic. 



** Description **


A value has some type that is determined in the context where it is utilized.
Currently, Tcl commands create and use this type information and it is even
stored internally, but Tcl discards this type information whenever the value is
used in a context that requires a different interpretation of the value.  The
string representation of the value, and the string representation alone,
conveys all the meaning for the value, and it is up to the consumer of the
value to decide how to interpret it.  This insulates each command from the
treatment values received at the hands of other commands.  A command can not
rely on any previous interpretation of the string representation for that
value.  Logically, the arguments each routine receives have no prior
interpretation, and any interpretation assigned within a routine is valid only
within that routine, and as a cached interpretation for the next command to use
if it so decides.

Groups of federated commands
cooperate with each other to use the same interpretation of a value.
This allows the commands to be more performant since they can use the
preexisting internal represenation.  This illustrates
that Tcl is in fact capable of carrying around type information for a given
value and that the type information survives transport from one command to
another.  Currently, this internal type information is used strictly as cached
information:  If the cached information conflicts with the current
interpretation of the string representation, it is discarded.  This allows the
programmer to freely use the value produced by a substitution in contexts where
different types of values are required.  This may sound like a convenient
thing, but in practice it is limiting.  Programmers think of the values as typed
values and construct their programs with these types in mind.  Type-consistent
usage is far more common than type-variable usage.  If Tcl made the types of
values observable at the script level, they could be used to great effect.

The existence of Rule 5 illustrates the flexibility that the interpreter has in
this regard.  That flexibility could be better articulated by making a small
change to the wording of [dodekalogue%|%Rule 2]:

        :   ''..., then all of the words of the command are passed to the command procedure.''

, Rule 2 could have been modified to say,

        :   ''..., then the resulting values are passed as arguments to the command procedure.''

Eliminating that second use of "word" to describe the result of the
substitutions makes it clear that words are processed into arguments to the
routine, and that once they are processed, they are outside of the scope of the
Tcl rules.

If values rather than words are the arguments to a routine, commands can treat
different types of values differently.  `[puts]` could take as its optional
argument either a channel or a file name.  A new variant of `[append]` could
work for [string%|%strings], [list%|%lists], and [dict%|%dictionaries].  The
first word of a command might actually be a routine rather than the name of a
routine.



** Example:  `[{*}]` **


In [Changes in Tcl/Tk 8.5%|%Tcl 8.5], a new processing directive was was
admitted at the script level and it was implemented as a change to the syntax:
A literal `[{*}]` in a script at the beginning of a word tells the interpreter
to unpack that word into multiple words, which has the effect of expanding the
number of arguments handed to a routine when it is called.  This is a welcome
bit of functionality, but did it have to be a syntactic change to Tcl?  One
thing that seems to be conspicuously absent from the discussions that led up to
the implementation of `[{*}]` is any substantive proposal to accomplish the
task by inspecting the internal representation of the value.  Rather than
introducing new syntax, a command which returned an "expand me" value could
have been introduced:

======
set [expand {name Bob}]
======

[APN] 2018-04-26 Seems to me the easiest way to do that would be to
introduce a new return code TCL_EXPAND (to go with TCL_ERROR, TCL_BREAK etc.)
which will cause the command interpreter to expand the result. Given we already
have {*}, not sure how useful it would be in practice but it does give the
command itself (as opposed to the caller) control over whether its return
value should be expanded or not. Can't think offhand of any use cases 
(other than expand itself) that demand this functionality but that might
just be a lack of imagination on my part.

[dbohdan] 2018-04-28: APN, I like your idea of TCL_EXPAND. As an alternative to
`{*}` it could be useful for 8.4-ish Tcl implementations like [Eagle] and
[JTcl] (easier to implement than `{*}`) as well as new Tcl derivatives
trying to cut down on syntax. (What is the minimal practical
[N-logue]? ''N'' can't be greater than 10.)



** Example:  `is` **

'''`string is`''' is currently available to determine whether a string
representation of a value conforms to a certain format.  Once words and values
are no longer being conflated, a similar command, perhaps named '''`is`''',
could be introduced to provide a system for inspecting the value itself:

   '''`is`''' ''`value1`'' ''`value2`'' :   Returns `true` if ''value1'' is the same type of value as as value2, and `false` otherwise.

[extension%|%extensions] that provide their own value types could use some
implementation-level mechanism to register a function for `is` to use when it
encounters values of that type .  A [proc%|%procedure] could use `is` to
condition its operation on these comparisons. For example, an enhanced `[puts]`
procedure might look like this: 

======
proc newputs args {
set opened 0
if {[llength $args == 1]} {
    set target [chan lookup stdout]
} elseif {[llength $args] == 2} {
    lassign $args target string
    if {![is $target [chan type]]} {
        set target [open $target w]
        set opened 1
    }
}
try {
    ::puts $target $string
} finally {
    if {$opened} {
        close $target
    }
}
}
======

`chan type` returns a value to be used only by `is`, and not for any real
channel operations.  For other values such as lists, `[list]` could be used:

======
if {[is $somevariable [list]]} {
    puts {found a list}
}
======


Also needed in the example was `[chan lookup stdout]` which returns the
corresponding channel value based on a name.



** Example: Routine **

Currently, a routine that takes as an argument a command prefix must do
something like this to make sure it can later call the command in the proper
context:

======
set cmd [list ::apply [list {cmd args} {
    ::tailcall {*}$cmd {*}$args
} [uplevel 1 {namespace current}] $cmd
======

Under EWIAC, a command could capture the current namespace into an internal
type:

======
set cmd [list [uplevel 1 ::command [lindex $cmd 0]] {*}[lrange $cmd 1 end]]
======


** Example: Recursive Data Structures **

Currently, it's problematic to use a [dict%|%dictionary] as a general
recursive data structure because there is nothing in a value to indicate
whether is should be interpreted as a nested dictionary.  An operation on a
dictionary could use `[::tcl::unsupported::representation]` to determine
whether the value is a nested dictionary, but because Tcl can gratuitously swap
out the internal representation, the approach is problematic.



** Caching Considered Troublesome **

The type information cached in a `Tcl_Obj` is quite useful, and even necessary
for the operation of modern Tcl.  What isn't necessary is the caching aspect.
It can complicate already-complicated routines.  On the `[Tcl_Obj]` page there
are descriptions of circular references among `Tcl_Obj`.  The solutions to
these problems would be more straightforward without the caching behaviour.










** Page Authors **

   [PYK]:   




<<categories>>  concept