This page describes several implementations for in-memory Tcl [array%|%arrays] that are synchronized with persistent storage. They should not be confused with [Persistent data structure%|%persistent data structures].
** See also **
* [tie] - a framework for the creation of persistent Tcl array variables
* [Tequila] - a Tcl server which implements persistent shared arrays
** Plain text files **
[Richard Suchenwirth] 2002-05-30 - Large data (e.g. [a simple database])
are in Tcl often stored in [array]s. You can load array contents from
files, and you can store them to files, so the data persists between
sessions. Here is some code that "ties" an array to a file, so initially
the array is "loaded" from the file, and changes to the array are
reflected in the file too.
This is done by opening it for appending, writing the changed data,
and closing it - possibly slow, but very robust.
If you don't give a filename, it is made up of the array name
and the extension ".txt".
A special application for this is event logging: register a global
persistent array once and assign to it, for instance with a telling key
like ''[clock] seconds'' for each event:
======
persistentArray log mylogfile
...
set log([clock seconds]) "$this happened."
...
set log([clock seconds]) "$that happened."
======
When later examining the logfile, you can reconstruct the date and time
of the events with ''[clock] format''. The disadvantage against pure
file logging is that all event messages remain in the array.
======
proc persistentArray {arrName {filename {}}} {
upvar 1 $arrName arr
array set arr {} ;# to make sure it exists, and is an array
if {$filename==""} {set filename $arrName.txt}
set filename [file join [pwd] $filename]
if [file exists $filename] {
set fp [open $filename]
array set arr [read $fp]
close $fp
}
uplevel 1 [list trace add variable $arrName {write unset} [list persist'save $filename]]
}
proc persist'save {filename arrName el op} {
upvar 1 $arrName arr
switch -- $op { write {set value $arr($el)}
unset {set value {}}
}
set fp [open $filename a]
puts $fp [list $el $value]
close $fp
}
======
** BerkeleyDB **
Here's another approach to persistent/tied arrays that is similar to what perl does with dbmopen and whatnot. It uses [Berkeley DB%|%BerkeleyDB].
======
# package require Db_tcl
load /usr/lib/libdb_tcl-3.2.so ;# already installed on some linux systems
proc tieArray {aname file db} {
upvar $aname a
array set a {}
set fname [file nativename $file]
set dbh [berkdb open -hash -create $fname $db]
set tie_db::afiles($aname) $dbh
trace add variable a read [list tie_db::read $aname]
trace add variable a write [list tie_db::write $aname]
trace add variable a unset [list tie_db::del $aname]
}
proc syncArray {aname} {
$tie_db::afiles($aname) sync
}
proc syncAll {} {
foreach a [array names tie_db::afiles] {
syncArray $a
}
}
proc untieArray {aname} {
tie_db::afiles($aname) close
trace remove variable read [list tie_db::read $aname]
trace remove variable a write [list tie_db::write $aname]
trace remove variable a unset [list tie_db::del $aname]
}
namespace eval ::tie_db {
variable afiles
proc read {rname aname ename op} {
variable afiles
upvar $aname a
foreach {k v} [lindex [$afiles($rname) get $ename] 0] {null} {null} {}
if {$k == ""} {
error "no such element in db"
}
set a($ename) $v
}
proc write {rname aname ename op} {
variable afiles
upvar $aname a
$afiles($rname) put $ename $a($ename)
# $afiles($rname) sync
}
proc del {rname aname ename op} {
variable afiles
upvar $aname a
$afiles($rname) del $ename
# $afiles($rname) sync
}
proc array {cmd aname args} {
variable afiles
if {[catch {set afiles($aname)}]} {
uplevel 2 _array $cmd $aname $args
} else {
switch $cmd {
names -
get -
names/get {
set cur [$afiles($aname) cursor]
_array set tmp_array {}
while 1 {
set kvp [lindex [$cur get -next] 0]
if {$kvp == {}} {break}
puts -nonewline [lindex $kvp 0]..
set tmp_array([lindex $kvp 0]) [lindex $kvp 1]
}
$cur close
if {$cmd == "names"} {
_array names tmp_array
} elseif {$cmd == "get"} {
_array get tmp_array
}
}
default {error "$cmd not implemented for tied arrays"}
}
}
}
}
rename array _array
proc array {args} {uplevel ::tie_db::array $args}
======
The write and del procedures can be changed to sync after each operation at a cost in performance.
Some datasets seem to make this upset (I wrote it trying to make the Bayesan spam filtering code
more elegant) but I suspect its some bugs in the version of db that I have.
Along with this, I noticed an anomaly with trace and upvar. You can use upvar to create an alias
for a variable, even is the real variable is an element of an array.
======none
% set c(x) 0
% upvar #0 c(x) x
% set x 1
% set c(x)
=> 1
======
However, that kind of aliased variable will '''not''' trigger traces set on the entire array,
although it will trigger a trace on that element of the array.
This seems like a bug; if a variable is traced then any way of accessing it
should fire the same traces, right?
** Tgdbm **
[Stefan Vogel] 8-Mar-2004: You can also use the new version of [Tgdbm]
(A Tcl-Wrapper for gdbm-(GNU-dbm)-API). Version 0.5 allows to "attach" an array
to a gdbm-file (which stores the equivalent to the tcl-array, namely hash-key/value-pairs).
It is possible to do this:
======
gdbm_open -writer -sync -array airports test.gdbm
set airports(PAR) Paris ;# will store or update the key/value to test.gdbm
# add/update some more data
array set airports {
ADD "Addis Abeba"
FFM "Frankfurt"
}
# print value (gdbm-file and array is synchronized)
puts "FFM: [airports fetch FFM] / $airports(FFM)"
unset airports ;# this will close test.gdbm
;# this could have been done with unset airportArray
======
See http://www.vogel-nest.de (go to Tcl/Tgdbm) for more details on Tgdbm.
** LMDB **
[dbohdan] 2017-11-23: The following module tracks changes to an array using [trace%|%traces] and persists them to an [LMDB] database.
*** Code ***
======
# Array persistence through LMDB.
# Copyright (c) 2017-2018 D. Bohdan
# License: MIT
package require Tcl 8.6
package require lmdb 0.3.5
namespace eval ::lmdbarr {
variable version 0.2.1
}
::oo::class create ::lmdbarr::Persistent {
variable _array
variable _db
variable _debug
variable _env
variable _path
# If $_temporary is true, Persistent will delete the database under $_path
# when destroyed.
variable _temporary
constructor {array {path {}} {temporary {}} {debug 0}} {
if {[array size $array] > 0} {
error "array must be empty when Persistent starts;\
$array has [array size $array] elements"
}
set _array $array
set _path $path
set _temporary $temporary
set _debug $debug
if {$_path eq {}} {
close [file tempfile _path]
file delete $_path
file mkdir $_path
if {$_temporary eq {}} {
set _temporary 1
}
}
set _env [lmdb env]
$_env open -path $_path
# Choose the map size: 1 TiB on 64-bit platforms and 2 GiB - 1 byte
# otherwise.
$_env set_mapsize [expr {
$::tcl_platform(pointerSize) == 8 ? 1 << 40 : (1 << 31) - 1
}]
set _db [lmdb open -env $_env]
my load
my traces add
}
destructor {
# Do nothing if the object wasn't initialized properly.
if {![info exists _array]} {
return
}
$_db close -env $_env
$_env close
if {$_temporary} {
foreach file [glob -dir $_path *.mdb] {
file delete $file
}
file delete $_path
}
my traces remove
}
method get-path {} {
return $_path
}
method handle-trace {name1 name2 op} {
if {$name1 ne $_array} return
set txn [$_env txn]
if {$_debug} {
puts stderr [list handle-trace $name1 $name2 $op]
}
switch -exact -- $op {
write {
$_db put $name2 [set $_array\($name2\)] -txn $txn
}
unset {
if {$name2 eq {}} {
# If the array has been unset, recreate the traces and empty
# the database.
my traces add
$_db drop 0 -txn $txn
} else {
$_db del $name2 {} -txn $txn
}
}
}
$txn commit
$txn close
}
method load {} {
set txn [$_env txn]
set cursor [$_db cursor -txn $txn]
set initialValues {}
try {
while 1 {
lappend initialValues {*}[$cursor get -next]
}
} on error {_ x} {
# Ignore MDB_NOTFOUND errors.
if {![string match {*ERROR: MDB_NOTFOUND*} \
[dict get $x -errorinfo]]} {
return {*}$x
}
}
array set $_array $initialValues
$cursor close
$txn commit
$txn close
}
method set dict {
my traces remove
set txn [$_env txn]
array set $_array $dict
foreach {key value} $dict {
$_db put $key $value -txn $txn
}
$txn commit
$txn close
my traces add
}
method traces op {
uplevel [list trace $op variable $_array {write unset} \
[list [self namespace]::my handle-trace]]
}
method unset list {
my traces remove
set txn [$_env txn]
foreach key $list {
unset $_array\($key\)
$_db del $key {} -txn $txn
}
$txn commit
$txn close
my traces add
}
unexport handle-trace load traces
}
proc ::lmdbarr::assert-equal {actual expected} {
if {$actual ne $expected} {
error "expected \"$expected\", but got \"$actual\""
}
}
proc ::lmdbarr::test {} {
# Tests.
set sorted {varName {
lsort -stride 2 -index 1 -integer [array get ::lmdbarr::$varName]
}}
set psr1 [Persistent new ::lmdbarr::arr {} 0]
set path [$psr1 get-path]
set ::lmdbarr::arr(false-start) -1
unset ::lmdbarr::arr
set ::lmdbarr::arr(foo) 2
set ::lmdbarr::arr(bar) 3
set ::lmdbarr::arr(baz) 5
set ::lmdbarr::arr(qux) 7
set ::lmdbarr::arr(quux) 11
unset ::lmdbarr::arr(qux)
array set ::lmdbarr::arr {baz 199}
assert-equal [apply $sorted arr] {foo 2 bar 3 quux 11 baz 199}
$psr1 destroy
unset ::lmdbarr::arr
set psr2 [Persistent new ::lmdbarr::arr $path 0]
assert-equal [apply $sorted arr] {foo 2 bar 3 quux 11 baz 199}
array unset ::lmdbarr::arr *
array set ::lmdbarr::arr {x 0 y 1}
assert-equal [apply $sorted arr] {x 0 y 1}
$psr2 destroy
unset ::lmdbarr::arr
set psr3 [Persistent new ::lmdbarr::arr2 $path 1]
assert-equal [apply $sorted arr2] {x 0 y 1}
# Benchmark.
set timex {{key script} {
upvar time time
dict lappend time $key [lindex [uplevel [list time $script]] 0]
}}
for {set run 0} {$run < 3} {incr run} {
apply $timex set {
for {set i 0} {$i < 1000} {incr i} {
set ::lmdbarr::arr2(k$i) $i
}
}
apply $timex get {
for {set i 0} {$i < 1000} {incr i} {
set ::lmdbarr::arr2(k$i)
}
}
apply $timex unset {
for {set i 0} {$i < 1000} {incr i} {
unset ::lmdbarr::arr2(k$i)
}
}
apply $timex mass-set {
set d {}
for {set i 0} {$i < 1000} {incr i} {
dict set d k$i $i
}
$psr3 set $d
}
apply $timex mass-unset {
set keys {}
for {set i 0} {$i < 1000} {incr i} {
lappend keys k$i
}
$psr3 unset $keys
}
}
foreach {key value} $time {
set sum 0
foreach el $value {
incr sum $el
}
puts [format "%-10s %6.1f ms" \
$key \
[expr {(1.0 * $sum) / [llength $value] / 1000}]]
}
$psr3 destroy
}
if {[info exists argv0] && ([file tail [info script]] eq [file tail $argv0])} {
::lmdbarr::test
}
======
*** Benchmark results ***
======
set 5035.5 ms
get 1.3 ms
unset 5078.7 ms
mass-set 11.6 ms
mass-unset 11.1 ms
======
*** Use example ***
======
# ex.tcl
source lmdbarr.tcl
set path /tmp/lmdbarr-example
file mkdir $path
set pst [::lmdbarr::Persistent new ::persist $path]
foreach {key value} $argv {
set ::persist($key) $value
}
parray ::persist
======
======
$ tclsh ex.tcl
$ tclsh ex.tcl k v
::persist(k) = v
$ tclsh ex.tcl
::persist(k) = v
$ tclsh ex.tcl k2 v2
::persist(k) = v
::persist(k2) = v2
======
<<categories>> Embedded Database | Example | File