Random Integers

Difference between version 1 and 2 - Previous - Next
Created by [CecilWesterhof].

'''Functions to Get Random Integers on a *NIX System'''

To get better random integers you can use /dev/urandom. I wrote a function getRandomBytesNix to get a series of random bytes see [Get Random Bytes on *NIX]. This is used for generating random numbers.

The first proc is a proc to get a a random integer from type c, s, i, or w.

======
proc getRandomIntNix {{type i}} {    switch ${type} {
        c {
            set bytes  1
            set format cu
        }
        s {
            set bytes  2
            set format su
        }
        i {
            set bytes  4
            set format iu
        }
        w {
            set bytes  8
            set format wu
        }
        default {            error "type can only be c, s, i, or w (${type})"
        }
    }    binary scan [getRandomBytesNix ${bytes}] ${format} random
    return ${random}
}
======

Often you need another range (for example [-5, 12]), for this I created getRandomIntInRangeNix.

A naive version is something like the following:
======
proc getRandomIntInRangeNixBiased {min max {type i}} {    if {${max} <= ${min}} {
        error "Error: [info level 0] min should be lower as max"
    }    switch ${type} {
        c {
            set maxMod [expr 2 **  8]
        }
        s {
            set maxMod [expr 2 ** 16]
        }
        i {
            set maxMod [expr 2 ** 32]
        }
        w {
            set maxMod [expr 2 ** 64]
        }
        default {            error "type can only be c, s, i, or w (${type})"
        }
    }    set  modulo [expr {${max} - ${min} + 1}]
    if {${modulo} > ${maxMod}} {
        error "Modulo should be <= ${maxMod} (${modulo})"
    }    set  random [getRandomIntNix ${type}]
    expr {${random} % ${modulo} + ${min}}
}
======

The problem with this is that this can result in biased results.

Say we have a generator that generates the values between 0 and 15. If we ask for a range between 0 and 10, then the values 0 to 4 are two times more likely to be generated as the values 5 to 10, because the values 11 to 15 are mapped to 0 to 4.

To show this I created the following script:

======
set range          99
set repeatCalc 250000
set repeatGen       5
for {set i 0} {${0} <= ${repeatGen}} {incr i} {
    array set count {}    for {set j 0} {${j} < ${repeatCalc}} {incr j} {
        set index [expr {[getRandomIntInRangeNix 0 ${range} c] / 20}]
        incr count(${index})
    }
    parray count
    unset  count
    puts   ""
}
======

An example output when this is run:

======
count(0) = 58899
count(1) = 58324
count(2) = 54793
count(3) = 39192
count(4) = 38792

count(0) = 58808
count(1) = 58317
count(2) = 54746
count(3) = 39129
count(4) = 39000

count(0) = 58295
count(1) = 58439
count(2) = 55095
count(3) = 39156
count(4) = 39015

count(0) = 58454
count(1) = 58674
count(2) = 54598
count(3) = 39202
count(4) = 39072

count(0) = 58825
count(1) = 58447
count(2) = 54700
count(3) = 39170
count(4) = 38858
======

This can be solved with the following code:

======
proc getRandomIntInRangeNix {min max {type i}} {    if {${max} <= ${min}} {
        error "Error: [info level 0] min should be lower as max"
    }
    # maxMod should not be bigger as half of the range
    # otherwise calculating can take to long
    # in this way the chance that you need 10 random values before returning
    # is less as one promille    switch ${type} {
        c {
            set range [expr 2 **  8]
        }
        s {
            set range [expr 2 ** 16]
        }
        i {
            set range [expr 2 ** 32]
        }
        w {
            set range [expr 2 ** 64]
        }
        default {            error "type can only be c, s, i, or w (${type})"
        }
    }    set maxMod [expr {${range} / 2}]
    set modulo [expr {${max} - ${min} + 1}]
    if {${modulo} > ${maxMod}} {
        error "Modulo should be <= ${maxMod} (${modulo})"
    }
    # Values above maxVal would result in bias and should be rejected    set maxVal [expr {${modulo} * (${range}  / ${modulo}) - 1}]
    while True {        set  random [getRandomIntNix ${type}]
        if {${random} <= ${maxVal}} {
            break
        }
    }    expr {${random} % ${modulo} + ${min}}
}
======

Now a run could give something like:
======
count(0) = 50347
count(1) = 49748
count(2) = 49856
count(3) = 50155
count(4) = 49894

count(0) = 50381
count(1) = 49524
count(2) = 50101
count(3) = 49979
count(4) = 50015

count(0) = 50005
count(1) = 50035
count(2) = 49845
count(3) = 49950
count(4) = 50165

count(0) = 49911
count(1) = 50110
count(2) = 50171
count(3) = 50349
count(4) = 49459

count(0) = 49681
count(1) = 50095
count(2) = 49802
count(3) = 49995
count(4) = 50427
======

----

As always: comments, tips and questions are appreciated.

<<categories>>Linux | Numerics | Utilities