User Tools

Site Tools


notes_20on_20the_20use_20of_20rnd

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
notes_20on_20the_20use_20of_20rnd [2018/04/17 17:50]
tbest3112 Added syntax highlighting
notes_20on_20the_20use_20of_20rnd [2020/07/27 13:49] (current)
richardrussell Avoid seed being zero
Line 1: Line 1:
 =====Notes on the use of RND===== =====Notes on the use of RND=====
  
-//by Richard Russell, February 2010, revised May 2013//\\ \\  ​This article expands on the description of [[http://​www.bbcbasic.co.uk/​bbcwin/​manual/​bbcwin7.html#​rnd|RND]] in the main documentation,​ discusses some of the limitations and pitfalls in its use, and provides some code examples.\\ \\ +//by Richard Russell, February 2010, revised May 2013// 
 + 
 +This article expands on the description of [[http://​www.bbcbasic.co.uk/​bbcwin/​manual/​bbcwin7.html#​rnd|RND]] in the main documentation,​ discusses some of the limitations and pitfalls in its use, and provides some code examples. 
 ===== Variants of RND ===== ===== Variants of RND =====
-\\  ​The **RND** function comes in five variants, as follows:\\ \\ + 
 +The **RND** function comes in five variants, as follows: 
 ==== RND ==== ==== RND ====
-\\ **RND** on its own returns a signed 32-bit pseudo-random integer value, i.e. a value in the range -2147483648 to +2147483647. The sequence length is 2^33-1 (8589934591),​ which means that after this number of values have been returned the sequence repeats.\\ \\  ​If you call RND precisely 8589934591 times, each of the non-zero integer values (&​00000001 to &​FFFFFFFF) will be returned **exactly twice** and the value zero (&​00000000) will be returned **exactly once**. Therefore, the statistical likelihood of RND returning zero is half that of it returning any specified non-zero value.\\ \\ + 
 +**RND** on its own returns a signed 32-bit pseudo-random integer value, i.e. a value in the range -2147483648 to +2147483647. The sequence length is 2^33-1 (8589934591),​ which means that after this number of values have been returned the sequence repeats. 
 + 
 +If you call RND precisely 8589934591 times, each of the non-zero integer values (&​00000001 to &​FFFFFFFF) will be returned **exactly twice** and the value zero (&​00000000) will be returned **exactly once**. Therefore, the statistical likelihood of RND returning zero is half that of it returning any specified non-zero value. 
 ==== RND(n) ==== ==== RND(n) ====
-\\ **RND(n)**, where n is a positive integer greater than 1, returns a pseudo-random integer value in the range 1 to n. RND(n) is equivalent to the following function:\\ + 
 +**RND(n)**, where n is a positive integer greater than 1, returns a pseudo-random integer value in the range 1 to n. RND(n) is equivalent to the following function: 
 <code bb4w> <code bb4w>
         DEF FNrnd(n%)         DEF FNrnd(n%)
Line 15: Line 26:
         = R - n% * INT(R / n%) + 1         = R - n% * INT(R / n%) + 1
 </​code>​ </​code>​
-Note that the statistical distribution of the returned values depends on the value of **n**. For small values of **n** the probabilities of each of the possible returned values (1 to n) are very nearly equal. For example if n=7 the likelihoods of the different values are as follows:\\ \\  1: 1227133513 / 8589934591 = 0.142857142857\\  2: 1227133514 / 8589934591 = 0.142857142974\\  3: 1227133514 / 8589934591 = 0.142857142974\\  4: 1227133514 / 8589934591 = 0.142857142974\\  5: 1227133512 / 8589934591 = 0.142857142741\\  6: 1227133512 / 8589934591 = 0.142857142741\\  7: 1227133512 / 8589934591 = 0.142857142741\\ \\  ​So returned values **5**, **6** and **7** are very slightly less probable than values **2**, **3** and **4**, with returned value **1** between these two probabilities. Although this variation is unlikely to be significant,​ it should be borne in mind for critical applications.\\ \\ + 
 +Note that the statistical distribution of the returned values depends on the value of **n**. For small values of **n** the probabilities of each of the possible returned values (1 to n) are very nearly equal. For example if n=7 the likelihoods of the different values are as follows: 
 + 
 +  1: 1227133513 / 8589934591 = 0.142857142857 
 +  2: 1227133514 / 8589934591 = 0.142857142974 
 +  3: 1227133514 / 8589934591 = 0.142857142974 
 +  4: 1227133514 / 8589934591 = 0.142857142974 
 +  5: 1227133512 / 8589934591 = 0.142857142741 
 +  6: 1227133512 / 8589934591 = 0.142857142741 
 +  7: 1227133512 / 8589934591 = 0.142857142741 
 + 
 +So returned values **5**, **6** and **7** are very slightly less probable than values **2**, **3** and **4**, with returned value **1** between these two probabilities. Although this variation is unlikely to be significant,​ it should be borne in mind for critical applications. 
 ==== RND(1) ==== ==== RND(1) ====
-\\ **RND(1)** returns a floating-point value in the range 0 <= R < 1, that is the returned value is **greater than or equal to** zero but **less than** one. RND(1) is equivalent to the following function:\\ \\ + 
 +**RND(1)** returns a floating-point value in the range 0 <= R < 1, that is the returned value is **greater than or equal to** zero but **less than** one. RND(1) is equivalent to the following function: 
 <code bb4w> <code bb4w>
         DEF FNrnd1         DEF FNrnd1
Line 26: Line 51:
         = R/2^32         = R/2^32
 </​code>​ </​code>​
-Note that the distribution of returned values reflects the distribution of the values returned by **RND**, so the likelihood of the value zero being returned is less than other values. Irrespective of the ***FLOAT** mode in use, RND(1) returns a 40-bit floating point number (32-bit mantissa).\\ \\ + 
 +Note that the distribution of returned values reflects the distribution of the values returned by **RND**, so the likelihood of the value zero being returned is less than other values. Irrespective of the ***FLOAT** mode in use, RND(1) returns a 40-bit floating point number (32-bit mantissa). 
 ==== RND(0) ==== ==== RND(0) ====
-\\ **RND(0)** returns the previous random number in RND(1) format. In other words, the pseudo-random-number generator does not advance to the next number in the sequence, but repeats the previous result.\\ \\ + 
 +**RND(0)** returns the previous random number in RND(1) format. In other words, the pseudo-random-number generator does not advance to the next number in the sequence, but repeats the previous result. 
 ==== RND(-n) ==== ==== RND(-n) ====
-\\  ​Setting the parameter of **RND** to a negative integer (i.e. a value in the range -2147483648 to -1) **seeds** the generator (see below) and returns that same value.\\ \\ + 
 +Setting the parameter of **RND** to a negative integer (i.e. a value in the range -2147483648 to -1) **seeds** the generator (see below) and returns that same value. 
 ===== Seeding the random-number generator ===== ===== Seeding the random-number generator =====
-\\  ​As mentioned above, the sequence-length of BBC BASIC'​s pseudo-random number generator is 2^33-1 (8589934591),​ so unless your program calls RND at least that number of times (unlikely!) only part of the sequence will be utilised. Determining the starting point in the sequence is called **seeding** the generator.\\ \\  ​When //BBC BASIC for Windows// is executed the random number generator is seeded from the value of [[http://​www.bbcbasic.co.uk/​bbcwin/​manual/​bbcwin7.html#​time|TIME]],​ but since this counts the number of centiseconds since the PC was last restarted it is likely to be quite small compared with the sequence length. Suppose for example the PC has been running for 24 hours, so TIME is approximately 8640000, this corresponds to only about one-thousandth of the overall sequence length! So relying on the automatic seeding will not make good use of the potential performance of RND.\\ \\  ​To improve the performance you should seed the generator yourself, using the **RND(-n)** option. To do better than the automatic seeding you need to choose a value which is highly variable, not predictable and can range over most or all of the available range (-2147483648 to -1). On a Windows PC a suitable source of such a seed is the Performance Counter: a 64-bit integer value which counts at a rate up to the clock speed of the CPU. The following code may be used:\\ + 
 +As mentioned above, the sequence-length of BBC BASIC'​s pseudo-random number generator is 2^33-1 (8589934591),​ so unless your program calls RND at least that number of times (unlikely!) only part of the sequence will be utilised. Determining the starting point in the sequence is called **seeding** the generator. 
 + 
 +When //BBC BASIC for Windows// is executed the random number generator is seeded from the value of [[http://​www.bbcbasic.co.uk/​bbcwin/​manual/​bbcwin7.html#​time|TIME]],​ but since this counts the number of centiseconds since the PC was last restarted it is likely to be quite small compared with the sequence length. Suppose for example the PC has been running for 24 hours, so TIME is approximately 8640000, this corresponds to only about one-thousandth of the overall sequence length! So relying on the automatic seeding will not make good use of the potential performance of RND. 
 + 
 +To improve the performance you should seed the generator yourself, using the **RND(-n)** option. To do better than the automatic seeding you need to choose a value which is highly variable, not predictable and can range over most or all of the available range (-2147483648 to -1). On a Windows PC a suitable source of such a seed is the Performance Counter: a 64-bit integer value which counts at a rate up to the clock speed of the CPU. The following code may be used: 
 <code bb4w> <code bb4w>
         DIM pc{l%,h%}         DIM pc{l%,h%}
         SYS "​QueryPerformanceCounter",​ pc{}         SYS "​QueryPerformanceCounter",​ pc{}
-        seed% = RND(-ABS(pc.l%))+        seed% = RND(-ABS(pc.l%)-1)
 </​code>​ </​code>​
-It should be noted, however, that the rate at which the Performance Counter increments is extremely variable between systems, and indeed one isn't guaranteed to exist at all. Therefore if you are lucky enough to have a better source of seed available you should use that instead.\\ \\  ​However good your source of seed, since RND(-n) can only take a maximum of 2^31 different values (compared with the total sequence length of 2^33-1) only about **one quarter** of the overall sequence is likely to be exploited. If this is important to you, the pseudo-random number generator can alternatively be seeded by writing values directly to its memory locations, see [[/​Interpreter%20internal%20variables|Interpreter internal variables]]:​\\ + 
 +It should be noted, however, that the rate at which the Performance Counter increments is extremely variable between systems, and indeed one isn't guaranteed to exist at all. Therefore if you are lucky enough to have a better source of seed available you should use that instead. 
 + 
 +However good your source of seed, since RND(-n) can only take a maximum of 2^31 different values (compared with the total sequence length of 2^33-1) only about **one quarter** of the overall sequence is likely to be exploited. If this is important to you, the pseudo-random number generator can alternatively be seeded by writing values directly to its memory locations, see [[/​Interpreter%20internal%20variables|Interpreter internal variables]]:​ 
 <code bb4w> <code bb4w>
         !412 = seedl%         !412 = seedl%
         ?416 = seedh%         ?416 = seedh%
 </​code>​ </​code>​
-Here **seedl%** is a 32-bit value (&​00000000 to &​FFFFFFFF) and **seedh%** is a 1-bit value (0 or 1). Note that they **must not both be zero**. Using this technique you can initialise RND to any point in its sequence.\\ \\ + 
 +Here **seedl%** is a 32-bit value (&​00000000 to &​FFFFFFFF) and **seedh%** is a 1-bit value (0 or 1). Note that they **must not both be zero**. Using this technique you can initialise RND to any point in its sequence. 
 ===== Is RND good enough? ===== ===== Is RND good enough? =====
-\\  ​The built-in RND function is likely to be good enough for most requirements,​ but it is very important to appreciate that it is **not** good enough for some applications. For example, suppose one is using RND to select six lottery numbers, each in the range 1 to 49. The maximum number of possible combinations is (49*48*47*46*45*44)/​(6*5*4*3*2*1) which is approximately 14 million and therefore significantly fewer than the sequence length of RND (n.b. the order of the numbers is not important). Therefore RND should be suitable for this kind of task.\\ \\  ​However suppose that instead we want to shuffle a pack of 52 playing cards. Now the number of possible permutations is 52x51x50x....x2 or about 8*10^67! This is a massively larger number than the length of the RND sequence, so only a minuscule fraction of all the possible shuffles can be achieved using RND. There are no '​workarounds'​ to this; RND just isn't suitable for shuffling a pack of cards in critical applications (such as an electronic gaming machine or online casino).\\ \\ + 
 +The built-in RND function is likely to be good enough for most requirements,​ but it is very important to appreciate that it is **not** good enough for some applications. For example, suppose one is using RND to select six lottery numbers, each in the range 1 to 49. The maximum number of possible combinations is (49*48*47*46*45*44)/​(6*5*4*3*2*1) which is approximately 14 million and therefore significantly fewer than the sequence length of RND (n.b. the order of the numbers is not important). Therefore RND should be suitable for this kind of task. 
 + 
 +However suppose that instead we want to shuffle a pack of 52 playing cards. Now the number of possible permutations is 52x51x50x....x2 or about 8*10^67! This is a massively larger number than the length of the RND sequence, so only a minuscule fraction of all the possible shuffles can be achieved using RND. There are no '​workarounds'​ to this; RND just isn't suitable for shuffling a pack of cards in critical applications (such as an electronic gaming machine or online casino). 
 ===== How to do better than RND ===== ===== How to do better than RND =====
-\\  ​There is no shortage of pseudo-random number generator algorithms with a longer sequence length than RND. For example the assembler-code algorithm [[/​Alternative%20pseudo-random%20numbers|here]] has a sequence length of 2^64-1 (1.8E19), which although far longer than RND is still hugely less than the number of permutations of a pack of 52 cards. The [[http://​groups.yahoo.com/​group/​bb4w/​files/​Mathematics/​mersenne.bbc|Mersenne Twister]] has a mind-boggling sequence length of 2^19937−1 (about 10^6000), although the BBC BASIC code at that link can only take a 32-bit seed which somewhat defeats the object. Two modern, high-performance algorithms are listed at [[/​High%20Quality%20Random%20Number%20Generation|High Quality Random Number Generation]].\\ \\  ​One thing not to attempt is to write your own algorithm, unless you're a top-notch mathematician! An algorithm with an unintentional bias is only too easy to create and the consequences could be serious.\\ \\  ​Using an improved algorithm is only part of the solution. Even if the sequence length is adequate for the task in hand, you must be able to seed it so that a big enough portion of this length is exploited. For example, in the card shuffling case you need to be able to generate a seed with around 230 bits to ensure all possible permutations of cards could, in principle, be generated. Finding an external source of '​randomness'​ with that number of bits is by no means easy. For example the Performance Counter, as suggested above, is woefully inadequate with only 64 bits of precision.\\ \\  ​The Windows API function [[http://​en.wikipedia.org/​wiki/​CryptGenRandom|CryptGenRandom]] (available on Windows 2000 and later) provides a //​cryptographically secure// random number of a specified length, in bytes. The function below returns a random 32-bit integer using this method:\\ + 
 +There is no shortage of pseudo-random number generator algorithms with a longer sequence length than RND. For example the assembler-code algorithm [[/​Alternative%20pseudo-random%20numbers|here]] has a sequence length of 2^64-1 (1.8E19), which although far longer than RND is still hugely less than the number of permutations of a pack of 52 cards. The [[http://​groups.yahoo.com/​group/​bb4w/​files/​Mathematics/​mersenne.bbc|Mersenne Twister]] has a mind-boggling sequence length of 2^19937−1 (about 10^6000), although the BBC BASIC code at that link can only take a 32-bit seed which somewhat defeats the object. Two modern, high-performance algorithms are listed at [[/​High%20Quality%20Random%20Number%20Generation|High Quality Random Number Generation]]. 
 + 
 +One thing not to attempt is to write your own algorithm, unless you're a top-notch mathematician! An algorithm with an unintentional bias is only too easy to create and the consequences could be serious. 
 + 
 +Using an improved algorithm is only part of the solution. Even if the sequence length is adequate for the task in hand, you must be able to seed it so that a big enough portion of this length is exploited. For example, in the card shuffling case you need to be able to generate a seed with around 230 bits to ensure all possible permutations of cards could, in principle, be generated. Finding an external source of '​randomness'​ with that number of bits is by no means easy. For example the Performance Counter, as suggested above, is woefully inadequate with only 64 bits of precision. 
 + 
 +The Windows API function [[http://​en.wikipedia.org/​wiki/​CryptGenRandom|CryptGenRandom]] (available on Windows 2000 and later) provides a //​cryptographically secure// random number of a specified length, in bytes. The function below returns a random 32-bit integer using this method: 
 <code bb4w> <code bb4w>
         DEF FN_RndSecure         DEF FN_RndSecure
Line 56: Line 111:
         = R%         = R%
 </​code>​ </​code>​
-CryptGenRandom uses various sources of //entropy// to seed the generator, for example the tick-count since boot time, the current clock time and the Performance Counter.\\ \\ + 
 +CryptGenRandom uses various sources of //entropy// to seed the generator, for example the tick-count since boot time, the current clock time and the Performance Counter. 
 + 
 ===== Code examples ===== ===== Code examples =====
-\\ + 
 ==== Lottery numbers ==== ==== Lottery numbers ====
-\\  ​The following routine selects and prints six lottery numbers, each in the range 1 to 49:\\ + 
 +The following routine selects and prints six lottery numbers, each in the range 1 to 49: 
 <code bb4w> <code bb4w>
         max = 49         max = 49
Line 75: Line 134:
         NEXT choice         NEXT choice
 </​code>​ </​code>​
-If you want to display the selection in ascending sequence, you can put the numbers in a second array and sort it:\\ + 
 +If you want to display the selection in ascending sequence, you can put the numbers in a second array and sort it: 
 <code bb4w> <code bb4w>
         INSTALL @lib$+"​SORTLIB"​         INSTALL @lib$+"​SORTLIB"​
Line 97: Line 158:
         NEXT         NEXT
 </​code>​ </​code>​
-\\ +
 ==== Shuffling a pack of cards ==== ==== Shuffling a pack of cards ====
-\\  ​Notwithstanding the comments made above, this routine uses RND for the convenience of the example. In practice a better random number generator ought to be used:\\ + 
 +Notwithstanding the comments made above, this routine uses RND for the convenience of the example. In practice a better random number generator ought to be used: 
 <code bb4w> <code bb4w>
         cards = 52         cards = 52
Line 114: Line 177:
         PRINT         PRINT
 </​code>​ </​code>​
 +
 This uses Durstenfeld'​s algorithm for the [[http://​en.wikipedia.org/​wiki/​Fisher-Yates_shuffle|Fisher–Yates shuffle]]. This uses Durstenfeld'​s algorithm for the [[http://​en.wikipedia.org/​wiki/​Fisher-Yates_shuffle|Fisher–Yates shuffle]].
notes_20on_20the_20use_20of_20rnd.txt · Last modified: 2020/07/27 13:49 by richardrussell