SHX-RAND - a random-number generator project for 8086/386

commodorejohn · May 25, 2012

Being as I'm interested in dabbling with game development for MS-DOS, I knew I'd need a random-number generator eventually. While researching the subject, I found this algorithm, which combines a number of attractive features: no multiply/divide (shift/XOR-based instead,) very little storage required, and a long period. (I'm no statistics expert, so I can't speak for the quality of the random numbers, but they certainly look random to me...) So I took a stab at implementing it in assembler, and here it is:

http://www.commodorejohn.com/shx-rand.zip (assembler source is NASM format)

It's really quite a nice and simple algorithm to implement in assembler; I did both an 8086 version and a 386 version (uses 32-bit operations and long immediate shifts.) Most of the work went into the 8086 version, which I tried to optimize to minimize memory access; I'm not going to claim the result is optimal (I'd be happy to hear suggestions for improvement,) but I think it's pretty good. A reference implementation in C is also included, along with test programs for all three which generate the first 65536 random numbers from a specified seed, and have been verified to give the same results.

So yeah, this was an interesting little project. I intend to add a 68k implementation to the mix, shouldn't be too hard. Questions? Comments? Smart remarks?

Code:

getrand:	; Destroys all main registers, returns AXBX = 32-bit random number
			; Also returns EBX = 32-bit random number on 80386
	%ifdef CPU_80386
		; 80386 version (32-bit, has barrel-shifter, fast!)
		mov	eax, [randx]	; Ahh, 32-bit registers...
		mov	ebx, eax		; Does this save us a memory access on the XOR?
		shl	eax, 11			; Ahh, barrel-shifter with large immediate shifts...
		
		xor	eax, ebx
		
		mov	ebx, [randw]	; I'm not going to try to make the list-shuffle look
		mov	ecx, [randy]	; like it corresponds, important thing is that the
		mov	[randx], ecx	; values wind up in the right place.
		mov	ecx, [randz]
		mov	[randy], ecx
		mov	[randz], ebx
		
		mov	ecx, ebx
		shr	ecx, 19
		
		xor	ebx, ecx
		
		mov	ecx, eax
		shr	ecx, 8
		
		xor	eax, ecx
		xor ebx, eax
		
		mov	[randw], ebx
		; Now load the high word of EBX into AX, for compatibility
		mov	eax, ebx		; I need to get a better 386 assembler reference,
		shr	eax, 16			; but I don't believe there's a way to access the
							; high word of a 32-bit register directly...
	%else
		; 8086 version (much slower, but more explanatorily commented!)
		; Many steps are done out of order here in order to minimize re-loading
		; of values from memory. I *think* I've tracked everything correctly so
		; that the results are correct.
		mov	cx, [randx+2]	; CXDX = [randx]
		mov	dx, [randx]
		
		mov	ah, cl			; AXBX = [randx] << 8
		mov	al, dh
		mov	bh, dl
		xor	bl, bl
		
		shl	bx, 1			; AXBX = AXBX << 3
		rcl	ax, 1
		shl	bx, 1
		rcl	ax, 1
		shl	bx, 1
		rcl	ax, 1
		
		xor	ax, cx			; First part of the algorithm:
		xor	bx, dx			; t = (x XOR (x << 11))
		
		mov	dl, bh			; CXDX = (t >> 8)
		mov	dh, al
		mov	cl, ah
		xor	ch, ch
		
		xor	ax, cx			; Piece for the second part of the algorithm:
		xor	bx, dx			; (t XOR (t >> 8))
		
		mov	cx, [randy+2]	; Move the stored values one space along in the list
		mov	dx, [randy]		; X = Y, Y = Z, Z = W
		mov	[randx+2], cx
		mov	[randx], dx		; Is this the fastest way to do it? Seems like it
		mov	cx, [randz+2]	; would be, I can't think of another method that
		mov	dx, [randz]		; requires fewer memory accesses and no overhead.
		mov	[randy+2], cx
		mov	[randy], dx
		mov	dx, [randw+2]	; DX = [randw] >> 16 (for later)
		mov	cx, [randw]
		mov	[randz+2], dx
		mov	[randz], cx
		
		xor	ax, dx			; Cuts the final XOR in half.
		
		shr dx, 1			; DX = DX >> 3
		shr dx, 1
		shr dx, 1
		
		xor	dx, cx			; Piece for the second part of the algorithm:
							; (w XOR (w >> 19))
		
		xor	bx, dx			; Second part:
							; w = (w XOR (w >> 19)) XOR (t XOR (t >> 8))
							; return w
		
		mov	[randw], bx		; Final history-list operation, store new value in W
		mov	[randw+2], ax
	%endif
	ret
	
	; Put these anywhere, but have 'em aligned!
	alignb	4
randx:	resd	1
randy:	resd	1
randz:	resd	1
randw:	resd	1

Krille · May 8, 2016

It is possible to optimize this slightly by replacing this;

Code:

		xor	ch, ch
		
		xor	ax, cx			; Piece for the second part of the algorithm:

with this;

Code:

		xor	al, cl			; Piece for the second part of the algorithm:

I've actually been looking for a lightweight PRNG suitable for 8088 machines. I haven't tried this yet (and it may take a while before I'll get to it) but it looks simple enough to implement even for a noob like me. Sadly, the link to the algorithm is now dead and searching for SHX-RAND doesn't give any useful results.

Anyway, thanks for doing this John!

commodorejohn · May 8, 2016

Hah, hadn't expected to see this thread revived four years on :D Nice point about the optimization. I haven't tried it on an 8088 myself, but I did a PDP-11 implementation and it was pretty zippy even on my 11/23. The original article is still available on archive.org.

Chuck(G) · May 8, 2016

I didn't want to bring this up when the thread first posted, but what the heck, I feel good today...

The title should read "A Pseudo-random integer generator project..."

That distinction, as you know, is important. A true random-number generator would not generate the same sequence when given the same initial conditions.

Trixter · May 9, 2016

Obligatory demoscene references: http://www.pouet.net/topic.php?which=10565

There's always the xorshift family. http://www.arklyffe.com/main/2010/08/29/xorshift-pseudorandom-number-generator/ for code.

Chuck(G) · May 9, 2016

Isn't the "xorshift" just another LFSR?

Most simple pseudorandom number generators are pretty poor. Monte Carlo tests often expose the weakness. Books have been written on the subject. Simple ones probably work well for games, but not so much for statistical or crypto work.

To make a better one a diode, an op-amp and a counter can be made to work as a very simple one. Shot noise or thermodynamic noise is random, although the "pinkness" of the noise has to be taken into account. Some MCUs such as the Raspberry Pi already have a TRNG on-chip.

commodorejohn · May 9, 2016

Well, yeah, that's true, but the way I look at it is that if you're doing statistical or crypto work, it's your business to know about this stuff, and if you're not, it probably doesn't matter that much. (At least, past the point of "don't use RANDU/the stdlib rand() function" which is pretty common knowledge.)

(Also, I had no idea the Pi had a true noise source in it - heh!)

kyodai · May 12, 2016

Well if you do Crypto stuff then you prolly wouldn't want to rely on a few lines of assembler. But yeah on a simple 8086 you usually don't have many good sources of noise. For a game I'd think it might be sufficient to incorporate weak noise factors like the current timestamp and maybe the mouse position (if you have a mouse that is) to help making the seed look halfway random. On modern Systems like unix or Windows the current technology already offers random number generators that already look pretty promising in a Monte Carlo test.

Chuck(G) · May 12, 2016

It depends. If you want "random-looking" patterns of bits with a uniform distribution, then a LFSR is probably the fastest way to go; just a shift and a couple of XORs. But that's far from producing random integers--an LFSR will fail a Monte Carlo test.

SHX-RAND - a random-number generator project for 8086/386

commodorejohn

Veteran Member

Krille

Veteran Member

commodorejohn

Veteran Member

Chuck(G)

25k Member

Trixter

Veteran Member

Chuck(G)

25k Member

commodorejohn

Veteran Member

kyodai

Experienced Member

Chuck(G)

25k Member