• Please review our updated Terms and Rules here

C's itoa() in Assembler with mini-sprintf bonus

neilobremski

Experienced Member
Joined
Oct 9, 2016
Messages
55
Location
Seattle, USA
I've been poking around for a quick way to convert 16-bit words into digit characters a la the itoa() function in C. [SUP][1][/SUP] I thought and searched a lot around dividing by constants and clever ways of converting numbers to strings. In the end, I chose the K&R method as the base for my algorithm [SUP][2][/SUP] which goes like this:

  1. Divide number by base (10).
  2. Write remainder as next digit (converting to ASCII by adding 30h).
  3. Use quotient as number and repeat loop if non-zero.
  4. Reverse the resulting string so digits are ordered high to low.

This is what I had come up with before searching for anything so I was both warmed and alarmed that my solution was the standard one. Usually what I come up with in a vacuum is workable but slow; in this case it is the way to do it. Before I had even checked the web, I debugged MSC 5.1's itoa() and found it to be doing this. [SUP][3][/SUP]

Now my main problem with the approach is the fact that accesses the string memory a lot: first to write the digits and secondly to reverse the string. The only way I could see around this was to use a fixed buffer size and start from right working left or discover some math to go from the highest digit to the lowest. Neither worked out for me so I scrapped them both.

And then dividing by 10 isn't exactly a problem but it's a constant divide so I figured there must be a way to do this without using the DIV instruction which is slow as molasses. Nope, not on the 8088 at least. There are solutions in Hacker's Delight [SUP][4][/SUP] but they use a lot of shifts. The cycles used to read all the shift instructions and perform them on the 8088 would be equivalent to simply dividing; not to mention the fact that DIV outputs both the quotient and remainder in one go.

Alright, so I decided that the algorithm is fine the way it is and I set out to write a function which I called Num2Str() ...

Code:
a 440
; ----------------------------------------------------------------------------
; Num2Str()							:NUM2STR
;
; Convert unsigned integer AX into a decimal (base 10) string at ES:DI.
;
;	[input]		[output]
; AX	ushort		(trashed)
; BX			(trashed)
; CX	CH=pad,CL=len	(trashed)
; DX			(trashed)
; SI			(trashed)
; DI	pString+offs	pString+len
;
CLD		; 440
MOV	BX, 0A	; 441 base 10
MOV	SI, DI	; 444
		;
CWD		; 446						:NUM2STR_LOOP
DIV	BX	; 447
XCHG	AX, DX	; 449
ADD	AL, 30	; 44A
STOSB		; 44C
DEC	CL	; 44D
JZ	0467	; 44F >NUM2STR_REV
MOV	AX, DX	; 451
OR	AX, AX	; 453
JNZ	0446	; 455 >NUM2STR_LOOP
		;
STOSB		; 457 AL is 0					:NUM2STR_NULL
DEC	DI	; 458
		;
OR	CL, CL	; 459						:NUM2STR_PAD
JLE	0467	; 45B >NUM2STR_REV
ADD	AL, CH	; 45D
JNZ	0463	; 45F >NUM2STR_REPS
MOV	AL, 30	; 461
XOR	CH, CH	; 463						:NUM2STR_REPS
REP	STOSB	; 465
		;
MOV	DX, DI	; 467						:NUM2STR_REV
DEC	DI	; 469						:NUM2STR_REVL
LODSB		; 46A read DS:SI byte
XCHG	AL, [DI]; 46B read/write ES:DI byte
MOV [SI-01], AL	; 46D write DS:SI byte
LEA AX, [SI+01]	; 470 equivalent to MOV AX, SI then INC AX
CMP	AX, DI	; 473
JB	0469	; 475 if (SI < DI) >NUM2STR_REVL
		;
MOV	DI, DX	; 477
RET		; 479						:NUM2STR_RET

This function can handle both variable length and fixed length strings. If a fixed length is used, by setting CX, then a NULL is not appended to the end. In either case, DI is set to the position after the last digit when the function returns. By doing that I could use it as part of a "stream" a la sprintf!

Code:
a 480
; ----------------------------------------------------------------------------
; sprintf()							:SPRINTF
;
; Copies string DS:SI to ES:DI using 16-bit substitutions from DS:BP.
;
; NOTE: Currently only supports %u via NUM2STR
;
;	[input]		[output]
; AX			(trashed)
; BX			(trashed)
; CX			(trashed)
; DX			(trashed)
; SI	pFormat+Offs	pFormat+Len
; DI	pOutput+Offs	pString+Len
; BP	ushort[]	ushort[]+Len
;
CLD		; 480
JMP	0488	; 481 >SPRINTF_LODS
STOSB		; 483						:SPRINTF_STOS
OR	AL, AL	; 484
JZ	04A0	; 486 >SPRINTF_RET
LODSB		; 488						:SPRINTF_LODS
CMP	AL, 25	; 489 '%'
JNE	0483	; 48B >SPRINTF_STOS
LODSB		; 48D
CMP	AL, 75	; 48E 'u'
JNE	0483	; 490 >SPRINTF_STOS
MOV	AX, [BP]; 492
INC	BP	; 495
INC	BP	; 496
XOR	CX, CX	; 497
PUSH	SI	; 499
CALL	0440	; 49A >NUM2STR
POP	SI	; 49D
JMP	0488	; 49E >SPRINTF_LODS
RET		; 4A0						:SPRINTF_RET

My "mini" version of sprintf() only handles unsigned integers substituted at "%u". It reads these from the memory at DS:BP in order. I decided on the ol' printf style because it's familiar and I can add functionality for other formatting later while easily supporting what I need now.

Footnotes:

[SUP][1][/SUP]. This is usually defined as char *itoa (int value, char *str, int base); but sometimes the base parameter is called radix and the return value is void. At the time of writing this, I haven't bothered to check what's standard.

[SUP][2][/SUP]. The function itoa appeared in the first edition of Kernighan and Ritchie's The C Programming Language, on page 60.

[SUP][3][/SUP]. And in stepping through the code, I found LEA to be a valid 8088 instruction whereas I had previously thought it appeared with the 286.

[SUP][4][/SUP]. http://www.hackersdelight.org/
 
Back
Top