neilobremski
Experienced Member
I've been poking around for a quick way to convert 16-bit words into digit characters a la the itoa() function in C. [SUP][1][/SUP] I thought and searched a lot around dividing by constants and clever ways of converting numbers to strings. In the end, I chose the K&R method as the base for my algorithm [SUP][2][/SUP] which goes like this:
This is what I had come up with before searching for anything so I was both warmed and alarmed that my solution was the standard one. Usually what I come up with in a vacuum is workable but slow; in this case it is the way to do it. Before I had even checked the web, I debugged MSC 5.1's itoa() and found it to be doing this. [SUP][3][/SUP]
Now my main problem with the approach is the fact that accesses the string memory a lot: first to write the digits and secondly to reverse the string. The only way I could see around this was to use a fixed buffer size and start from right working left or discover some math to go from the highest digit to the lowest. Neither worked out for me so I scrapped them both.
And then dividing by 10 isn't exactly a problem but it's a constant divide so I figured there must be a way to do this without using the DIV instruction which is slow as molasses. Nope, not on the 8088 at least. There are solutions in Hacker's Delight [SUP][4][/SUP] but they use a lot of shifts. The cycles used to read all the shift instructions and perform them on the 8088 would be equivalent to simply dividing; not to mention the fact that DIV outputs both the quotient and remainder in one go.
Alright, so I decided that the algorithm is fine the way it is and I set out to write a function which I called Num2Str() ...
This function can handle both variable length and fixed length strings. If a fixed length is used, by setting CX, then a NULL is not appended to the end. In either case, DI is set to the position after the last digit when the function returns. By doing that I could use it as part of a "stream" a la sprintf!
My "mini" version of sprintf() only handles unsigned integers substituted at "%u". It reads these from the memory at DS:BP in order. I decided on the ol' printf style because it's familiar and I can add functionality for other formatting later while easily supporting what I need now.
Footnotes:
[SUP][1][/SUP]. This is usually defined as char *itoa (int value, char *str, int base); but sometimes the base parameter is called radix and the return value is void. At the time of writing this, I haven't bothered to check what's standard.
[SUP][2][/SUP]. The function itoa appeared in the first edition of Kernighan and Ritchie's The C Programming Language, on page 60.
[SUP][3][/SUP]. And in stepping through the code, I found LEA to be a valid 8088 instruction whereas I had previously thought it appeared with the 286.
[SUP][4][/SUP]. http://www.hackersdelight.org/
- Divide number by base (10).
- Write remainder as next digit (converting to ASCII by adding 30h).
- Use quotient as number and repeat loop if non-zero.
- Reverse the resulting string so digits are ordered high to low.
This is what I had come up with before searching for anything so I was both warmed and alarmed that my solution was the standard one. Usually what I come up with in a vacuum is workable but slow; in this case it is the way to do it. Before I had even checked the web, I debugged MSC 5.1's itoa() and found it to be doing this. [SUP][3][/SUP]
Now my main problem with the approach is the fact that accesses the string memory a lot: first to write the digits and secondly to reverse the string. The only way I could see around this was to use a fixed buffer size and start from right working left or discover some math to go from the highest digit to the lowest. Neither worked out for me so I scrapped them both.
And then dividing by 10 isn't exactly a problem but it's a constant divide so I figured there must be a way to do this without using the DIV instruction which is slow as molasses. Nope, not on the 8088 at least. There are solutions in Hacker's Delight [SUP][4][/SUP] but they use a lot of shifts. The cycles used to read all the shift instructions and perform them on the 8088 would be equivalent to simply dividing; not to mention the fact that DIV outputs both the quotient and remainder in one go.
Alright, so I decided that the algorithm is fine the way it is and I set out to write a function which I called Num2Str() ...
Code:
a 440
; ----------------------------------------------------------------------------
; Num2Str() :NUM2STR
;
; Convert unsigned integer AX into a decimal (base 10) string at ES:DI.
;
; [input] [output]
; AX ushort (trashed)
; BX (trashed)
; CX CH=pad,CL=len (trashed)
; DX (trashed)
; SI (trashed)
; DI pString+offs pString+len
;
CLD ; 440
MOV BX, 0A ; 441 base 10
MOV SI, DI ; 444
;
CWD ; 446 :NUM2STR_LOOP
DIV BX ; 447
XCHG AX, DX ; 449
ADD AL, 30 ; 44A
STOSB ; 44C
DEC CL ; 44D
JZ 0467 ; 44F >NUM2STR_REV
MOV AX, DX ; 451
OR AX, AX ; 453
JNZ 0446 ; 455 >NUM2STR_LOOP
;
STOSB ; 457 AL is 0 :NUM2STR_NULL
DEC DI ; 458
;
OR CL, CL ; 459 :NUM2STR_PAD
JLE 0467 ; 45B >NUM2STR_REV
ADD AL, CH ; 45D
JNZ 0463 ; 45F >NUM2STR_REPS
MOV AL, 30 ; 461
XOR CH, CH ; 463 :NUM2STR_REPS
REP STOSB ; 465
;
MOV DX, DI ; 467 :NUM2STR_REV
DEC DI ; 469 :NUM2STR_REVL
LODSB ; 46A read DS:SI byte
XCHG AL, [DI]; 46B read/write ES:DI byte
MOV [SI-01], AL ; 46D write DS:SI byte
LEA AX, [SI+01] ; 470 equivalent to MOV AX, SI then INC AX
CMP AX, DI ; 473
JB 0469 ; 475 if (SI < DI) >NUM2STR_REVL
;
MOV DI, DX ; 477
RET ; 479 :NUM2STR_RET
This function can handle both variable length and fixed length strings. If a fixed length is used, by setting CX, then a NULL is not appended to the end. In either case, DI is set to the position after the last digit when the function returns. By doing that I could use it as part of a "stream" a la sprintf!
Code:
a 480
; ----------------------------------------------------------------------------
; sprintf() :SPRINTF
;
; Copies string DS:SI to ES:DI using 16-bit substitutions from DS:BP.
;
; NOTE: Currently only supports %u via NUM2STR
;
; [input] [output]
; AX (trashed)
; BX (trashed)
; CX (trashed)
; DX (trashed)
; SI pFormat+Offs pFormat+Len
; DI pOutput+Offs pString+Len
; BP ushort[] ushort[]+Len
;
CLD ; 480
JMP 0488 ; 481 >SPRINTF_LODS
STOSB ; 483 :SPRINTF_STOS
OR AL, AL ; 484
JZ 04A0 ; 486 >SPRINTF_RET
LODSB ; 488 :SPRINTF_LODS
CMP AL, 25 ; 489 '%'
JNE 0483 ; 48B >SPRINTF_STOS
LODSB ; 48D
CMP AL, 75 ; 48E 'u'
JNE 0483 ; 490 >SPRINTF_STOS
MOV AX, [BP]; 492
INC BP ; 495
INC BP ; 496
XOR CX, CX ; 497
PUSH SI ; 499
CALL 0440 ; 49A >NUM2STR
POP SI ; 49D
JMP 0488 ; 49E >SPRINTF_LODS
RET ; 4A0 :SPRINTF_RET
My "mini" version of sprintf() only handles unsigned integers substituted at "%u". It reads these from the memory at DS:BP in order. I decided on the ol' printf style because it's familiar and I can add functionality for other formatting later while easily supporting what I need now.
Footnotes:
[SUP][1][/SUP]. This is usually defined as char *itoa (int value, char *str, int base); but sometimes the base parameter is called radix and the return value is void. At the time of writing this, I haven't bothered to check what's standard.
[SUP][2][/SUP]. The function itoa appeared in the first edition of Kernighan and Ritchie's The C Programming Language, on page 60.
[SUP][3][/SUP]. And in stepping through the code, I found LEA to be a valid 8088 instruction whereas I had previously thought it appeared with the 286.
[SUP][4][/SUP]. http://www.hackersdelight.org/