#### Ruud

##### Veteran Member

For a very long time I'm working on three software projects between other things:

- My own OS meant to run on 8088 PC compatibles. My primary targets: C= PC10 series. Main hurdle at this moment: directories.

- My own bootable BASIC, same targets. Shares a lot of routines with the OS.

- My own Pascal, written in Turbo Pascal, which outputs macros. My own assembler should turn the macros into binaries for the various target machines. For the moment only the PC and the Commodore 64.

The last two projects have one thing in common: they need Floating Point (FP) routines for the mathematical part. For the C64 I can use the onboard FP routines so that problem already has been solved. But FP routines for the 8086.....

So far for the 8086 I only found the sources of the original and a hacked Microsoft's GW-BASIC and the so called URC library. The problem: the code of both is very hard to understand. And I'm not the only one with this opinion. And what doesn't help either, I'm a NASM man and the last time I really worked with MASM/TASM was thirty years ago.

That raised the idea to use the to 8086 converted FP routines of the C64 for my Pascal and BASIC as well. OK, it is "only" 40 bits but that is good enough for me. If I understand things better, I always can extend the number of bits. The main advantage: to check if things work OK, the idea is just running a small BASIC/ML program and to see what happens with the Floating Point registers FAC1 and FAC2. And then the "only" thing I have to do is to achieve the same result with my Pascal/8086 routines.

I don't want to copy everything from the C64 so I decided to write the various routines in Pascal first in a way that more or less mimics the use of registers. I decided to start with converting a string to FP. I have to check for several things. I assume that a number can look like "-1234.5678 e -123", with or without spaces around the 'e' and without one or both '-' signs or maybe with a '+' sign instead.

After parsing the string I will end up with two booleans/bytes representing the two signs, one word for the extension, four words for the fractional part and four words for the integer part. I know, massive overkill, but I'm already thinking about supporting the extended format which is eighty bits. And scaling down afterwards is easier than scaling up.

Generating the signs values, the extension word and integer part were relatively easy compared to the fractional part. While parsing the number for creating the integer part, I had to multiply the previous number with ten and I used a combination of addition and "shift left". For creating the fraction part I thought it was just a matter of subtraction and "shift right" but unfortunately that wasn't the case. And to make matters worse, I could not make use of the assembly instruction DIV and IDIV if more than sixteen bits were involved.

I had to find a way to divide multiple words by ten and in the end I came up with four possible solutions:

- found on internet:

unsigned divu10(unsigned n)

{

unsigned q, r;

q = (n >> 1) + (n >> 2);

q = q + (q >> 4);

q = q + (q >> 8);

q = q + (q >> 16);

q = q >> 3;

r = n - (((q << 2) + q) << 1);

return q + (r > 9);

}

It works for the biggest LONGINT I can produce in Pascal, $7FFFFFFF, but it looks a bit complicated (read: I'm lazy) and I have no idea what the error is at the end so at this moment I'm not really convinced.

- found on internet:

Have a look at this:

result = (input * 205) >> 11

This can be seen as "r = (i * 205) div 2048". the error will be less than one per mille.

When using 16 bits, "r = (i * 6554) >> 16", the error will be 4/65536. I even found the 6502 version for the calculation on internet.

When using 32 bits the error will be rough1y 1/1.000.000.000.

Thinking things over I'm leaning over to the idea of using 19 bits. The error will be 2/524.288 and, more important, the multiplier will be 52429: this means I can still use the 8088's own multiply instruction MUL.

I have no idea if this idea will be more complicated to convert into code than the first idea but its biggest advantage is that you know what the error will be.

- using tables:

The main idea is to generate a table with the values for 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001, etc., etc. The question is: how many records do we need and how may bits will each record contain? FYI: using just six records, thus including the one for 0.000001, will give already a smaller error than the above 16-bit multiplication/shift idea. Another advantage of using records: speed.

- using the C64 routines as base:

That was the original idea but then I found out that the C64 did the division by using FP routines. And that means that I ran more or less into the "chicken and egg" problem: I cannot produce an egg because I haven't a chicken yet.

Side note: the C64 used a table to look up the FP value for ten and that triggered the idea of using tables.

Once my own FP routines seems to work, I will see if I keep the original idea of division-by-ten or replace it with a FP division. The last will mean less code but speed can be a decisive factor

To be continued.....

Feel free to add comment, critics, ideas etc. Everything is welcome!

- My own OS meant to run on 8088 PC compatibles. My primary targets: C= PC10 series. Main hurdle at this moment: directories.

- My own bootable BASIC, same targets. Shares a lot of routines with the OS.

- My own Pascal, written in Turbo Pascal, which outputs macros. My own assembler should turn the macros into binaries for the various target machines. For the moment only the PC and the Commodore 64.

The last two projects have one thing in common: they need Floating Point (FP) routines for the mathematical part. For the C64 I can use the onboard FP routines so that problem already has been solved. But FP routines for the 8086.....

So far for the 8086 I only found the sources of the original and a hacked Microsoft's GW-BASIC and the so called URC library. The problem: the code of both is very hard to understand. And I'm not the only one with this opinion. And what doesn't help either, I'm a NASM man and the last time I really worked with MASM/TASM was thirty years ago.

That raised the idea to use the to 8086 converted FP routines of the C64 for my Pascal and BASIC as well. OK, it is "only" 40 bits but that is good enough for me. If I understand things better, I always can extend the number of bits. The main advantage: to check if things work OK, the idea is just running a small BASIC/ML program and to see what happens with the Floating Point registers FAC1 and FAC2. And then the "only" thing I have to do is to achieve the same result with my Pascal/8086 routines.

I don't want to copy everything from the C64 so I decided to write the various routines in Pascal first in a way that more or less mimics the use of registers. I decided to start with converting a string to FP. I have to check for several things. I assume that a number can look like "-1234.5678 e -123", with or without spaces around the 'e' and without one or both '-' signs or maybe with a '+' sign instead.

After parsing the string I will end up with two booleans/bytes representing the two signs, one word for the extension, four words for the fractional part and four words for the integer part. I know, massive overkill, but I'm already thinking about supporting the extended format which is eighty bits. And scaling down afterwards is easier than scaling up.

Generating the signs values, the extension word and integer part were relatively easy compared to the fractional part. While parsing the number for creating the integer part, I had to multiply the previous number with ten and I used a combination of addition and "shift left". For creating the fraction part I thought it was just a matter of subtraction and "shift right" but unfortunately that wasn't the case. And to make matters worse, I could not make use of the assembly instruction DIV and IDIV if more than sixteen bits were involved.

I had to find a way to divide multiple words by ten and in the end I came up with four possible solutions:

- found on internet:

unsigned divu10(unsigned n)

{

unsigned q, r;

q = (n >> 1) + (n >> 2);

q = q + (q >> 4);

q = q + (q >> 8);

q = q + (q >> 16);

q = q >> 3;

r = n - (((q << 2) + q) << 1);

return q + (r > 9);

}

It works for the biggest LONGINT I can produce in Pascal, $7FFFFFFF, but it looks a bit complicated (read: I'm lazy) and I have no idea what the error is at the end so at this moment I'm not really convinced.

- found on internet:

Have a look at this:

result = (input * 205) >> 11

This can be seen as "r = (i * 205) div 2048". the error will be less than one per mille.

When using 16 bits, "r = (i * 6554) >> 16", the error will be 4/65536. I even found the 6502 version for the calculation on internet.

When using 32 bits the error will be rough1y 1/1.000.000.000.

Thinking things over I'm leaning over to the idea of using 19 bits. The error will be 2/524.288 and, more important, the multiplier will be 52429: this means I can still use the 8088's own multiply instruction MUL.

I have no idea if this idea will be more complicated to convert into code than the first idea but its biggest advantage is that you know what the error will be.

- using tables:

The main idea is to generate a table with the values for 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001, etc., etc. The question is: how many records do we need and how may bits will each record contain? FYI: using just six records, thus including the one for 0.000001, will give already a smaller error than the above 16-bit multiplication/shift idea. Another advantage of using records: speed.

- using the C64 routines as base:

That was the original idea but then I found out that the C64 did the division by using FP routines. And that means that I ran more or less into the "chicken and egg" problem: I cannot produce an egg because I haven't a chicken yet.

Side note: the C64 used a table to look up the FP value for ten and that triggered the idea of using tables.

Once my own FP routines seems to work, I will see if I keep the original idea of division-by-ten or replace it with a FP division. The last will mean less code but speed can be a decisive factor

To be continued.....

Feel free to add comment, critics, ideas etc. Everything is welcome!

Last edited: