I believe that a significant difference between the H-200 and the IBM1401 was that the H-200 was able to do both BCD and binary arithmetic. Being a character machine with unlimited word length that meant that one could do either to any number of significant digits as well, which was fun. Also having binary arithmetic meant that one could do address modification. In the most basic model of H-200, which didn't have indexed addressing, that meant that one could index a programme loop through an array simply by adding values to the appropriate parts of
the instructions, which was okay until the addresses overflowed by mistake and the carries changed the operation codes
There is no consolidated multiplication or division in my algorithm because it only calculates one decimal place at a time. The only multiplications required are by integer constants so they are done by the time-honoured process of doubling and adding. Divisions just provide a single decimal digit as quotient with a remainder, so are just done by repeated subtraction. In order to calculate the digits progressively all the calculations involved have to be done simultaneously one place at a time. As binary values take less space than BCD I use purely binary numbers but scale them by ten each time around the loop, so they effectively convert into decimal as they are used, a sort of extended BCD if you like. Only the final accumulator is a real BCD field. This field is necessary because I have to allow a delay of a few digits before printing the top one so that carries can propagate through the answer. This final carrying is the only part of the calculation of Pi that demands right-to-left operations; everything else can be done left-to-right even though it runs contrary to our usual view of arithmetic. One of my reasons for choosing the Feynman point as my target was that passing it successfully proves that the propagation of carries is working where it is most likely to fail.
I do use variable length words in the FIFO queue but that alone wasn't enough to hit my target, so I have added half-byte compaction to eliminate half-byte zeroes. The memory used by the compaction and expansion routines is less than the savings in the queue size, so it was worth doing ... but I still didn't hit the target then. The H-200 has an additional bit, the item mark, on each character and that tends to sit around doing nothing most of the time, so I use it to mark bytes where a half-byte zero has been suppressed. Making the values variable length isn't an enormous saving though as I only ever store remainders from divisions, which are all small numbers by definition. I have put so much memory optimisation into the H-200 version of the programme that it is now difficult to see how the actual calculation of Pi works in amongst all the optimisation code.
Having achieved sufficient compaction in my software I now have to be equally adept at compacting the necessary logic into my hardware as I only have sufficent backplane space for 200 logic boards even though I have almost a thousand boards to hand. It would be very nice to find another H-200 backplane somewhere, even a small one out of a Honeywell disk drive or elsewhere. Even compatible individual edge connectors (single sided 0.125 inch pitch 40 pin) don't seem that easy to find now. Well it wouldn't be fun if it was too easy.