• Please review our updated Terms and Rules here

Microsoft BASIC disassembly or listing?

Trixter

Veteran Member
Joined
Aug 31, 2006
Messages
7,478
Location
Chicagoland, Illinois, USA
There are sites and posts in various corners of the internet that have source code listings or code reviews of things that are of historical interest: DOS 1.0, the Doom engine, etc. It just dawned on me that I've never seen anything for the original Microsoft Cassette BASIC (ie. the BASIC in ROM of the 5150 and later). Does anyone know if the source has been published, or a commented disassembly has been done?

(If not, I might take a stab at it, but was hoping someone had already done the work.)
 
I have part of it done in IDA, but it hardly is completed. I tried attaching my IDA project file in a zip folder, but it exceeds the file limit. Honestly, I'd say go for it and do it yourself, even if you have to learn IDA- it'll never harm you to have spent time playing with it ;).

And besides, if I can semi-disassemble an 8kb file in a week (the COMPAQ Portable BIOS), I'm sure you can do 32kb file faster, assuming you know what you're looking for.
 
With the BIOS, I know what I'm looking for since I have the source and it's 8K. But the BASIC is 32K and much more convoluted with a lot of space-saving tricks and open hooks and such. Thank goodness there can't be any self-modifying code by sheer fact it's in ROM, but still, it's spaghetti. That's why I was hoping someone else had already tackled it.
 
Source code for the IBM Cassette Basic would be very useful, especially for the potential for making some tweaks in the I/O. I have started to view the ROM Basic on the IBM as having the potential to become an Operating System of sorts that does not require DOS. That is one of the reasons I am currently working to remove the DOS calls in the Version 1.1 of the Palo Alto Tiny Basic. (<3K of code!) I even tried using the original BASCOM compiler to see if it would create code without DOS calls, but the generated code was full of INT 21H machine code.
 
With the BIOS, I know what I'm looking for since I have the source and it's 8K. But the BASIC is 32K and much more convoluted with a lot of space-saving tricks and open hooks and such. Thank goodness there can't be any self-modifying code by sheer fact it's in ROM, but still, it's spaghetti. That's why I was hoping someone else had already tackled it.

If you know the entry point to BASIC (I don't offhand but had to in order to get started when disassembling it), you can tell IDA to interpret that address as code, and it will disassemble instructions until it stops based on heuristics ("is this next instruction likely to be code or data?"). It will also automatically find all code that can be jumped to from the current code block that was disassembled based on relative and absolute jumps.

I personally would then look for any jumps that rely on a base and BP/BX, see if BX/BP and the base point to a jump table and tell IDA to interpret the jump table as data. Obviously, every address within the jump table would lead to another code block which more than likely references other code blocks :D.

If you open up a hex editor and see 'PSQRV' or some similar combination of those letters- that is more than likely code- the ASCII codes of 'PSQRV' and 'TU' are the opcodes for the PUSH instructions on the 8088.
 
Helpful hints (that I already knew, but are very helpful to others following the thread). What I meant about being difficult and entry points and such is that there is a mixture of tables and code throughout the segment, and lots of space-saving measures (like the fall-through examples you saw in the Compaq BIOS), and other interesting choices like using custom INTs instead of near CALLs... it's very hack-y, which makes it difficult more difficult to follow than more traditional sources. But I'm not surprised, as MS was trying to make the most of the space it had.

I realize I don't get to pick the difficulty level of what I want disassembled :)

I need more practice with IDA and jump tables, as I always seem to make a mess of them.
 
Back
Top