• Please review our updated Terms and Rules here

"Fun with BIOS" Programming Thread

In order of importance:

  1. Invoke on keypress- implies break/step anywhere under any circumstance
  2. Step through/over subroutines (I think that's the exact reason int 0x01 and 0x03 exist)
  3. Allow loop to finish then break (a.k.a 'break when CX = 0')
  4. Peek and Poke
  5. Break 'x' instructions from current IP. Gets around breakpoint problem on a small scale :p.
  6. (Optionally) create a new stack frame and jump to arbitrary code (akin to g=c800:0005, for example).

I only have 8kB to work with of course... so it's in order of importance.

If I keep using software interrupts, it's a given that some minimal number of BIOS routines will need to be implemented. I can always fake some of them while developing (example, I could return a conservative value for amount of memory).

That said, except for screwing with video routines, I do not see how breaking within ISR will require me to perform my own custom I/O... of course I could be missing something important and it's nearly midnight where I am. By design, the debugger will destroy the screen contents (on invocation!).

For games, this might not be a big deal since the screen will likely be updated again after debugging is done (and yes, the current video mode will have to be saved too). For text user interfaces... probably not so much. It might be possible to circumvent this by using a serial port, but I can't be sure.

Or, I could just use the glyphs within the ROM BIOS itself at all time and say 'screw using BIOS interrupts for output' :p.
 
Not trying to be pedantic, but I'd like to know exactly what capabilities that you plan on implementing.

Pedantic is good. I genuinely appreciate your interest in this delusion of mine and/or guidance to prevent me from spectacularly crashing and burning.
 
Since you're not doing any file I/O, 8K should be quite doable. Take a look at the source for DDT86, for example. Leave out the file-oriented stuff and it comes pretty darned close.
 
Since you're not doing any file I/O, 8K should be quite doable. Take a look at the source for DDT86, for example. Leave out the file-oriented stuff and it comes pretty darned close.

File I/O can be handled by the OS- I can include a stub program which triggers INT3 after the OS loads the file. As an analogy, the debugger is not meant to run under control of the OS like MSDOS debug, nor should the OS run inside the debugger like SoftICE. Rather, the debugger should be completely detached from the OS and not give a damn about its existence.
 
Also remember that DDT86 in its 14K footprint not only includes file I/O, but also an assembler and disassembler--and both are written in PL/M. 8K should seem like a walk in the park if you don't need the assembler and file operations. The disassembler, however, might be very useful.
 
http://www.ctyme.com/intr/rb-2939.htm#Table1595

Currently, I'm making two programs for the debugger- one being the debugger proper, and the other which is a DOS wrapper which loads (but does not execute) a program, patches the first instruction with INT 0x03, invokes the debugger, and then executes the first instruction of the program. When the program ends, the wrapper will (somehow- haven't worked out the details yet) make sure the debugger is stopped. This provides a means to use the ROM debugger within the context of an OS process. If I had the time, I'd also love to see such a wrapper work on MINIX or XENIX.

The link above shows that I use INT 0x21, Subfunction 0x4B to start a program. I already got the command tail, so assuming a correctly parsed command line, I already have the filename to execute. However, I'm not quite sure what the DOS parameter block pointed to by ES:BX means. Can anyone give me some insight on how to set up such at a data structure in the absence of a book on DOS internals? I don't know much about File Control Blocks and the like other than CP/M used them and they are related to the Disk Transfer Area (?).

Additionally, is starting a loaded program just as simple as using the jumping to the first instruction after load? Or is it my responsibility to set all the registers before program execution to the values they would take at program start? It seems odd that all DOS versions besides 2.0 preserve the registers, since in order to load and not execute, one would need to make sure that the registers have the correct values beforehand when their program decides it's time to execute the loaded program.
 
Last edited:
Found this website resource this morning with many development tools for embedded applications. Of special interest is the zip file XTOOLS which includes monitors/debuggers for many different processors. Here is the link:

http://www.classiccmp.org/dunfield/dos/index.htm

From the readme file in XTOOLS:

MONITORS:
- 68HC08, 6809, 68HC11, 68HC12, 8051/52, 8080/85, 8086, 8096, Z80 & AVR.
- Well documented ASM source code.
- Completely stand-alone, runs on the bare hardware.
- Very compact, occupies less than 8K of ROM.
- Built in disassembler (except AVR)
- Edit/Dump Memory, Processor registers and Interrupt vectors.
- Multiple breakpoints, which are completely transparent to the user
program, and remain effective until removed.
- Software Single-Step works even when tracing ROMed code.
- Download INTEL or MOTOROLA format hex data.
- Revector any/all interrupts to the user program.
- Online help display of commands and syntax.
- Many more features *** Features may differ in some monitors.

Also of interest is the file EMBEDPC which allows creation of a custom disk boot loader to completely skip DOS running an application. Interesting stuff...
 
Working with the assembly source code from Dunfield Development Systems (From the location given in my last post) I was able to create an executable of the monitor/debug program that seems to work OK running in DOSBOX. I also put the short user manual inside the zip file. In the interest of full disclosure... I have never used this series of tools required ( Assembler: ASM86, Macro Formatter: MACRO, Hex Formatter: HEXFMT) so it is entirely possible I have done something incorrect. My next step will be to try and burn the program into ROM for the PC.
 

Attachments

  • mon86.zip
    7.3 KB · Views: 1
Working with the assembly source code from Dunfield Development Systems (From the location given in my last post) I was able to create an executable of the monitor/debug program that seems to work OK running in DOSBOX. I also put the short user manual inside the zip file. In the interest of full disclosure... I have never used this series of tools required ( Assembler: ASM86, Macro Formatter: MACRO, Hex Formatter: HEXFMT) so it is entirely possible I have done something incorrect. My next step will be to try and burn the program into ROM for the PC.

The past few weekends for me has consisted of the following:
10% required tasks-to-live sustainably
5% coding
85% RTFM

I've been reading a large number of manuals... The Phoenix BIOS manual, one from American Megatrends, the IBM Technical References, SCons manual and API, and lastly the MASM manual and the 50,000 extremely useful knowledge-base articles that Microsoft has kept on MASM over the years.

As of now, my little debugger draws to the screen, and a list of opcodes has been formatted and stored in memory. Slow progress, but I hope the time I spend doing reading up front will make the coding less painful.
 
Question for Chuck,

Why does MASM reserve space for a stack/bss if they are not the last segments detected in a program? Shouldn't MASM be able to understand to not increment the program counter for segments which don't actually contribute data? I suppose .fardata? and .data? may make sense so that programs have scratchpad memory "built-in" when loaded... but why stack? The only good reason I can think of is that so a valid stack space actually exists within the program when the stack would otherwise eat away other data.

To make matters more confusing... if .stack is declared before .fardata? (bss), but after everything else (see the below program), the stack data is not emitted either. Presumably, this is because .fardata? doesn't actually emit any code either, and MASM only emits code when data/code has been explicitly defined?

EDIT to the above two paragraphs: This is because the '?' initialized does in fact allocate space for variables, according to the bottom of page 87 of the MASM 6.1 manual... the assembler just doesn't initialize it. Still, this doesn't answer my question about why the data is allocated in certain contexts but not others.


Curiously enough, unless I'm missing something, the MASM manual only gives two hints that .data?, .fardata?, and stack can have initialized data (and by extension, have data emitted in the executable) at all!

Page 42 (Emphasis by me):
You can also define the .DATA? and .CONST segments that go into DGROUP
unless you are using flat model. Although all of these segments (along with the
stack) are eventually grouped together and handled as data segments, .DATA?
and .CONST enhance compatibility with Microsoft high-level languages. In
Microsoft languages, .CONST is used to define constant data such as strings
and floating-point numbers that must be stored in memory. The .DATA?
segment is used for storing uninitialized variables. You can follow this
convention if you want. If you use C startup code, .DATA? is initialized to 0.

Page 51:
The linker processes object files in the order in which they appear on the
command line. Within each module, it then outputs segments in the order given
in the object file. If the first module defines segments DSEG and STACK and
the second module defines CSEG, then CSEG is output last.


Also, could you do me a favor and assemble and link the following program in MASM (I don't have a comparable version), and tell me the order in which segments are linked. Assume STACK_AT_BEGINNING is defined:

Code:
.model small

;Bug? Page 46 of WATCOM Linker manual.
;Order of segments is unexpected.

ifdef STACK_AT_BEGINNING
	.stack
endif

.code
	.startup
	.exit
.data
	db '.data'
	
.fardata?
	far_bss_data dw ?

ifdef STACK_AT_MIDDLE
	.stack
endif

full_seg segment byte public 'FULL_SEG'
	db 'full_seg'
full_seg ends


ifdef STACK_AT_END
	.stack
endif

end

According to page 46 of the WATCOM linker manual, if DOSSEG is not specified in the program OR as a special linker option, segments are output in the order they are declared...

Except they aren't. Even though I declared .stack first, it is output after _TEXT (.code) and _DATA (.data) but before FAR_BSS (.fardata?) and full_data (full_data), according to the map file.

I wonder if similar behavior exists in MASM? The only thing I can think of is something on page 34 of the MASM programmer's guide:
The .STACK directive opens and closes the stack segment but does not close the current segment.
 
Last edited:
I got it!

I got it!

Well, today was a good day in the world of vintage programming... Murphy's Law was semi-violated.

Not only did I get the Option ROM to boot successfully in Bochs (haven't tried real hardware), I now have working- rudimentary, but working- breakpoint int 0x03 and single-step 0x01 interrupt handlers. I should have something worth burning to an EPROM in a few days. But I still got some ways to go, and even further for BIOS. Nevertheless, this is good practice for writing assembly, and should help me while writing BIOS on real hardware.

During boot, the ROM debugger hooks ints 0x01, 0x03, and 0x09, and tests that the breakpoint/single-step handler works. Int 0x03 sets up the trap flag, and is artificially invoked just before the end of the boot routine. In this booting context, int 0x03 runs without patching an instruction or decrementing the pushed instruction pointer (i.e. after the interrupt, the instruction where int 0x03 was will not be retried). In this case, int 0x03 acts just like any other software interrupt.

The int 0x01 handler prints a formatted string with the registers, but doesn't currently print the values, as I don't have a bss ('global' variable) data area set up. Keyboard interrupt (0x09) doesn't work for a similar reason- I have no reliable area to place the address of the old interrupt handler.

For a stack, I'm using the BIOS stack during boot, and just assuming the bss_seg is invalid and going without for now. At least for the stack, might just go ahead and just have the debugger use the caller's Stack Segment, and assume SS is valid no matter the context in which the debugger is called. That can't possibly backfire, right?

One cool thing I've noticed is that if I keep coding this the way I am now, the debugger is completely position-independent code. You can burn it to an EPROM at any place in the HMA, provided it is on a boundary where an option ROM is expected to have its header.
 
Well, remember that the area around address 500-5FF is used by BASIC, so it's probably safe to use for the debugger.

As far as memory size goes, you don't need to call INT 12H--just read location 00413-00414 for memory size in K.

According to this:
http://stanislavs.org/helppc/bios_data_area.html

0x500 to 0x5FF is not guaranteed to be unused by DOS, DOS commands, and of course BASIC :p. Though about 10 bytes are free for a POST scratch area, that unfortunately is not quite enough for my uninitialized variables. Do you know of a resource which gives a comprehensive memory map of DOS from the IVT to the start of COMMAND.COM?


I should be ashamed of the following paragraph, part of my original post:

What I COULD possibly do is declare 0x0300:0x0000 as heap space for variables and just use the caller stack from when the debugger was invoked as the debugger stack. I would have to write a very small memory manager though, as there is at least one routine that must return a pointer to allocated data (16-bit hex to ascii-respresentation conversion).

First off, it's 0x0030:0x0000. Second off, that's still part of the IVT... which I should know darn well by now.
 
Last edited:
Well, that's always going to be your problem. Originally, 500-5FF was defined as "Program communication area". That really didn't last long as this shows. Pretty much, everything's spoken for.

I suppose that you could appropriate the top 1K or so of RAM at boot, but some folks may not consider that to be a viable option.
 
Well, that's always going to be your problem. Originally, 500-5FF was defined as "Program communication area". That really didn't last long as this shows. Pretty much, everything's spoken for.

I suppose that you could appropriate the top 1K or so of RAM at boot, but some folks may not consider that to be a viable option.

I just checked in the VM Bochs... DOS stores stuff up at the 639k mark or so (bits and pieces of strings and help text) for some odd reason... I'll take a snapshot of it when I get the chance.
 
You can do what Compaq did and record input output by typing or invoking every action of the BIOS resulting in a legal "Non really reversed engineered or either looked at source" way. So now you can modify it to your wishes.

Functionality of the IBM BIOS was not determined by looking at IBM code - this was banned. In fact, functionality was determined by a process known as "black boxing", which involved treating the BIOS as a black box and feeding every possible input to it and recording the output.
 
Last edited:
I just checked in the VM Bochs... DOS stores stuff up at the 639k mark or so (bits and pieces of strings and help text) for some odd reason... I'll take a snapshot of it when I get the chance.

The point is that DOS will use up to whatever value INT 12H returns. Hook that interrupt, knock a K off of the reported number and pass it on. You should be fine.
 
The point is that DOS will use up to whatever value INT 12H returns. Hook that interrupt, knock a K off of the reported number and pass it on. You should be fine.

Alright sounds good. Just one other q... do you know if XENIX uses int 0x12? I don't have it, but I'm curious nonetheless if I can make the debugger truly detached from one's choice of OS.

As for games which eat all memory... I'll cross that bridge when I come to it.
 
SO close!

Okay, I'm a bit stumped on this one... anyone know of a good method to go about coding a hex-to-ascii routine? The routine always outputs to a constant location, but that not need be the case.

Not asking someone to code it for me, but... my naive implementation is the following (and it doesn't work either- output is garbage values):
Code:
;Take number in AX and convert it to ascii hex. Store result in
	;hex_buffer_word.
	;Assumes: ES points to the correct segment
	hex_to_ascii proc near
		;Improvement: Caller passes a pointer as parameter using C-calling
		;convention. What's the easiest way to get far pointer on the stack?
		;push ds, push si?
	
		mov dx, ax
		and ax, 0x000F
		add al, 0x30 ;add ascii bias
		
		;Assembler-time constant... use base-addressing.
		mov byte ptr es:[hex_buffer_word], al
		
		mov ax, dx
		shr ax, 1 ;Will become loop- therefore ax, cl form can't be used.
		shr ax, 1
		shr ax, 1
		shr ax, 1
		and ax, 0x000F
		add al, 0x30
		mov byte ptr es:[hex_buffer_word + 1], al
		mov ax, dx
		shr ax, 1
		shr ax, 1
		shr ax, 1
		shr ax, 1
		and ax, 0x000F
		add al, 0x30
		mov byte ptr es:[hex_buffer_word + 2], al
		mov ax, dx
		shr ax, 1
		shr ax, 1
		shr ax, 1
		shr ax, 1
		and ax, 0x000F
		add al, 0x30
		mov byte ptr es:[hex_buffer_word + 3], al	
		ret
	hex_to_ascii endp
 
I have been using debug to look at the first 1K of memory (that is the Interrupt Vector Table) to see what's there, on the various vintage and new computers I use daily... I have noticed that the space is (at most) about 1/2 full. I have only seen some data at the top of that IVT space on 1 computer: An old Pentium running Windows 95. However, So far I have never seen anything at location 0000:0300 to 0000:0370. As you have noted, in general the lower memory areas look like the Wild Wild West.
 
Back
Top