MASM/WASM segments COMMON and absolute positioning

cr1901 · Jul 21, 2013

I have two pieces of code in two separate source files using WASM, assuming at least MASM 3.0 (or so) compatibility. I think the code speaks for itself what I'm trying to do (just for amusement for now, really):

First source file bios_start.asm:

Code:

public bios_org
biosseg segment byte common 'bios_start'
	org 0xe000
bios_org:
	inc ax
	inc ax ;Dummy instructions
	inc ax
biosseg ends

end

Second source file org_jmp.asm:

Code:

extrn bios_org:near
biosseg segment byte common 'org_jmp'
	org 0xfff0
	DB	0EAH		; HARD CODE JUMP
		DW	bios_org		; OFFSET
		DW	0F000H		; SEGMENT
	db '07/21/13'
	db 0x0ff
	db 0x00
	db 0x00 ;Checksum
biosseg ends
end

Based on a book I have based on MASM 5.0 or so (and this should be MASM 3.0 compatible- so the Watcom Assembler should work), the COMMON combine type should make all segments be overlaid on each other relative to a common address.

Of course, because NOTHING in MASM syntax EVER works as advertised (MASM's lack of consistent documentation and features not working as intuitively expected is the reason I believe 8088 is so difficult, not the assembly language itself), instead of the segments being overlaid on each other, the segments of code are laid out one after another in memory, so bios_start is at 0xe000 and org_jmp is at 0x1dff3... the second segment is relative to the first, when both should be relative to a single address (0xe000) and overwritten if necessary.

Interestingly enough, attempting the following:

Code:

org 0xe000
public bios_org
biosseg segment byte common 'bios_start'
bios_org:
	inc ax
	inc ax
	inc ax
biosseg ends

end

Causes WASM to not trigger an error, but also generate no source file at all... it would be nice to simply have the org directive update all labels' offsets without generating 0xe000 bytes of zero-fill.

I'm really not sure what I'm doing wrong... all I want is my first source file to be placed at offset 0xe000 and the second to be place at 0xfff0 in a single source file. If it's possible, I also want all labels to have an offset of 0xe000 added to it to reduce the file size to 8192 bytes.

The order of files into the linker, if it matters, is the same as how I labelled the source files. My linker options correctly generate a raw binary output:

Code:

wlink FILE {bios_start.obj org_jmp.obj} FORMAT RAW BIN NAME bios.bin OPTION NODEFAULTLIBS, START=bios_org,  MAP=bios, STACK=0

Chuck(G) · Jul 21, 2013

Why are you using different class names in your segment directive. Here's an example (even more simplified) and it works as expected.

Code:

seg1	segment		byte common "classy"

	public		startit

startit:
	inc		ax
	inc		ax
	int		20h

seg1	ends
	end

Second file:

Code:

seg1	segment		byte common "classy"

	extern		startit:near

	org		0fff0h
	jmp		startit

seg1	ends
	end

And the load map:

Code:

LINK : warning L4021: no stack segment

 Start  Stop   Length Name                   Class
 00000H 0FFF2H 0FFF3H SEG1                   CLASSY

  Address         Publics by Name

 0000:0000       startit

  Address         Publics by Value

 0000:0000       startit

Everything is where it should be. I don't know if it matters, but I'm using MASM/ML 6.11d.

MASM is pretty good--I've done some very hairy stuff with it over the years. About all that I'd really have liked to see is separate location and address counters, so I could assemble code that's to be moved at runtime.

cr1901 · Jul 22, 2013

Well, that worked... now to get rid of the 56 kilobytes of leading zeros... any directives which could help with that, assuming that bios_start/startit is the entry point?

UPDATE: No it didn't work... either the first segment gets eaten by the segment (all zeros), or the linker finishes with no errors but doesn't produce an executable at all... I NEED to be able to link raw binaries for this, and wlink doesn't want to listen.

Chuck(G) said:
Why are you using different class names in your segment directive.

MASM is pretty good--I've done some very hairy stuff with it over the years. About all that I'd really have liked to see is separate location and address counters, so I could assemble code that's to be moved at runtime.

The book I was using said that when you have COMMON combine segments, the names (i.e. biosseg) need to be the same, and that while segments with the same class names are loaded together, they are mainly used for debugging symbols. I wasn't aware both the segment and class names have to be the same.

Maybe back then (what is it that you do for a day job anyway?!) there were better references on MASM...

Chuck(G) · Jul 22, 2013

Well, give MASM 6.11 a try--it's floating around the web; Microsoft has distributed it FOC in some of their DDKs and as far as I know, it's a freebie with MSC.

I did manually verify that my code produces the appropriate binary:

Code:

-u7f0:0
07F0:0000 40            INC     AX
07F0:0001 40            INC     AX
07F0:0002 CD20          INT     20
07F0:0004 0000          ADD     [BX+SI],AL..

.-u7f0:fff0
07F0:FFF0 E90D00        JMP     0000
07F0:FFF3 0000          ADD     [BX+SI],AL
07F0:FFF5 0000          ADD     [BX+SI],AL...

007F0 is the segment where the image is loaded. You should be happy that you're not trying to use MASM 1.0.

cr1901 · Jul 22, 2013

What does your linker command line look like? And how do you get it to output a raw binary?

Chuck(G) · Jul 22, 2013

Just

LINK X1+x2;
EXE2BIN X1.EXE X1.BIN

cr1901 · Jul 22, 2013

I forgot about the EXE2BIN tool... that may make some things easier now that I know that tool works (and apparently it was used in the past to develop BIOS- the Generic XT BIOS also uses it as I now recall).

I was intending to write my little toy BIOS partially in C... as an exercise for mixing C and assembly, and also as a supplement to understanding DOS and BIOS internals.

The problem I perceived with EXE2BIN is that I cannot guarantee that my C compiler won't do something I don't expect with my code unless I explicitly tell the linker to use 'Raw Output'... I don't have access to the std libraries when using C.

This already happened actually... the WATCOM C Compiler defines a symbol based on your current memory model to discourage improper mixing of models I suppose, akin to name mangling... you're expected to read the compiler manual to determine how to mix memory models. Well, I disabled all standard libraries but forgot to enable one just so that single external symbol can be resolved.

Speaking of manuals, the WATCOM linker manual gives an example of how to order segments explicitly for a ROMable device. This linker script is what works for me:

Code:

  name bios                  # name of resulting binary
 OPTION NODEFAULTLIBS, MAP=bios, STACK=0, quiet
 system dos              #type of binary (ELF, PE, MZ, PharLap, ...)
 output raw offset=0xe000 	#Removes MZ header and adds '.bin' extension. Possibly undocumented usage of offset.
 #output raw offset=0xfe000	#Use if absolute addressing is necessary
 file bios_main.obj		#C code file consisting of the main bios
 file bios_start.obj
 file org_jmp.obj
 library clibs.lib	#Resolve _small_code_ symbol
 # add more obj files here
 
 order # in which order should segments be put into binary
     clname BIOS_START offset=0xe000 segment bios_start offset=0xe000
     clname CODE segment _BEGTEXT segment _TEXT #Order: Class, segment (CODE is a Watcom-specific class for C code)
     clname BIOS_END offset=0xfff0
     
     #Use if absolute addressing is necessary
     #clname BIOS_START offset=0xe000 segment bios_start segaddr=0xf000 offset=0xe000
     #clname CODE segment _BEGTEXT segment _TEXT
     #clname BIOS_END segment org_jmp segaddr=0xf000 offset=0xfff0
     
     #clname DATA
     #clname STACK 
     #clname BSS #Not implemented yet

I'll update this as I refine it. My only question for now is- where exactly is the stack located for the BIOS as it boots (after DRAM and DMA is initialized)?

Contrary to what I thought, it appears directly manipulating the addresses of segments is compiler specific, even among the DOS compilers, which did a great job mutilating ANSI C with it's platform-specific headers.

Chuck(G) · Jul 22, 2013

Wasn't there an Annabooks publication that detailed how to write your own BIOS in C?

In the 5160/5150, SS:SP is initialized to 0030:0100, so you get 256 bytes to do your fooling around.

cr1901 · Jul 22, 2013

Chuck(G) said:
Wasn't there an Annabooks publication that detailed how to write your own BIOS in C?

In the 5160/5150, SS:SP is initialized to 0030:0100, so you get 256 bytes to do your fooling around.

Yes, and I even started with the code from that book, but stopped when I realized that I'd have a hell of a time getting the supplied Makefile to work properly. That, and converting code from a Microsoft Word document isn't what I would call enjoyable

.

I may go ahead and use that code as a base anyway, but convert it for my own means.

cr1901 · Jul 23, 2013

A BIOS that... beeps the speaker.

A BIOS that... beeps the speaker.

Well, I had my first success tonight... I got the PC emulator PCem to boot the BIOS and beep the PC speaker, and then halt!... well it's a start:

bios_main.c ... combine with org_jmp.asm and bios_start.asm and the linker file into a raw binary. In fact, that probably won't work as intended, so I'll just upload a zip file when I clean up the code.

Code:

#include <dos.h>
#include <conio.h>

char __far  * const mono_seg = (char __far  * const) MK_FP(0xB800, 0000);

void __cdecl bios_main(void);
void __cdecl beep(void);

void __cdecl bios_main(void)
{
	register int i;
	//register int j = 0;
	/* for(i = 0; i < 5; i++)
	{
	}

	for(i = 0; i < 255; i++)
	{
		*(mono_seg + i) = 'A';
	} */
	
	/* Timer channel 2 cheap code steal */
	outp(0x43, 0xB6);
	outp(0x42, 0x33);
	outp(0x42, 0x05);
	/* __asm
	{
		mov al, 0xB6
		out 0x43, al
		mov al, 0x33
		out 0x42, al
		mov al, 0x05
		out 0x42, al
	} */
	
	beep();
	/* Set up PIT channel 1 for strobe  mode */
	//outportb(0x76, 0x40); //dx al <-- code is wrong
}


/* Use special calling convention for this/other routines? */
/* No-stack-or-RAM calling convention- input bl, number of 250ms intervals */
void beep(void)
{
	/* Signed vs unsigned... it matters! */
	register unsigned int delay, count;
	
	/* turn on speaker- example from WATCOM clib.pdf, page 399 */
	outp(0x61, inp( 0x61 ) | 0x03);
	
	for(count = 4; count > 0; count--)
	{
		for(delay=65535; delay > 0; delay--)
		{
			
		}
	}
	
	outp(0x61, inp( 0x61 ) & 0xFC);
	/* __asm{
		mov bl, 0x04
		in al, 0x61
		mov ah, al
		or al, 0x03
		out 0x61, al
		
		sub cx, cx
	sound_loop:	
		loop sound_loop
		dec bl
		jnz sound_loop
		
		mov al, ah
		out 0x61, al
	} */
}

Unfortunately... compare to the hand-assembly version, the code is twice as long... AND slower o.0;... calls to outp/inp are not optimized to their immediate counterparts. The compiler generates code to use the full general purpose register width when a byte will suffice (again, has partially to do with outp/inp). And the loop for whatever reason takes more than a second (the beep in the hand optimized code is 4 loops of 250ms each)... haven't quite figured that one out, but I'll take a look when I'm less tired.

So... why do it in C? Well, to fill in a gap of my technical knowledge of combining C and assembly for use in a bare-metal application (and WITHOUT an IDE doing something 'magical' and me not question the computer). Additionally, I somehow think that doing it in C might make the code more modular, so I can swap out modules (i.e. source files) of code depending on the attached hardware, like Turbo board vs non-turbo board, V20 vs non-V20, which was an issue with Generic XT BIOS.

Chuck(G) · Jul 23, 2013

Whose C are you using? MSC with full optimization, while not as good as assembly, still does a fair job.

cr1901 · Jul 24, 2013

I'm using OpenWatcom C, which is basically the real mode Microsoft C Compiler and the Visual C Compiler in one. I don't have any special options enabled, and reading the documentation isn't clear which optimizations are enabled by default.

I KNOW that some optimizations are enabled, because disabling optimizations generates code four times as large as assembly XD, as opposed to twice as large. Turning on optimize for space does not change the file size, but causes some opcodes to be rearranged.

Turning on maximum optimizations, based on the compiler documentation's recommendations for 8088 code, generates code that is comparable in length to the equivalent assembly... I just hope that it didn't break the flow control while it did that (I know gcc can break code using max optimizations). Analyzing the assembly, it's worth noting that the compiler took out at least one function call, which saves more space than just using assembly alone (at the cost of losing the subroutine). It's definitely a tradeoff between optimized compiled code and the generated code resembling one's intent (in my case, the missing subroutine).

Chuck(G) · Jul 24, 2013

On MSC 8.00d, it's pretty safe to use /Ox (be sure to include /W3 to catch any obvious goofs) and /Zp.

cr1901 · Jul 25, 2013

Chuck(G) said:
On MSC 8.00d, it's pretty safe to use /Ox (be sure to include /W3 to catch any obvious goofs) and /Zp.

Why is packing to 1-byte a good idea? Because the 8088 takes in only one byte at a time (WATCOM includes a CL.EXE clone)?

Really, you should try WATCOM... I bet it's better than programming on a DOS machine XD. At least with a Windows compiler that can target real mode DOS, I can code in one window, and look up things I need to know on another!

Chuck(G) · Jul 25, 2013

MSC comes with an IDE if you want to use it and the latest versions of the compiler requires at least PM support. For 16-bit x86 code, I've used Microsoft C since it was Lattice C (I still have the old compiler around).

I'm old enough that I don't use an IDE. Even for MCUs, I use a simple command-line interface compiler. You forget that I'm old--coding on paper and staring at listings works for me.

/Zp will mean that all members of a structure will be on byte address boundaries, instead of padding odd-length fields before a word or doubleword. Saves a bunch of space, and--in the case of BIOS structures, is important because you have to conform to a standard interface--so you don't have the luxury of putting words on word boundaries. On an 8088 system, there's no speed penalty, of course. There's also a #PRAGMA() for turning it on and off.

I bought the Watcom 32-bit C when it came out--the code generation didn't impress me and I used Microsoft instead. I don't do a lot of 16-bit new programming nowadays.

cr1901 · Aug 1, 2013

Chuck(G) said:
MSC comes with an IDE if you want to use it and the latest versions of the compiler requires at least PM support. For 16-bit x86 code, I've used Microsoft C since it was Lattice C (I still have the old compiler around).

I'm old enough that I don't use an IDE. Even for MCUs, I use a simple command-line interface compiler. You forget that I'm old--coding on paper and staring at listings works for me.

Ya know, maybe I'll know a lot more by the time I'm your age, but how do you manage to code without ever having to look up API references (either BIOS, DOS, Windows) or instruction syntax (PIC12, MSP430)? How'd you survive before/without the Internet- books?! If you did your coding on paper, that's fine... you have easy access to references on the computer (though typing out code from listings- to me- is a chore). But coding in DOS- at least in text mode- implies you have room for one source file open at once. Switching out source files implies swapping to disk, which is not an immediate transition. Having to look up information implies dropping to COMMAND.COM in some manner and loading another program so you can actually have a reasonable viewing window, unless DESQVIEW is set up appropriately (if possible). In today's DOS world, we have LYNX thanks to DJ Delorie porting GCC to a platform Stallman refused to support, but I can envision you back when DOS was the primary OS, pointing your modem to the nearest BBS, and waiting for pages of coding information to load- IF you can even find them XD.

Having access to that help without interrupting my coding is more important to me than an IDE... that was the point I was trying to make.

Chuck(G) · Aug 2, 2013

Howso? There are several DOS editors that allow for several open files at the same time. There's also the use of two computers.

Before the internet, I used real paper books as reference. Ever see the Developer's Reference set for OS/2? It's a thing of beauty. It also helps to have a good memory.

Even when doing music transcribing, I'm not sure that having two displays (one for what I'm writing, the other for what I'm reading) is any better than having the reference score printed on paper.

Do you remember coding forms? Back then, we didn't even use terminals to do programming. They were very expensive and management didn't trust their use as being "undisciplined". You had paper reference material, you wrote your programs down, had them keypunched and then submitted them to be run and waited for the output.

cr1901 · Aug 2, 2013

Howso? There are several DOS editors that allow for several open files at the same time.
With immediate swapping between buffers? I.e. no saving the buffer to a temp file before switching?

There's also the use of two computers.
My current setup... ideally the DOS computer would be headless if "CTTY COM 1" didn't give me "Abort, Retry, Fail?" all the time.

Before the internet, I used real paper books as reference. Ever see the Developer's Reference set for OS/2? It's a thing of beauty. It also helps to have a good memory.

I have not seen the Developer's Reference set for OS/2, but if it's anything like IBM's Technical Reference Manuals... those things are amazing!
My memory is selective and tends to remember the stupid stuff and forget important things.

Even when doing music transcribing, I'm not sure that having two displays (one for what I'm writing, the other for what I'm reading) is any better than having the reference score printed on paper.
Well, there is a law of diminishing returns for monitors, but I haven't composed music since I was 19, and back then, my computer IQ was abysmal.

Do you remember coding forms?
I'm 23... you forget that I'm young

. I know a mechanical engineering professor at my university who dealt with FORTRAN... he has 'fond' memories of having to diagnose someone else's punched cards because they forgot that FORTRAN statements don't begin until column 7. I tried learning F77 once before I realized how unbelievably ridiculous it was compared to F90. I still occassionally code in F90, and even started making an Multilayer Perceptron library in it. But the reality is that it doesn't have enough intrinsics (it needs a quicksort intrinsic), and doesn't play nice with the GNU debugger, from experience.

Chuck(G) · Aug 2, 2013

cr1901 said:
With immediate swapping between buffers? I.e. no saving the buffer to a temp file before switching?

Sure, VEDIT, for example, is a great programmer's editor, comes in DOS or Windows (even has an EBCDIC version for the IBM Displaywriter). Swap views, split screen, what have you. Worth checking out if you're interested. There's also Multi-Edit for DOS

My current setup... ideally the DOS computer would be headless if "CTTY COM 1" didn't give me "Abort, Retry, Fail?" all the time.

I was involved in a project that substituted a Z80 board for the MDA and keyboard and sent the screen to a VT-100 type terminal and translated the VT-100 keycodes to scan codes. You could even splitscreen RS232 data coming in from a different source with the PC output. I still have the card. You could very easily construct something like that today with an MCU.

I know a mechanical engineering professor at my university who dealt with FORTRAN... he has 'fond' memories of having to diagnose someone else's punched cards because they forgot that FORTRAN statements don't begin until column 7. I tried learning F77 once before I realized how unbelievably ridiculous it was compared to F90. I still occassionally code in F90, and even started making an Multilayer Perceptron library in it. But the reality is that it doesn't have enough intrinsics (it needs a quicksort intrinsic), and doesn't play nice with the GNU debugger, from experience.

The column 7 thing hasn't been in the standard since Fortran 9x (yes, you can spell it in lowercase now--I did a brief stint on X3J3). One nice thing about older versions of Fortran is that they're very amenable to automatic optimization.

Trixter · Aug 4, 2013

Chuck(G) said:
For 16-bit x86 code, I've used Microsoft C since it was Lattice C (I still have the old compiler around).

I thought Microsoft C 3.0 was developed internally. (2.0 was indeed rebadged Lattice)

I don't do a lot of 16-bit new programming nowadays.

I'd be amazed if you did ANY 16-bit new programming nowadays. (If you still are, in what fields?)

MASM/WASM segments COMMON and absolute positioning

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member