• Please review our updated Terms and Rules here

hmalloc in open watcom

Mike Chambers

Veteran Member
Joined
Sep 2, 2006
Messages
2,621
i've got a program that needs to allocate an array larger than 64 KB, so after some research i found i can supposedly use __huge pointers and hmalloc in open watcom. when i try to use it, the linker complains that hmalloc is an undefined reference. and yes, i'm including malloc.h.

does anybody know what the problem might be? this is the first time i've even heard of an hmalloc function.
 
ah, yes you are right it's halloc. however, it still spits out the same linker error after correcting that.
 
ah, figured it out. halloc needs 2 parameters, number of array items and the size of each item. that compiles, but the program hangs upon calling halloc. i'll try to figure out why.
 
I'll add that halloc() can be a real bother unless you include a bit of code to normalize pointers that you do arithmetic on if you're using the functionality to address arrays larger than 64KiB. AFAIK, halloc just calls the corresponding DOS function.
 
Question: couldn't you simply get by using classic mAlloc and an array of pointers? Then you could have 64k of pointers pointing at each array item having a limit of 64k. That's how I've always done it in Turbo C and Turbo Pascal.

Does the data even have to be an array? Linked list may also be an option if you don't need 'seeking' and are only sequentially processing. Sequential processing of linked lists is often faster than with an array, since each record includes a pointer to the next! Also means no memory limits apart from the largest element you can allocate to a pointer.

Could you share what it is you're storing in an array? Example code? We might be able to think up other solutions depending on the data set. Array of pointers to separate arrays for example... make each array 0..0x0FFF, then you could store a 32 bit index where the array of pointer index is the master_index >> 12, and the sub-array index is simply masterIndex & 0x0FFF;
 
Last edited:
that's actually how i initially tried doing this. i made an array of pointers to separate 1 KB chunks and was using it like array[index>>10][index&1023]

the program is a stripped down version of my Fake86 emulator that can run in 16-bit DOS, just for the fun of emulating a PC on an 8088. :p

the array is for the emulated RAM. for some reason it only works right if i only allocate/emulate 48 KB of memory, using both the pointer array and halloc methods. so, i guess the problem might not be related. problem is with 48 KB the best i can boot in the emu is a floppy image of MS-DOS 2.01, and even that leaves only about 5 KB free lol.

it does run though, at 4.77 MHz it took about 4 minutes to run POST then get to an A> prompt. it runs decent enough that text-based programs are useable on a fast 286. however, i'd really like to get it able to use at least 128 KB or so.
 
alright, it definitely must have something to do with the way the array is being allocated and/or accessed. using watcom, the EXACT same code was compiled for both 16-bit DOS and then 32-bit DOS (the only singular difference was i removed the "far" keyword for 32-bit since it's not segmented). the 32-bit version works fine with any size array for the emu's memory up to the full 640 KB. the 16-bit version still will only work right if i tell it to use 48 KB.

this is how my pointer array is defined:

Code:
uint8_t far *ramptr[640]; //640 pointers to 1024 byte chunks


this is how i allocate the memory:

Code:
            for (ramchunk=0; ramchunk<(rambytes>>10); ramchunk++) {
                    ramptr[ramchunk] = (uint8_t far *)_fmalloc(1024);
                    if (ramptr[ramchunk] == (uint8_t far *)NULL) {
                        printf("error!\n");
                        return;
                    }
            }


the actual accesses to the memory chunks are only handled in two functions, read86 and write86. anything in the whole program that wants to read/write with it goes through these:

Code:
uint8_t read86(uint32_t addr32) {
        if (addr32>=0xFE000) {
            return(BIOS[addr32 - 0xFE000]);
        } else if ((addr32>=0xC0000) && (addr32<0xC8000)) {
            return(VBIOS[addr32 - 0xC0000]);
        } else if ((addr32>=0xB8000) && (addr32<0xC0000)) {
            return(vidptr[addr32 - 0xB8000]); //vidptr is a far pointer to the host machine's REAL video memory
        }
        switch (addr32) { //some hardcoded values for the BIOS data area
            case 0x410: //0040:0010 is the equipment word
                return(0x41); //video type (0x41 is VGA/EGA, 0x61 is CGA, 0x31 = MDA)
            case 0x475: //hard drive count
                return(hdcount);
        }
        if (addr32 < rambytes) {
            return(ramptr[addr32>>10][addr32&1023]);
        } else return(0);
}

void write86(uint32_t addr32, uint8_t value) {
        if ((addr32>=0xB8000) && (addr32<0xC0000)) {
            vidptr[addr32 - 0xB8000] = value;
            return;
        }
        if (addr32 < rambytes) {
            ramptr[addr32>>10][addr32&1023] = value;
        }
}

i wonder what i'm missing here... :?
 
I don't have time to go through the code but one technique that is just sending up warning flares is the use of 32 bit integers as pointers. You are just asking for trouble.

If you are doing 16 bit DOS programming then look at all pointers as far pointers - a segment and and offset. If you need to do pointer arithmetic you can normalize into a 32 bit value and then back to a far pointer. I use these in IRCjr:

Code:
#define normalizePtr( p, t ) {       \
  uint32_t seg = FP_SEG( p );        \
  uint16_t off = FP_OFF( p );        \
  seg = seg + (off/16);              \
  off = off & 0x000F;                \
  p = (t)MK_FP( (uint16_t)seg, off );          \
}

#define addToPtr( p, o, t ) {        \
  uint32_t seg = FP_SEG( p );        \
  uint16_t off = FP_OFF( p );        \
  seg = seg + (off/16);              \
  off = off & 0x000F;                \
  uint32_t p2 = seg << 4 | off ;       \
  p2 = (p2) + (o);                       \
  p = (t)MK_FP( (uint16_t)((p2)>>4), (uint16_t)((p2)&0xf) );          \
}

Obviously you don't want to normalize or add to a pointer too often. (The generated code isn't that bad actually.)

Huge pointers in Watcom work for comparisons, but I ran into a bug doing post-increment and pre-increment (++ and --) on them. After getting burned by that I decided to roll my own.
 
normalizing the pointer was actually the very first method i tried. didn't use any of C's alloc functions, and i directly asked DOS to allocate memory with int 21h function 48h. i stored the base segment it gave me in a variable named ramseg, and then i was doing this to access the memory:

to write:
Code:
pokeb(ramseg + (addr32 >> 4), addr32 & 15, value);

to read:
Code:
peekb(ramseg + (addr32 >> 4), addr32 & 15);


i've also tried doing it this way:
Code:
uint8_t far *ramptr = (uint8_t far *)(((ramseg + (addr32 >> 4)) << 16) | (addr32 & 15));

and then just addressed the byte as *ramptr.

i've tried every method i can think of to do this, but i keep having this problem with all of them. since the exact same code compiled into a 32-bit executable works correctly using any amount of memory, the problem clearly must be in how i am accessing the memory. or at least it would seem that way...

hulk.gif
 
Last edited:
Same here--with Microsoft C. One need for normalizing arises when you want to call a library routine that accepts far pointers, but doesn't have a version that knows about huge pointers. Calling, say, _fmemcpy near the limit of an offset can result in a "wrap", as the routine does not perform normalization/reduction of pointers. Like Mike, I found that it was supposed to be taken care of, but wasn't. For last bit of code that I wrote maybe a decade ago that used huge memory structures, I just wrote some separate code to do the allocation, deallocation and pointer arithmetic, rather than bother with the bugs of a particular C implementation.

Code:
//  Bigmem.c:  Large-memory operations.
//  -----------------------------------
//
//    Note that everthing is couched in terms of generic FAR
//  memory, not HUGE, which tends to be broken on MSVC.


#include <stdlib.h>
#include <dos.h>

#include "cdtypes.h"
#include "bigmem.h"


//  AllocHuge - Allocate a huge block of memory.
//  --------------------------------------------
//
//    Uses the DOS call to do it.
//
//    Returns NULL if unsuccessful.
//

LPVOID AllocHuge( DWORD BlockSize)
{

  WORD segaddr;

  if ( _dos_allocmem( (WORD) ((BlockSize+15) >> 4), &segaddr))
    return NULL;
  else
    return MAKE_FP( segaddr, 0);
} // AllocHuge

//  FreeHuge - Free a huge block of memory.
//  ---------------------------------------
//
//    NULL arguments are okay and ignored.
//

void FreeHuge( LPVOID What)
{

  if ( What)
    _dos_freemem( GET_FP_SEG(What));
  return;
} // FreeHuge

//  IndexHuge - Compute simplified address of huge pointer.
//  -------------------------------------------------------
//
//    void HUGE *IndexHuge( void HUGE *Base, DWORD Offset)
//

LPVOID IndexHuge( LPVOID Base, DWORD Offs)
{

  Offs += ((PWORD) &Base)[0];		    // new offset from base
  ((PWORD) &Base)[1] += (WORD) (Offs >> 4); // add the segment
  ((PWORD) &Base)[0] = ((WORD) Offs) & 15; // add the offset
  return Base;
} // IndexHuge


//  IncrementHuge - Increment a huge pointer.
//  -----------------------------------------
//
//    void HUGE *IndexHuge( void HUGE *Base, DWORD Offset)
//

void IncrementHuge( LPVOID *Base)
{
  *Base = IndexHuge( *Base, 1);
  return;
} // IncrementHuge
 
very nice, chuck! thanks. i will stick this into my program and see if it makes any difference. i'm at college right now, but i'm about leave and go home. i'll try it when i get there.

in the mean-time here's how it is right now, this is including the needed data files and the open watcom .wpj project file. you can start it with the DOS 2.01 disk image like this:

fake86.exe -fd0 dos201.dsk


by default it will use 48 KB of RAM since thats the only thing that seems to work. it can be changed with the "-m #" command line parameter, where # is RAM size in KB.

http://rubbermallet.org/fake86-0.12.4.11-dos.zip

it's not that huge, 95% of the code is just the big switch statement with all the emulated opcodes and so isn't relevant to the issue. it's stripped bare compared to the regular windows/linux version. the file size of the DOS fake86.exe is 44 KB right now. :p
 
i still have the same problem when i use your code, chuck. i expected as much because when looking at it, it appears to basically do what i was already doing with that array of 1 KB far pointers. this one's driving me nuts! if the same code works with large amounts of memory when compiled as a 32-bit executable, i can't understand why it wouldn't work the same way as a 16-bit exec as long as i am accounting for pointer normalization across segments.
 
the program doesn't actually hang, it continues to run but the memory array doesn't work right. i haven't done any detailed testing on exactly what behavior is occuring, i just know that when it goes to read some memory back it acts like the data is not correct. for example, trying to boot DOS when it's doing it the emulator's CS:IP keeps changing and the program doesn't crash or anything, but it's just all over the place and doesn't run the code it should be running.

i need to do a detailed analysis. i'll have it do some read/write verification on every byte in the memory array and see where the failure zones are. that may help shed some light on the source of the problem. it clearly still works correctly in the lowest few KB of memory, because it still emulates the BIOS and it runs the POST so clearly the memory area where the BIOS sets up it's stack works correctly.

EDIT: for the record, the problem is not simply limited to running under DOS. the exact same problem occurs when i compile it as a 16-bit OS/2 or win16 application. i'm also kind of curious as to what happens if i ditch the whole "allocate conventional memory" thing and use EMS.
 
turns out there is no problem at all with the actual memory accesses. i wrote some test code to give the read86 and write86 functions a workout.

Code:
        for (wx=0; wx<256; wx++) {
            for (wy=0; wy<1024; wy++) {
                write86( ((uint32_t)wx<<10) + (uint32_t)wy, wx);
            }
        }
        for (wx=0; wx<256; wx++) {
            for (wy=0; wy<1024; wy++) {
                value8b = read86(((uint32_t)wx<<10) + (uint32_t)wy);
                if (value8b != (uint8_t)wx) {
                    printf("Incorrect byte @ %05Xh: ", ((uint32_t)wx<<10) + (uint32_t)wy);
                }
            }
        }

then i reversed the order in which i wrote the data (from last to first 1 KB chunk) and that turned up no corruption as well.

what in the #$%^ could be happening that makes my code run incorrectly ONLY when using more than 48 KB of emulated RAM, and ONLY when compiled to a 16-bit executable? as i mentioned before it doesn't matter if it's compiled for win16, 16-bit OS/2, or 16-bit DOS. same bug. the memory model doesn't matter either. same problem using any of them from small to huge.

it doesn't even matter what compiler i used. i've tried it with both Borland Turbo C++ and Open Watcom. i haven't tried Microsoft C yet. i will do that next i guess, but i don't expect it to work there either.

if anybody has the inclination to look, there is a link to a zip of the code with data files and watcom project file a few posts above this one. :hammermon:
 
i seriously appreciate your readiness to always help me and the other folks on here out, chuck. there's a fair bit of code in the main file cpu.c, so feel free to back out of this. i don't want anybody to waste their time unless they actually like bug-hunting. :)

if you run the exe in the zip, start it like this: fake86.exe -fd0 dos201.dsk -csip

the -csip causes the emulator to keep the current CS:IP of the emulation to be shown at the top right as it runs. this just runs in 80x25 text mode under DOS. you'll see it running the BIOS from the xtbios.bin file as it POSTs. (or maybe you wont see it if you're on a modern computer, it goes by extremely quickly)

it is set to emulate a 48 KB memory space, and you'll see that after the BIOS code runs it boots the DOS 2.01 floppy image correctly. you can play around with it and see that everything runs properly. then, start the emulator over again using the same command line but this time add -m 256 to the end, and it will try to emulate a 256 KB memory space.

at this point you should see that the BIOS POST has found a memory error. i've also tried a modified version of the BIOS where i made it ignore the memory error and attempt to boot the floppy image anyway, but it just hangs there unable to load DOS correctly. i've also tried it with a DOS 6.22 floppy image, and that one will display "Starting MS-DOS..." correctly but then it also hangs.

you will see the CS:IP that the emu is running continue to change, which shows that it is still running, but something is broken that stops DOS from working. if you were to recompile the same watcom project but set it to build a 32-bit version, and then run that using the exact same command line with -m 256 it works perfectly.

i'm still going through the code myself as well, seeing if something eventually jumps out at me that would cause a problem in a 16-bit program. the header files are a little messy, this was just a quick 'n dirty attempt at making a scaled down DOS version of this program from the regular windows/linux codebase. there are some remnants that i didn't bother removing from the headers. i was just curious as to how quickly it would run on (very) old machines.


EDIT: if you actually did recompile it for 32-bit, a couple changes would have to be made. interrupt calls to the host system would need to be changed. when the code being emulated tries to do an interrupt call to 10h or 16h, it just passes those through to the host system interrupt handlers and resumes emulation. all other interrupts are handled normally by pushing the flags/CS/IP and jumping to the location specified in the IVT.

the keyboard stuff would have to be changed to use kbhit() and getch() stuff, and a quick way to see that it's printing the right stuff out is to have a printf("%c", regs.byteregs[regal]) when a call is made to int 10h for function 0Eh. all of this stuff would be done in the intcall86(uint8_t intnum) function.

the farmalloc would also have to be changed to a regular malloc in main() where it allocates the memory. all in all, it's far more trouble than it's worth, so you can just trust me when i say that it does work correctly as a 32-bit app. :)
 
Last edited:
What happens if you try to run with, say -m 70 or -m 64?

the same thing happens, i've tried all sorts of values. 48 KB is the highest i can go before it fails. the BIOS checks for RAM in 8 KB increments, so the next possible one is 56 KB which it fails with. 48 KB makes DOS useless other than listing a directory and reading text files with all of a whopping 4 KB free memory.
 
Back
Top