Can somebody please explain how DOS memory allocation works?

PgrAm · May 26, 2017

So I working on a game for real mode DOS right now in C++ using WATCOM. I appear to be having trouble running out of memory long before I think I should. I allocate all memory using the c++ "new" operator as needed, which I assume uses int 21h, ah = 48. This limits me to the first 640k of memory as far as I know with a granularity of 16 bytes. I used QEMM and DOS tells me I have 655360 bytes available.

There's a few things I can do to reduce my memory usage but I'm fairly sure that my issue is a problem of fragmentation, in order to improve this it would help to understand how dos allocates memory but i cant find much info on this. Does DOS just return the first free block? Does it try to avoid fragmentation somehow? Does it do any paging? how does it know to avoid TSRs? how can I access the more than 640k without using protected mode (eg. with a 286) and how do I know if there is any?

Thanks.

per · May 26, 2017

DOS allocates memory in blocks, yes. When allocating, DOS will combine ajacent free blocks. From what I can find, I would guess DOS just scans through memory for the first free block that is big enough.

Another thing, COM programs are automatically allocated the biggest block of memory available, so it might be an idea to release a few paragaphs before trying to allocate even more.

Chuck(G) · May 26, 2017

Still one of the best articles on the subject.

Since TSRs don't use either INT 20H or INT 21H/AX=4CH to terminate, their memory isn't released. Instead, they indicate upon the terminate and stay resident command (INT 27H/AX=31H) how much memory to keep around. Unless you need it, you can also gain a bit more memory by releasing the program's environment segment.

krebizfan · May 26, 2017

Check with your compiler. It was more common to allocate from DOS a large space and then handle the suballocations without talking to DOS. Doing your own garbage collection system is a breeding ground for bugs so use the compiler's implementation if possible.

If you use DOS 5 or later, Function 58h can change the allocation strategy from first fit to last fit or best fit.

For additional memory, EMS and XMS are possible choices.

Edit: It was easy to find Watcom C's strategy for most allocations is to work off a heap. With small model programs, by default, the heap will be restricted to less than 64kB so running out of memory is easy. C++ new isn't given a nice clear explanatory text; looks like reading source would be necessary.

PgrAm · May 26, 2017

I'm using the compact model in Watcom BTW, that's a small code model and big data model so that leads me to believe that the heap must be bigger than 64K.

Scali · May 26, 2017

Watcom supplies three different functions for heap-related allocation:
_fmalloc(): This is a 'far malloc', as in: it allocates a segment and returns a far-pointer. But beware: it can only handle 64k allocation max! (where other compilers may support farmalloc() with larger blocks of memory).
malloc(): This is either 'near-malloc' in models with a 64k data segment, or an alias for _fmalloc() with larger models.
halloc(): This is 'huge alloc', which allows you to allocate larger blocks of data (this is most similar to farmalloc() in most other compilers).

It makes sense if you think of it like this:
near pointers: assume the current/default segment, store only the offset. So you can only address at most 64k in a single segment.
far pointers: store a segment + offset. However, during pointer manipulation, only the offset is updated, the segment remains the same. So you are still limited to 64k max (but you get good and predictable performance from these pointers).
huge pointers: store a segment + offset. At every pointer update, the entire pointer is updated, so the segment can be changed as well. This effectively makes your code behave like a 'flat' memory model.

You can also use 'near', 'far' and 'huge' as modifiers for static data, eg:
char far data[500];

This has similar behaviour as the above pointers, eg:
near: place the data in the current/default segment.
far: create a new segment and place the data in there (limited to 64k data max)
huge: create as many new segments as required to store all the data (like a 'flat' memory model).

My approach is to use a small memory model. This means that Watcom places code, stack and data in a single segment, so my default is 'everything near'.
I then have control over additional memory by using far/huge, to bring in extra segments for code or data.
This way I can get very compact/efficient use of memory, without being limited to 64k of code or data.

Chuck(G) · May 26, 2017

So the question really was "How does Watcom C allocate memory?", not "How does DOS allocate memory?"

Silly me for taking the thread topic seriously.

If you're writing 16-bit C, you're almost always better off doing what Scali says--write for the small model (<64K data) and explicitly allocate and type the large structures as _far or _huge, depending on their size. (MSC uses _fmalloc and _halloc for large memory allocation). It's also a good idea to #define or typedef your _far or _huge datatypes, so that if you do move to 32-bit (or 64 bit) code, you can simply null out the pointer qualification without rewriting everything.

krebizfan · May 26, 2017

The actual question was "How does Watcom C++ new allocate memory?" which has yet to be answered. While I think it goes back to the standard heap functions, I haven't verified that.

C++ may not be the best language for tightly coded DOS games. No telling how many assumptions derived from larger more capable systems are waiting to be brought in.

PgrAm · May 26, 2017

So I think I pretty much get it now. While C++ might not be the best language to use here, I chose it because I'm most familiar with it and it has enough low level features, plus I can just drop into assembly whenever I need to. The way I'm using C++ right now does not make it much different from plain C, no exceptions, no RTTI, no STL, etc.

Anyway, I dumped the heap information using _heapwalk() and found that things are not quite as fragmented as I thought, but its OK because I can think immediately of a way to free up ~42K of memory and probably more.

Trixter · May 27, 2017

PgrAm said:
I used QEMM and DOS tells me I have 655360 bytes available.

You have 640K total, not 640K available. On a regularly-configured system, there is typically 580K available for program code and data.

There's a few things I can do to reduce my memory usage but I'm fairly sure that my issue is a problem of fragmentation

Fragmentation is only a problem if you're constantly allocating and freeing a variety of differently-sized blocks. Your actual problem is likely running out of memory, not fragmenting it. Think of ways you can reduce the memory usage of your code, or work more efficiently. Can Watcom produce a memory map of your program? If so, that can help identify any sections of code that are, themselves, large. Are you loading assets in one format, then allocating a lot of memory to transform them into another format for usage? You can solve that by translating them beforehand and storing the post-conversion results on disk.

If you identify that fragmentation is actually an issue, you can simply allocate large blocks of RAM, never free them, and work within them. That's extra work on your part, though. You might want to also build debugging tools into your program. One demo I worked on had odd issues in parts, so I wrote code that displayed memory usage and other stats onscreen so I could watch what was happening. You can observe memory leaks this way, for example.

BrianS · May 27, 2017

Don't forget that we programmers of the 80s used a lot of overlays, disk I/O, and when the 286 came out- accessing extended memory from DOS and using it for storing data. By 1990, having more than 640K on a DOS machine was common. Phar Lap extenders were popular for gaming.

Chuck(G) · May 27, 2017

Think about it--DOS/360 used 8KB for the supervisor--and ran a background and two foreground partitions. Lots and lots of transient phases. When was the last time you saw a multiprogrammed OS that used that little?

Come to think of it, early CDC 6000 series OSes were completely resident in the 4Kx12bit words PPUs. Most were used for I/O drivers. No CPU code at all, initially. Eventually some functions, such as "storage move", were done in the CPU, but you're not talking much code there.

Scali · May 27, 2017

krebizfan said:
The actual question was "How does Watcom C++ new allocate memory?" which has yet to be answered. While I think it goes back to the standard heap functions, I haven't verified that.

Yes, I haven't looked into the semantics of new in 16-bit compilers, but I can only hope it's as simple as the 'new' operator taking the allocation-method related to the pointer type it is being assigned to... so near pointers do malloc(), far pointers do _fmalloc() and huge pointers do halloc().
Either that, or perhaps you can just use these keywords in combination with new... so you could do:
char far* pData = new far char[256];
Or something like that.
I'd have to investigate that some time. I just never used C++ in the first place on 16-bit systems, because of the additional compile-time, and having less control over the size and performance.

Chuck(G) · May 27, 2017

There are some cases where I prefer C++ objects over traditional C code, but on small systems, they're pretty because of limited resources.

Another concern is that a lot of early x86 C++ runtimes were plagued with memory leaks. Those can be the devil to chase down.

mbbrutman · Jul 19, 2018

Necroposting, but with some good information ...

I was debugging a problem where I expected X amount of memory to be allocated, and I saw a few more KB more than X allocated instead. So I went hunting.

First I used _heapwalk, which allows one to examine the state of the Watcom managed heap. (Far heap in my case.) Watcom consistently has some unallocated space in the heap even if you do not allocate and free memory. (ie: fragmentation is not the cause; the unused space is at the end.) It also looks like Watcom is able to do each malloc allocation with just 2 bytes of overhead, as opposed to the usual 16 bytes that the native DOS allocation function requires. So for many small allocations the heap routines are more memory efficient, but it is preemptively allocating more memory from DOS than is strictly required. (Two bytes doesn't seem like enough so I want to dig into that more, but it might be using eye catchers or other special patterns to mark which sections are in use vs. not in use.)

Watcom also has the _heapshrink function, which explicitly shrinks the size of the allocated heap to make as much memory available as possible for another program you might then load next using the system call. It does not defragment or rearrange memory, but it will remove the speculative allocation at the end of memory. At least temporarily ... the next call to allocate heap (malloc) or a runtime function that uses heap will cause the runtime heap allocator to speculatively grab more memory than it needs.

I'm paranoid about resource use so I'd like to find/implement a run-time hint to make Watcom stop speculatively allocating memory that it does not need. However, the true memory requirements of the program really don't change because of it. If you try to malloc 10KB and Watcom grabs 16, nobody cares. If you are that close to the edge of memory Watcom probably will look at any memory allocation failure and then retry with a smaller amount that still satisfies your malloc. And if it doesn't, that's a bug in their runtime. I'll go look at source code tonight to confirm this is the implemented behavior.

I substituted in the malloc calls with my own calls to the DOS memory allocator. The DOS memory allocator doesn't have this behavior, but you still have the general problem of if you go back into a runtime routine and use a heap function it's going to speculatively allocate more than is needed.

Trixter · Jul 19, 2018

What I've never understood is why most HLLs/compilers even bothered to manage memory themselves. DOS will always be loaded and DOS has memory routines built into the operating system. If you write your own routines, it can't possibly be for space reasons, since the space your routines take up will always be larger than what you might have saved over just using DOS int 21,48 (unless you were dumb about allocating memory, like allocating 2000 16-byte-blocks or something). It couldn't have been for fragmentation either, because DOS coalesces free blocks while it operates.

Okay, actually I just thought of one situation where custom allocation was a win: Turbo Pascal went OOP in 5.5, and in 6.0 they rewrote the memory allocator to deal with many small allocations that naturally occur when writing OOP. They could allocate blocks as small as 8 bytes, whose headers contained a far pointer (4 bytes) to the next free block and 4 bytes to indicate its size. For programs doing stuff like new(p), this was a win over DOS once you allocated a few hundred tiny pointers.

PgrAm · Jul 19, 2018

Thanks for the updated info guys!

I might find _heapshrink useful.

For programs doing stuff like new(p), this was a win over DOS once you allocated a few hundred tiny pointers

Yeah I do that probably more of that I should

.

What I ended up doing was a mix of relying mostly on Watcom and then using my own custom allocations for very specific circumstances. Because my game is significantly larger than could fit in 640k at one time I just have to be careful about when things are loaded and unloaded.

Trixter · Jul 19, 2018

PgrAm said:
What I ended up doing was a mix of relying mostly on Watcom and then using my own custom allocations for very specific circumstances. Because my game is significantly larger than could fit in 640k at one time I just have to be careful about when things are loaded and unloaded.

You may want to double-down on your custom routines and turn them into a completely customized heap manager specifically for your assets. Doom used a library/heap management system that allowed tagging some resources as high-priority (loaded first or non-relocatable, for example sampled audio and main character graphics) and low-priority (menu screens, etc.).

commodorejohn · Jul 19, 2018

Trixter said:
You may want to double-down on your custom routines and turn them into a completely customized heap manager specifically for your assets. Doom used a library/heap management system that allowed tagging some resources as high-priority (loaded first or non-relocatable, for example sampled audio and main character graphics) and low-priority (menu screens, etc.).

Yeah, I think this is probably an inevitable part of writing a general-purpose game engine, especially in resource-constrained environments.

Which reminds me, I really need to get back to work on mine...

Chuck(G) · Jul 19, 2018

My rule of thumb (back in the day when I still coded real-mode DOS) was to use malloc() if my model was small and use the DOS calls for far memory allocation. The rationale was that C used its own code for "near" memory allocation and DOS calls for far memory anyway. Whether or not that bought me anything, I can't say, but recall that a lot of this was back in the day of buggy 'C' runtimes.

Can somebody please explain how DOS memory allocation works?

Experienced Member

Veteran Member

25k Member

Veteran Member

Experienced Member

Veteran Member

25k Member

Veteran Member

Experienced Member

Veteran Member

Experienced Member

25k Member

Veteran Member

25k Member

Associate Cat Herder

Veteran Member

Experienced Member

Veteran Member

Veteran Member

25k Member