• Please review our updated Terms and Rules here

TINY memory model C, compiler recommendation?

deathshadow

Veteran Member
Joined
Jan 4, 2011
Messages
1,378
I'm seriously considering kicking Pascal to the curb. As high level languages go it's always been my first love -- the strict typecasting, the forward declarations, the verbose non-cryptic structures...

C pisses me off a LOT for being needlessly and pointlessly cryptic, letting you create code structures that should never exist in the first place, concepts like range checking being an absolute joke, the whole separate .h .c rubbish, etc, etc...

But after years of using C on other platforms (AVR, ARM, MIPS, Windows), the limitations of Pascal are far outweighing the advantages for me at this point in x86 development.

More so my recent encounters with the "modern" Pascal community via Delphi, xCode, and FPC has pretty much proven to me that the modern Pascal "developer":

1) Talks the talk but can't walk the walk

2) Guards their own efforts, codebases and "breakthroughs" like Gollum with his uncle huffing "precious".

3) Seems to actually want to actively discourage people from using the language.

The first is no surprise, I've been dealing with that on the web technologies front since I started dealing with web dev. You deal with ASP, PHP, SQL, HTML, CSS, and JS long enough, you get used to 99% of the "experts" talking out their arse!

The second is disturbing after years of working on web technologies where everything is "open by design" and communities where you ask a question, show some code, and other people will just improve it and help you because they can. Honestly the response I've gotten has me re-evaluating my low opinion of the FLOSS movement as I've not seen developer behavior this mercenary since the "paid by the KLOC" days.

The third is just... I mean I thought I was brusque and direct... the difference I think is that when I am so, I'll take the time to explain it and handhold the person on the other end to a better solution -- not vaguely mention that "there is a solution but I'm not telling you, hahahahah" which seems to be all three failings rolled into one response and then repeated nauseam.

Add in that same way, the fact that Pascal mostly remains hobbled by the compiler including things whether you use them or not, and the lack of proper "define" (since constants are just variables with a predefined value sucking down DS space) much less macro's...

... and I'm done. ME, Mr. Pascal, worshipper at the throne of all things Wirth... is DONE with Pascal.

So... reasoning aside, which C compiler would you folks recommend for DOS development with a TINY target, likely with NASM .obj files linked in? I'm looking at Borland C++ 5.0 since I have a copy AND the manuals... so a good place for me to start.

But I'm wondering if anyone has tried multiple versions or has any other recommendations for me at least to try. I'm likely going to put together a small sub-test to run across multiple compilers to see which does what I'm trying to accomplish the best... but I really don't know what's out there as I've never really used C under DOS before.

Windows? Unix? Linux? OSX? AVR? ARM? Hell, I've even done C for the Commodore 64... But plain Jane DOS? Much less with a TINY memory model? Never done it before. For the most part back in the day when that was my code target, C compilers were still considered something of a joke.

STILL not my first choice of a language, but the state of compilers may be making the choice for me -- as at least it has proper "define", some compilers offer macros, and if I omit "#include stdio.h" I can have a clean baseline for implementing my own optimized routines instead of blindly trusting some library that **** knows how "optimized" it is... or isn't... much less blindly including constants and procedures I don't even plan on using.

Wait, he's not saying that most stdio libraries aren't properly optimized, is he? Given my experience with "libraries" and "frameworks" the past decade on web technologies, I take nothing for granted anymore. See the stupid "leftpad" fiasco for node/babel where not only were people blindly including it when it could be removed from the source tree up-control from you at ANY time breaking the web when it was removed, NOBODY bothered looking at the function to see if it was well coded or not, much less if it the creator was even competent enough to be writing code for others to use!

I'm finding anew that my disgust knows no bounds.
 
OpenWatcom produces very good DOS code with executables that are smaller than the museum Borland products. More importantly, the code generation is very reliable. Having had C compilers that optimized code into a non-functional state, I find I prefer code that works over code that is small. www.openwatcom.org

There are a lot of older compilers. I would skip all the very early (1985 and before) C compilers; code generation was uniformly poor. Borland was middle of the pack. Zortech was very nice until it got picked by Symantec where good code goes to die. Microsoft alternated between decent versions (though Watcom and Zortech were generally better for DOS apps) and some of the worst compilers ever sold anywhere. Do not ever consider paying for Microsoft C 6.
 
I'm seriously considering kicking Pascal to the curb. As high level languages go it's always been my first love -- the strict typecasting, the forward declarations, the verbose non-cryptic structures...

I almost suggested this myself after reading your "Squeezing blood from a stone" post.

So... reasoning aside, which C compiler would you folks recommend for DOS development with a TINY target, likely with NASM .obj files linked in? I'm looking at Borland C++ 5.0 since I have a copy AND the manuals...

BC++ is pretty good. I used BC++4.52 for Digger Remastered (after having used Turbo C 2.01 for years - BC++4.52 has better code generation and more features).

I also used Microsoft C++ 8.00c (the most recent one which supports 16-bit code generation IIRC) for one of the parts of 8088MPH. I think it's available for free as part of the WDK. I'm not sure how the code generation compares to Borland. A nice thing about this is that it's a proper Win32 program, so you can run it on a 64-bit Windows machine without VPC or DOSBox or whatever.

Scali used Watcom for some other parts of 8088MPH. I don't have much experience with it myself but you should send him a PM and ask him if he doesn't find this thread.

A while back I looked into making a modern GCC port targeting 16-bit x86. A friend of mine found some old patches that someone else had done and did some work towards updated them to the latest (at that point) GCC. I think he had a basically working compiler at one point, though it was rather "raw" and probably not usable for anything non-trivial. It's at https://github.com/crtc-demos/gcc-ia16 if you're interested in taking a look at that. I'm not sure if the state of LLVM for IA16 is any better but it might be worth a look.
 
... So... reasoning aside, which C compiler would you folks recommend for DOS development with a TINY target, likely with NASM .obj files linked in? I'm looking at Borland C++ 5.0 since I have a copy AND the manuals... so a good place for me to start. ...

Dave Dunfield's Micro-C might be your ticket: http://www.classiccmp.org/dunfield/dos/index.htm (scroll down to the Programming Languages heading).

If that is too limiting, then you can always use the old standby, DJGPP, which can produce 16-bit code (http://www.delorie.com/djgpp/16bit/).
 
Microsoft C, version 8 was my "go to" compiler for many years. It's still very decent. If you know how to use it for both tiny and small memory models, it works fine--particularly when /OX optimization is selected. I've tried several other non-free Cs, and always kept coming back to MS C. The inline assembler is also very good. I wrote my own I/O library.

Earlier versions of the compiler, not so much--some of the bugs were really atrocious.

It and MASM (particularly version 6.x) are what I made my living with for many years. I'm still here, so it must have worked.
 
Great recommendations guys, I've refined my requirements and done some testing...

Micro C was eliminated as it only runs from DOS. One of my goals is to have something I can run from a 64 bit windows or Linux shell native.

Microsoft C, early versions are indeed rubbish, and while the Version 8 @Chuck(g) recommended isn't horrible, the resultant file sizes and minimum library is more of a step sideways from TP7 instead of a step forward. What really makes me reject it though is the piss poor documentation that's like someone wrote it in Finnish, had a Russian translate it to Japanese, then let Google change it back to broken Engrish.

I don't know what it is, but EVERY Microsoft development tool, the language they use in the manuals, books, tutorials, and anything remotely related to it leaves me thinking I just plodded through Jimmy James, Macho Business Donkey Wrestler.

There are times I oft wonder if I learned the same English as the rest of you.

GCC, given my relationship in the past with all things GNU and experience with the ridiculously painfully slow inefficiencies... no. On general principle, not going there.

DJGPP I've never heard good things about, and dealt briefly with in FPK's early days... seems dead now, so really not looking good. Couldn't even seem to get it to work here without dicking around so, whatever.

BCC 5.02 surprised me with a minimum .COM size three times that of TP7's .exe's... not encouraging. It too doesn't run from inside the 64 bit shell environment.... so scratch that.

So like an idiot I try the first one suggested LAST, and it does EXACTLY what I want.

Open Watcom it is. A baseline empty .COM is 988 bytes, my little sample to test the basics of what I want to do is 4k vs. the 8-12k of everything else... It's EXTREMELY well documented, I like that the linker PDF is separate from the C compiler one. Font choice was a little wonky for screen use but that was easy enough to override. If they offered those in print, I'd buy 'em.

I was up and running in a fraction the time of anything else, it's giving me PROPER control over what it outputs, is MORE than feature-rich enough for my needs... Every question I've had in my process of slowly starting to port things to it I've found in the documentation PDF's quite easily. That's more than I can say for a lot of other compilers where you end up stuck with either crappy online tutorials, a poorly formatted .txt file, or something where the developer up and decided it was wrong to put more than 30 words of meaningful text on a page because "ooh, whitespace cool".

So thanks for all the suggestions, and big extra thanks to @krebizfan for being first in the door suggesting Open Watcom and @reenigne for mentioning that @Scali uses it. For the time being, it seems we have a winner.

Now to see if I can port this entire codebase over. Having everything NEAR is going to be nice, no more having to spend time screwing around with worrying about any selector other than making sure ES is set to 0xB800... much less the overhead savings that look to be paying off rather quickly -- particularly since unlike Turbo Pascal, C will let me actually let me have NEAR pointers.

Also means I can stop screwing around pushing/popping BP and just copy SP to BX on a lot of my code... and it seems to have instruction switches for passing vars on registers as an option too. That will be fun to play with.

It's funny, MOST of my reason for wanting it to run from a local shell was actually to run NASM native as a single one-place build process SEPARATE from my testing environment. Pretty much any assembler under DOSBOX is agonizingly slow -- case in point it took around 20 seconds for NASM to assemble my txtgraph.asm under DOXBOX with cycles set to MAX, when TP7 could build and link all my units in under a second. Don't even get me STARTED about MASM or TASM...

Now I can just run my build script and it's gonna be done almost as fast as it took me to type the command.
 
Last edited:
I used Turbo C++ 3.0 for DOS for years. It includes C as well as C++ and supports inline assembler. The code that it produces is simple to understand, but not well optimized. After a while I got tired of writing key parts of mTCP in assembler only to find that the vast majority of the remaining code was slower than snot because of the generated code.

Open Watcom is generally far superior; it has a more modern runtime with more of the C and C++ library routines that I expect. The assembler is good, and their support for inline assembler via pragmas is excellent. The generated code is reasonable. There is some stupidity in it, like it's desire to bring in the double byte versions of the library functions, or its insistence on having both a near heap and a far heap when only the far heap will be used. But those things can be worked around.

And lastly, you can see the source code for the compiler and the library. That's really handy when debugging a problem and you suspect the runtime is doing something stupid.
 
I don't know what your baseline OS is, but MS C 8.0 as well as MS C 32-bit and MASM run just fine in DOSemu on Linux with the HXDOS stuff included. MASM 6.1+ is a hugely complex assembler, with macro features that none of the others have, so yeah, it' runs slow in its 16-bit version. Runs like the wind in Win32 mode, however.

If you're judging the size of the .COM file by the standard entry-point prologue, then you're including a lot of stuff you don't need. Write your own--I did. MS includes the source code for that.

MS Documentation; well, once upon a time it wasn't too awful. The problem is that, like Windows, stuff just got tacked onto the documentation, rather than the documentation being rewritten, as it should have been. I've been using MASM since the truly awful 1.0 and MSC since it was Lattice, so the documentation doesn't bother me. But it's like Philadelphia--if you lived there, you wouldn't be lost.

One reason for me using the MS compiler/assembler is that at one time, I wrote a bunch of device driver code for Windows. It's just so much easier using MS tools and not worrying about being tripped up by some incompatibility in the DDK code.

I'm surprised that nobody mentioned Zortech C. If you can find an old copy, it's pretty good.

At any rate, congratulations on your choice--best of luck with it.
 
If you're judging the size of the .COM file by the standard entry-point prologue, then you're including a lot of stuff you don't need. Write your own--I did. MS includes the source code for that.
That's the thing that was pissing me off in FPC -- first "write your own, it includes the full source" -- because I want to wade through a megabyte of source spread out over dozens of separate files to make it NOT include things when I'm NOT INCLUDING ANYTHING.

I mean if my entire source file is:

Code:
void main() {}

What the **** is it "including" by default when my target is tiny? I could understand it if I included stdio, but without any libs a baseline C compiler (or any compiler really) should only MAYBE have some minor range checking code and stack handling in place. ESPECIALLY on tiny. Really that breaks 1k and I have to dig in to rewrite ANY form of library that I'm NOT explicitly including in the first damned place and that the compiler transparently pulls out of it's arse then there is something utterly ****tarded with the compiler.

... and apparently that opinion comes across as alien to some folks.

I dunno, maybe I've been doing C on WinAVR for too long now. It's sad when CC65 feels more well thought out.

LAUGH is, I already HAVE all the stuff I need it to do for IO written, so I JUST want to cut the crap out, but with both FPC and MSC, **** knows where to even start and when you ask, the most common answer is "It's undocumented so either pay someone or figure it out on your own."

That's the shit that made me kick modern pascal to the curb in the first place as to be frank, **** that noise. Because I'm gonna waste a year sifting through spaghetti code to make it behave how a compiler actually should behave out of box if it were truly platform neutral. It's sad enough I've basically pissed away two years thanks to my health without the damned tools getting in the way as I'm ramping back into development mode.

Admittedly, platform neutrality was NEVER on Microsoft's to-do list when it came to C, or really any language they made once DOS became their bread and butter.

I think that's what actually really pissed me off; making more work to make it NOT do something, thanks to the entire mess having a nasty case of taking the simplest of things -- entry/exit code wrapping simple high level compiled code -- and turning it into a needlessly, pointlessly, agonizingly convoluted MESS... to the point it ends up more like brain surgery when it should be tinkertoys.

"You can check your anatomy all you want, and even though there may be normal variation, when it comes right down to it, this far inside the head it all looks the same. No, no, no, don't tug on that. You never know what it might be attached to." -- Buckaroo Banzai

But then pretty much my entire development career involved taking systems and codebases that were killing the client on hardware costs and/or speed efficiency, tossing the entire mess in the trash and providing far better optimized solutions. See web development where you take a site who's SQL activity was killing a Dual Xeon with 8 gigs of RAM in a five-thousand a month managed hosting facility, with an uptime of "uptime, what's that?" and gutting it down with zero functionality differences (apart from actually working) on a $10/mo VPS with processing and I/O time to burn.

... or even the $20K I cleared as a teenager in one summer of the early '80's with simple scripts that just deleted comments from DiBOL or BASIC code.

Aka, the reason people used to use Borland languages instead of Microsoft ones in the DOS age.
 
Last edited:
Just a thought, I don't know if there's suitable Oberon compilers that would be capable of doing what you're after? Otherwise disregard my comments if it's being considered.

That's disappointing Pascal isn't being coded properly. It's probably the reason as to why you don't see Pascal Cross-Compiler Languages being coded for 8bit systems, which are mostly derived from C. Though Compilers like SDCC for example suggest it's based on GCC.
 
Last edited:
My guess is that you wouldn't like any of the C compilers and runtime for ARM MCUs--they usually involve a whole bucketload of setup code before you get to main(). The generated code on the Cortex MCUs can be very efficient, but the MCUs themselves are very complex.

C itself must include support for standard K&R features--the I/O library, memory allocation, etc. If you want tiny, you write your own prologue code and libraries to get rid of those features. If you can do that, then a compiler is pretty much a compiler. A good one will generate tight code and give you control over how the code is generated; a bad one won't. Back in the day, I'd gone so far as to purchase Watcom C++ for 32-bit code. The code generation was so bad that I ended up giving the whole big box full o' books away. Borland C++ was just different enough from Microsoft C to confuse some people who were licensing my code. Since Borland was living in a Microsoft world, it was easiest to write in MS C and let the users of Borland adapt. The MS approach also made it easier for me to support the MS Vistual BASIC crowd.

Really, unless something is really, really awful, such as the assemblers for some of the 8-bit MCUs, I've learned to roll with the punches.

If you want tiny, write in assembly.
 
C itself must include support for standard K&R features--the I/O library, memory allocation, etc.
That's the thing though -- the I/O library should be optional should it not? After all that's what stdio.h is, right? I don't include stdio, I don't expect to have an I/O library loaded. I know, crazy me...

... and really if malloc ends up more than 64 bytes, there's something wrong with it and/or the host OS. (if any) -- ESPECIALLY on a 16 bit memory address target where the only thing you should have to worry about is smacking into the stack. (assuming heap atop data growing upwards, with stack growing down from address top)

So far my experience with ARM has been fairly problem-free, but mostly I've been using the Arduino command line build on Cortex A4 in environments (like the Teensy) where I start byte counting,. It's supposedly a GCC fork but you'd never know it since they seem to have removed all the GNU / FLOSS mental enfeeblement.

Of course, I'm all happy, got three Teensy 3.2 on the way to play with now. 72mhz A4 with 256k of flash and 64k of RAM? Quite a step up given said project was originally planned around a 16mhz Atmega 32.
 
Remember that the stdio stuff has to allocate buffers, particularly for stream I/O, manage "raw" vs. "cooked" I/O and do other stuff to stay within the constraints of the standard that probably requires initialization code in the prologue. After all, you get stdout, stderr and stdin in buffered mode without even asking for it. So, if you want to be really tiny, write your own support.

There's a boatload of code hiding in that Arduino implementation on the A4, though, just as there is in the PIC32 Uno setup. Try writing a 200 byte executable using that.

I use gcc and the ST Micro libraries for my work.
 
Blame it on MS labeling. The logon from executing "cl":

Code:
C:\>cl
Microsoft (R) C/C++ Optimizing Compiler Version 8.00c
Copyright (c) Microsoft Corp 1984-1993. All rights reserved.

usage: cl [ option... ] filename... [ -link linkoption... ]

I think that MS started calling it Visual C 1.0 when they started bundling it with an IDE. So you have to distinguish between the compiler and the whole suite. I believe that there might even be an 8.00d out there.
 
Remember that the stdio stuff has to allocate buffers, particularly for stream I/O, manage "raw" vs. "cooked" I/O and do other stuff to stay within the constraints of the standard that probably requires initialization code in the prologue. After all, you get stdout, stderr and stdin in buffered mode without even asking for it. So, if you want to be really tiny, write your own support.
I don't know what part of what I'm saying is failing to register with people... but that's EXACTLY WHAT I'm SAYING. If I don't include stdio.h shouldn't NONE of that EVER apply?!? I'm TRYING to use my own damnit! I shouldn't have to dig into the compiler internals to do it! I omit stdio.h or use my own, why the **** is it still including all the rest of the crap?!? 90% of which ONLY applies to stdio!

I mean, if I DON'T say #include (or on pascal's case "unit" or $i) shouldn't I just be getting enough to load it into memory, return/exit/terminate properly, and the basic memory management -- and that's IT?!?

'Cause that's NOT what a lot of compilers seem to be doing these days.
 
Last edited:
I've never had any problem that way. Sure, you may have to dummy a couple of things out, but the simplest thing is to remember that main() is just a subroutine like any other. Use a different name instead and link in your own prologue and entry point and you probably won't get anything else.

I think you're dwelling to much on a minor point, but I can illustrate an example if you'd like.
 
Scali used Watcom for some other parts of 8088MPH. I don't have much experience with it myself but you should send him a PM and ask him if he doesn't find this thread.

Yes, for integer code it's very nice. Generates fast code, seems to be quite reliable.
Some pointers:
1) Careful, it uses unsigned chars by default, on x86 it is more common to use signed chars, but there's a compiler-switch for that. So if you use code that worked with MSC or BC++ or such, and it doesn't work in OpenWatcom, see if it's your chars.
2) I've had some inconsistent results when using floating point. Couldn't quite put my finger on it, but sometimes it seems the floating point precision/rounding is very bad. I think there are some bugs in its x87-backend. I had some routines that initialized things like sin-tables, sqrt, 1/x-related stuff... And my code broke in OpenWatcom, while it worked in BC++ and MSC. I ended up just precalcing the tables on another system, and including them in the binaries, as a workaround.
3) There is a bug in the FPU detection routine which locks up an 8088 or 8086 system when no FPU is installed. I have patched the libc to fix this issue (doesn't impact machines with FPU or 186 and higher, so you can use these libs for any 16-bit target): https://dl.dropboxusercontent.com/u/72602692/Watcomclibc8088.zip
See also this issue: https://github.com/open-watcom/open-watcom-v2/commit/40dfa157a1f419a477a04392473d188a53495966
Or this earlier thread: http://www.vcfed.org/forum/showthre...-9-built-binaries-lock-up-on-8088-without-FPU
 
Last edited:
I don't know what part of what I'm saying is failing to register with people... but that's EXACTLY WHAT I'm SAYING. If I don't include stdio.h shouldn't NONE of that EVER apply?!?

Well, the thing is, int main() is not the entrypoint from the OS point-of-view. I think what Chuck(G) tries to say is that for a K&R-compliant environment with int main(), various OS-specific things need to be abstracted and initialized.
So, the actual OS-entrypoint is in a library, and that library calls your int main().
This library is always included by default , and depending on the granularity of the libraries of the specific compiler, may pull in quite a bit of standard functionality.

Of course, with C/C++, headers do not pull in any code whatsoever, libraries do.
Linkers will generally have a switch to override any default libraries, so you can start with a 'clean slate'. You also have to specify your own entrypoint then, because main() is no longer being called by some library routine with its own OS-entrypoint.
But once you do that, you're basically in the same environment as you are when programming assembly. Of course, as Chuck(G) also pointed out, this also means that you can no longer use various C libraries, such as I/O routines and malloc, because normally their initialization routines would be called before reaching main(). You have to explicitly set up these libraries manually, before they can be used.
More often than not, this is not worth the trouble, and you're better off writing your own routines based on OS-specific APIs.
 
Back
Top