Which DOS C Compilers had smart linkers?

alank2 · Apr 9, 2022

I think a smart linker = one that will not link in code or data that is not used. I used to use the library technique to accomplish this with Borland products because their linker would link in the entire object file. Essentially I would put each and every function in their own cpp file and then compiled them into a library. Then the linker would link in the functions a program uses, but none of the ones it doesn't.

So my question is, what other C compilers for DOS could just do this automatically. Did gcc call this whole program optimization or something similar?

Chuck(G) · Apr 9, 2022

I'll have to check my MSVC 16-bit version, but I know that Microsoft LINK for 32-bit Windows does this automatically. (see /OPT:REF).

juj · Apr 9, 2022

My understanding is that all linkers do this since the advent of the compilation+linking programming model. That is the purpose of linking: it looks at missing undefined references to symbols in one object file/library, then satisfies those references by adding in the symbols from another object file/library. If there is some code in the library that the main code did not reference, then there is nothing to bind those functions to, so they naturally don't get bound in to anything as a result and get left out.

Maybe there were some really simple linkers that did add everything in a library to the other library. However this would not be much of a simplification: the linker will still need to walk through all the undefined references to symbols in the main program to perform the call site binding of those undefined symbol references to the actual linked in symbols, so it is not as if the linker could somehow skip needing to look into that. So the linker will always end up forming a knowledge of what were the symbols that ended up being needed in a library.

In most compilers there is a way to perform a full library concatenation (sometimes called a "concatenating link" or an "assembling link"). This is generally not a linking process, but a process where all symbols in a library are appended to another library without resolving any symbol references. Library archiving using ar or ranlib conceptually is a process like that, and so is using clang when assembling LLVM .bc bitcode files together. Those procedures are not to be called proper "linking" though. Sometimes people prefer this kind of "linking" process as a simpler way to resolve circular library references, which would otherwise require linking in a library more than once, or using link groups. ( https://stackoverflow.com/questions...tart-group-and-end-group-command-line-options )

If you are seeing a library concatenation taking place, rather than a link, then maybe it is possible that there exists some linker flags to specify which type of operation it is supposed to perform.

Whole Program Optimization (also Link Time Optimization, LTO, or sometimes Full Interprocedural Optimization) is completely different. It is a process where compiler optimization passes are run again after all code has been brought together by linking. Generally optimizations are run at compiling stage, so they only "see" code individually in each compilation unit (.c/.cpp file) alone. After the linker has been run and all code has been linked together, one can do another invocation of the same compiler for a second run of its optimization passes, which now will be able to operate on all code, and so do optimizations that it would not have had visibility to do otherwise, e.g. constant propagation across function callers and callees across compilation units, more aggressive inlining, or removing function formal arguments that were only ever called with a specific constant literal as input parameter. See https://en.wikipedia.org/wiki/Interprocedural_optimization or https://llvm.org/docs/LinkTimeOptimization.html

Ruud · Apr 9, 2022

To be honest, I'm not familiar with linkers and object files. But I'm writing my own Pascal and I ran into a similar problem. If you write a program using Turbo Pascal 3, the COM file contains all possible needed functions. So if you compile an empty program, it is already 10 KB big. The more lines you add, the bigger it becomes.
My Pascal generates macros and the assembler turns these macros into assembly and then a binary file. Every function in my Pascal has its own macro. If a macro is encountered, a flag is set. A macro can call other macros: the function "writeln" will need the macro that outputs a char to the screen. At the end of the Pascal program the compiler checks all flags and if set, the macro is written to the initial source file for the assembler.

So the first idea of the OP is quite valide: write separate source code for every needed object file. And the compiler knows what functions are addressed and tells the linker what object files to use.

But using only parts of an object file? IMHO that on a to high level.

Chuck(G) · Apr 9, 2022

I did check the 16-bit MS LINK and it doesn't have that option; only the Win32 version does.

whartung · Apr 9, 2022

The whole premise of the library files was to deal with having the routines in individual files. The library files stored several object files and the linker pulled them out as necessary, each one essentially an atomic unit. The library files were necessary because the file systems just got overrun with some many little object files. So, they were bundled.

GeoffB17 · Apr 9, 2022

If you want some numbers..?

I'm still maintaining a DOS application, used by a Trade/Retail warehouse. I'm using MS C7, with substantial extra from a CodeBase package (C code for dBase files and FoxPro CDX indexes. I use the BLINKER link system.

I've never done anything regarding the sort of optimisation referred to, so these numbers relate to the packages 'out-of-the-box'.

My installed application is 615,304 - a single .EXE, using overlays generated within BLINKER.

The libraries referenced are the main CodeBase lib (434,999) which will be made heavy use of, an associated library for screen and utility functions (168,845), plus the Large model C Lib (422,249) which will be referenced by both the other libs and by my code, but nothing like as much as the first two libs mentioned. As well as this, the final .EXE will contain all my code, maybe getting towards 300k.

If the link process was including everything, then it should be 1.3Mb. It is however half of that. So the system must be being fairly selective as to what it includes in the linked .EXE.

Geoff

Chuck(G) · Apr 9, 2022

Are talking about unreferenced routines in the object module or modules within a library? The first, if not detected by the compiler would be pretty tricky to delete in the linker.

For example, if a non-global routine in the main program module is not referenced by other routine in that module, the compiler can omit it--I'm not sure about the linker being able to do the same.

alank2 · Apr 10, 2022

I was reading and found that TopSpeed C/C++ apparently did this for DOS.

I think the win32 linker for Borland does it, but the 16-bit linker doesn't. The workaround is putting each function in its own cpp file, but it is a annoying to maintain.

Which DOS C Compilers had smart linkers?

alank2

Veteran Member

Chuck(G)

25k Member

juj

Member

Ruud

Veteran Member

Chuck(G)

25k Member

whartung

Veteran Member

GeoffB17

Veteran Member

Chuck(G)

25k Member

alank2

Veteran Member