Writing Assemblers... What should a good assembler do?

Bruce Tomlin · Apr 6, 2024

This. If you are going to use it yourself, then make it work the way you want it to work. But it's still helpful to have experience with different assemblers so that you can know what kind of things are possible. Macros is one area with a very wide variability, and it can be very tricky to make a useful macro system if you don't have a lot of experience with writing code to parse text.

whartung · Apr 6, 2024

As long as you're using global labels for your macro arguments, then pleas for naming the arguments will go on deaf ears.

One problem with the ARG1, ARG2, ... scheme is that you can not insert an argument (easily) in the macro, since it pushes all of the others and the macro will have to be gone through. Also, with enough ARG1, ARG2, etc. you find they blur together and lose their meaning.

It's all good when the macro is 5 lines long, but its different if they get extensive.

The argument of just using labels is fine, by the same argument you be able to do:

Code:

SET X = 1
SET Y = 2

MYMACRO X, Y

There's certainly nothing wrong with using external variables in macros, as long as the macro itself does not shadow them inadvertently.

The logic for INCLUDES and MACROs should be much the same, and, arguably, shared, to a point.

Code:

.INCLUDE "MYFILE.ASM", P1, P2, P3

and

Code:

MYMACRO P1, P2, P3

is, for most use cases, essentially an identical process save that the macro definition is already hanging around in memory, rather than being read from a file on the fly. Seems silly to have an include file with 3 lines of code. (And I don't think anyone would bark if you didn't allow macros with macros.)

It's mostly about parsing the parameters, and setting up scope, which you need to do with includes and local or context aware labels anyway.

cj7hawk · Apr 7, 2024

Thanks for the thoughts.

The Macro and Include logic is mostly the same, but Includes don't get extra parameters, and macros can only be chained, not nested. Also the implementation for both is coded separately.

I'm reasonably confident that the macro code, as I implemented it, works OK and should meet most Macro usage requirements and allow at least what Microsoft's MACRO assembler would have been capable of. Hopefully no one would want to create large, complex, macros since it stores the macro text in the assembler memory where it competes with labels for space.

And while I recognise it's for a worldwide user base of 1 person, I'm writing this for fun and challenge - It's like building a ship in a bottle - Hence why I have been adding features I don't presently desire. On the other hand, the PC cross assembler might actually be useful to others in the future. As @Svenska noted previously, there are z80 assemblers available under Linux, but using WSL to run an assembler bothers me, and is mostly why I wrote my own to run under Windows 10,11, 12??? or whatever comes next. So I'll rewrite my PC assembler to match the z80 assembler when I'm done

That way the code can be assembled on either system. So that much of the effort is not wasted at least.

The other value to me is that I still write assembly commercially, so learning new structures and programming techniques and ideas is still valuable to me. Writing an assembler is just an autodidactic approach to learning more about assembler.

The macro store/define/generation function uses the system parser to strip out comments, unnecessary whitespace, unused operators and all characters below $20 (ASCII space) including tabs, which are replaced only with a space when relevant to the syntax of an argument. This ensures that when a macro is called, I only need to track which term (if any) causes an error, but it all shows up correctly for the line that calls it. The intent there was to minimise the memory requirements.

It seems like the macro function does what is necessary, and can even pass parameters. Using system labels isn't preferred, but then an assembler by nature restricts what you can do. I don't reserve register names, so labels like HL, DE etc are perfectly acceptable, which allow even register names to be passed to the macro if necessary, by assigning the register opcode token value to the register.

Using predefined argument names comes from DOS/UNIX type shells, where command line arguments are referenced by an argument indicator - eg, ARGV, ARGC, %1, %2 etc. Using text arguments that are reserved just means code can't use those reserved words, but there's not a lot of reserved words in the assembler. The list was kept intentionally short. And ARG isn't a label that someone is all that likely to use. Even if they did, it would generate an easily found error rather than accepting it and allowing the collision. It seems an elegant way to allow in-line parameters without needing to define them, which creates a different kind of rigidity.

whartung said:
As long as you're using global labels for your macro arguments, then pleas for naming the arguments will go on deaf ears.

I may be using global labels for macro arguments, but that doesn't mean I don't support local labels for the macros also. That code is already in place ( though it gives me ideas to improve it ). But having a reference to he position of the arguments is a bit of a system constraint. Doing so saves memory over other options and still allows local labels.

whartung said:
One problem with the ARG1, ARG2, ... scheme is that you can not insert an argument (easily) in the macro, since it pushes all of the others and the macro will have to be gone through. Also, with enough ARG1, ARG2, etc. you find they blur together and lose their meaning.

This is actually quite easily done, if the programmer is going to the trouble of declaring local variables anyway, since it's not that different from other macros - just the syntax and structure look different.

Code:

MACRO MYNAME
  LOCAL ; Establish local precedence over the label list.

EQU X,ARG1
EQU Y,ARG2
; Do something here with X and Y.
MEND

Would easily be extended like this;

Code:

MACRO MYNAME
  LOCAL ; Establish local precedence over the label list.

EQU X, ARG1
EQU X1,ARG2
EQU Y, ARG3
EQU Y1,ARG4
; Do something here with X, X1 and Y, Y1...
CLEAR X ; Remove all the labels after X, including X.
GLOBAL ; Restore the global labels back to use.
MEND

All that is needed is to reassign the arguments.

When you consider it, it's not that different from defining a normal macro except the declaration is explicit rather than implicit.

So once the macro goes as far as establishing local labels, and they get defined ( which a Macro parser has to do anyway ) then you can add, remove, change etc, without any more difficulty than would be expected.
One thing I note - I'm going to have to write some decent documentation explaining all of that. Going local removes the global table. What I need is another command to partially re-attach the global table after making the local declarations. I can't leave them there by default, because of collisions, but I can make it possible to attach them later, which is more trivial than may seem obvious.

And again, the more I think about it, creating a very flexible system is only half the battle. Documenting that flexibility is going to be the next challenge.

While my assembler is very small, I think it punches way above it's weight for features and function

Though I need to pull down a real CP/M system some time and try it out to see how well it assembles on a 4Mhz z80 instead of an emulator.

Bruce Tomlin said:
This. If you are going to use it yourself, then make it work the way you want it to work. But it's still helpful to have experience with different assemblers so that you can know what kind of things are possible. Macros is one area with a very wide variability, and it can be very tricky to make a useful macro system if you don't have a lot of experience with writing code to parse text.

It's not that I don't have a lot of experience writing code to parse text... Actually, that's probably incorrect because I don't have a lot of experience, but I've written a few parsers at least.

The challenges come when I choose a parsing methodology is that once I've settled on a strategy, if I add or change things later, it becomes constrained by earlier decision.

Hence why knowing as much as I could before I set out seemed like a good idea. Instead I took the Indiana Jones route and made it up as I went

cjs · Apr 7, 2024

cj7hawk said:
On the other hand, the PC cross assembler might actually be useful to others in the future. As @Svenska noted previously, there are z80 assemblers available under Linux, but using WSL to run an assembler bothers me...

There are plenty of assemblers that run under Windows and even DOS. I've mentioned Macroassembler AS before; I mostly use it on Linux but it works absolutely great under Windows as well. (I've not tested the DOS version.) Along with MINGW Bash (installed when you install Git for Windows) it's allowed a lot of my assembly systems to Just Run under Windows with no extra effort, which is really nice. (You can see one example of many in my tk80-re repo., which uses the 8080 and 8085 targets.)

BTW, if you want to looked at some examples of macro use, I just happened to add four examples to my PC-8001 ROM disassembly yesterday. (That's a link to a dev branch; when it stops working you can just check the code on the main branch.) You might find it worthwhile considering whether your assembler would support these, and, if so, whether they would continue to so the same job of making the programmer's life simpler after any modifications you need to make. (You might also find some space-saving parsing ideas in the code itself, such as the qchar macro/qconstchar routine.)

Svenska · Apr 7, 2024

cj7hawk said:
As @Svenska noted previously, there are z80 assemblers available under Linux, but using WSL to run an assembler bothers me, and is mostly why I wrote my own to run under Windows 10,11, 12??? or whatever comes next.

There are many assemblers running on other systems as well. The Z80 is and has been an extremely common target. The list at http://www.z80.info/z80sdt.htm contains 5 entries for 32-bit Windows, for example. RomWBW is also built primarily on Windows.

There is at least one Z80-focused IDE running within a web browser, and a few open, semi-commercial or fully commercial ones as well, plus plugins for existing environments (VS Code, Eclipse). In university, we used Zilog's own IDE, which worked great for assembly (but tended to crash when doing C development). Lack of good tooling is definitely not an issue for the Z80.

cj7hawk · Apr 7, 2024

Svenska said:
There are many assemblers running on other systems as well. The Z80 is and has been an extremely common target. The list at http://www.z80.info/z80sdt.htm contains 5 entries for 32-bit Windows, for example. RomWBW is also built primarily on Windows.

There is at least one Z80-focused IDE running within a web browser, and a few open, semi-commercial or fully commercial ones as well, plus plugins for existing environments (VS Code, Eclipse). In university, we used Zilog's own IDE, which worked great for assembly (but tended to crash when doing C development). Lack of good tooling is definitely not an issue for the Z80.

Are there any assemblers you could recommend that operate identically on both a CP/M platform in native z80 and a cross-assembly platform? (eg, I can assemble my code in CP/M or Windows 10/11 without using WSL?)

Thanks
David.

Svenska · Apr 7, 2024

In that case, I would simply use a CP/M-based assembler. On Windows or Linux, I would run the same assembler through a command-line CP/M emulator.

RomWBW takes this approach for C code, using HiTech C through a wrapper, which guarantees identical binaries across host systems.

cj7hawk · Apr 7, 2024

Svenska said:
In that case, I would simply use a CP/M-based assembler. On Windows or Linux, I would run the same assembler through a command-line CP/M emulator.

RomWBW takes this approach for C code, using HiTech C through a wrapper, which guarantees identical binaries across host systems.

If I get different binaries, I have much bigger problems !

At the moment, the z80 version will assemble the same source as my cross-assembler, but not vice-versa. I will rewrite my cross-assembler so they will create things the same.

That's my objective. To create a cross-assembler / native-assembler pair -

Everything else I've done, someone had already done, so I'm expecting someone has already written a native/cross-assembler matched pair before.

But I don't want to do it through an emulator - I want the PC running it native since my Dev Environment uses JIT assembly for the entire environment everytime I run it.

whartung · Apr 7, 2024

cj7hawk said:
And while I recognise it's for a worldwide user base of 1 person, I'm writing this for fun and challenge

Indeed, for heavens sake don't listen to us! Your project, your challenges.

cj7hawk said:
The Macro and Include logic is mostly the same, but Includes don't get extra parameters, and macros can only be chained, not nested. Also the implementation for both is coded separately.

For sure, see to me, I'd think about combining them. Saves some code memory through reuse, gain some simple functionality (i.e. parameters to includes).

With the code savings you can add in the EQU mapping and clearing that you demonstrated manually. You could literally include those EQU into the source stream, and not change you current code at all. (I know this is harder than it sounds, but conceptually you could do that.)

cj7hawk · Apr 8, 2024

whartung said:
With the code savings you can add in the EQU mapping and clearing that you demonstrated manually. You could literally include those EQU into the source stream, and not change you current code at all. (I know this is harder than it sounds, but conceptually you could do that.)

Adding the parameters to the includes is certainly possible, since once the filename is loaded I could use the same routine for the argument mapping.

Logically that would allow use of includes where nesting is required over chaining, even at the cost of disk space and multiple passes per nesting level in geometric progression.

Can parameters be passed to object files when using link? I would have thought that the one-way nature of creating object files and then linking them would preclude that. Especially as there should be no carriage of labels from the source to the linked object.

But I'm not quite sure what you mean by the above as including the remapping of arguments to local or global variables in the stream of the macro is already does. Anything declared in the macro will be played back and recreated whenever the macro is called.

Can you give me a pseudo-code example?

cjs · Apr 8, 2024

cj7hawk said:
That's my objective. To create a cross-assembler / native-assembler pair -
If I get different binaries, I have much bigger problems !

Well, you have much bigger problems. One of the situations where you're going to get different binaries is when the CP/M version runs out of memory and the PC version does not.

To have any hope of two assemblers with completely different source code working in the exact same way, you're at the least going to want a very extensive automated test suite.

cj7hawk said:
But I don't want to do it through an emulator - I want the PC running it native since my Dev Environment uses JIT assembly for the entire environment everytime I run it.

I don't understand the problem there. I'm not sure what you mean by "JIT assembly," but what's the issue with running the exact same CP/M program you'd run under CP/M under emulation? You don't want the assembler running far faster on the PC?

You are aware that emulators like RunCPM are just command line programs like any other under Unix and Windows, right?

whartung · Apr 8, 2024

cj7hawk said:
But I'm not quite sure what you mean by the above as including the remapping of arguments to local or global variables in the stream of the macro is already does. Anything declared in the macro will be played back and recreated whenever the macro is called.

One of the techniques that compiler writers use is to rewrite higher level expressions into lower level ones that the compiler actually knows how to compile.

As a contrived example, say you had if/else and wanted to add a case statement.

Code:

case
    when a = 1 begin
        ...
    end;
    when b = 1 begin
       ...
    end;
    when c = 1 begin
       ...
    end;
end case;

Rather than writing code in the compiler to "compile" the case statement, instead, when you see the case statement, you simply convert it into a bunch of if/else clauses

Code:

if a = 1 then begin
    ...
end;
else if b = 1 then begin
    ...
end;
else if c = 1 then begin
   ...
end;

and then feed THAT into the compiler.

So, in your case, when the assembler sees the macro "MACRO MYMAC, A, B, C" it, instead, translates that into:

Code:

MACRO MYMAC
EQU A ARG1
EQU B ARG2
EQU C ARG3
...
ENDMACRO

In that way, your current macro assembler part remains unchanged, rather there's a pre-processing part that converts the source code that it sees and simply stamps in those EQU clauses for you.

"Syntax sugar" is the term of art for it. You could even show it on the assembler listing, no reason to hide it. Just gets a little wonky if you're counting lines. But you already have that problem with macro expansion anyway (since macro expansion adds lines that do not have associated file line numbers, assuming you can list out macro expansion in the final listing).

It's almost as if the MACRO command is a little macro itself. (Now the world gets really spinny!)

cj7hawk · Apr 9, 2024

cjs said:
Well, you have much bigger problems. One of the situations where you're going to get different binaries is when the CP/M version runs out of memory and the PC version does not.

To have any hope of two assemblers with completely different source code working in the exact same way, you're at the least going to want a very extensive automated test suite.

I don't understand the problem there. I'm not sure what you mean by "JIT assembly," but what's the issue with running the exact same CP/M program you'd run under CP/M under emulation? You don't want the assembler running far faster on the PC?

You are aware that emulators like RunCPM are just command line programs like any other under Unix and Windows, right?

JIT =Just In Time. That is, when I build my dev/test environment, it assembles it all from source every time, rather than just using preassembled binaries. The batch file assembles all of the files necessary, writes the now-assembled binaries to the disk image and boots the emulator.

So the emulator boot time is going to take a bit longer to assemble the source on my PC than the PC takes to save it natively. Also, the PC will have better debugging tools. Also it all needs to happen from the DOS command line.

Can I be 100% certain both will produce the same binary? Well, no, as a bug is always possible, but we are talking about an *assembler* here. It's job is to produce the correct binary, so I'd expect the same binary even if I just wrote a translator and used someone else's assembler to assemble the binary. In fact, I generally test my assemblers by using source designed for a different assembler, translate the source, then make sure I get the same binary.

As for memory issues, yes, the CP/M version will drop out first, but it won't produce a different binary - it will just produce an error. It's also possible to use the label table manipulating commands to reuse memory in different ways to reduce the load on system memory by deleting old entries that are no longer required, and cleaning up the label table.

The source is around 190K at present, but it should still assemble easily on a 64K machine or even a 32K CP/M machine. As a rule of thumb the labels typically consume around the same memory as the assembled size, so source that generates a 16K ROM will need around 16 to 20K of system memory if long labels are used. Less if short ones.

cj7hawk · Apr 9, 2024

whartung said:
So, in your case, when the assembler sees the macro "MACRO MYMAC, A, B, C" it, instead, translates that into:

Code:

MACRO MYMAC EQU A ARG1 EQU B ARG2 EQU C ARG3 ... ENDMACRO

In that way, your current macro assembler part remains unchanged, rather there's a pre-processing part that converts the source code that it sees and simply stamps in those EQU clauses for you.

"Syntax sugar" is the term of art for it. You could even show it on the assembler listing, no reason to hide it. Just gets a little wonky if you're counting lines. But you already have that problem with macro expansion anyway (since macro expansion adds lines that do not have associated file line numbers, assuming you can list out macro expansion in the final listing).

It's almost as if the MACRO command is a little macro itself. (Now the world gets really spinny!)

Yes, correct, this is how my Macro feature presently works. And I'm warming to the need for Macros to support nesting also, as well as driving more capability in the label table, which would allow me to turn on and off access to system variables within labels as well, dropping the requirement to reserve anything more than a single vector and providing macros access to the label system, which is starting to resemble a disk drive type data structure now. That would keep my code size low, while allowing additional functionality to be implemented.

I like the suggestion.

I don't have the line count issue as the entire macro is either counted as appended to the calling line, or inserted before the next line, and a position counter counts which tems is being used in the macro, though it is possible to overflow it since the term counter is only 8 bits, but that still allows for very large macros. Include type macros just change to local line numbers within the include.

It also facilitates being able to "Include" "Macros" which is a macro that creates macros. That way I could have a whole bunch of system related macros for anything from shimming an interrupt, to managing the process list for an advanced application, through to OS functions. Given the complexities of managing the architecture this assembler is intended for, that makes a lot of sense, especially for writing driver code. And making it easy to intercept data streams or build temporary ramdisks would be useful. Not the mention handling paging and extended memory access.

cjs · Apr 9, 2024

cj7hawk said:
So the emulator boot time is going to take a bit longer to assemble the source on my PC than the PC takes to save it natively.

And on what do you base this? And can you explain why on a single core of my 14-year-old laptop, echo -e 'info\nexit\n' | time -v ./RunCPM runs so fast (under 5 ms) that time just prints the total time used as "0.00" seconds?

cj7hawk said:
Also, the PC will have better debugging tools.

Right. Which is why I'm suggesting running the emulator on your PC, not on some other system.

cj7hawk said:
Also it all needs to happen from the DOS command line.

You should think about when you went wrong when, from my suggestion that you use an emulator, you thought I was saying you should not use the command line. (I was suggesting no such thing; quite the opposite.)

cj7hawk said:
In fact, I generally test my assemblers by using source designed for a different assembler, translate the source, then make sure I get the same binary.

That's a nice smoke test to make sure your assembler is not completely failing, but is far from comprehensive test coverage. And even with better coverage, still, why try to test what you can prove? As Dijkstra said, "Program testing can be used to show the presence of bugs, but never to show their absence!"

cj7hawk said:
As for memory issues, yes, the CP/M version will drop out first, but it won't produce a different binary - it will just produce an error.

So try it out. Create a program that is too big for the CP/M version to assemble, run it sending the output to out1.com, run the PC version sending the output to out2.com, and compare the two files. Do they compare the same? If not, the output binaries are different.

cj7hawk said:
It's also possible to use the label table manipulating commands to reuse memory in different ways to reduce the load on system memory by deleting old entries that are no longer required, and cleaning up the label table.

So? I'm talking about the behaviour when the source file is too large to assemble on CP/M. Substituting another source file doesn't make the problem go away.

I note that in a number of the responses you make above, you seem to be trying to rationalise away your problems rather than dealing with them. I suggest, if you want reliable software, you're better off at least just saying, "Yes, my system is broken in this or that way," so at least users will know what the limitations are. Some of the greatest frustrations of my life have come from vendors hiding known problems with their system, rather than just admitting to them.

cj7hawk · Apr 9, 2024

cjs said:
And on what do you base this? And can you explain why on a single core of my 14-year-old laptop, echo -e 'info\nexit\n' | time -v ./RunCPM runs so fast (under 5 ms) that time just prints the total time used as "0.00" seconds?

I can only assume your CP/M emulator is way faster than the one I wrote, which is only a few times faster than a fast real-z80 based system.

cjs said:
You should think about when you went wrong when, from my suggestion that you use an emulator, you thought I was saying you should not use the command line. (I was suggesting no such thing; quite the opposite.)

The issue is how to link arguments from the DOS CLI to the Emulator CLI and pass the response back to the DOS CLI. It's not impossible, but my current emulator doesn't handle it.

cjs said:
So try it out. Create a program that is too big for the CP/M version to assemble, run it sending the output to out1.com, run the PC version sending the output to out2.com, and compare the two files. Do they compare the same? If not, the output binaries are different.

I've tried it using artificial limits by reducing system memory until it fails, and source file is only limited by disk space. The practical limit is reached when labels and stored file control blocks exceed system memory.

So far, up to 16K+ of output file, it's been the same, but both assemblers handle it fine.

If the memory limit is exceeded, the assembler just errors out, and won't produce a binary to test, so there's nothing to test against. If there's no failure then so far the output is the same.

Since the biggest item I can add to memory is 256 bytes, storage to the stack or list causes an error once remaining available memory is less than 256 bytes.

But it will give an indication how far it got before that happens.

cjs said:
So? I'm talking about the behaviour when the source file is too large to assemble on CP/M. Substituting another source file doesn't make the problem go away.

Make it go away? No. But if a file is split up and routines that don't have to share global parameters are turned into includes, and extra passes are added, then it is possible to recover some memory mid-assembly and get around the problem. It should be possible to assemble a single file of up to 64K like that, and after that, well, that's enough for a version 1.0

cjs said:
I note that in a number of the responses you make above, you seem to be trying to rationalise away your problems rather than dealing with them. I suggest, if you want reliable software, you're better off at least just saying, "Yes, my system is broken in this or that way," so at least users will know what the limitations are. Some of the greatest frustrations of my life have come from vendors hiding known problems with their system, rather than just admitting to them.

Rationalise? Maybe, but that's not the intent. Ensuring I have a way to achieve the functionality is the intent. Discussion helps me see what is possible too. The assembler isn't broken, but I haven't examined it spec for spec against anything other than Macro Assembler. It's about a third of the size of the same or similar functionality to the macro assembler I compared it to, and can support a lot more capability in some ways. So I'm pretty happy with it. The macro assembler doesn't allow chaining or nesting of macros, and doesn't allow any kind of includes.

To describe my activity, I'm mainly using the Cross Assembler to do actual assembly, because it has less memory limitations, but I want to be able to assemble my source on a z80 native system with the same functionality, which I why I'm writing the assembler for z80 as well. It's a key part of the OS I'm writing.

It's more convenient to generate source on the PC though, and much faster. Simply put, it happens in the blink of an eye when I assemble on the PC, but on the emulator, it takes around 10 or so seconds from memory for the same file... So once in a while I do check the binary for consistency, mainly to ensure I haven't broken anything major when updating. The rest of the time the test file for the z80 assembler don't run on the PC assembler, so I have to run them native, but that doesn't include the assembler code itself.

A simple goal is that the assembler must be able to assemble itself -

Adding in the other stuff makes the assembler more useful, but I can't take advantage of that until I drop the functionality back into Windows as myh z80 assembler is way more advanced than my cross assembler now, and my activity there and dislike of running in emulators is why I wrote a Windows 11 compatible z80 assembler int he first place. The only thing I run under emulator presently is my GAL assembler ( GALASM ) and that irks me enough that I might end up writing a native GAL assembler for Windows 11 also.

Some of it is just my personal quirks also

cjs · Apr 9, 2024

cj7hawk said:
I can only assume your CP/M emulator is way faster than the one I wrote, which is only a few times faster than a fast real-z80 based system.

So the issue isn't that an emulator won't work for you; the issue is that you refuse to use one that will work for you.

cj7hawk said:
The issue is how to link arguments from the DOS CLI to the Emulator CLI and pass the response back to the DOS CLI. It's not impossible, but my current emulator doesn't handle it.

With RunCPM this is easy: generate a submit file in the files area that RunCPM reads as the emulated disk, run that when you start RunCPM, and read out the output files placed there when your program is done.

cj7hawk said:
I've tried it using artificial limits by reducing system memory until it fails, and source file is only limited by disk space. The practical limit is reached when labels and stored file control blocks exceed system memory.

In other words, the size is not limited just by disk space. Even a source file that fits on disk can be too big for your assembler under CP/M.

cj7hawk said:
If the memory limit is exceeded, the assembler just errors out, and won't produce a binary to test, so there's nothing to test against.

Sigh. There is something to test against: try to compare the PC output file to the non-existent CP/M output file and you will find that your diff tool exits with a code that says they're not the same.

Trying to discuss whether a non-existent file is somehow a different thing from a non-matching file is simply a pointless waste of time: either the two systems produce the same output or they don't. Make your life simpler by just looking at things like this as "correct" and "not correct" rather than wasting time on contemplating the equivalent of how many angels will dance on the head of a pin.

cj7hawk said:
If there's no failure then so far the output is the same.

Right. So only sometimes the output is the same; in other cases it fails to be the same. Simple enough.

cj7hawk said:
Make it go away? No. But if ...[large explanation here deleted].

Again, no need to go into a pile of irrelevant details. "It can produce different output and just sometimes I can do things to fix it, but often not" is the same thing as, "it sometimes produces different output."

cj7hawk said:
Rationalise? Maybe, but that's not the intent.

It's not about intent. It's that this thing where someone points out a general problem and another developer suggests a few hacks to fix a few specific instances of the problem while not addressing the general problem is something I've seen often in my career as a software developer, and that approach has never ended up with reliable software.

cj7hawk said:
It's more convenient to generate source on the PC though, and much faster. Simply put, it happens in the blink of an eye when I assemble on the PC, but on the emulator, it takes around 10 or so seconds from memory for the same file...

Well, again, that's your fault: you're deliberately using a slow emulator rather than a fast one, and you're blocking off solutions that will save you not only a lot of testing, but also save you writing your assembler twice over.

cj7hawk · Apr 9, 2024

cjs said:
So the issue isn't that an emulator won't work for you; the issue is that you refuse to use one that will work for you.

I am using emulators. It's a custom build that emulates the environment I'm slowly physically building, including the graphics. But if I can't script the build of multiple ASM files, recover the output and build the disks automatically from the PC, it's not much help to me for what I'm developing. I could modify the one I'm using to do that, but as I mentioned, speed is the issue. It's quicker to build the dev environment each time on the PC.

cjs said:
With RunCPM this is easy: generate a submit file in the files area that RunCPM reads as the emulated disk, run that when you start RunCPM, and read out the output files placed there when your program is done.

RUNCPM looks interesting. I'll download and have a play with it when I get a moment.

cjs said:
In other words, the size is not limited just by disk space. Even a source file that fits on disk can be too big for your assembler under CP/M.

That is possible, if it consumes all the label and macro space in memory, yes.

cjs said:
Sigh. There is something to test against: try to compare the PC output file to the non-existent CP/M output file and you will find that your diff tool exits with a code that says they're not the same.

My diff tool says "File Not Found". That is the expected behaviour.

cjs said:
Trying to discuss whether a non-existent file is somehow a different thing from a non-matching file is simply a pointless waste of time: either the two systems produce the same output or they don't. Make your life simpler by just looking at things like this as "correct" and "not correct" rather than wasting time on contemplating the equivalent of how many angels will dance on the head of a pin.

So far, assuming both complete assembly without error, the binaries are the same. Excepting when I hit a bug in the assembler, which hasn't happened for a while.

cjs said:
Right. So only sometimes the output is the same; in other cases it fails to be the same. Simple enough.

No, when it assembles, both outputs are the same.

cjs said:
It's not about intent. It's that this thing where someone points out a general problem and another developer suggests a few hacks to fix a few specific instances of the problem while not addressing the general problem is something I've seen often in my career as a software developer, and that approach has never ended up with reliable software.

You've made many suggestions. Some fit my needs and were helpful. Others didn't fit and were still helpful. Some potentially may have been misunderstood, but none were ignored. All were appreciated. The software is still being written, so I wouldn't call it reliable yet. It is however functional. I'm in the final stretch now, determining which features will make the cut.

cjs said:
Well, again, that's your fault: you're deliberately using a slow emulator rather than a fast one, and you're blocking off solutions that will save you not only a lot of testing, but also save you writing your assembler twice over.

Three times. I have to write it three times - this is just the second one. And I am using emulators, just not to build the environment. The environment is being built around the emulator for testing. And even then ironically, the emulator I do have is still *too fast* for some of my testing. I'd like to find an emulator that is speed-correct for one of the older machines, if you have any suggestions? Especially one that simulates disk access and read times. I have a feeling I'll have to pull down the Amstrad to do that.

But you do note correctly that I don't like using emulators for certain things, and that's true. In those cases, use of an emulator frustrates me and reduces the enjoyment I get from the process. In those cases, I those circumstances I just prefer to run everything from the PC native in Windows 10/11 CLI with PC Binaries. Using it for my initial build is one of those things that frustrates me.

cjs · Apr 9, 2024

cj7hawk said:
I am using emulators. It's a custom build that emulates the environment I'm slowly physically building, including the graphics. But if I can't script the build of multiple ASM files, recover the output and build the disks automatically from the PC, it's not much help to me for what I'm developing. I could modify the one I'm using to do that, but as I mentioned, speed is the issue. It's quicker to build the dev environment each time on the PC.

Again, this is not a problem with emulation, this is a problem with the particular emulator you choose to use. RunCPM will do what you need, and do it quickly, if you apply a little cleverness.

cj7hawk said:
My diff tool says "File Not Found". That is the expected behaviour.

And does it return a "success" or "failure" error code?

If it returns "success" when one of the two files you give it is present and the other is missing, you have a pretty bad compare tool (for scripting, anyway) and you should replace it with something that understands that when you ask if two files are the same, and one is missing, that means you do not have two files that are the same.

cj7hawk said:
No, when it assembles, both outputs are the same.

Right. When it assembles. And if it doesn't, the two outputs aren't the same. And you admit that we can fairly easily find cases where the two outputs aren't the same because one fails to assemble.

cj7hawk said:
Three times. I have to write it three times....

Oh my. I don't even want to know.

cj7hawk said:
But you do note correctly that I don't like using emulators for certain things, and that's true. In those cases, use of an emulator frustrates me and reduces the enjoyment I get from the process. In those cases, I those circumstances I just prefer to run everything from the PC native in Windows 10/11 CLI with PC Binaries. Using it for my initial build is one of those things that frustrates me.

It frustrates you because you're using an emulator very unsuited that purpose. It's absolutely no problem to use a (different) emulator that will work just as well and, from a human point of view, just as fast as running a separate native program. And even if it were a bit slower, the time it would save you from writing and maintaining two separate programs, much less trying to ensure that they do the same thing, would easily make up for it.

Given the file sizes you're assembling (a couple of hundred K or less), and what your assembler is doing, there is no reason that running the Z80 version under emulation should be significantly slower than using a native PC application. In fact, there's little reason it even should be noticeably slower (to a human).

cj7hawk · Apr 10, 2024

cjs said:
Again, this is not a problem with emulation, this is a problem with the particular emulator you choose to use. RunCPM will do what you need, and do it quickly, if you apply a little cleverness.

I will check out RunCPM when I get some time. Is there a PC x64 installer / binaries available, or is it a case of always needing to build an appropriate environment to assemble it?

cjs said:
And does it return a "success" or "failure" error code?

When given two files, it reports "success" and has done for some time now. But it's still too early to confirm success, and I still need to write the third assembler.

cjs said:
It frustrates you because you're using an emulator very unsuited that purpose. It's absolutely no problem to use a (different) emulator that will work just as well and, from a human point of view, just as fast as running a separate native program. And even if it were a bit slower, the time it would save you from writing and maintaining two separate programs, much less trying to ensure that they do the same thing, would easily make up for it.

Not really. It just seems impractical to use a z80 assembler when I have a perfectly good cross assembler that is more suited to some of the task. Some tasks are suited to cross-assembly and some to native assembly. It's about choosing the tool most suitable to the task. I'm sure it would be possible to make the other tool fit, but that's not necessarily the best approach. Generally I update the assembler code, cross assemble, build the dev environment, then run it, then run the assembler code to test. Sometimes on a specific test target and sometimes on its own code.

cjs said:
Given the file sizes you're assembling (a couple of hundred K or less), and what your assembler is doing, there is no reason that running the Z80 version under emulation should be significantly slower than using a native PC application. In fact, there's little reason it even should be noticeably slower (to a human).

It's a curious thought you raise. I can evaluate it when I get RunCPM working at some point. There's no doubt the code itself is far more streamlined and should be much faster on the z80 version as it's all in assembly while the cross-assembler is in BASIC compiled to x64 EXE. Speed comparisons sound interesting.

Writing Assemblers... What should a good assembler do?

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Experienced Member

Veteran Member

Experienced Member

Veteran Member