Writing Assemblers... What should a good assembler do?

cj7hawk · Apr 5, 2024

16 is a quite few ! Its cumbersome code, and uses a bit of space, but it's not too much... I suppose around 200 bytes or so to implement the extra support with the data structures I use, which isn't too much if that many is useful.

I never thought of using Macro's to generate DPBs, but it makes sense, since you can use more common parameters to build a DPB from base data, and know it's going to be correct.

Out of interest do you have a Macro from any version assembler that assembles a DPB ? I'd love to see how it looks.

Though I should mention that there's no limit to how many labels can pass into a macro, so if you did something like;

SET CYLINDERS,100
SET HEADS,4
SET SPT,32
SET DIRECTORY_ENTRIES,256
SET RESERVED_TRACKS,2
DPB ; Call DPB macro to generate DPB.

Then there's no need to put them all on the same line... The macro will still pick the parameters up on other lines, which avoids crowding, and removes any limits...

I just mean how many parameters should be passed directly to the Macro as unnamed arguments that follow the line eg,

DPB 100,4,32,256,2

Would do the same as the above, except instead of meaningful label names, the DPB parameters are just ARG1, ARG2, ARG3 etc.

My main thinking was that passing some arguments would allow creation of new "commands" like OUT (C),A as a 16 bit version to make the intent clear - eg

MACRO OUT16
LD BC,ARG1
LD A,ARG2
OUT (C),A
MEND

Would let you do stuff like

OUT16 $03F8,$32

So I was thinking to have parameters for readability, not just functionality.

Would you still think 16 arguments in the command line is needed, or would the first example cover it better?

Thanks
David

daver2 · Apr 5, 2024

See examples in: http://www.cpm.z80.de/manuals/cpm3-sys.pdf.

Dave

cj7hawk · Apr 5, 2024

Thank you for the example !

daver2 said:
See examples in: http://www.cpm.z80.de/manuals/cpm3-sys.pdf.

Oh my goodness, that was exactly what I was trying to avoid... Aside from making the source smaller, is that really a good idea?

It gives an example such as;

DPB$SD: DPB 128,26,77,1024,64,2

Which is really messy and would be difficult to understand without extensive documentation inserts.... it's not outside of what the assembler can handle, but it really was intended to be a bit more reader-friendly. eg;

In my assembler, assuming I wrote a macro module by default, you would enable the module with;

INCLUDE DPBMACRO.ASM

Then;

SET PSIZE,128 ; The physical sector size in bytes.
SET PSPT,26 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF ; The number of tracks to offset.
SET NCKS ; The number of checked directory entries.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

This can also be done with other functions, eg;

SET PSIZE,128 ; The physical sector size in bytes.
SET PSPT,26 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF ; The number of tracks to offset.
SET NCKS ; The number of checked directory entries.
INCLUDE DPBCD.MAC ; Generate the Disk Parameter Block in the BIOS.

Which would keep the Macro code on Disk instead of in memory.

The SKEW example is smaller, eg

SKEW 26,6,1

But even then, there's no certainty as to what the 26 6 and 1 mean without extra comments or referring to the manual.

I was originally thinking that Macro's should be used in a way that makes sense, and that sometimes different arguments might need to be passed that are irrelevant to the calculations in the macro, but require values that might not exist in the label system already, eg, to indicate that code is being relocated by LDIR,

eg:

MACRO RELOCATE ; RELOCATE FROM,TO,SIZE.
LD HL, ARG1
LD DE,ARG2
LD BC,ARG3
LDIR
MEND

This would make it obvious the register loads were intended for a LDIR... I may have misunderstood common macro use?

cj7hawk · Apr 5, 2024

I should add that my system uses labels for everything... System variables ( that it shares from the assembler to the code at assembly time ) and Macro's included. So the assembled code has some idea of what is going on at assembly time in a meta sense. Though since the code must assemble itself, that does precluse me from using system variables that are the same as the assembler recognizes since it creates a meta doppleganger effect and kills self-assembly.

So the only reason to use numbers after the macro call with arguments is that then there is no need to create labels - eg, PSIZE might be desirable as a non-macro label, or be used elsewhere.

The system labels are somewhat intrusive for this reason, so I limit them to just a few necessary variables and arguments.

Labels can also contain meta data in the form of a normal label value, though I'm not sure what use to make of it. Either as a type specifier or as a version identifier were the current ideas, but at the moment, it just reflects the macro size in memory so a SHOWLABEL MACRONAME will reflect the macro memory usage to the console during assembly time. It could also be used to show the versions of the macro though, or other data types.

Svenska · Apr 5, 2024

cj7hawk said:
Would you still think 16 arguments in the command line is needed, or would the first example cover it better?

Your example would work for a single DPB entry, barely.
It would not work well for a complex table of precomputed values.

I've written a table-driven i80 simulator which chains assembly fragments to save space.
Each instruction is a single table entry, as follows:

Code:

.macro instr
    ; table contains four-byte entries = 1024 bytes
    .db low(@2), low(@1), low(@0), 0
.endmacro

.org (PC+255) & 0xff00
instr_table:
    instr    fetch_nop,   exec_nop,   store_nop  ; 00       : NOP
    instr    fetch_dir16, exec_nop,   store_bc   ; 01 nn nn : LXI B, d16
    instr    fetch_a,     exec_nop,   store_mbc  ; 02       : STAX B
    ...

Three arguments is enough in this case. Adding cycle counting requires four (or five for taken/non-taken branches). My i86 emulator already chains 5 fragments per instruction, so seven arguments including simple cycle counting.

Using globally reserved symbol names for macros is a bad idea without namespacing. Requiring multiple lines for each table entry is too verbose, too cumbersome.

Macros can always be replaced with a pre-processing step. But that is also annoying and produces fragile build systems.

Hard limits suck. At least make them large enough to not matter. "Maximum three arguments and no nesting" is a very hard limit.

cj7hawk · Apr 5, 2024

Svenska said:
Your example would work for a single DPB entry, barely.
It would not work well for a complex table of precomputed values.

I've written a table-driven i80 simulator which chains assembly fragments to save space.
Each instruction is a single table entry, as follows:

Code:

.macro instr ; table contains four-byte entries = 1024 bytes .db low(@2), low(@1), low(@0), 0 .endmacro .org (PC+255) & 0xff00 instr_table: instr fetch_nop, exec_nop, store_nop ; 00 : NOP instr fetch_dir16, exec_nop, store_bc ; 01 nn nn : LXI B, d16 instr fetch_a, exec_nop, store_mbc ; 02 : STAX B ...

Three arguments is enough in this case. Adding cycle counting requires four (or five for taken/non-taken branches). My i86 emulator already chains 5 fragments per instruction, so seven arguments including simple cycle counting.

Using globally reserved symbol names for macros is a bad idea without namespacing. Requiring multiple lines for each table entry is too verbose, too cumbersome.

Macros can always be replaced with a pre-processing step. But that is also annoying and produces fragile build systems.

Hard limits suck. At least make them large enough to not matter. "Maximum three arguments and no nesting" is a very hard limit.

I do support chaining macros to any level, and nesting of external macros while memory remains. Macros and Includes are the same in my system - just where they are located and retrieved from differs. They can be mixed up together also. Macro assembler supported neither when I checked the documentation, so I used that as my baseline. Though that might also just be the version of documentation I had for it.

When you say "Namespacing" I'm not sure of the context you mean... Those labels can be reused over and over to create a full set of DPBs, with the variables changing for each DPB context. It's also possible to create a "DPB.ASM" include that removes the labels entirely, and won't affect them being used in the main code at all for some other purpose. Or they can be fully global and shared, or duplicated with different meanings for different modules. The Assembler allows the module to determine what the programmers intent was and implement it like that, so that includes can be included and removed, recovered, switched etc. It's pretty versatile like that.

But system labels are fixed... And while they can be redefined, it's best to leave them alone. I have only a few;

ORGPC - The Program Counter - Can be used by Macros to create instructions and relative values.
PCOFF - The offset added to PC for calculations.
XWLOC - Where in memory we are actually writing to at the moment. ( Target memory, not real-time memory )
XPASS - The Global and Local pass - useful for making sure comments only appear once, not once per pass. Or for differentiating during debug.
SYSFL - System flags, defined but not used yet.

Changing these can change the operation of the assembler, especially the pass number since it affects what pass the assembler is making.

Then there are the macro defacto arguments, which whatever way you look at it, are going to usually take up label space on most assemblers. Hence ARG1, ARG2, ARG3 etc... Up to ARG16 perhaps, since so far you're the only one to suggest a number ( thank you )... Hard limit is not unreasonable if 16 is a big number, but beyond that I'd probably make it adjustable if there was a valid reason to have a macro with say 64 numerical arguments on a single line of code.

I did think of reserving something simple like X1, X2, X3 etc but that seemed to me to be more likely to be used in user code than the ARG... Though once I use ARG for arguments, the number doesn't really matter. I could use something like SYS-X1, SYS-X2, but I haven't gotten any good idea for using prefix based namespacing yet, so just went with the above...

As for multiple DPB entries, it would look something like this;

; DPB for A drive.
SET PSIZE,128 ; The physical sector size in bytes.
SET PSPT,26 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF,2 ; The number of tracks to offset.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

; DPB for B drive.
SET PSIZE,256 ; The physical sector size in bytes.
SET PSPT,18 ; The number of physical sectors per track.
SET TRKS,80 ; The number of tracks on the drive
SET BLS,2048 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF,2 ; The number of tracks to offset.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

And the macro cound do the calculation and generate the table without issue. Given the Assembler itself is 190K of source for 9.8Kb of assembled binary. that might be a problem on smaller systems, but it also allows files to be split across multiple disks, and how big an ASM file is, is dependent on the programmer and not the system or assembler. I guess I like verbose assemble with lots of comments.

cj7hawk · Apr 5, 2024

I guess what I thought people would use macros for was things like rewriting the PC and offset - eg, I don't have a "TARGET_ORG" type function, but it could be created by

Code:

MACRO TARGET_ORG
  SET ARG16,ARG1-ORGPC ; Arguments can be reused without creating labels if not specified in the command.
  OFFSET ARG16
MEND

Then if you wanted code assembled for an ORG of $4000 but were assembling it inline somewhere else ( eg around $0100 ) you could just do;

TARGET_ORG $4000

and it would automatically adjust the offset of any fixed address instructions that use the PC.

Then rather than using a heap of code for functions that would require code space anyway, these seldom used instructions could be created by macros...

The same for conditional assemble - at the moment, I support IFZ, IFNZ, IFNEG and IFPOS based on the subsequent value. Macros could be used to create bit tests and even different syntax, eg,

IFGT LABEL1,LABEL2 ; If Label1 is greater than Label2.

Conditional statements can be opened in a macro also, and closed just like a normal conditional - there is no need to define a conditional end statement within the macro.

As an interesting concent, Macros can also be used to *create* new macros - Not nesting, literally build a new macro type from scratch. Or add macros as dependencies. Macro creation and Macro execution are handled in two completely different ways, so there's no conflict there. Hence it's possible to create macros in source that don't actually show up unless you call them - eg, Adding in a bunch of disk functions.

With the idea in mind that spare arguments can be reused over and over in the code, perhaps having more of them isn't such a bad idea afterall...

So far the total assembler size is a little under 10K, which I think is pretty reasonable for the functionality it supports... Though I still could optimize the code a fair bit.

I want to keep it small since the target architecture has a 64K ROMDISK as Drive L: which will have the OS boot, some basic utilities, and also the assembler. 64K isn't a lot when it comes down to it... About 2/3 the size of a typical Osborne 1 disk.

daver2 · Apr 5, 2024

I wrote an entire graphics language in assembler macros once.

People who used the graphics language didn't realise they were (effectively) programming in assembler (or, more correctly, threaded interpreted code and data).

Some people did ask, and were a little surprised...

As you know, programmers will use a facility to the max!

However, if I am doing anything myself, I will still include copious comments - as the software needs to either be maintained or can act as a learning platform for other people.

If your assembler is designed to have a small footprint - you are quite at liberty to constrain the features as you think fit.

Dave

cjs · Apr 5, 2024

cj7hawk said:
Which is really messy and would be difficult to understand without extensive documentation inserts.... it's not outside of what the assembler can handle, but it really was intended to be a bit more reader-friendly.

If you want to be more reader-friendly, start with a named parameter list rather than having arguments just named ARG1, ARG2, etc. When reading a macro with several arguments, it's much harder to keep track of what's what if you have to remember numbers rather than being able to see names. (Names instead of numbers is one of the big reasons we have assemblers in the first place!)

Further, consider again the example of the DPBCB macro, where the two uses of it take up 16 lines of code. Without a parameter list against which the arguments are checked, it's easy enough to accidentally leave out a line in the SET statements before the second call. In that case, the assembler will silently use the value of an argument to the previous call, which will then silently produce broken output that is likely to be rather painful to debug.

Svenska · Apr 5, 2024

cj7hawk said:
As for multiple DPB entries, it would look something like this;

Yeah, no. If I want a concise table, I don't want an overly verbose representation. If this is what your assembler provides, I'd precompute the table outside and use DW statements instead.

cj7hawk said:
When you say "Namespacing" I'm not sure of the context you mean...

In your example, the DPBCD macro reserves the names PSIZE, PSPT, TRKS, BLKS, NDIR and OFF for itself. These names cannot be used anywhere else in the file, which is probably an annoying restriction within a CP/M BIOS.

Obviously, it is possible to prefix each of them with DPB_ (or something else) for some manual namespacing, but that increases the symbol table size and is just another workaround because of an assembler deficiency.

If I have a table of 256 entries (such as the instruction table), I'd prefer not to write 2048 lines of code. The latter is hard to understand, hard to edit, and hard to scroll through even on a big screen. On a slow 80x25 terminal, it's be pure pain.

cj7hawk · Apr 6, 2024

cjs said:
If you want to be more reader-friendly, start with a named parameter list rather than having arguments just named ARG1, ARG2, etc. When reading a macro with several arguments, it's much harder to keep track of what's what if you have to remember numbers rather than being able to see names. (Names instead of numbers is one of the big reasons we have assemblers in the first place!)

Further, consider again the example of the DPBCB macro, where the two uses of it take up 16 lines of code. Without a parameter list against which the arguments are checked, it's easy enough to accidentally leave out a line in the SET statements before the second call. In that case, the assembler will silently use the value of an argument to the previous call, which will then silently produce broken output that is likely to be rather painful to debug.

Yes, that is correct, though that much is correct for the entire codebase when it comes to determining what went wrong.

In this assembler's case, the easy way to locate that issue would be to use conditional debugging and reporting, and once turned on, it would be very obvious that the parameters to the Macro were incorrect. The user could just dump the parameters to the console at assembly time, and there's just a handful of operations I still want to include, and dumping all of that to a file is the other function I want to use.

A bigger issue I haven't hit yet, which I imagine will occur, is that there is no syntax checking of the assembler macro code at macro-storage time - so if the macro isn't used, it won't be tested, and if it is used, it shows up as a single line of code, and the assembler operates on it as a single line of code. It also either appends it to the macro instruction itself or to the next line, depending on whether the EOL directly follows the macro, or there is a space/documentation line, which I need to address to make finding the correct line easier. Though in fairness it does correctly report the failed macro name, so it should still be obvious.

The main reason for leaving out local variable names in parameters is the way the assembler macro was implemented, since any name would either need to be tokenized, or translated, which would make understanding the error code more difficult. The macro code strips out anything not necessary and repackages the entire macro onto a single line, then provides information around which argument the assembler disagreed with, but since it only returns a single segment of the macro, that is not so easy to figure out when you might have 20 to 30 statements in a typical macro... Though it does let you know roughly where the issue is, and a macro can be assembled out-of-macro also to make sure it assembles correctly.

Also a macro can be manually checked at any time via the SHOWLABEL MACRONAME, which shows the label and the label value, but in the case of macros, it also appends the macro, and it does this in the error handler. There's quite a bit of debugging capability in the assembler itself, even if compiled without the additional debugging switches and code.

The original assembler was nearly 8K in length, so adding in macros, includes, conditional assembly, nesting, chains, debugging output etc, so far has only cost me an additional 2K of code, but increased the source from 140K to 190K - Nearly a third larger. Though it's still only about a third of the size of Macro Assembler for comparable functionality, so it's still a lightweight, and system macros (optional) will also be included in the boot rom so any program can be assembled at any time without loading up the assembler, which would support some JIT style assembly, eg, driver code.

Having macro labels defined at assembly time ( of the Macro ) would mostly impact the size of the assembler, since I would need code to create all the different logic. I originally wrote the macros without any parameters ( as can be observed by the fact that a macro definition doesn't have any arguments, but the macro execution accepts them ). I had some serious bugs which took me a few hours to locate where it would execute the macro *before* loading in the parameters too, which had me confused for a while until I realized I was switching the source location before reading the correct source. The sources are all still completely serial in nature, so the assembler will still handle punched tape as an input source, with lines of any length - it's also still possible to assemble the entire codebase on a single line. Though I truly hope no one would ever do that intentionally.

I ended up keeping the linked list since it allows me to extend on the data structure and that's how I included the macro commands, but I still need to fix up how the internal lists work sometime to get some code space back from the source... But that's not a priority since once working, I'll switch over to writing the assembler documentation next.

cj7hawk · Apr 6, 2024

daver2 said:
I wrote an entire graphics language in assembler macros once.

People who used the graphics language didn't realise they were (effectively) programming in assembler (or, more correctly, threaded interpreted code and data).

Some people did ask, and were a little surprised...

As you know, programmers will use a facility to the max!

However, if I am doing anything myself, I will still include copious comments - as the software needs to either be maintained or can act as a learning platform for other people.

If your assembler is designed to have a small footprint - you are quite at liberty to constrain the features as you think fit.

Dave

I'd love to hear more about the macro GL - Feel free to add some details in here if it's not an imposition - They are probably quite relevant to the topic.

It is designed to have a small footprint, but the biggest issue is that I've never used some of the functionality I'm implementing now, so I lack experience at using these assembly functions which makes writing them a lot more difficult, since I don't have a clear idea of the context of how they are supposed to work in the first place. So I'm reading documentation for other assemblers to try and understand them and guessing at which features might be used.

I'm the kind of programmer who doesn't use facilities to the max, so all the input in this thread has really expanded my understanding. In fact, even working with operating systems for this entire series of projects has massively changed the way I write code, and it's not just me revisiting it - I still code commercially in assembly on other projects from time to time, so I do regard myself as a veteran assembly programmer, but the ways I was taught to write assembly were all very different to how others may have learnt also.

There is no doubt I've developed a new kind of Macro... And it's not the same as other macros, but hopefully it has more in common with existing macro's that it wouldn't alienate a programmer. Writing an assembler is a bit like writing a novel. You want it to feel like it flows to the user/reader. If it's too jarring, then the user/reader won't like it and will wander off to other assemblers/books. ( Speaking of which, I also write novels... If you like science fiction, try Turing Evolved - still one of the top reads on Amazon. )

From all of the input to the latest question, it seems like I still need at least 10 to 16 arguments able to be passed to the macro at assembly time as simple parameters. That's a lot more than I would have guessed when I chose 3. I was originally unsure if more than 2 was a good idea so felt like 3 was overkill... I was very much incorrect in that thinking.

cj7hawk · Apr 6, 2024

Svenska said:
Yeah, no. If I want a concise table, I don't want an overly verbose representation. If this is what your assembler provides, I'd precompute the table outside and use DW statements instead.

In your example, the DPBCD macro reserves the names PSIZE, PSPT, TRKS, BLKS, NDIR and OFF for itself. These names cannot be used anywhere else in the file, which is probably an annoying restriction within a CP/M BIOS.

Obviously, it is possible to prefix each of them with DPB_ (or something else) for some manual namespacing, but that increases the symbol table size and is just another workaround because of an assembler deficiency.

If I have a table of 256 entries (such as the instruction table), I'd prefer not to write 2048 lines of code. The latter is hard to understand, hard to edit, and hard to scroll through even on a big screen. On a slow 80x25 terminal, it's be pure pain.

That makes sense.

They can most definitely be used by objects without any conflict in the current code build.

And I do have easy ways of selecting these in and out of existence - so it's possible to group a set of names, and I could probably switch them in with something like "SELECT <MACRONAME>" and "UNSELECT <MACRONAME>" which would let the assembler know to switch in the macro labels for variables and constants, regardless of any collision with user labels and would protect user label values from being changed, while allowing continuity of names for the macros, so the macros could share values at assembly time.

It's also possible to still allow macros to either access user namespace or make it entirely unaccessible outside of the macro, though I can't imagine the latter is a good thing since it's more likely that the macro will want to use parameters in user-defined labels as per my original BDOS example. The general rule is that if I switch in temporary namespace, then other namespace that doesn't cause namespace collision is still valid and collision namespace is protected.

I'm going to give that idea some more thought. I like it.

cjs · Apr 6, 2024

cj7hawk said:
In this assembler's case, the easy way to locate that issue would be to use conditional debugging and reporting, and once turned on, it would be very obvious that the parameters to the Macro were incorrect.

I don't see how. After all, the macro is still getting all the parameters, it's just the the value of one of the six parameters for that particular call is wrong, e.g., PSPT is 16 instead of 18 because you are missing a line that sets it to 18 before that particular macro call, which is one of many. You can be left with dozens or even hundreds of lines to scan through looking for an easy-to-miss error, but even then that's only if you know that that is the particular error anyway, and it could well take a lot of debugging of the running system to get you to that point.

Personally, if I had to generate those tables and had only your macros available, I would probably not use macros at all but just write out the tables by hand, as I see that as being more likely to be reliable.

cj7hawk · Apr 6, 2024

cjs said:
I don't see how. After all, the macro is still getting all the parameters, it's just the the value of one of the six parameters for that particular call is wrong, e.g., PSPT is 16 instead of 18 because you are missing a line that sets it to 18 before that particular macro call, which is one of many. You can be left with dozens or even hundreds of lines to scan through looking for an easy-to-miss error, but even then that's only if you know that that is the particular error anyway, and it could well take a lot of debugging of the running system to get you to that point.

Personally, if I had to generate those tables and had only your macros available, I would probably not use macros at all but just write out the tables by hand, as I see that as being more likely to be reliable.

I feel like I'm not understanding your example properly. Especially as I already support all the same versions for calling a macro that other macro assemblers seems to support.

It seems as though you're expecting the macro itself to error due to a missing value which it can do, but if we're talking about a *wrong* value. I don't really see how it's all that different from any other error. Like most errors, when complete, something will be obvious since the code isn't working. There are just so many ways to debug that.

But given you have to type it in either way, or cut and paste it, then change values, you're comparing

this; ( which I do support as a way to do it ).

DPB$SD: DPB 128,18,77,1024,64,2
vs
DPB$SD: DPB 128,16,77,1024,64,2

to this?

SET PSPT,18 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive
vs
SET PSPT,16 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive

Or maybe even this;

SET PSIZE,128 ; The physical sector size in bytes.
SET PSPT,16 ; The number of physical sectors per track.
SET TRKS,77 ; The number of tracks on the drive
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF,2 ; The number of tracks to offset.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

to this;

SET PSIZE,128 ; The physical sector size in bytes.
SET TRKS,77 ; The number of tracks on the drive ; Note the missing PSPT entry before this one.....
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF,2 ; The number of tracks to offset.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

Perhaps I'm misunderstanding the point. But doing it with full line statements seems a lot easier to understand than a series of numbers... And I do make mistakes in my DPBs from time to time when just relying on numbers. So I usually put in a lot of comment to remind me what the number represents, and what is a normal kind of result to expect to find there.

Also, I can't see the above as a problem that even requires more than a single glance to notice the issue once it's not working, since I would check the DPB, note the wrong PSPT, and go checking, then see the line was either wrong or missing. Seems pretty straightforward so I think I must be misunderstanding.

I can template stuff like that to detect errors, and a macro could just as easily zero out any label values on exit, so that missing values would be easy to spot, or could even generate a warning since none of the values should be zero except offset. Just showing the calculated capacity as a sanity check would be pretty good to quickly highlight any errors.

But I feel like I may have missed the point completely?. The current model with expanded capacity following this question should handle both ways of doing it as a macro, so it's not really any different to the other assembler detail provided since it can be done both ways. It can do it on the command line, but I don't know what a reasonable upper limit to a single command line should be.

Of if you mean you might assume ARG3 is PSPT instead of ARG2 in the macro itself, well, it's entirely possible to generate local variables for the Macro itself, eg, EQU PSPT,ARG2 - that works too - and if you prioritise the local version over any existing global version, and delete the local version post-macro execution, then it doesn't interfere with the use of that label. All the stuff you can do with includes, except nesting, you can do with Macros, which only chain. ( Which again, is more than Microsoft Macro assembler seems to allow ).

The number of arguments is easy to make available to the Macro, so it can check and error if required if the first way is better. But it seems that as the quantity of numbers increases, a big string of numbers is less desirable.

Additionally, it's entirely possible to write the macro so it works the same as other assemblers when called from the source. Even on the current version of code. And I can already do a few things that some other macro assemblers don't seem to be able to do, such as writing extended assembler directives and changing the actual assembler system variables. Writing new instructions is also supported, as is complicated maths, even on the parameter line, as all expressions are evaluated by the same code when read from the source.

Hence what I need to know how many parameters should be passable on the assembly line itself as direct arguments on the same line as the call... So far the best number I have is 16 but I haven't noted a practical example that exceeds 10 yet, and given the assembler assembles itself, it's fairly trivial for a programmer to customize that element of the assembler without wasting space on code that they might never use.

I'm not trying to write my assembler to work the same as other assemblers though... Already it appears I've chosen a good balance between size and capability. I am trying to include the useful functionality of other assemblers however so I need to better understand what that functionality should be.

cjs · Apr 6, 2024

cj7hawk said:
SET PSIZE,128 ; The physical sector size in bytes.
SET TRKS,77 ; The number of tracks on the drive ; Note the missing PSPT entry before this one.....
SET BLS,1024 ; The block size.
SET NDIRS,64 ; The number of directory entries
SET OFF,2 ; The number of tracks to offset.
DPBCD ; Generate the Disk Parameter Block in the BIOS.

Yes, that's the one. I my example of how a problem could arise, say PSPT needs to be set to 14. Above, PSPT silently uses the value 16 instead, since your macro just reads global variables instead of taking parameters, so there's no way to tell that the current value of 16 in PSPT was to be passed to a different invocation of DPBCD, not this one.

cj7hawk said:
Also, I can't see the above as a problem that even requires more than a single glance to notice the issue once it's not working, since I would check the DPB, note the wrong PSPT, and go checking, then see the line was either wrong or missing. Seems pretty straightforward so I think I must be misunderstanding.

But when you start seeing mysterious failures in your OS, possibly even weeks down the road, how do you know that it's a wrong parameter in the DPB and not something else somewhere?

cj7hawk said:
I can template stuff like that to detect errors, and a macro could just as easily zero out any label values on exit....

And so what do you do about macros that require parameters that sometimes might be zero?

I also am guessing that you're proposing the programmer do all this checking for and setting of sentinel values in the macro itself, which just seems like a lot of extra work for a form of type checking that a language should be doing for you.

That said, it is your language, and you may be a lot less concerned about early error checking that someone like me.

cj7hawk said:
And I can already do a few things that some other macro assemblers don't seem to be able to do, such as writing extended assembler directives and changing the actual assembler system variables. Writing new instructions is also supported, as is complicated maths, even on the parameter line, as all expressions are evaluated by the same code when read from the source.

I'm wondering if your lack of familiarity with other macro systems is also part of the issue. Everything you've described above is standard stuff for most macro systems. You might find it worthwhile to learn how other macro assemblers do all the above.

daver2 · Apr 6, 2024

And, of course, you can have parameters that are blank. There is usually some form of IF to check for blank parameters (e.g. IFB or IFNB).

Dave

cj7hawk · Apr 6, 2024

daver2 said:
And, of course, you can have parameters that are blank. There is usually some form of IF to check for blank parameters (e.g. IFB or IFNB).

Dave

What does a blank parameter look like? Something like macroname 1,,2 I would assume?

I have no idea what a blank parameter should come out as... I do trap it and invalidate it since assigning a meaning to nothing is generally not a good idea. I could allow it to evaluate as 0, but that's not a great idea. The easiest way to implement in my macros is to leave those parameters as optional. There's no requirement to use all of the parameters in any macro, and detecting they were not used is also possible within the macro. That is already implemented. and a macro can use any number of parameters, and can choose whether to use additional parameters or not without changing the macro name ( ie, just one macro handles different numbers of parameters ).

I'm not sure what problem that solves though.

cj7hawk · Apr 6, 2024

cjs said:
Yes, that's the one. I my example of how a problem could arise, say PSPT needs to be set to 14. Above, PSPT silently uses the value 16 instead, since your macro just reads global variables instead of taking parameters, so there's no way to tell that the current value of 16 in PSPT was to be passed to a different invocation of DPBCD, not this one.

There are two different ways to do it with my current macro system. The easiest is to zero it. A more complex way is to make it local and just do it the same as any other macro. Only the definition syntax is different.

cjs said:
But when you start seeing mysterious failures in your OS, possibly even weeks down the road, how do you know that it's a wrong parameter in the DPB and not something else somewhere?

That sort of stuff happens, but usually because I didn't test a specific case when testing. I have one such bug I haven't gone looking for - where single character com files won't execute properly on a newly logged drive, but all others will. It's annoying, but retyping the command fixes it, and I'm not expecting it to be too difficult to fix - DPB errors I usually fix pretty quickly though - I've hit a few of them and find them pretty quick to solve... Usually as soon as I notice them, which might be why the example isn't gelling with me.

cjs said:
And so what do you do about macros that require parameters that sometimes might be zero?

Nothing. It's only a valid check when you know a zero value isn't valid,

cjs said:
I also am guessing that you're proposing the programmer do all this checking for and setting of sentinel values in the macro itself, which just seems like a lot of extra work for a form of type checking that a language should be doing for you.

I didn't say it couldn't do it, I just implemented it in a different way that doesn't do it by default... For example the missing parameter example Daver2 provided works really well with my code, specifically because it doesn't care how many parameters you enter, but it's entirely possible to check for missing parameters with a single line of code in the macro, for the cost of a few bytes. Then it's up to the programmer. If it's not important, I don't see why they would implement a test for the correct number of parameters, and if it is, then they would normally do it. I just create a mechanism both ways instead of forcing one or the other.

cjs said:
That said, it is your language, and you may be a lot less concerned about early error checking that someone like me.

I might be, but then again, I have no idea of other use cases, and writing an assembler - at least a useful one - means considering things I had never considered. A brief list of help from this thread so far includes;

* Conditional assembly.
* Includes
* Binary Includes
* Macros
* Can add opcodes as macros also
* Can add assembler directives as macros
* Can choose between storing macros on disk or in memory
* Can create relocatable code without object files.
* Can emulate the outcomes from object use without objects.
* Can create assembly time debugging routines.
* Macro syntax looks like any other directive.
* More mathematical operators.
* Better data structures.
* Better reports

None of that is supported by the cross-assembler, and originally, I was just trying to build a z80 version of the cross-assembler so I can assemble code on my PC or on z80.

So yeah, this thread is incredibly helpful to me

cjs said:
I'm wondering if your lack of familiarity with other macro systems is also part of the issue. Everything you've described above is standard stuff for most macro systems. You might find it worthwhile to learn how other macro assemblers do all the above.

Yes, that's definitely a big contributor.

There is remarkably little information to assist in writing either OS's or Assemblers. Lots of information on the detail, but nothing that describes "Here's how to approach the situation"... I'm on my second assembler now and I'm still learning new stuff and rediscovering things that others knew long ago.

Svenska · Apr 6, 2024

Be aware of feature creep. If your scope is to assemble a single project or to reimplement a single assembler, then you don't need to build a fully-featured product. Realistically, you won't have many users either way; at least I don't see a reason to switch to your project yet.

There are many good Z80 assemblers out there already. Some are more advanced than others, some live in their niche (e.g. cassette operation), others are tied to some ecosystem (e.g. compilers or linkers) or have special support (e.g. GBZ80). I've used z80asm because... it is in the Debian repositories and seems to work, that's it.

Unfortunately, more advanced features are rarely portable to other products anyway, unless they specifically target compatibility.

Writing Assemblers... What should a good assembler do?

Veteran Member

10k Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

10k Member

Experienced Member

Veteran Member

Veteran Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Experienced Member

10k Member

Veteran Member

Veteran Member

Veteran Member