If you are repeatedly hitting limitations and they annoy you, you should maybe consider doing something about them.
It's more that there's been lots of good ideas, and I'm working out which ones I can fit into the current architecture. Some I still like, but they aren't going to make the first cut, if ever. I've set myself an 11.75K hard limit on the assembler size, on disk, and I've just cracked 10K with the addition of sending the console output to a file, since exporting things like the entire label list is a big output.
But lately, it feels like I'm approaching a final specification for the design.
The biggest challenge though is that I didn't have this information when I started, so changing some things is essentially the same as rewriting the assembler and it's architecture from scratch. It's not impossible, but doesn't make sense to do when it now does everything I wanted and a whole bunch of stuff I didn't know I wanted when I set out.
Scoping (or namespaces) are an issue which has been named before, especially in the macro context. I'm not suggesting that you should build a complex system, but a few flags marking symbols appropriately might be enough to resolve some of these issues. Not all assembler allow having a label "ADD", for example - which is stupid, in my opinion.
Actually I've never heard of an "ADD" function for labels? Can you tell me how labels can be added together?
Or do you mean an "ADD" label as in something like ADD: or EQU ADD,VALUE? An ADD label is possible on my assembler, but not an ADD macro since the ADD opcode would take preference in the parser. Well, it is possible to have an ADD macro too, but it would be impossible to call it.
Given it's a z80 assembler, you can imagine that I have labels already like A, HL, DE, BC etc as well as labels line INC, DEC, LD and all the other opcodes are labels. Only system variables are reserved, and even that can be bypassed when I finish the current objective list. I support some pretty crazy label names. The only exceptions are no spaces (though not always) and no operators. But labels can start with numbers, be numbers, and also include some non-alphanumerics. Underscore is an obvious one.
Maximum label length is 120 characters. It used to be 256, but I made it more practical. That's the buffer processing limit for anything that represents a single element in the source.
I've set up a flag system also for special labels to act as group markers with the capability to slot in and out functions, so an include could create it's own label space and turn off the other label spaces, but still recover it's own label space for Pass 2. Or it can delete it's label space between passes, or it can use temporary labels... Or it can just use global labels. Nothing is set in stone there, it's all up to the programmer and it's easy enough to mix and match. But reuse of the same label over and over is definitely possible.
The switch function creates a "group" of subsequently declared labels, and can have the label search bypass or include the group. Label groups can even be nested - eg, a group can turn on and off groups, even hiding groups from group commands to create group hierarchies. Though I'm still writing the supporting code for that. It's intended to allow included modules to have access to their own labels, always in memory.
Also, the final change to the current plans is to add in some minimal nesting capability to macros. They can't nest like includes, but I now want macros to be able to call macros rather than just chaining them. Fortunately there's not a lot of state to save when nesting macros which is unlike includes, so I only need to reserve a small amount of memory for nesting ( and it counts as "program memory" within my imposed limit ). This will make better use of local variables so that programmers can use macros they didn't write. Programmers can also *force* other includes to operate in local mode even if the include or macro wasn't designed that way. It's very flexible.
I settled on 9 post-call arguments per include or macro in the end, and the argument names are always assigned to the same labels (ie, ARGC, ARG1, ARG2 .... ARG9) the same just like they are in many shells, but the programmer can rename/reassign/declare them for a specific routine which makes them permanent and/or local to that routine. The arguments set by the last function ( either an include or a macro ) exist in the variable "ARGC" so code can quickly check that the right number of arguments exist or error and fail assembly. Speaking of which, I still need a "FATAL" opcode to trigger a fatal assembly error when that happens.
Groups and other label functions can also exist outside of includes and macros, so there's no reason individual routines can't be declared locally, and groups can be turned on and off at will by any code, which means hiding group names allows reuse of even group names from other group controllers within called subroutines.
You are not the first person in the world to see a Z80 and you don't live in a vacuum, so you don't need to invent everything at the same time.
Nobody will hold a grudge if you use a modern IDE on a high-resolution, flicker-free color screen instead of Wordstar to write your code. Tools are available, feel free to use them. Even if you eventually will replace all of them for fun. Debugging an emulator is done far better using modern and proven test suites; debugging the assembler is much easier with a decent debugger.
That's why I want two assemblers - a paired z80 assembler and a Windows10/11/Linux/Mac compatible cross-assembler. I want to have my cake and eat it too. And I'll put more into the cross-assembling IDE than the z80 version has, but functionally, I want them both to use the same syntax and assemble the exact same code from source. A pair so I can assemble locally ( keep in mind I'm intending to support JIT assembly in the OS ) and still do my development on my PC without the memory limitations. The only time that won't work is when the z80 version runs out of memory, although as noted, the target architecture supports 1Mb of memory and even virtual memory, so even that limit may not be an issue in the long term.
I hit a problem with my original first-gen cross-assembler today. I wanted to do something like "BUFFER: BLOCK BUFFERSIZE+1" and it kept failing since the BLOCK command expects only a single number, not a formula. It works nicely under the new assembler, but the old cross-assemble while mostly compatible doesn't recognize mathematical declarations in the same way as the z80 assembler does. It has a very rigid data structure, while the new z80 assembler treats things in a more intuitive way. If a number if expected, then all that matters is that a number is provided. HOW it is provided doesn't matter. It can be a specifc number, a label, an operator or a long formular involving combinations of all of these, which is evaluated on assembly. When I write the new cross-assembler to be compatible with the z80 assembler that limitation will go away, though both will be backwardly compatible to the original cross-assembler.
One of the best things that came out of this thread is the includes and macros and especially the binary includes. I was wondering how to create extensive system functions without using up the RSTs, interrupts or other non-dedicated zero page hooks. Now I can just have a SYSTEM.ASM file on my boot disk, and code can INCLUDE "SYSTEM.ASM" and it will define a bunch of macros to do common system tasks, eg, install drivers, hook interrupts, access extended code space beyond 64K, identify code blocks in the Memory management units, etc. And I can create another called "GRAPHICS.ASM" with graphics extensions. And another "CALCULAT.ASM" for Maths Functions. That lets me predefine all the system functions in an editable, modifiable way without imposing any of it on the programmer as a requirement.
This is important since the target architecture of the system this is designed for has a unified memory/disk architecture to allow all memory use to show up on M: and memory blocks need not be contiguous. So there needs to be a way to address memory blocks directly and page in/out parts of a file directly, which means having routines to do that in memory just like random access to a file is important - though it's also possible just to use random access through the BDOS to achieve the same affect, but blocks in memory are 4K each, so being able to page in Block-N from File <X> to memory-page Y can be turned into a macro, extending the z80 architecture to some extent.
It seems quite an elegant, modular, approach to the problem.
I may not need to invent everything at the same time, but I got a huge number of ideas from this thread, so I'm grateful. It would have been nice to have them all before I started, but that's part of the learning curve.