• Please review our updated Terms and Rules here

Kaypro 4-83 graphic overlay

gertk

Experienced Member
Joined
Jan 26, 2016
Messages
429
Location
Netherlands
Hello,

Since the Kaypro 4-83 does not have any graphic capabilities I decided to develop a small simple graphic extension board.
The board creates a graphical 560x240 pixel overlay on top of the text display. The board has its own memory.

Synchronization with the original video output is done by connecting some wires to a few points and lifting one pin in the video circuit of the mainboard.
To keep interfacing simple I decided not to try and get it memory mapped but instead control the overlay board through the two unused B ports of the two Z80 PIO adapters.
The mainboard has labeled solderpads for all the unused pins.
By first adding some pinheaders those solder pads next to the PIO's connecting is fairly easy.

Writing to the graphic memory is done in three steps:

  1. set X offset
  2. set Y offset
  3. write data

All done by IN and OUT instructions to the PIO chips.

It is also possible to read back the data from the graphic memory.

The rather odd resolution of 560 pixels horizontal comes from the fact that the Kaypro displays 5x8 character bitmaps in a 7x10 matrix.

So far I have made a prototype and tested with BBC basic and some small assembler routines (for speed) and the results are already quite nice.
There are still some small glitches when writing to the graphic memory resulting in some artifacts which I hope might be solvable by synchronizing some of the outputs of the GAL 16V8 (which acts as 'logic glue' on the board).

When the Kaypro is reset the output of the graphics overlay is disabled and the Kaypro works as usual.
By setting a single bit to zero the graphics overlay is activated.

The whole circuit consists of 6 TTL chips, a fast 32 kByte SRAM chip and a GAL16V8 chip.

The (crude) schematic is attached as PDF, I will add a list of which connections go to where if anybody is interested.

View attachment kpgfx.pdf

some example images
2020-10-14 14.03.02_small.jpg

2020-10-13 12.26.06_small.jpg

photo_2020-10-15_20-58-17.jpg

the prototype
photo_2020-10-08_22-09-49.jpg

Gert
 
Nice!

Looking at the schematic I see you have A0-A6 connected to the "X" latch and A7-A14 connected to the "Y" latch; I assume that means that the way you have the framebuffer laid out it's essentially 1024 bit-wide (128 byte) line that you're only using part of... I guess the question I'd have: Since the Kaypro natively has 7 pixel wide characters, are you using 80 bytes across and not using one of the bits in each character, or are you using 70 full bytes? ... Looking at the schematic I'm going to guess it's the first, because you're using a signal called "DCTC" as both the load signal for 74LS166 and to clock the address counters. (Which I'm guessing is pulled from the Kaypro's shift register load signal?)
 
Nice!

Looking at the schematic I see you have A0-A6 connected to the "X" latch and A7-A14 connected to the "Y" latch; I assume that means that the way you have the framebuffer laid out it's essentially 1024 bit-wide (128 byte) line that you're only using part of...

That is correct, the Kaypro uses a similar scheme: 128 bytes per line with 80 bytes visible, by using a clever way of wiring the address lines they still manage to keep it in 2 kBytes.
My counters are chugging along with this scheme resulting in 128 bytes per line and by combining the HS and VS signals it blanks out the rest (by resetting the shift register)

I guess the question I'd have: Since the Kaypro natively has 7 pixel wide characters, are you using 80 bytes across and not using one of the bits in each character, or are you using 70 full bytes? ... Looking at the schematic I'm going to guess it's the first, because you're using a signal called "DCTC" as both the load signal for 74LS166 and to clock the address counters. (Which I'm guessing is pulled from the Kaypro's shift register load signal?)

Yes I use 7 bits per character also, resulting in the 560 pixels horizontal resolution.

/DCTC is the character clock (inverted) taken from pin 1 of U6, it also latches the parallel input of the 74F166 shift register
TC is the non inverted signal coming out pin 15 of the pixel counter U1 (both signals are also available on U4)
I needed both edges for the 74HC590A counters (one to count and one to latch the output register)

The 13.9 MHz signal taken from pin 8 of U2

The HS signal is coming from pin 8 of U32 (aka A10 of the character rom), VS is taken from solder pad E4

Data lines KP_D0 to KP_D7 are connected to solderpads E14 to E7 (U52 PIO)

/ENA is connected to E18 (U52 PBRDY line, it pulses for every data byte written to port B of the PIO)
Solderpad E16 (U52 /PBSTB) is connected to ground (this gives the shortest pulse on E18, about 500 ns on my 5 MHz clocked Z80)

LDLO connects to E29 (PB0 of U72) when high selects the X latch
LDHI connects to E28 (PB1 of U73) when high selects the Y latch
KPWR connects to E31 (PB2 of U73) when writing this signal is high, reading is low
/VON connects to E34 (PB7 of U73) when low the graphic overlay is enabled, it is pulled high with a 10k resistor so when the Kaypro is reset the overlay is off.

Port B of U73 is used in mode 3, that leaves the other pins open for either in- or ouptut.

/GFX_OUT from the GAL is fed into (the lifted) pin 5 of U15

In my prototype I used a 20V8 GAL and the equates are here (turned out a GAL 16V8 can handle all the signals also)
photo_2020-10-16_11-10-24.jpg

None of the outputs of the GAL are synchronized (yet) and could explain the glitches I sometimes see.
In the new circuit I have made provision to synchronize with either /DCTC or TC but have not tried this yet.

BTW reading from the graphic memory needs an extra dummy 'IN' instruction because of the odd PBRDY/PBSTB handshake the PIO does when the port is in input mode.
 
That is correct, the Kaypro uses a similar scheme: 128 bytes per line with 80 bytes visible, by using a clever way of wiring the address lines they still manage to keep it in 2 kBytes.

I'm really curious how they did that, I'll have to see if I can dig up the schematic.

I've been working on my own video project, a from-scratch video card intended to emulate early S100 video cards like the Polymorphic VTI or Solid State Music VB1B (or a TRS-80's built-in video) in addition do doing full bitmap graphics, and a working constraint I initially adopted was having the memory for each video line having to be a power-of-two in width because it simplifies a number of things significantly. (Which is why 64x16 was a popular video format for a couple years.) I didn't know when I started that apparently that same constraint secretly applied to some early 80 column cards as well. The Kaypro's apparently another example.

None of the outputs of the GAL are synchronized (yet) and could explain the glitches I sometimes see.
In the new circuit I have made provision to synchronize with either /DCTC or TC but have not tried this yet.

I assume the glitches are the normal "hash" you see on video systems that don't arbitrate the CPU with wait states when the video memory is being accessed by the video out circuitry? (IE, like a TRS-80 Model I or an original IBM CGA card.) I'm not sure how you'd solve that without generating wait states or using dual-ported memory.
 
Most of the glitches/artifacts occur on the leftmost column and the pattern is often the same: 1010101 (and it gets stored into RAM)

There is also some (barely visible) hash during the writes but that was expected, the Kaypro video circuit has the same 'problem'

The timing is quite critical I guess as the write signal, /RAMWR, and the outputs of the latches and buffer are only active during the low portion of the /ENA signal (PBRDY).
Might be violating the timing rules of the SRAM a bit here :)

I did use a very fast SRAM chip (I have a whole bunch of the old 32 kByte PC cache rams of 15 and 20 ns in the 'skinny' DIP28 format)

Below I attached the schematics for the IO/Video and CPU circuits of the Kaypro 2 & 4-83.

View attachment kaypro4_83_cpu.pdf
View attachment kaypro4_83_io.pdf
View attachment kaypro4_83_video.pdf

P.S. It seems I have left out the logic equation for the /LOE and /CNTRES signals in the image, but they are as follows:

Code:
' COUNTER RESET '
CNTRES = VS;
CNTRES.OE = OE;

'LATCH and BUFFER OUTPUT ENABLE, active only when reading or writing data '
LOE = ENA * !LDLO * !LDHI
LOE.OE = OE;
 
I posted the circuit diagrams of the CPU, IO and Video circuit of the Kaypro above.

Also tried to (RC) delay the output enables of the latches and buffer but it makes no difference.
Maybe I need to toggle the /CS or /OE of the SRAM but then I need to have an output on the GAL...
The video output inversion can be skipped as there is a spare NAND port on U5 available.
For blanking the output I can also use the shift register reset output freeing up one output on the GAL.
 
Just took a look at the Kaypro's video schematic, and I have to admit I'm stuck scratching my head why they used a six bit edge-triggered latch plus an 8:1 multiplexer instead of a shift register for dot generation, but I guess there's no reason why that wouldn't work?

For blanking the output I can also use the shift register reset output freeing up one output on the GAL.

I think I'm a little thick this morning to work it out, so I have some questions about what you've learned from driving the shift register!

First: You're driving the Load pin (and clocking the GAL) with a once-per-character pulse from the Kaypro's motherboard? My dumb question, since it's a thing I actually haven't tried yet with the video dingus I'm building: the characters in this machine are 7 bits wide. When a character is less than 8 bits does triggering a load automatically make the next bit clocked out bit0 again, or are you having to trigger a clear on the shift register? It doesn't *look* like you are with the GAL code.

Second: I'm curious, do you really need to work the clear input on the shift register? (And if you do I'd genuinely like to know why, in case it's a thing I need to worry about.) On the thing I'm building at first I thought I needed to use either the CE/Inhibit input or the clear input on the shift register to keep it from outputting recirculated data when I was outside the active video area, but then I realized if pin 1 was tied low like you have it'd just clock out an endless string of zeros when it wasn't within 8 cycles of the last load command. But maybe there's something about the overlay or cooperating with the Kaypro's native circuitry that makes you need it?

In case you're curious at all I'm using a GAL on my project too; I'll probably be using at least one other because the thing I haven't tackled yet is the memory arbitration, which may well run into whatever issues you're having. (I'm likewise intending to use a fast 32k SRAM; I have a couple 35ns Cypress parts on hand that I hope are fast enough to do the needful, but right now the prototype is just generating fixed content from a 70ns flash chip) The GAL that's in there now is clocked at the pixel clock (12mhz) and four of the outputs are set up in registered mode as a 4-bit counter that I can hold in halt/reset or allow to free-run. Address generation is done by an AVR 8-bit CPU running synchronously at the pixel clock, and the shift register load signal is generated by the GAL based on that counter, which I start free-running at the start of each active line area. (That frees up instructions otherwise needed to toggle the shift register so I can do more per character while still giving a predictable cadence so I can prevent doing the address loads that would unsettle the memory output when the load is going on.)
 
Just took a look at the Kaypro's video schematic, and I have to admit I'm stuck scratching my head why they used a six bit edge-triggered latch plus an 8:1 multiplexer instead of a shift register for dot generation, but I guess there's no reason why that wouldn't work?

No idea why they used this elaborate concept, also using a 4 bit counter starting at 9 to get 7 dots effective, then only using 5 bits from the rom, scanning 10 lines per character vertically and use only 8... A10 and the enable pin(s) of the rom are also used for blanking. The character counter is running from 0-127 and 0-79 are visible (maybe that way it was easier to generate the horizontal timing)

I think I'm a little thick this morning to work it out, so I have some questions about what you've learned from driving the shift register!

First: You're driving the Load pin (and clocking the GAL) with a once-per-character pulse from the Kaypro's motherboard? My dumb question, since it's a thing I actually haven't tried yet with the video dingus I'm building: the characters in this machine are 7 bits wide. When a character is less than 8 bits does triggering a load automatically make the next bit clocked out bit0 again, or are you having to trigger a clear on the shift register? It doesn't *look* like you are with the GAL code.

Yes, parallel loading the shift register is shifting bit D7 (input H) out first so that way it stays in sync with the 7 bit width scheme. I'm keeping the shift register reset during the blanking (vertical and horizontal) otherwise the data keeps on going for all 128 bytes and I get retrace problems. The serial input can either be tied to ground or Vcc since the shift register never gets to the bit from input A (aka D0)

Second: I'm curious, do you really need to work the clear input on the shift register? (And if you do I'd genuinely like to know why, in case it's a thing I need to worry about.) On the thing I'm building at first I thought I needed to use either the CE/Inhibit input or the clear input on the shift register to keep it from outputting recirculated data when I was outside the active video area, but then I realized if pin 1 was tied low like you have it'd just clock out an endless string of zeros when it wasn't within 8 cycles of the last load command. But maybe there's something about the overlay or cooperating with the Kaypro's native circuitry that makes you need it?


To keep the screen from getting overlayed with random data during startup (making the Kaypro's own video unreadable), I keep the video output disabled until 'switched on' by the /VON input.
The graphics ram can then be cleared (or written) first and then switched on. At the moment I am using the /VON in the graphic video inverter stage of the GAL.
The /W output of the Kaypro video multiplexer U41 is fed into the NAND gate U15, used as inverter, by lifting pin 5 I have an extra input for my graphic video but it needs to be inverted too :D

In case you're curious at all I'm using a GAL on my project too; I'll probably be using at least one other because the thing I haven't tackled yet is the memory arbitration, which may well run into whatever issues you're having. (I'm likewise intending to use a fast 32k SRAM; I have a couple 35ns Cypress parts on hand that I hope are fast enough to do the needful, but right now the prototype is just generating fixed content from a 70ns flash chip) The GAL that's in there now is clocked at the pixel clock (12mhz) and four of the outputs are set up in registered mode as a 4-bit counter that I can hold in halt/reset or allow to free-run. Address generation is done by an AVR 8-bit CPU running synchronously at the pixel clock, and the shift register load signal is generated by the GAL based on that counter, which I start free-running at the start of each active line area. (That frees up instructions otherwise needed to toggle the shift register so I can do more per character while still giving a predictable cadence so I can prevent doing the address loads that would unsettle the memory output when the load is going on.)

When using 8 pixel wide characters there is no need to reset the shift register as every 8 pixels you need to clock in new data anyway, by tying the serial input low you can leave the dotclock running during blank, just stop the parallel load from happening and it will work fine.

Arbitration is the hardest part. You need either a synchronous system of video and CPU (as can be done with the 6502 for example: first half of the CPU clock is video access, second half is VIDEO access) but with a Z80 you either have to add waitstates or use some other clever technique.

I have thought of the following:
The CPU writes data and address to some latches and then the video circuit handles the writing (synchronously with the character clock)
With a fast SRAM chip this should be doable.

You have a 12 MHz dotclock and it gets divided by 8 for your character clock so your time per character access is 1.5 usec. By using the 'other edge' of that character clock you could write the data from the latches into the SRAM without disturbing the reads (which would happen on the leading edge of the clock). If the CPU has an access time of less than 1.5 usec there would be no contention. If the CPU is faster you could add fixed time as waitstate for the video memory range (or uses some NOP's before accessing the video ram again)
You do need a flipflop set by the writing of the latches by the CPU which indicates that writing is needed, and is reset by the actual write by the video circuit.
Otherwise the video ram will be filled with the same data over and over...
 
Yes, parallel loading the shift register is shifting bit D7 (input H) out first so that way it stays in sync with the 7 bit width scheme. I'm keeping the shift register reset during the blanking (vertical and horizontal) otherwise the data keeps on going for all 128 bytes and I get retrace problems. The serial input can either be tied to ground or Vcc since the shift register never gets to the bit from input A (aka D0)

To keep the screen from getting overlayed with random data during startup (making the Kaypro's own video unreadable), I keep the video output disabled until 'switched on' by the /VON input.

... I was about to say I'm still a little confused, but I think I get it? I was about to say the datasheet I have for the '166 seems to indicate that when you hold "clear" the output at Qh is low, so that shouldn't really be any different than if clear isn't held low and the '166 is happily running along clocking out the low states it's getting from pin 1 being grounded, but... in the kaypro I assume the problem is you're still getting the "load" pin toggled and loading random memory contents even when you're out of the active video area, so you're pulling "clear" to squelch that.

(Would it be any more efficient to make the shift register load switchable on/and/off by the GAL instead of manipulating clear? I guess it probably wouldn't make much difference. At least the takeaway is my scheme should be fine as long as I'm only generating load signals in the active area.)

I was wondering about the less-than-8-bits thing because I'm thinking of trying to do a VGA output version, but because of the limits on how fast I can clock the CPU I'm using instead of a discrete timing chain I might have to do six bit wide characters instead of eight. (Right now I'm doing a 512x192 effective pixel area on composite with the 12mhz clock, I was thinking with a 20mhz clock, which is the limit without overclocking, I could do 384x192 (triple-scanned) using the exact VESA timings for 800x600@60hz, which uses a 40mhz clock.) I'll need to reprogram the GAL magic that generates the loads to reset the counter at six bit intervals instead of free-running and just rolling over...

Arbitration is the hardest part. You need either a synchronous system of video and CPU (as can be done with the 6502 for example: first half of the CPU clock is video access, second half is VIDEO access) but with a Z80 you either have to add waitstates or use some other clever technique.

Yeah, I can definitely see why the home computer makers loved the 6502 so much. Slave the CPU to the pixel clock and the job's done for you. Alas I'm more interested in this phase targeting S-100 and TRS-80-like Z-80 applications. :)

I have thought of the following:
The CPU writes data and address to some latches and then the video circuit handles the writing (synchronously with the character clock)
With a fast SRAM chip this should be doable.

You have a 12 MHz dotclock and it gets divided by 8 for your character clock so your time per character access is 1.5 usec. By using the 'other edge' of that character clock you could write the data from the latches into the SRAM without disturbing the reads (which would happen on the leading edge of the clock). If the CPU has an access time of less than 1.5 usec there would be no contention. If the CPU is faster you could add fixed time as waitstate for the video memory range (or uses some NOP's before accessing the video ram again)
You do need a flipflop set by the writing of the latches by the CPU which indicates that writing is needed, and is reset by the actual write by the video circuit.
Otherwise the video ram will be filled with the same data over and over...

Yeah, I was trying to think at one point if a scheme like that with latches really helped me, but I keep getting stuck on the fact that while you might be able to use one to make *writes* synchronous you're boned when the CPU wants to read from memory; either you take the glitch or you need to generate wait states because you won't be able to preload a read latch for when the CPU request comes in.

Since the TRS-80 is one of my design inspirations I was looking at how the Model III de-glitched its display (the Model I didn't), and if I understand its service manual correctly it basically has logic to pull down the WAIT signal for the Z80 if video RAM is accessed outside of a blanking area and holds it until it goes into blank. Worst case that would halt the CPU for 64 character's worth of video output. (IE, it looks like it makes no attempt to make the access hold any more granular than that.) If that's a legit way to go that's easy, I could just implement similar logic that defines "not in blank" by that line that lets the load counter in the GAL run. It does seem like there should be a "smarter" way to do it, though, definitely.

(Ultimately maybe I don't care about video glitch artifacts as long as I can actually make the memory read/writes reliable, but glitch free would be "nice".)

Here's a picture of the video output from mine as it stands. I just this week solved a problem with another function I had the GAL for, which was a selectable 2:1 clock divider for the pixel clock so I could do both "high" and low-res modes in hardware. (512 vs 256 pixel; the TRS-80 and some of the S-100 cards I want to be able to emulate had 32 column modes.) Shows high and low res alternately running with the same source bitmap. (With the lower res I can move a 32 bytes "viewport" around on a 64 byte line by adjusting counter offsets.)

hi_res_low_res.jpg
 
Last edited:
... I was about to say I'm still a little confused, but I think I get it? I was about to say the datasheet I have for the '166 seems to indicate that when you hold "clear" the output at Qh is low, so that shouldn't really be any different than if clear isn't held low and the '166 is happily running along clocking out the low states it's getting from pin 1 being grounded, but... in the kaypro I assume the problem is you're still getting the "load" pin toggled and loading random memory contents even when you're out of the active video area, so you're pulling "clear" to squelch that.
Yes the Kaypro counters keep running and thus /DCTC is also active during the retrace/blanking periods.

(Would it be any more efficient to make the shift register load switchable on/and/off by the GAL instead of manipulating clear? I guess it probably wouldn't make much difference. At least the takeaway is my scheme should be fine as long as I'm only generating load signals in the active area.)

Since the dotclock is not fed into the GAL it would be tricky to rely on the GAL to generate the load for the shift register.

Yeah, I was trying to think at one point if a scheme like that with latches really helped me, but I keep getting stuck on the fact that while you might be able to use one to make *writes* synchronous you're boned when the CPU wants to read from memory; either you take the glitch or you need to generate wait states because you won't be able to preload a read latch for when the CPU request comes in.

For reading you could do the same, it just takes two accesses: one for setting the address and the second for reading the data. That is what is happening with my board also since the /ENA pulse is happening after read from the PIO, and the data is thus latched too late, a second 'IN' instruction then reads the PIO latched data.


Since the TRS-80 is one of my design inspirations I was looking at how the Model III de-glitched its display (the Model I didn't), and if I understand its service manual correctly it basically has logic to pull down the WAIT signal for the Z80 if video RAM is accessed outside of a blanking area and holds it until it goes into blank. Worst case that would halt the CPU for 64 character's worth of video output. (IE, it looks like it makes no attempt to make the access hold any more granular than that.) If that's a legit way to go that's easy, I could just implement similar logic that defines "not in blank" by that line that lets the load counter in the GAL run. It does seem like there should be a "smarter" way to do it, though, definitely.

(Ultimately maybe I don't care about video glitch artifacts as long as I can actually make the memory read/writes reliable, but glitch free would be "nice".)

On the Acorn Atom the video is also not synchronized to the CPU (and they used a 6502 and an 6847 which can do that perfectly!) and with the games they just waited for the vertical blank to appear and then read/write the screen data. Outside that area there was also 'snow' on the screen.. Alas a lot of CPU cycles were lost..

Here's a picture of the video output from mine as it stands. I just this week solved a problem with another function I had the GAL for, which was a selectable 2:1 clock divider for the pixel clock so I could do both "high" and low-res modes in hardware. (512 vs 256 pixel; the TRS-80 and some of the S-100 cards I want to be able to emulate had 32 column modes.) Shows high and low res alternately running with the same source bitmap. (With the lower res I can move a 32 bytes "viewport" around on a 64 byte line by adjusting counter offsets.)

View attachment 64175

Pictures look very nice, and yes you can do some real magic with those GAL's but I often run out of outputs... :)
 
Since the dotclock is not fed into the GAL it would be tricky to rely on the GAL to generate the load for the shift register.

Yeah, I wasn't thinking of having the GAL generate the load, just mute it if you're not in the blanking area, IE, psuedocode:

/SRLOAD = /BLANK * /DCTC

Where "blank" is whatever combination of conditions that say you're on an active section of a line... but, actually, how you're doing it with holding clear is probably better. The thing that actually pushed me over into generating the LOAD signal with the GAL on my design is I was having some spurious issues after I added the "low-res" clock divider that looked like issues with the LOAD signal registering, and I *think* the culprit may have been the propagation delay added to the pixel clock was causing it to "miss" the load signal going direct from the timing. Whatever it was generating the load signal on a counter again directly in phase with the pixel clock cleaned it up. I suppose there's a chance if you gated LOAD on the GAL it would put the shifted pixels slightly out of register with the Kaypro's pixels?

For reading you could do the same, it just takes two accesses: one for setting the address and the second for reading the data. That is what is happening with my board also since the /ENA pulse is happening after read from the PIO, and the data is thus latched too late, a second 'IN' instruction then reads the PIO latched data.

I was planning to make my board straight memory-mapped (although there will probably be a page register so it doesn't actually take a full 12k+ of linear address space, at least unless you want it to), so I don't have the PIO load cycles to get a jump on the memory addresses. Pretty much stuck with techniques that work inside of a single memory read/write cycle.

Something that's crossed my mind is maybe having some kind of "state machine" based on the SR load counter that would present a wait state on video memory access if it happens within, say, 3/4's of the character cycle leading up to the shift register load, but open it up for a couple bits immediately after the load has been latched. I need to sit down with a piece of paper and compare the machine state timing of the Z-80 with a cycle like that and see if that would leave useful windows long enough for the Z-80 to shove in a read or write and finish with enough time for the address bus to switch back to the "CRTC" and settle to get a clean pixel output for the shift register. Depending on the ratio between CPU speed and character timing I have a feeling it might just end up synchronously sampling T-states during the "wrong" period and ride to the end of the active line anyway, making it the same as the Model III's wait-for-blank method.

(At 12mhz there's an odd number of pixel clocks per line with my timing, so drifting in and out of "register" would be a real possibility too.)

On the Acorn Atom the video is also not synchronized to the CPU (and they used a 6502 and an 6847 which can do that perfectly!) and with the games they just waited for the vertical blank to appear and then read/write the screen data. Outside that area there was also 'snow' on the screen.. Alas a lot of CPU cycles were lost..

I vaguely recall the early versions of the Commodore PET used memory too slow for the 6502's lockstep to work, so they had a vertical refresh interrupt that they intended all screen updates to happen during. (Which means they hash if you write directly to the screen.) But later ones did the timing sync thing?

Pictures look very nice, and yes you can do some real magic with those GAL's but I often run out of outputs... :)

Doing things like counters in GALs makes me realize it's probably time to suck it up and try a full CPLD. It's kind of a bummer there aren't any "internal" registers, if you want a counter you lose an output for every bit. :p
 
Some years ago I connected an 8 bit ISA VGA card to a Z80: http://kgelabs.nl/?p=147
Needed a special 'initial wait state generator' to get the memory access right (see the handdrawn circuit on that page above)

The VGA card asserted the /WAIT line too late for the Z80 to act upon (especially during writes). The circuit now asserts the /WAIT line almost immediately after /MREQ and holds it for the first cycles.
After the circuit has taken /WAIT low, the VGA card itself takes the /WAIT over until the memory access is done.
 
The VGA card asserted the /WAIT line too late for the Z80 to act upon (especially during writes). The circuit now asserts the /WAIT line almost immediately after /MREQ and holds it for the first cycles.
After the circuit has taken /WAIT low, the VGA card itself takes the /WAIT over until the memory access is done.

Huh. Looking at an ISA reference it says "Devices using this signal to insert wait states should drive it low immediately after detecting a valid address decode and an active read or write command". I guess the important part there is the "and an active read or write command"; looking at the graphs for a Z-80 bus cycle the Z80 on a read cycle asserts READ at the *same time* as MREQ, while the ISA bus timing does show IOCHRDY not coming until after the read/write pulse starts. So I guess always pre-emptively toggling the insertion of a WAIT on decode is indeed what you'd have to do to be sure when adapting an ISA device.

Thanks you for this, I was looking for some practical examples of how to do WAITs on the Z80.
 
Small update:

I noticed that the 79th column of my graphics overlay only displayed the leftmost pixel. Seems the horizontal sync signal I used for blanking starts at the beginning of the character..
By using the horizontal sync signal from pin E2 it now displays the 79th column correctly but als column 80, 81 82 etc.. up to the end of the screen...
Not a real big problem as you can simply wipe those bytes and then it is fine.

The 1010101 dot patterns which sometimes occured on the far left side are gone now.
Still I am getting minor corruption on random places on screen during writes and can not lay my finger on those.
I reprogrammed the GAL to also emit an /OE signal for the ram hoping it might help but it made no difference.

The graphics video on/off is now done by an additional logic equation on the reset of the shift register and works fine freeing up an output pin on the GAL (now used for the /OE of the RAM)

The inversion of the graphics output of the shift register is now done by the unused NAND port of U15 (lifted pins 1, 2, 3 and 6 from the socket, video from the graphics board enters on pin 1+2 and pin 3 is connected to pin 6 for mixing with the original Kaypro output)
 
Still struggling with the weird stray pixels and corruptions.
When I write horiziontal lines almost no corruption happens, when writing vertical lines stray pixels appear at random places.

Hope you can see this video: https://www.youtube.com/watch?v=_fJVwtxcsWM

The hash is quite visible when hammering the read/writes but it causes no permanent corruptions.
 
Last edited:
Hope you can see this video: https://www.youtube.com/watch?v=_fJVwtxcsWM

The hash is quite visible when hammering the read/writes but it causes no permanent corruptions.

Huh. So when you so "no permanent corruptions" do you mean the offset/extra dots that are appearing between the vertical "pinstripes" don't read back as incorrect if you read the memory contents back? (Because they do seem to be stable on the screen.)
 
... wait, never mind the above, I realized you were just referring to the hash not leaving a mark.

In the video it kind of looked like a lot of the dots were offset a consistent distance from the column they were supposed to be in, could this be simply a noise/crosstalk issue? I spent hobby time this weekend rebuilding my prototype to use an AVR 324 instead of a 328 for the timing generator, and after neatening up the wiring I have significantly reduced “jailbar” interference, but that’s let me notice some individual flickering dots. What’s interesting about them is their locations are pretty consistent and I can make them go away by resting my hand on the data bus wires; it’s like certain magic combinations of address and data bus conditions can add up to enough noise to trigger a glitch.
 
... wait, never mind the above, I realized you were just referring to the hash not leaving a mark.

In the video it kind of looked like a lot of the dots were offset a consistent distance from the column they were supposed to be in, could this be simply a noise/crosstalk issue? I spent hobby time this weekend rebuilding my prototype to use an AVR 324 instead of a 328 for the timing generator, and after neatening up the wiring I have significantly reduced “jailbar” interference, but that’s let me notice some individual flickering dots. What’s interesting about them is their locations are pretty consistent and I can make them go away by resting my hand on the data bus wires; it’s like certain magic combinations of address and data bus conditions can add up to enough noise to trigger a glitch.

Yes I was thinking about interference also but the only way I could influence the corruptions is by laying my fingers on the pins of the buffer and/or latches and squeezing hard.
Viewing the signals with the oscilloscope they are nice and square.

My guess is that the write timing is too critical. Since the SRAM /WR pulse is generated by the PIO's BRDY signal and this also switches the outputs around from the counters to the latches, the moment the SRAM /WR rises the address latches and databuffer also leave the SRAM address- and databus and the counter outputs take over again.
According to the datasheet of the SRAM the write hold time can be as short as 0 ns so in theory it should work.

Another possibility is that the latches are too slow (74HC573) and the address they provide is not stable as the SRAM /WR pulse is released.

The length of the BRDY pulse from the PIO (with BSTB and BRDY connected together) should be one clock cycle (200 ns as my CPU is clocked at 5 MHz at the moment but switching it back to 2.5 MHz makes no difference, in the circuit diagram it is difficult to see if the PIO and SIO chips are also clocked from the same source as the CPU, have to check with the oscilloscope.

If not reading or writing the SRAM through the PIO the screen is rocksteady, never misses a pixel, in fact I only reset both counters at once on the vertical sync and there are no weird byte shifts or such per line.


I will try to replace the latches with edge triggered 74HC574 or 74F574 instead of the transparant 74HC573 latches I have in there now, if no succes I will hook up the Salea16 logic analyzer to the board and see what is going on...
 
For laughs here's a picture of one of those magic "flickering yet persistent" dots on my prototype. Cleaning up the wiring on the breadboard nixed most of them, but that one dot visible under the pterosaur's chin in low-res mode below:

324proto_32column.jpg

Was still around, at least in the initial round of testing I did yesterday evening. (I generated a couple new test plates with information useful for actually verifying that I was getting the correct number of pixels per line, et al, and demo displaying different-width framebuffers offset in different memory locations and doing horizontal/vertical hardware scrolling.) I could make it go away by putting my finger on the flash chip, and then disappeared mystly after I yanked out the MCU for reprogramming the second time and shoved some wiring around a little. Curiously the dot was *not* there in high-res mode, even though that's referencing the same location on the same frame buffer, only in low res. Some very minor timing difference...

(I have a suspicion based on the location of the glitch that this is related to address line A12 updating, because in the wiring rats-nest that crosses a wire that shares a pull-up between the shift register and the ROM chip...)

The lesson I'm taking away from this is that noise/crosstalk/signal integrity is a harsh mistress when you're playing with multi-mhz circuits. :p I'm sure this is going to be super-fun now that I'm about out of excuses to not start working on the RAM interfacing/contention circuitry.
 
Last edited:
Back
Top