Well, there's a lot of "that depends" in this. Do you need attributes? Text (number of rows) or graphics too?
Anything wrong with a good old 6845?
Mostly the issue with just using a 6845 is it needs quite a lot of support circuitry. Trying to be "clever" and seeing if I can minimize things a bit.
The code I'm starting with for my idea is
Grant Searle's video processor, which uses a 16mhz Atmega328 combined with a 74LS166 to be able to do 80x25 with the character buffer stored internally on the chip. There are 8 machine cycles available per character, so the display routine for each cell is basically:
"increment the pointer for the line buffer (assembled previously) that has the byte needed to display the dots for this character at this row"
" stuff the byte at that location into the output port the shift register is on"
"toggle the load latch on the shift register".
"untoggle the load latch"
My idea is to have the shift register sitting in front of the output of a character ROM, use the output port to represent 4 bits of row data for the character ROM and four bits of data for character line (remember, the TRS-80's video screen is 16 rows high), and use 6 bit counter to represent the line address. (Obviously this won't work for an 80 column line, or any other format where the number of characters per line isn't divisible by a power of two.) Then I have 8 cycles to:
1: toggle OE/MEMR for the RAM character buffer chip
2: toggle OE/MEMR for the character generator ROM
3: toggle the load latch on the shift register
4: untoggle all of the above. (All this toggling is achieved by writing pre-baked byte patterns to an output port)
5: send a pulse to increment the counter that's the bottom 6 bits of the memory address
6 untoggle that pulse.
All of the things above are *supposedly* single cycle instructions, so I think the math works out? Obviously I'll still need to multiplex the memory address lines to share with the CPU, or use a dual-port RAM chip if I want to make it non-conflicting. (The latter is tempting but, man, those turkeys are expensive.)
I think this is basically similar to the recipe that Tynemouth Software is doing with their Mini-Pet and Minstrel computers, but I have no doubt my implementation is going to be clunkier. (Their video processor is closed source.) One thought is to have a second shift register sitting directly on RAM output lines and set up the code so I can also optionally do a full 512x192 graphics mode. (Just use the character generator row address lines as upper address lines for a framebuffer instead. I actually intend to test the thing by just burning a bitmap into a flash ROM and using the address lines to clock that out row by row.)
Another thing that would be needed for a full TRS emulation would be a toggle-able clock divider to do a 32 column text mode, of course.