• Please review our updated Terms and Rules here

Tandy 1000 TL - 85 Color Demo

Since this stuff is a bit Tandy specific, what about a small memory expansion card that filled in upper conventional RAM - the same RAM the video controller is scanning from. Then add an IDE interface to that with on-board DMA. It would be cheating, but the first design iterrations I did for JR-IDE (on paper) did just that. The DMA transfer loop, lane steering, and latch logic fit in an ATF1508. The 1MB RAM was organized in 16 bits with two parts on the back of the pld along with the IDE header. Block transfers to an on-chip address target could run as fast as any modern IDE device supports with an independent clock. Again.. cheating but a thought. It would still use the on-board video for output.

Wow... complex... My goal would be to get a "stock" Tandy 1000 TL. I use the word stock because I will use an XT-IDE only because I do not have another mass storage solution. Were there any hard drives that maxed out the ISA bus at that time?
 
ADP-50 is often cited as the fastest 8-bit ISA hard-disk controller. With an ATA drive in a PC/XT it will do about 300KB/s; should be a little faster in a TL given the CPU and 16-bit RAM bus.
 
Ok - slideshow time... Help me pick a theme. I am open to suggestions and or image submissions. Resolution doesn't matter and you can just send a hyperlink. I plan on making a slideshow that would fit on one 720k floppy disk, so 10 pictures. I plan to make two slide shows. One of images you guys suggest and one of NSFW images (you have to, it is required.) I was thinking of making three - maybe one dedicated to video games, modern or classic, but they have to be ones that used more than 16 colors or colors that the Tandy 16 palette did not have... Open to suggestions.
 
This is something I've obviously put a lot of thought into (and very recently, code). I have solutions to the bandwidth, compression, and colors problem, but being of the demoscene mentality, I'm hesitant to share my thoughts/ideas until I actually use them in a "world first" production :) But it would be a jerk move for me to not offer any advice, so here's some advice:


  • Measure your target hardware before starting. Knowing what your limitations are will guide your progress. Measure hard disk streaming speed and blit-to-video-ram speed. You may find that your original idea is no longer feasible but a new idea will pop into your head... sometimes better than the old idea.
  • If chjmartin2 hasn't viewed http://archive.org/details/8088CorruptionExplained yet, merely watching that should give him some ideas. Also grab the latest 8088flex version because it's capable of stable uninterrupted 60fps given enough bandwidth (use the pop.tmv example). Examine the playback code; it's fairly simple, all things considered.
  • On any CPU under 25MHz, video playback is an I/O problem, not a CPU problem. If you're using more than 5% CPU to play animation/"video", you're doing something wrong.
  • Use your powerful modern computer to do all the heavy lifting. If preprocessing something takes 30 minutes on your old machine, don't do it on your old machine. Your development will go much faster this way. (The 8088 Corruption encoder was written for/in Windows.)
  • Delta unpacking will require assembly. If BASIC is your environment, read up on POKEing opcode values into memory and then calling the routines. For short routines, debug.com is helpful (type "a 0100" to start assembling and then "u 0100" to view the binary of what you just wrote).
  • If you EVER plan on playing audio with the video, build that into your plans at the very start. Even if you're not going to use it right away (but you should, since the TL has a built-in DMA DAC).

And the #1 tip I can give if you are working in 320x200 or higher: Ordered dither. (NOT Floyd-Stienberg)

And from advice, I now switch to opinion: The TL series has 640x200x16 -- if you are going for color fidelity and not framerate, use it. You already have a temporal "mix" of 85 colors; now think about mixing spatially. 640x200 on a fuzzy CGA monitor fuzzes the pixels so much that they blend together...
 
It seems like an optimized code generator based on the video content may be in order. You could implement macro-block type updates only the macro blocks could be as small as a one pixel updated with one instruction. But instead of blits, off-line generate instruction sequences that either update large blocks with word immediate moves or use the sequences sparingly to increase image quality in places where it improves the mean error rate the most. The data density overall for storing the generated code is larger than raw block image data, but updating a sub-set of pixels with slightly smaller code than a whole-sale block replacement might generate a good enough update and result in faster processing time.

If you drive a select 2012+ GM vehicle, the key-on splash animation screen is actually generated from an evolved sprite compiler I wrote in 1991 for an old parallax scroller game! The concept twisted a few minds when I first brought it up at work to increase frame rates. But it was immensely satisfying to teach some new dogs old tricks.
 
If you drive a select 2012+ GM vehicle, the key-on splash animation screen is actually generated from an evolved sprite compiler I wrote in 1991 for an old parallax scroller game! The concept twisted a few minds when I first brought it up at work to increase frame rates. But it was immensely satisfying to teach some new dogs old tricks.

That's totally awesome!
 
Was named 'Gade'. We had a publisher in Austin lined up but ultimately wasn't able to deliver on deadlines. Mostly because a lack of local artistic talent (in Little Rock).
 
On Tandy 1000 video...

On Tandy 1000 video...

Measure your target hardware before starting. Knowing what your limitations are will guide your progress. Measure hard disk streaming speed and blit-to-video-ram speed. You may find that your original idea is no longer feasible but a new idea will pop into your head... sometimes better than the old idea.

Already have that in mind. I am getting an XT-IDE which should let me get 250k to 300k of read/write.


If chjmartin2 hasn't viewed http://archive.org/details/8088CorruptionExplained yet, merely watching that should give him some ideas. Also grab the latest 8088flex version because it's capable of stable uninterrupted 60fps given enough bandwidth (use the pop.tmv example). Examine the playback code; it's fairly simple, all things considered.

You know I have! You just don't remember. http://www.atariage.com/forums/topic/195039-mattel-aquarius-movie-player/

On any CPU under 25MHz, video playback is an I/O problem, not a CPU problem. If you're using more than 5% CPU to play animation/"video", you're doing something wrong.

Understood

Use your powerful modern computer to do all the heavy lifting. If preprocessing something takes 30 minutes on your old machine, don't do it on your old machine. Your development will go much faster this way. (The 8088 Corruption encoder was written for/in Windows.)

Frame encoder is in Free Basic.

Delta unpacking will require assembly. If BASIC is your environment, read up on POKEing opcode values into memory and then calling the routines. For short routines, debug.com is helpful (type "a 0100" to start assembling and then "u 0100" to view the binary of what you just wrote).

GW Basic was just used because it gives easy access to the Tandy 1000 video mode. Was not a requirement, plan on using ASM, but it was fun to make it a simple bload, page flip demo.

If you EVER plan on playing audio with the video, build that into your plans at the very start. Even if you're not going to use it right away (but you should, since the TL has a built-in DMA DAC).

Right - have to leave space for it intraframe in the datastream.


And the #1 tip I can give if you are working in 320x200 or higher: Ordered dither. (NOT Floyd-Stienberg)

I want to argue about this. I am using Stucki and I like the diffusion dithers better than any of the Bayer stuff. You've said this to me before. I just cannot agree!


And from advice, I now switch to opinion: The TL series has 640x200x16 -- if you are going for color fidelity and not framerate, use it. You already have a temporal "mix" of 85 colors; now think about mixing spatially. 640x200 on a fuzzy CGA monitor fuzzes the pixels so much that they blend together...

Should be trivial to do it at 640x200 rather than 320x200. I wasn't sure about how many pages I would have available in GW-Basic, so I went ahead and started here. That one is on the list too. I think step one will be to do it at whatever resolution I can support, straightforward, load data from disk, transfer to video ram, send to DAC, repeat and see what I get for throughput to start...

Sorry if I am treading in your territory! :)
 
I want to argue about this. I am using Stucki and I like the diffusion dithers better than any of the Bayer stuff. You've said this to me before. I just cannot agree!

Then you must not be considering deltas in your scheme, because diffusion dither schemes change nearly every pixel.

Sorry if I am treading in your territory! :)

On the contrary, I am intrigued to see what you can produce. Competition benefits consumers!
 
Not sure. It's a very simple animation. Only 3-5 seconds, 16bpp color, 15fps. But the person before me had a series of full frame jpegs and a commercial (and expensive) jpeg decoder for ARM doing frame by frame decode and blits. It took >65% CPU during a very time critical window. By encoding some of the lower complexity frames by generating ARM thumb move immediates, incs, and stores and adding some basic loops where appropriate (long RLE runs) to draw the frame deltas directly into the off frame buffer (relative to 2 frames back), then just zipping up the raw higher complexity frames, the CPU dropped to <15% on the start-up animation and <5% on the power flow animations. Can see different ones on Camero, Verano, LaCrosse, Volt, Terrain, and a few others.

The zen of code optimization has been lost on a lot of young people. Michael would not be pleased.
 
Not sure. It's a very simple animation. Only 3-5 seconds, 16bpp color, 15fps. But the person before me had a series of full frame jpegs and a commercial (and expensive) jpeg decoder for ARM doing frame by frame decode and blits. It took >65% CPU during a very time critical window. By encoding some of the lower complexity frames by generating ARM thumb move immediates, incs, and stores and adding some basic loops where appropriate (long RLE runs) to draw the frame deltas directly into the off frame buffer (relative to 2 frames back), then just zipping up the raw higher complexity frames, the CPU dropped to <15% on the start-up animation and <5% on the power flow animations. Can see different ones on Camero, Verano, LaCrosse, Volt, Terrain, and a few others.

The zen of code optimization has been lost on a lot of young people. Michael would not be pleased.

Well... just alot easier to buy the component and wire it up. It drives me nuts that with all of the CPU power we have to day that you still have to wait. Operating systems (windows) are complete bloat-ware, and I am sure the linux guys will come out and talk about how great that system is, blah, blah, it is still bloat-ware. Just imagine if all of these coders had to work in ASM and really optimize their code. Too easy to be lazy and bang it out using some high level language (don't get me started) than it is to really understand what is going on. To me it is a side effect of how easy it has become to work on systems, upgrade them, change out parts, etc. I realize that I never had to use a punch card, or an S100 or program by flipping switches, but if I had to I am sure I could figure something out. I am by no means an ASM guru (better at Z80 than x86) and not even a very good coder, but it makes me so angry when I think about how many calcs per second and how much memory modern computers have and how TERRIBLE the software is for them. Only stuff that comes close in my mind are game engines, but I bet there is plenty of lazy bloat-code there too...

Who started this?
 
Then you must not be considering deltas in your scheme, because diffusion dither schemes change nearly every pixel.

Good point. I thought the argument was that ordered dither produces the best visual at 320x200. So yes, if I wanted to intra-frame compression then ordered would be a better method. Your suggestion makes me think of trying to figure out to make a series of tiles and XOR.... hmmmm....

On the contrary, I am intrigued to see what you can produce. Competition benefits consumers!

Atari 2600 guy got mad at me for working on something similar to what he had done, so I continue to be cautious. I just like trying to optimize displays, and enjoy watching video on different machines. If a tool exists then I use it (C64) if it doesn't then maybe I'll write it (Aquarius) and so on... I have a pretty nice collection of old consoles and old comps, too many to explore, right now, Tandy 1000 TL is on my desk - so that is where the effort goes. I had a melt-down today though and lost my whole system - luckily my programs are all saved, but my development rig has to be reimaged. (My Make-It-486 upgrade chip was drawing too much power from the bus and stupid me had the hard drive power coming off of the MFM controller card - don't ask me why, I have no idea why I did that...)
 
No. At the 6 minute mark in the video below you can see one of the Volt power flow animations. Those used to be all full frame jpeg decodes including the car silhouette, dithered backdrop, etc. Now in '13 each frame is a generated ARM thumb function that draws the delta change relative to 2 frames back. The code is loaded dynamically from an SO and executed. The first two frames are full compressed raw bitmaps. There are two frame functions at the end that draw the delta change back to frames 1 and 2 for continuous looping. Was a 12 fold increase in speed overall since so little changes frome frame to frame (wheel spin and energy trail).

http://www.youtube.com/watch?v=WQhT49nWPvY
 
No. At the 6 minute mark in the video below you can see one of the Volt power flow animations. Those used to be all full frame jpeg decodes including the car silhouette, dithered backdrop, etc. Now in '13 each frame is a generated ARM thumb function that draws the delta change relative to 2 frames back. The code is loaded dynamically from an SO and executed. The first two frames are full compressed raw bitmaps. There are two frame functions at the end that draw the delta change back to frames 1 and 2 for continuous looping. Was a 12 fold increase in speed overall since so little changes frome frame to frame (wheel spin and energy trail).

http://www.youtube.com/watch?v=WQhT49nWPvY

Sweet! Intraframe compression implemented in code on an ARM ... sweet!
 
It drives me nuts that with all of the CPU power we have to day that you still have to wait.

I wouldn't necessarily blame programmers; most problems today are I/O-bound (both memory and persistent storage).

My machine takes almost 7 minutes to boot up because I choose to pre-cache a lot of background stuff (I have 12G RAM so it's not a problem). I once cursed that loading all the DLLs necessary on bootup was a poor design, until I realized that if everything was statically linked into every binary I would have even bigger problems.
 
Back
Top