I was using "UMA" in the sense that Eudimorphodon seems to have meant it: the CPU has direct access to the frame buffer RAM.
That is... an interesting definition of "UMA" that nobody else uses.
UMA = Unified Memory Architecture, also called "shared memory", where CPU and graphics frame buffer is in the same RAM chips. So CPU and Graphics Chip share the same memory for software, data and frame buffer and access alternately to it. Means, if CPU is accessing it has to wait until the video chip has read it's part.
Yes, this, To repeat, AGAIN: UMA in this context is
"Unified Memory Architecture", which was a term that appears to have been
invented by, or at least latched onto by SGI, to describe the architecture of their O2 Workstation, and quickly latched onto by AMD and Intel
to describe computers that have integrated video chipsets. (
And, almost forgot, it's also *widely* used when discussing the architecture of the ARM architecture Macintoshes. )This is a term that's been pretty well known since the late 1990's (it's been used to describe Intel chipsets since the i810), but it's clear the concept it describes, IE, a system where under "normal circumstances" you have a common pool of memory shared between CPU code execution and video refresh, goes back to, yes, machines like the Apple II. Unfortunately this acronym shares "UMA" with
Uniform Memory Access, which describes multiprocessor systems that all have equal priority shared access to a common pool of memory (verses "NUMA", where the concepts of "Local" and "Non-Local" memory apply), and it's clear that this clash is causing some brainrot problems here.
For the rest of this discussion I am referring *solely* to "Unified Memory Architecture" as it applies to graphics.
Anyway. I mean, sure, if you really want to pick a fight about it you could argue that "UMA" as described in the SGI literature describes more than just framebuffers, it also refers to things like texture memory... but this is functionally an evolution of the tricks that many old 8-bit computers did with walking around the base address of the video buffer to do things like hardware scrolling and double-buffering. So as far as I'm concerned "UMA" applies as well to an Amiga or Atari 800 as it does to an M2 Macintosh.
but what Eudimorphodon was talking about
in his post above would not be "more UMA" (or UMA at all) because it still has the same issue as standard CGA: if both the graphics circuitry and the CPU try to access the same address at the same time, one will be slowed or fail to read. (The latter case is what produces "snow" on the display.)
I mentioned this in the same post, but here's a point about CGA:
It doesn't snow in either the 40 column text mode or the graphics modes. From the tech manual for CGA:
If you look at the technical details you'll find that the CGA card implements the IOREADY line correctly to insert enough just wait states such that *almost* all the time the CPU can read/write the RAM without serious performance hits *and* without causing screen disturbances in every mode other than the 80 column text mode. This mode requires twice the bandwidth on the video refresh side as all other supported modes and if priority were given to video instead of CPU you'd basically have to hold the CPU in wait through the entirety of the active horizontal line period, so instead of fixing the IOREADY mechanism to do that in that specific mode IBM implemented a polling method that allows manually checking to see if you're in the active area if you're really dying to avoid snow.
Anyway, regarding this:
The reasons for not using the video card RAM as sole system RAM are pretty obvious if you start to work out the design: you a kilobyte of memory at $000-$3FF for the interrupt vector table, and working out how to share that with the frame buffer and the program, while it could be done, would be rather a pain.
I kind of hope whoever posted that "Prototype PC" design document sees this thread and re-links it, because apparently I completely fail at using the site's search engine. Because I don't have it in front of me I cannot comment on whether the document actually addresses that, or if I'm misremembering the "0k scenario" entirely, but I do recall that the version of the PC described in said article had a lot of differences from what actually shipped; among other things the "proto-CGA" card described might not not of had the 80 column mode yet. It's possible that idea of running from CGA memory was chucked in there by someone who didn't know about the interrupt vector limitation, or it's also possible they might have considered a scenario where the CGA memory could have been addressed from 0K.
Back to this nitpicking about "UMA": The definition I used in that post was a machine that "
normally" uses the same
physical memory bank(s) for both video refresh and executing CPU code. Applying to any machine that has a memory-mapped framebuffer is stretching it to absurdity, because outside of some weird special cases (let's say some really old machines like the original TRS-80 that leave out a bit), sure, you can run code from video memory if you *really* want to. Machines like the Apple II, Commodore VIC-20 and 64 (almost any home computer using the 6502, actually), the Sinclair ZX Spectrum, etc, completely fit this definition, as do 16 bit computers like the Commodore Amiga and the 128k IBM PCjr and Tandy 1000. If this is what a computer is doing then I don't think it matters if the memory that's being shared is physically on the same PCB as the CPU, that's just a construction detail, if said memory is the "main" memory. This hypothetical prehistory PC prototype just running from a proto-CGA card would thus qualify as UMA, at least until you add dedicated CPU RAM to it. At which case it... stops being UMA? Maybe we can have a whole separate argument about what happens when you add more memory to a UMA machine and said memory is NOT accessible to the video hardware.
Most of the machines described above can *only* use memory directly connected to their video hardware for video refresh; IE, if you sit down and pick through how they're put together the video hardware is essentially acting as the DRAM controller and providing refresh to the attached memory; the video hardware is *not* a self-contained DMA busmaster sharing a "neutral" pool of memory with the CPU(*). In these machines if you slap a memory card on the bus it's only there for the CPU, and depending on the architecture of the machine it may or may not be faster than the shared memory; Amiga people *specifically* make a distinction between "fast ram" and "chip ram", for instance.
Here's an edge case: going back to the S100 era video systems that you could call UMA that *were* implemented as neutral busmasters using RAM not on the video card. The classic example is in fact the Cromemco Dazzler; it worked as a DMA master to periodically grab a line's worth of data from whatever memory card you pointed it at in your system for the video refresh, and as a result it could do things like double-buffering by just poking a new base memory into its address register. Here's the thing, though: it also demonstrates why this wasn't a very good solution. The Dazzler would only work with SRAM cards because it had no patience for getting interrupted by anything else, and even then it still relied on having a line cache for storing a blob of pixels to keep from starving out the CPU, which is a strategy that only works for a low resolution/text system. (There was a fairly common Intel CRTC, the 8275m that also relied on doing DMA to external memory just like the Dazzler, using a built-in 80 character memory cache. That works great if you only need to fill that cache every 8 scalines or whatever, not so well if you're *constantly* having to let the DMA controller own the shared memory bank.) Because of the scale of the contention issues you run into with this I
personally consider "UMA" only really applying to machines where the shared memory in question
is controlled/refreshed specifically with the timing requirements of the video circuitry in mind.
Yeah, again, I mostly agree it was a dumb/unworkable idea with CGA only having 16K on it, but I do think it's *interesting* to think about an alternative universe where they decided the solution was to slap another 16K on the card. 32K minus 2K for the 40 column text mode would have almost been competitive at the time, as dumb as that is, and having that extra RAM on there *could* have enabled PCjr-level color in an expanded machine. But yeah, didn't happen.