To reiterate again:
What actually makes the computer look at the character rom is that the hardware?
Is it the kernal that says to the video ram look in location xxxx?
The screen generation process is entirely in hardware. (That is why the older PETs can display a screen full of perfectly formed garbage characters even if their CPU is halted or the ROMs corrupted. The newer ones require the ROMs to intialize the piece of hardware that forms the characters, IE, the CRTC, but once it's running it's again completely "hardwired".) Basically it work like this: (Generalizing based on the original 40 column PET and oversimplifying a little.)
There are a series of binary counters. One counts to forty, for the number of characters in a line of text, and another counts to 25 for the number of lines. 40x25=1000, the total number of characters on the screen, and that explains why your PET has 1K of video RAM. That part seems easy enough to understand, right? You stick a value in a memory location and it shows up on the screen as the counter counts through the 1K of video RAM? But... wait a minute: it's not that easy. Each character is 8 scan lines tall. So you can't just read a character once and stick it up on the screen in one blow, since the video circuitry only generates one line of dots at a time. And how does it know what those dots should be?
There's another counter for the number of scan lines in each character. It counts to 8. For every line of text the scanning hardware reads 40 contiguous memory locations 8 times over; when it's done it moves on to the next 40 locations. What does that accomplish? Each character is an 8 bit binary number. The lower 7 bits, 0-127, are used directly to specify what character are displayed. The top bit, 128, specifies whether a character is displayed as "normal" or reverse video. When the value is read from the character memory that bit goes straight to a gate that controls an inverter: if that bit is set than black becomes white and vice-versa. What about the other seven bits? Well: Remember how we're counting to 8? Those seven bits are combined with the three bits (000 through 111) of the 8 count to produce an 10 bit memory address. That's how we get the 8 dots that make up a given vertical slice of a character. So:
If I understand it correctly you can't just replace the rom with ram because you dont have a location to poke a value in to change that character set in that location.
Yes. Unlike a machine that allows user-definable character tables to reside in memory space accessible on the CPU bus, that 10 bit memory address is fed into a completely private address bus that lives between the PET's character ROM and the address generation circuitry in the video hardware. The CPU cannot interviene in what happens there *at all*. Completely under control of the video hardware the ROM is fed an address generated by the combination of the binary value of the character and the scan line position, and the 8 bit data byte it returns is fed directly into a shift register which clocks the bits out to the hardware which controls whether the electron beam sweeping across the monitor lights up a spot or not. The only thing that the user can modify on a normal pet is there's a latch on the character ROM that activates or deactivates the top address line. (You may have noticed that 10 bits=1k, while a PET has a 2K character ROM. By activating or deactivating the latch you can control which half of the ROM the reads come from and therefore control which of hardcoded character sets are used when rendering.)
Thus the only way to have user generated characters on a PET is to actually modify the hardware so you can substitute a writeable memory of some sort for the Character ROM, and there follows the hardware discussion about RAMs and dual porting and MUXes, etc.
I don't have a reference for a period PET add-on that provided this capability, but there was a fairly common third-party add-on for the TRS-80 that did exactly what's being discussed:
http://www.trs-80.org/80-grafix/
(The TRS-80 Model I and III had quite similar video hardware to the PET.)