I think up to the 386, most of the x86 chips Intel released were tri-state multiplexed
FWIW,
"Tri-state" doesn't have anything to do with multiplexing; it refers to a standard feature of IC logic where a chip can set an output to a high impedance state to effectively "disconnect" itself from the circuit. It's a vital and standard feature of any kind of bus-based architecture, IE whenever multiple devices sit on a common wire and take turns driving it. I mean, yes, tristate logic is a feature you need to make a multiplexer, but CPUs with non-multiplexed busses also (almost always*) support tristate on their address busses and control lines in order to allow busmastering by other devices.
(As a counter-example The original 6502 CPU didn't support tristating the address bus. Which is why practically all complex 6502-based computers have a set of 74LS244 or equivalent tristate buffers on the address lines. Some successors, like the WD65C02S and Commodore's
6510 used in the C64, added internal tristate.)
The 386DX was also hampered by the tri-state bus, it would have been a much faster part if it had dedicated address and data lines.
The
80286 was the first x86 CPU with a demultiplexed bus. It's a significant reason it manages to be faster than its real-mode only half-sister, the 80186. (Both CPUs include improvements like having dedicated address calculation hardware, but the 186 bus cycle needs more time to complete because of the address latching step. The 8mhz Tandy 2000 would usually run about even on benchmarks with the original 6mhz IBM 5170, but the later zero-wait-state 6mhz 5162 would beat it.)
To be fair to the 386, being "only" roughly as cycle-efficient as the 286 isn't that much of a slam, the 286 was legit a pretty fast and efficient CPU when it came out, and the 386 compared pretty evenly with other CPUs of its generation, like the 68020. In terms of ISA/architecture it was a huge leap... at least in theory. The real shame is mostly that it took so long for mainstream software to really support it.
Motorola got it right with the 68000.
A dirty little secret about the 68000 is there are benchmarks out there on which the 286 will hang it out to dry; it's not actually *that fast* of a CPU, even compared to the 8086. (Apple engineers used to gripe about the 6502 being faster than it in certain circumstances. It's probably more precise to say the 6502 was more *consistent*, honestly, but for some of what Apple was doing the 6502 could in fact perform the necessary byte-size operations in fewer cycles.) It takes enough cycles to execute most instructions that having a multiplexed bus probably wouldn't have impacted its
IPC that much. The reason it was so popular and long lived was more about its architecture than raw performance; by essentially emulating a 32 bit CPU it's a much more friendly architecture for higher level languages and big-memory applications than the segmented architectures of its 16 bit contemporaries.
That said, Motorola did probably make the right call with putting the 68000 in its big fat 64 pin package instead of resorting to multiplexing. The 8086's 40 pin package is kind of a joke when you consider how many pins it *actually* is to fully expand not only its data and address busses, but its status and control signals. Motorola could have shaved a few bucks on the package costs by multiplexing everything the way Intel did with the 8086, but considering their CPU die was already almost three times larger it wouldn't really have been worth the savings.