They can cause not only software but hardware incompatibilities. What will handle the 16M addressing space of 65816 for example?!? Anyway to allege that a a given CPU "does work successfully" by mere running a program or two on the given system is a very blind and in this case rather empirical approach...An in-depth knowledge and severe testing is required. I have read how Intel used to test their 486 and Pentium processors once upon a time...A short look at the datasheet of 65816 should be sufficient to learn that direct replacement is impossible. Such replacement is gambling and constant doubt that any fault or software bug in such system could be caused by an incompatible CPU for it...Are you comfortable with that? For a collector's computer...that runs for a few minutes a day...maybe...
I've just done some more reading... turns out the 65C816 will work in the Apple //e motherboard, if you bend out pin 1 as mentioned as causing too much power drain. The 65C816 runs fine with just a 16 bit memory bus... turns out that the pins that differ between the 65C02 and 65C816 are mostly ( 7 out of 8 ) not connected on the //e. There is a problem using a 65C816 in an Apple ][ or ][ plus, as one of these pins was used for a memory buffer purpose, not used on the //e. A project over a 6502.org was built that made a small "wedge" PCB to directly replace the 6502 with the 65C816. The '816 is designed to be very compatible with the 65C02... as they were both designed by the same engineer (Bill Mensch) and same company (Western Design Center), this is not surprising.
It was mentioned in discussions on C.S.A2 that the main benefit to running a 65C816 in your //e is to be able to use Merlin 16, which runs 4 to 5 times faster with access to true 16 bit instructions instead of having to emulate them in 8-bit (which Merlin 8 does). It was also mentioned that the 65C816 add-on for the Ramworks was somewhat unique in that it did utilize the extra 24 bit addressing to directly address the Ramworks memory... I wonder if any software (other than a patched Appleworks) made use of the direct memory addressing on the Ramworks?
So, if you develop in assembly on a //e (on real hardware, not emulated), a Transwarp card with a 65C816 installed might be a good way to speed up your work. Many people have done this upgrade, and software compatibility seems to be very good, likely because the 65C816 was designed to be able to run 65C02 software without issues, for the most part. Programs (mainly copy protected) that deliberately run what were illegal instructions on the 65C02, but instead run as valid on the 65C816 will fail, but the same programs also likely failed on the Apple IIgs.
Edit: I forgot to mention that some implementations of the 65C816 do not seem to like the floating inputs on the unused pins, but those chips are somewhat rare. The 65C816 chips found in most IIgs systems seem to work fine on the 8-bit Transwarp... though killing IIgs systems for their CPUs makes little sense, you might as well have a IIgs instead of a IIe... unless you have a defective IIgs motherboard, then salvaging and re-using the CPU only makes sense.