The principle of operation of the catweasel is very simple. There's a free-running counter (speed selectable between roughly 7, 14, 28 and 56 MHz) and a block of SRAM (131K/128KiB and a counter for SRAM address control). Every time a pulse arrives from reading a floppy, the lower 7 bits of the counter are stored into the SRAM, the store address is incremented and the counter is reset. The high-order bit is connected to the index output, so you have an idea of whether or not you're within the index area when a sample is taken. A track is read (depending on program control, this can be index-to-index or when the SRAM is full) and then the PC reads the SRAM. There's a very basic set of latches that can control the density select, step and direction outputs, but the timing is done under program control for those--just a simple bit-bang affair.
It's up to the software to figure things out. Writing works in the opposite direction--the program stores the transition times to the SRAM and the counter is used to bang the write data line at the appropriate time.
Everything else is left to the program. This is the principle of operation of all of the other similar bits of kit, such as DeviceSide, Kyroflux, etc. They differ only in details.
Mind you, you can't simply copy a disk by feeding back what you read. There are various effects such as quantization error and bit-crowding (prcompensation) that need to be accommodated. A few years back, I posted a way to do this using a simple ATMega microcontroller-with-SRAM setup, where a 16MHz clock was more than sufficient to process 1.44MB floppies. Not a lot of interest, so I let it go without much other comment. Modern MCUs have sufficient SRAM to be able to do this all on one chip; not to mention built-in USB OTG facilities. So there's really not much in the way of hardware to do--just sharpen your coding pencil...