snq
Experienced Member
I've taken up oldschool programming again and I'm wondering if there are any optimizations to do a regular memory copy?
I'm concentrating on performance on a 286 and so far rep movsw seems to be as fast as it gets. I've tried unrolling the loop, reading 8 bytes at a time into 4 registers and then writing them, some alignment stuff, but it seems nothing beats the regular rep movsw. In fact whatever I do seems to be significantly slower.
On modern machines there's plenty of ways to improve performance over rep movsd, and it just seems to be too easy if rep movsw is actually the fastest method on a 286.
Anyone got some tricks up their sleeve here?
I'm concentrating on performance on a 286 and so far rep movsw seems to be as fast as it gets. I've tried unrolling the loop, reading 8 bytes at a time into 4 registers and then writing them, some alignment stuff, but it seems nothing beats the regular rep movsw. In fact whatever I do seems to be significantly slower.
On modern machines there's plenty of ways to improve performance over rep movsd, and it just seems to be too easy if rep movsw is actually the fastest method on a 286.
Anyone got some tricks up their sleeve here?