2010/10/16

Consideration on using XCHG Instruction

This part is taken from Wikipedia...

The Intel 8086, released in 1978, included an instruction named XCHG. All three of these instructions swapped registers with registers, or registers with memory, but were unable to swap the contents of two memory locations.

On the common x86 architecture, the XCHG instruction with a memory operand has an implicit LOCK prefix, so that the operation is atomic, requiring hundreds of cpu cycles to synchronize with every other device that can access main memory. By comparison, on many processors a single MOV instruction can be issued at the same time as other common instructions in a single clock cycle. Between two registers, XCHG may still be slower than three MOVs, but is smaller, so it may be useful where code size matters. x86s XCHG is primarily useful for its locking form, for writing locking primitives used in threaded or multiprocessing applications.

Good Links on Assembler

MOV reg,0 v.s. XOR reg,reg

To set any register to zero, better use XOR reg, reg instead of MOV reg, 0.
Why? Two most important reasons in code optimization: code size and execution speed.

Let's see those two instruction after assembled into machine code:
Comparison between MOV and XOR instructions in Delphi BASM CPU View
XOR EAX, EAX only use 2 bytes, while MOV EAX, 0 consumes 5 bytes (duh)...

For speed consideration, MOV takes an immediete value that need to be read from memory (RAM) to be fetched into internal processor cache. XOR using no external memory (from processor point of view), thus speed can be gained much faster.

Swap Two Integer Variables

//Following function utilize XCHG instruction to swap two
//DWORD variables. Since no temporary memory and stack
//is utilized (swapping is done within registers and internal
//processor cache), this routine deliver fastest swapping
//speed possible using classic 32 bits x86 instruction set.

procedure SwapInt(var i1,i2: Integer); assembler; register;
asm
mov ecx, [eax] xchg ecx, [edx] mov [eax], ecx
end;

Note: this routine destroy ECX register. If somehow you need to preserve ECX, instead of push/pop-ing it from stack, replace two MOV instructions above with XCHG instructions.

Fast replacement for Abs() function

WARNING: This function works only with Integer variable/constant!
function Abs(i: Integer): Integer; assembler; register;
asm
mov edx, eax sar edx, 31 xor eax, edx     // "xor eax,edx + sub eax,edx" is equal to sub eax, edx     // if edx = -1 then neg eax
end;

Fast replacement for Round() function

//This function using FPU to perform fast rounding operation
function FastRound(const Value : Double): Integer; assembler; register; const
c : Int64 = $18000000000000; // 2^52 + 2^51
var
x: Double; i: Integer absolute x;
asm
fild c fadd Value fstp x mov eax, i
end;

Reverse Bit Order

// Following function will reverse bits order (e.g.: 1101 -> 1011)
// in DWORD value (Integer, Cardinal).
function reverseBits(_EAX: Cardinal); assembler; register;
asm
xor edx, edx mov ecx, 32 @Again: shr eax, 1 rcl edx, 1 loop @Again mov eax, edx
end;