|
|
|
|
|
|
|
|
patch and then see how each hooking engine does. |
|
|
patch and then see how each hooking engine does. |
|
|
|
|
|
|
|
|
I'll test: |
|
|
I'll test: |
|
|
|
|
|
|
|
|
* [EasyHook](https://easyhook.github.io/) |
|
|
* [EasyHook](https://easyhook.github.io/) |
|
|
* [PolyHook](https://github.com/stevemk14ebr/PolyHook) |
|
|
* [PolyHook](https://github.com/stevemk14ebr/PolyHook) |
|
|
* [MinHook](https://www.codeproject.com/Articles/44326/MinHook-The-Minimalistic-x-x-API-Hooking-Libra) |
|
|
* [MinHook](https://www.codeproject.com/Articles/44326/MinHook-The-Minimalistic-x-x-API-Hooking-Libra) |
|
|
|
|
|
|
|
|
hooked itself poses. |
|
|
hooked itself poses. |
|
|
|
|
|
|
|
|
Namely: |
|
|
Namely: |
|
|
|
|
|
|
|
|
* Are jumps relocated? |
|
|
* Are jumps relocated? |
|
|
* What about RIP adressing? |
|
|
* What about RIP adressing? |
|
|
* If there's a loop at the beginning / if it's a tail recurisve function, does |
|
|
* If there's a loop at the beginning / if it's a tail recurisve function, does |
|
|
|
|
|
|
|
|
================ |
|
|
================ |
|
|
This is just a very small function; it is smaller than the hook code will be - |
|
|
This is just a very small function; it is smaller than the hook code will be - |
|
|
so how does the library react? |
|
|
so how does the library react? |
|
|
```ASM |
|
|
|
|
|
_small: |
|
|
|
|
|
xor eax, eax |
|
|
|
|
|
ret |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
_small: |
|
|
|
|
|
xor eax, eax |
|
|
|
|
|
ret |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Test case: Branch |
|
|
Test case: Branch |
|
|
================= |
|
|
================= |
|
|
Instead of the FASM code I'll show the disassembled version, so you can see the |
|
|
Instead of the FASM code I'll show the disassembled version, so you can see the |
|
|
instruction lengths & offsets. |
|
|
instruction lengths & offsets. |
|
|
```ASM |
|
|
|
|
|
0026 | 48 83 E0 01 | and rax,1 |
|
|
|
|
|
002A | 74 17 | je test_cases.0043 ----+ |
|
|
|
|
|
002C | 48 31 C0 | xor rax,rax | |
|
|
|
|
|
002F | 90 | nop | |
|
|
|
|
|
0030 | 90 | nop | |
|
|
|
|
|
0031 | 90 | nop | |
|
|
|
|
|
0032 | 90 | nop | |
|
|
|
|
|
0033 | 90 | nop | |
|
|
|
|
|
0034 | 90 | nop | |
|
|
|
|
|
0035 | 90 | nop | |
|
|
|
|
|
0036 | 90 | nop | |
|
|
|
|
|
0037 | 90 | nop | |
|
|
|
|
|
0038 | 90 | nop | |
|
|
|
|
|
0039 | 90 | nop | |
|
|
|
|
|
003A | 90 | nop | |
|
|
|
|
|
003B | 90 | nop | |
|
|
|
|
|
003C | 90 | nop | |
|
|
|
|
|
003D | 90 | nop | |
|
|
|
|
|
003E | 90 | nop | |
|
|
|
|
|
003F | 90 | nop | |
|
|
|
|
|
0040 | 90 | nop | |
|
|
|
|
|
0041 | 90 | nop | |
|
|
|
|
|
0042 | 90 | nop | |
|
|
|
|
|
0043 | C3 | ret <-----------------+ |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0026 | 48 83 E0 01 | and rax,1 |
|
|
|
|
|
002A | 74 17 | je test_cases.0043 ----+ |
|
|
|
|
|
002C | 48 31 C0 | xor rax,rax | |
|
|
|
|
|
002F | 90 | nop | |
|
|
|
|
|
0030 | 90 | nop | |
|
|
|
|
|
0031 | 90 | nop | |
|
|
|
|
|
0032 | 90 | nop | |
|
|
|
|
|
0033 | 90 | nop | |
|
|
|
|
|
0034 | 90 | nop | |
|
|
|
|
|
0035 | 90 | nop | |
|
|
|
|
|
0036 | 90 | nop | |
|
|
|
|
|
0037 | 90 | nop | |
|
|
|
|
|
0038 | 90 | nop | |
|
|
|
|
|
0039 | 90 | nop | |
|
|
|
|
|
003A | 90 | nop | |
|
|
|
|
|
003B | 90 | nop | |
|
|
|
|
|
003C | 90 | nop | |
|
|
|
|
|
003D | 90 | nop | |
|
|
|
|
|
003E | 90 | nop | |
|
|
|
|
|
003F | 90 | nop | |
|
|
|
|
|
0040 | 90 | nop | |
|
|
|
|
|
0041 | 90 | nop | |
|
|
|
|
|
0042 | 90 | nop | |
|
|
|
|
|
0043 | C3 | ret <-----------------+ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This function has a branch in the first 5 bytes. Hooking it detour-style isn't |
|
|
This function has a branch in the first 5 bytes. Hooking it detour-style isn't |
|
|
possible without fixing that branch in the trampoline. The NOP sled is just so |
|
|
possible without fixing that branch in the trampoline. The NOP sled is just so |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A quick and dirty[1] test for this is re-implementing the well known C rand |
|
|
A quick and dirty[1] test for this is re-implementing the well known C rand |
|
|
function. |
|
|
function. |
|
|
```ASM |
|
|
|
|
|
public _rip_relative |
|
|
|
|
|
_rip_relative: |
|
|
|
|
|
mov rax, qword[seed] |
|
|
|
|
|
mov ecx, 214013 |
|
|
|
|
|
mul ecx |
|
|
|
|
|
add eax, 2531011 |
|
|
|
|
|
mov [seed], eax |
|
|
|
|
|
|
|
|
|
|
|
shr eax, 16 |
|
|
|
|
|
and eax, 0x7FFF |
|
|
|
|
|
ret |
|
|
|
|
|
|
|
|
|
|
|
seed dd 1 |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
public _rip_relative |
|
|
|
|
|
_rip_relative: |
|
|
|
|
|
mov rax, qword[seed] |
|
|
|
|
|
mov ecx, 214013 |
|
|
|
|
|
mul ecx |
|
|
|
|
|
add eax, 2531011 |
|
|
|
|
|
mov [seed], eax |
|
|
|
|
|
|
|
|
|
|
|
shr eax, 16 |
|
|
|
|
|
and eax, 0x7FFF |
|
|
|
|
|
ret |
|
|
|
|
|
|
|
|
|
|
|
seed dd 1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The very first instruction uses rip relative addressing, thus it needs to be |
|
|
The very first instruction uses rip relative addressing, thus it needs to be |
|
|
fixed in the trampoline. |
|
|
fixed in the trampoline. |
|
|
|
|
|
|
|
|
Test case: AVX & RDRAND |
|
|
Test case: AVX & RDRAND |
|
|
======================= |
|
|
======================= |
|
|
|
|
|
|
|
|
The AMD64 instruction set is extended with every CPU generation. Becayse the |
|
|
The AMD64 instruction set is extended with every CPU generation. Becayse the |
|
|
hooking engines need to know the instruction lengths and their side effects to |
|
|
hooking engines need to know the instruction lengths and their side effects to |
|
|
properly apply their hooks, they need to keep up. |
|
|
properly apply their hooks, they need to keep up. |
|
|
|
|
|
|
|
|
are disagreements on whether I've picked good candidates of "exotic" or new |
|
|
are disagreements on whether I've picked good candidates of "exotic" or new |
|
|
instructions, but those were the first that came to mind. |
|
|
instructions, but those were the first that came to mind. |
|
|
|
|
|
|
|
|
|
|
|
Test case: loop and TailRec |
|
|
|
|
|
=========================== |
|
|
|
|
|
|
|
|
|
|
|
My hypothesis before starting this evaluation was that those two cases would |
|
|
|
|
|
make most hooking engines fail. Back in the good ol' days of x86 detour hooking |
|
|
|
|
|
didn't require any special thought because the prologue was exactly as big as |
|
|
|
|
|
the hook itself -- 5 bytes for `PUSH ESP; MOV EBP, ESP` and 5 bytes for `JMP +- |
|
|
|
|
|
2GB`[2]. That isn't so easy for AMD64: a) the hook sometimes needs to be *way* |
|
|
|
|
|
bigger b) due to changes in the calling convention and the general architecture |
|
|
|
|
|
of AMD64 there just isn't a common prologue, used for almost all functions, |
|
|
|
|
|
anymore. |
|
|
|
|
|
|
|
|
|
|
|
Those by itself arn't a problem, since the hooking engines can fix all the |
|
|
|
|
|
instructions they would overwrite. However I hypothesized that only a few would |
|
|
|
|
|
check whether the function contained a loop that jumps back into the |
|
|
|
|
|
instructions that have been overwritten. Consider this: |
|
|
|
|
|
|
|
|
|
|
|
public _loop |
|
|
|
|
|
_loop: |
|
|
|
|
|
mov rax, rcx |
|
|
|
|
|
@loop_loop: |
|
|
|
|
|
mul rcx |
|
|
|
|
|
nop |
|
|
|
|
|
nop |
|
|
|
|
|
nop |
|
|
|
|
|
loop @loop_loop ; lol |
|
|
|
|
|
ret |
|
|
|
|
|
|
|
|
|
|
|
There's only 3 bytes that can be safely overwritten. Right after that is the |
|
|
|
|
|
destination of the jump backwards. This is a very simple (and kinda pointless) |
|
|
|
|
|
function so detecting that the loop might lead to problems shouldn't be a |
|
|
|
|
|
problem. Basically the same applies for the next example: |
|
|
|
|
|
|
|
|
|
|
|
public _tail_recursion |
|
|
|
|
|
_tail_recursion: |
|
|
|
|
|
test ecx, ecx |
|
|
|
|
|
je @is_0 |
|
|
|
|
|
mov eax, ecx |
|
|
|
|
|
dec ecx |
|
|
|
|
|
@loop: |
|
|
|
|
|
test ecx, ecx |
|
|
|
|
|
jz @tr_end |
|
|
|
|
|
|
|
|
|
|
|
mul ecx |
|
|
|
|
|
dec ecx |
|
|
|
|
|
|
|
|
|
|
|
jnz @loop |
|
|
|
|
|
jmp @tr_end |
|
|
|
|
|
@is_0: |
|
|
|
|
|
mov eax, 1 |
|
|
|
|
|
@tr_end: |
|
|
|
|
|
ret |
|
|
|
|
|
|
|
|
(Preliminary) Results |
|
|
(Preliminary) Results |
|
|
===================== |
|
|
===================== |
|
|
|
|
|
|
|
|
+----------+-----+------+------------+---+------+----+-------+ |
|
|
+----------+-----+------+------------+---+------+----+-------+ |
|
|
| Name|Small|Branch|RIP Relative|AVX|RDRAND|Loop|TailRec| |
|
|
| Name|Small|Branch|RIP Relative|AVX|RDRAND|Loop|TailRec| |
|
|
+----------+-----+------+------------+---+------+----+-------+ |
|
|
+----------+-----+------+------------+---+------+----+-------+ |
|
|
|
|
|
|
|
|
then thrown away by the multiplication. It's shitty code is what I'm saying. |
|
|
then thrown away by the multiplication. It's shitty code is what I'm saying. |
|
|
|
|
|
|
|
|
In retrospect I should have used a jump table like a switch-case could be |
|
|
In retrospect I should have used a jump table like a switch-case could be |
|
|
compiled into. That would be read only data. Oh well. |
|
|
|
|
|
|
|
|
compiled into. That would be read only data. Oh well. |
|
|
|
|
|
|
|
|
|
|
|
[2] And Microsoft decided at some point to make it even easier for their code |
|
|
|
|
|
with the advent of hotpatching. |