The code injection framework is a means of taking control over the program flow and running injected code.
This allows us to effectively change the code without requiring a complete source code, instead allowing us to just edit what we need.
When you know the assembly, you already have the source code.
The goal of the code injection framework is to modify an ARM binary in an easy way to repoint to our own code.
There are some very basic methods to steal the control flow based on patterns in the generated assembly.
Function pointers are relevant when a table of function pointers is used, like where a set of commands map to their own functions.
For this, replacing the function pointer is enough to steal control flow.
Repoints are relevant when a table is used and all that's desired is different elements in that table.
This is particularly when the desire is to expand the table beyond its initial memory scope.
Hooks are a more invasive method of directly inserting code that jumps to a different location.
This involves manually overwriting code to jump to a different location with code that we manage.
There are two hook constructs used by hg-engine: register-specific long jumps and full function replacements.
The register-specific jump is simple:
ldr rN, =SymbolToJump
bx rN
.pool
00 48 - ldr r0, [pc, #0]
00 47 - bx r0
EF BE AD DE - 0xDEADBEEF little endian01 4B - ldr r0, [pc, #2]
18 47 - bx r3
00 00 - alignment because 0xDEADBEEF is not word aligned
EF BE AD DE - 0xDEADBEEF little endianADDRESS like so:
((ADDRESS & 0x3) != 0, (48 | register), (register << 3), 47, [00, 00,] (LE)ADDRESS
This second snippet has a big requirement: It must maintain the values of all of the registers.
The obvious solution would be to turn to the stack at this point.
The stack would allow storage of a value temporarily as long as it is retrieved and realigned everything once control of the program flow is returned to the original ROM.
However, the ARM standard throws a bone in the plans by stipulating that functions with 5 or more arguments throw their later arguments onto the stack.
This adds that requirement to our second type of hook: complete stack preservation.
The only way to achieve this becomes clear: usage of the branch with link (bl) instruction. This is an instruction that only modifies one register: the link register.
The link register is kept in order to track program flow and return to the caller of the function once the function is finished.
In order to achieve this and keep the link register, there needs to be some location that we can store the value of the link register for preservation overall.
hg-engine then takes advantage of a quirk of the NDS that isn't present in its predecessor, the GBA: the ROM is not mapped to executable space.
Code needs to be dumped to the EWRAM in order to run it at every instance.
What this lets us do is actually just store the link register directly alongside the executable code that we are modifying at runtime for safe restoration.
This can even account for recursion if all of the recursion is done outside of the original ROM code at all, so we do not have to worry about that.
This snippet looks something like this:
push {r5-r6}
ldr r5, =lrStorageSpace
mov r6, lr
str r6, [r5]
pop {r5-r6}
bl SymbolToJump
ldr r1, =lrStorageSpace
ldr r1, [r1]
mov pc, r1
.pool
lrStorageSpace:
.word 0
bl as an instruction does a lot more than the other instructions behind the scenes--in fact, it is actually 2 instructions even in THUMB that
calculate a high half of the target instruction and a low half of the target instruction.F000 F800 in that order.
The combination of the two instructions encodes how many bytes are to be skipped over. The remaining 11 bits of each instruction stored half of the 22 bit total.
The actual value of bytes is double of what is specified in the instruction (because there is no use in specifying a byte-aligned address to jump to), allowing for a total range of 2^23 bytes that are possible to skip.
To allow both forward and backward jumping, this is a 2's complement signed integer.
This then allows for a total range of bytes of -0x400000 - +0x3FFFFE. Given that the NDS has exactly 4 MB of EWRAM, this completely covers every bit of code editing for our purposes.
What is now present is the idea behind all of the hook insertions. There should now be a method that can be used to easily specify the hooks such that a script can automate their placement in the binaries.
In order to get this, we need to get a dump of the symbols in the objects that we create and where they have been assembled to--their memory addresses.
Per research done by a fellow community member Mikelan98, the NDS Pokémon games actually
have a quite large region of memory that by and large goes unused in gameplay at 0x023C8000 through the end of the EWRAM. This allows us a region of memory with which we can store all of our generated code.
The NDS introduced an overlay system with which code can be inserted into the game with individual binaries and selectively loaded at runtime.
This code is position-dependent unlike other REL files from i.e. N64 or DLL files, but we forgive Nintendo for their early embedded systems explorations.
Our code can directly be added as a new overlay and loaded in using the built-in system when the game starts to give us a permanent code expansion.
The format decided on was simple:
overlayNum symbolName addressToJumpFrom [usedRegister]
for example...
arm9 CreateBoxMonData 0206DED0
0012 CheckCanTakeItem 02241334 0CreateBoxMonData is jumped to from the memory address 0x0206DED0 in the ARM9 using the second hook that maintains registers and stack.CheckCanTakeItem is jumped to from the memory address 0x02241334 in overlay 12 using r0 to get there, cobbling r0 in the process with the destination address.