You are on page 1of 53

Analysis and Visualization of Common Packers

HITBSecConf2008 - Kuala Lumpur Ero Carrera - ero.carrera@gmail.com Reverse Engineer at zynamics GmbH Chief Research Officer at VirusTotal

Introduction

An historical perspective
Originally meant to save space by reducing the redundancy in executable file formats Simply compressed parts or the whole of the executable Created a new "envelope" around it that restored the original executable and the passed control to it The decompressing envelope did not much more than just restoring the executable

Evolution of the techniques

Compression provided a trivial degree of obfuscation, but obfuscation nonetheless Was easy to add additional measures in the decompressing envelope

Overview of the techniques

Destruction of informational components Import address table Simple late reconstruction into an original form Construction of new connectivity artifacts between the original code and imported modules Strings .

Anti-debug Aimed at making tracing hard Using SEHs triggered by hard to handle exceptions Confuse debuggers throwing INTs they use Calling hard-to-hook low level APIs/syscalls Checking for hooks .

etc Techniques aimed against specific tools OllyDBG. VMWare. Softice.Anti-environment VM detection. VirtualPC. IDA. etc .

confusing or aimed at crashing some of the most common tools IDA OllyDBG Procdump Softice .Breaking tools Tricks detecting.

Anti-analysis Code obfuscation Adding junk code. using opaque predicates Code transformation Virtual machines Flow obfuscation (SEH. Nanomites) .

hook syscalls. look deeper that user-mode tools Inspection of physical/virtual memory Memoryze.Tools Bochs Provides with a high-level view No need to worry about most of the anti-* techniques Windbg Can do kernel-mode debugging. for the real hardcore .

Obfuscation & Anti-Analysis .

has some workarounds against this) Also: indirect obfuscation through heavy optimization . to the middle of what would be an instruction if disassembling linearly Result => confusion (latest IDA.3.Basic trickery against analysis algorithms Most tools will linearly disassemble a chunk of code Introduce and non-terminal flow branching instruction (not a ret or jmp) Make it point later in the code. 5.

Example: ASPack (original) 0101F001 60________ 0101F002 E803000000 0101F007 E9EB045D45 pusha call jmp near ptr loc_101F007+1 near ptr 465EF4F7h .

Example: ASPack (fixed) 0101F001 60________ pusha 0101F002 E803000000 call loc_101F008 0101F007 E9________ db 0E9h . T 0101F008 EB04______ jmp short loc_101F00E .

Example: Linear Disassembly 0DFE000000 or 3D02000000 cmp E901000000 jmp 75B8______ jnz eax. 0FEh eax. 0C0h C3________ ret . 2 0x1 short near ptr 0FFFFFFC9h F3C001C0__ rep rol byte ptr [ecx].

Example: Linear Disassembly 0DFE000000 or 3D02000000 cmp E901000000 jmp 75________ db 0x75 HERE: B8F3C001C0 mov C3________ ret eax. 2 0x1 // HERE . 0xc001c0f3 eax. 0FEh eax.

Example: Opaque Predicate 0DFE000000 and 3D02000000 cmp 7401______ jz 75B8______ jnz eax. 2 eax. 2 0x01 short near ptr 0FFFFFFC6h F3C001C0__ rep rol byte ptr [ecx]. 0C0h C3________ ret .

Example: Opaque Predicate 0DFE000000 and 3D02000000 cmp 7401______ jz 75________ db 0x75 HERE: B8F3C001C0 mov C3________ ret eax. 2 HERE . 2 eax. 0xc001c0f3 eax.

. ... instruction instruction instruction (operand.) (operand........) address address address ..) (operand. instruction instruction instruction (operand..) (operand. . .) Memory Page address address address . .... .) (operand.) . instruction instruction instruction (operand. .Executable Image Memory Page Function Chunk Function Chunk Function Chunk Function Chunk Function Chunk Function address address address .) (operand... . ..... . ... .) (operand.......) (operand... .) Memory Page address address address ... instruction instruction instruction (operand.) (operand.. ...) (operand.. instruction instruction instruction (operand.) (operand. .) address address address .... .

. .... .) (operand.... instruction instruction instruction (operand.... .) (operand.) Memory Page Function Chunk address address address ...) Memory Page Function Chunk Function Chunk address address address .. .. ...) Memory Page Function Chunk Function Chunk address address address .) (operand.. ...) (operand. .. .. instruction instruction instruction (operand.) ... .... instruction instruction instruction (operand..Executable Image Function address address address .. instruction instruction instruction (operand. instruction instruction instruction (operand. .. .) (operand.) (operand..) (operand..) address address address ...) (operand....) (operand.. . .. . .) (operand...

. .... ( ( ( .) .... .. ..... ( ( ( ... .. ( ( ( .) . . .) address address address ..) .) address address address ..) . ..) Function B address address address ..) ..) . .) ... . ( ( ( .. . ( ( ( ..) . . ..) .. ...) .. ( ( ( ...) ... .. .) address address address .. .) address address address ... .) . .) ........ ( ( ( . ..... ... ..) .......Function A address address address .... . ( ( ( .) address address address .. .......) . ( ( ( ..) ...) address address address . . .) . .) Shared Blocks address address address ...) .) . .

Junk. Polymorphic and static .

Junk Code pusha popa Non-Standard Branching Junk JMP insertion .

0F0DEFBh eax eax .Exmple: Junk Code I (Themida) 018900CE 48________ dec 018900CF 60________ pusha 018900D0 B9FBDEF000 mov 018900D5 50________ push 018900D6 9C________ pushf 018900D7 E912000000 jmp 018900D7 [junk data] loc_18900EE ecx.

Exmple: Junk Code II (Themida) 018900EE loc_18900EE: 018900EE E90E000000__ jmp 018900EE [junk data] 01890101 loc_1890101: 01890101 9D__________ popf 01890102 5E__________ pop 01890103 61__________ popa 01890104 0F844E06FA7A jz loc_7C830758 esi loc_1890101 .

Ruby.NET Starforce. x86 Virtualizer. Java. handle algorithm . decode. Perl. . Themida/ CodeVirtualizer At a high-level it’s a: fetch.Virtual Machines Visual Basic. Python. VMProtect.

Virtualized Code Runs Virtual CPU Standard Code Runs Runs Real CPU Real CPU .

reg2 opB reg2 branchA XYZ 4 Execute handler 3 Real CPU opcodes Update registers Registers Virtual CPU handler for opA 2 handler for opB Decode Decoder handler for opC handler for branchA Decoder .Registers -General Purpose -Instruction Pointer -Stack pointer Virtual CPU opcodes Fetch Instruction Pointer 1 opA reg1. -Look up operand in table -Call handler ...

Virtual Machine Countermeasures Rolf Rolles and Boris Lau have already shown that optimization/reduction techniques can help Translating to an intermediate representation and performing optimization in the code leads to reduced forms You could use a tool like Peter’s “Find executable code” to discover instruction handlers from a memory dump of the VM .

Advanced Packers Some of the hardest current packers are VMProtect. custom techniques Usually commercial products protectors . Themida. Armadillo They incorporate some complex.

Armadillo .

debug blocker Nanomites Strategic Code Splicing Armadillo's invalid instructions LOCK prefix Invalid MOV .Armadillo Double process debugging.

. . mov [UWZ]. 0 INT 3 . 0xff pop ebp ret Look up address Find target Set target in child context Resume child Debug Child process Parent catches it Transfer control INT 3 . esp push 0 push 0 call XYZ cmp eax.Parent process Debug Child process Child's code Parent catches it Transfer control INT 3 push ebp mov ebp. .

Themida .

keep the semantics.Themida's API obfuscation The general algorithm can be summarized as: Retrieve the API's function body Perform a basic analysis and disassembly Reconstruct the API's function body inserting junk in between each of the real instructions Re-assemble functionality. change the syntax .

Standard Imports Executable Imported DLL .

A function references other code Executable Imported DLL Exported DLL Function Internal DLL Function .

Some of the references are kept Themida protected executable Imported DLL Exported DLL Function DLL Function (Obfuscated) Internal DLL Function .

Reconstruction The algorithm has limitations References to other functions within the DLL are kept Same for true branches of conditional branches Those two points can allow us to do API discovery by studying their connectivity .

Themida's obfuscation Adds lots of branching and junk Keeps few "real" instructions per obfuscated block IDA can “easily” deal with the branching Although bogus calls break IDA analysis and lead to broken obfuscated functions Some scripting can make this look better .

.

Current state Packing vs unpacking Packing is not always a symmetric proces. sometimes it can't be undone perfectly You won’t get the original process back Can it be done generically? Some cases the answer is "mostly" yes You will mostly always be able to obtain code close to its original form .

Recent techniques skape documented an elegant trick on uninformed 10 a few weeks ago Attacks a basic heuristic used by most “generic” unpackers Tracking execution transfer to “dirty-memory” .

Virtual Address Range A Virtual Address Range A Virtual Address Range A WRITE WRITE TIME .

Virtual Address Range A Physical Memory MMU Virtual Address Range A .

Virtual Address Range A WRITE Physical Memory MMU Virtual Address Range A EXECUTE .

Countermeasures Windbg can see the mappings from virtual to physical Not to hard to spot doubled mapped regions Bochs and other low level emulators can easily do it as well Requires kernel-mode access or “higher” .

Secrets of Reverse Engineering.org/SSTIC08-article-Guillot_GazetDeprotection_Semi_Automatique_Binaire.pdf Virtual Machine Threats . Eldad Eilam Déprotection semi-automatique de binaire. Peter Ferrie http://www. Yoann Guillot & Alexandre Gazet http://metasm.com/avcenter/reference/Virtual_Machine_Threats.symantec.References Reversing.cr0.pdf .

skape http://www.datasecurity-event.wordpress.org/?v=10&a=1 Dealing with Virtualization packer.References II A Quick Survey on Automatic Unpacking Techniques. Daniel Reynaud http://indefinitestudies. Boris Lau http://www.pdf .com/2008/09/25/automatic-unpacking/ Using dual-mappings to evade automated unpackers.com/uploads/boris_lau_virtualization_obfs.uninformed.

oreans.com ReWolf's x86 Virtualizer http://www.org/blog/view/847/x86_Virtualizer_-_source_code .org/blog/browse/RolfRolles Oreans Themida/CodeVirtualizer http://www.openrce.openrce.References III Rolf Rolles blog in OpenRCE https://www.

ru/ Deroko's Nanomite's write up http://www.org/i3/Nanomites_And_Misc_Stuff.htm .vmprotect.com/software/memoryze.References IV VMProtect http://www.mandiant. Mandiant http://www.txt Memoryze.phearless.

Thanks Dhillon & the HitB crew! Q&A .