reverse engineering ii: the basics - department of ... · reverse engineering ii: the basics this...

38
Copyright F-Secure 2010. All rights reserved. Protecting the irreplaceable | f-secure.com Reverse Engineering II: The Basics This document is only to be distributed to teachers and students of the Malware Analysis and Antivirus Technologies course and should only be used in accordance with the course guidelines.

Upload: duongkhanh

Post on 03-May-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Copyright F-Secure 2010. All rights reserved.Protecting the irreplaceable | f-secure.com

Reverse Engineering II: The Basics

This document is only to be distributed to teachers and students of the Malware Analysis and

Antivirus Technologies course and should only be used in accordance with the course guidelines.

Copyright F-Secure 2010. All rights reserved.

Agenda

• Very basics

• Intel x86 crash course

• Basics of C reversing

2

Copyright F-Secure 2010. All rights reserved.3

Binary Numbers

1 0 1 1 - Nibble

B

1 0 1 1B

1 1 0 1D

- Byte

1 0 1 1B

1 1 0 1D

0 0 1 13

1 0 0 19

- Word

Copyright F-Secure 2010. All rights reserved.4

Byte Order a.k.a. Endianness

12 34

1234

00 01= 0x3412 (Little Endian)

= 0x1234 (Big Endian)

= 0x1234 (Little Endian)

= 0x3412 (Big Endian)

00 01

Copyright F-Secure 2010. All rights reserved.5

Little Endian Dword

12 34 56 78

12345678

00 01 02 03

0x78563412

0x12345678

00 01 02 03

Copyright F-Secure 2010. All rights reserved.6

Endianness Matters

• Data exchange between computers

• Networking protocols

• File formats for disk storage

Copyright F-Secure 2010. All rights reserved.7

System Endianness

Little

Endian

Big

Endian

Switchable

Endianness

Intel x86PowerPC

(exc. G5)ARM

Intel 8051Sparc

(exc. v9)Alpha

Most

uControllersSystem/370 Intel IA64

Copyright F-Secure 2010. All rights reserved.8

ASCII Code

0x00 - 0x1FControl

Characters

Backspace,

Line feed

0x20 - 0x3FDigits and

Punctuation

0-9 <> =

.,: *-()!

0x40 - 0x5F

Upper-case

Letters and

Special

ABCD...

@[]\^_

0x60 - 0x7E

Lower-case

Letters and

Special

abcd...

`{}|~

Copyright F-Secure 2010. All rights reserved.9

ASCII Example

H e l l o 1 2 3 4

48 65 6C 6C 6F 20 31 32 33 34

http://en.wikipedia.org/wiki/ASCII

Copyright F-Secure 2010. All rights reserved.10

Unicode Strings

ff fe 48 00 65 00 6c 00 6c 00 6f 00

H e l l oBOM

UTF-16 / UCS-2

http://en.wikipedia.org/wiki/UTF-16/UCS-2

http://en.wikipedia.org/wiki/Category:Unicode

Copyright F-Secure 2010. All rights reserved.11

String Storage

• ASCIIZ: Zero-terminated ASCII

• Pascal: Size byte + ASCII string

• Delphi: Size Dword + ASCII or Unicode string

H e l l o

ASCIIZ: 48 65 6C 6C 6F 00

Pascal: 05 48 65 6C 6C 6F

Copyright F-Secure 2010. All rights reserved.12

Intel x86 Architecture

Image Copyright © 2004 GNU

Copyright F-Secure 2010. All rights reserved.13

Introduction to Intel x86

• Started with 8086 in 1978

• Continued with 8088, 80186, 80286, 386, 486, Pentium, 686 ...

• CISC architecture

• 32-bit is called x86-32 or IA-32

• 64-bit is called x86-64, AMD64, EMT64T

• 80386 introduced in 1986

• Has a 32-bit word length

• Has eight general-purpose registers

• Supports paging and virtual memory

• Addresses up to 4GiB of memory

Copyright F-Secure 2010. All rights reserved.14

Data Register Layout

Image Copyright © 1997-2008 Intel Corporation

Copyright F-Secure 2010. All rights reserved.15

Data Registers

AL / AH / AX

EAXAccumulator Arithmetic operations

BL / BH / BX

EBXData index

General data

storage, index

CL / CH / CX

ECXLoop counter Loop constructs

DL / DH / DX

EDXData register Arithmetics

Copyright F-Secure 2010. All rights reserved.16

Address Registers

IP / EIP Instruction Pointer Program execution

SP / ESP Stack Pointer Stack operation

BP / EBP Base Pointer Stack frame

SI / ESI Source Index String operation

DI / EDI Destination Index String operation

Copyright F-Secure 2010. All rights reserved.17

Segment Registers

CS Code Segment Program code

DS Data Segment Program data

ES / FS / GS Other Segments Other uses

Copyright F-Secure 2010. All rights reserved.18

EFLAGS Register

Image Copyright © 1997-2008 Intel Corporation

Copyright F-Secure 2010. All rights reserved.19

Mnemonic Examples

MOV EAX, 1 Move 1 to EAX

ADD EDX, 5 Add 5 to EDX

SUB EBX, 2 Subtract 2 from EBX

AND ECX, 0 Bit-wise AND 0 to ECX

XOR EDX, 4 Bit-wise eXclusive OR 4 to EDX

SHL ECX, 6 Shift ECX left by six

ROR EBX, 3 Bit-wise rotate EBX right by 3

INC ECX Increment ECX

Copyright F-Secure 2010. All rights reserved.20

More Mnemonics

JNZ label Jump if not zero (equal)

JMP label Unconditional jump to label

CALL func Call function

RET Return from function

LOOP label ECX--, Jump to label if not zero

PUSH EAX Push EAX to stack

POP EDI Pop EDI from stack

LODSB Load byte from DS:ESI to AL

Copyright F-Secure 2010. All rights reserved.

Reversing C code

Copyright F-Secure 2010. All rights reserved.

Basic Data Types

• char - 1 byte

• short - 2 bytes

• int - 4 bytes (platform word)

• long - 4 bytes

• float - 4 bytes floating point

• double - 8 bytes floating point

Copyright F-Secure 2010. All rights reserved.

Pointers and Arrays

• Pointers can point to any memory location

• One-dimensional arrays are flat memory

• Multi-dimensional arrays use pointers

A[0] A[1] A[2] A[3]

char a[4];

char *b, c;

c = a[2];

b = a;

c = *(b+2);

Copyright F-Secure 2010. All rights reserved.

Composite Types: Structure

• Memory is allocated for all members

• Members are accessible separately

struct {

unsigned int id;

unsigned short age;

char name[16];

} record;

Copyright F-Secure 2010. All rights reserved.

Alignment

• Data structures are aligned to word size

• #pragma pack(n) directive can change it

• #pragma pack(1) removes alignment

• Important when reconstructing structures

Copyright F-Secure 2010. All rights reserved.

Structure Storage

long id;

2-byte padding

char name[16];

char name[16];

PackedAligned

sizeof(record) = 24 sizeof(record) = 22

short age; short age;

long id;

Copyright F-Secure 2010. All rights reserved.

Composite Types: Union

• Memory is allocated for the largest member

• Holds only one member at a time

union foo {

int one;

char two;

};

Copyright F-Secure 2010. All rights reserved.

Control Structures

• Conditional Branch

• Iteration

• Switch-Case

• Goto label

Copyright F-Secure 2010. All rights reserved.

Conditional Branch: if

int example_if()

{

int foo = 0;

if (foo)

{

do_one_thing();

}

else

{

do_another();

}

}

var_C = dword ptr -0Ch

push ebp

mov ebp, esp

sub esp, 18h

mov [ebp+var_C], 0

cmp [ebp+var_C], 0

jz short loc_1F27

call _do_one_thing

jmp short locret_1F2C

loc_1F27:

call _do_another

locret_1F2C:

leave

retn

Copyright F-Secure 2010. All rights reserved.

Iteration: for

int example_for()

{

int i;

for (i=0; i<10; i++)

{

if (check_something(i))

break;

}

}

push ebp

mov ebp, esp

sub esp, 28h

mov [ebp+var_C], 0

jmp short loc_1F51

loc_1F3D:

mov eax, [ebp+var_C]

mov [esp], eax

call _check_something

test eax, eax

jnz short locret_1F57

lea eax, [ebp+var_C]

inc dword ptr [eax]

loc_1F51:

cmp [ebp+var_C], 9

jle short loc_1F3D

locret_1F57:

leave

retn

Copyright F-Secure 2010. All rights reserved.

Iteration: while

int example_while()

{

int i = 0;

while (i < 100)

{

if (check_something(i))

break;

}

}

push ebp

mov ebp, esp

sub esp, 28h

mov [ebp+var_C], 0

jmp short loc_1F77

loc_1F68:

mov eax, [ebp+var_C]

mov [esp], eax

call _check_something

test eax, eax

jnz short locret_1F7D

loc_1F77:

cmp [ebp+var_C], 64h

jl short loc_1F68

locret_1F7D:

leave

retn

Copyright F-Secure 2010. All rights reserved.

Branching: Switch-Case

int example_switch()

{

int i = 1;

switch (i)

{

case 0:

do_one_thing();

break;

case 1:

do_another();

break;

default:

check_something(i);

}

}

push ebp

mov ebp, esp

sub esp, 38h

mov [ebp+var_C], 1

mov eax, [ebp+var_C]

mov [ebp+var_1C], eax

cmp [ebp+var_1C], 0

jz short loc_1FAB

cmp [ebp+var_1C], 1

jz short loc_1FB2

mov eax, [ebp+var_C]

mov [esp], eax

call _check_something

jmp short locret_1FB9

loc_1FAB:

call _do_one_thing

jmp short locret_1FB9

loc_1FB2:

call _do_another

jmp short $+2

locret_1FB9:

leave

retn

Copyright F-Secure 2010. All rights reserved.

Branching: Goto

int example_goto(void)

{

open_files();

if do_one_thing()

goto error;

if do_another()

goto error;

close_files();

return 1;

error:

close_files();

return 0;

}

push ebp

mov ebp, esp

sub esp, 18h

call _open_files

call _do_one_thing

test eax, eax

jnz short loc_1FE6

call _do_another

test eax, eax

jnz short loc_1FE6

call _close_files

mov [ebp+var_C], 1

jmp short loc_1FF2

loc_1FE6:

call _close_files

mov [ebp+var_C], 0

loc_1FF2:

mov eax, [ebp+var_C]

leave

retn

Copyright F-Secure 2010. All rights reserved.

Function Calling Conventions

• Common calling conventions:

•__stdcall - Standard calls on Windows

•__cdecl - Most common C calling convention

•__fastcall - Uses registers for arguments

•__thiscall - Pass ‘this’ pointer in ECX in C++

• Most important: Who is going to clean the stack?

• Mixing them will crash the program

Copyright F-Secure 2010. All rights reserved.

Simple C Program

int foobar(int x, int y)

{

int z;

return x;

}

int main(void)

{

int z = foobar(1, 2);

}

Copyright F-Secure 2010. All rights reserved.

__cdecl Calls

PUSH arg2

PUSH arg1

CALL function

ADD ESP,8

PUSH EBP

MOV EBP, ESP

SUB ESP, 4

MOV EAX, [EBP+8]

MOV ESP, EBP

POP EBP

RET

ARG2

ARG1

RET Addr.

Saved EBP

LOC1

arg1: EBP+8

arg2: EBP+12

loc1: EBP-4

Stack

Copyright F-Secure 2010. All rights reserved.

__stdcall Calls

PUSH arg2

PUSH arg1

CALL function

PUSH EBP

MOV EBP, ESP

SUB ESP, 4

MOV EAX, [EBP+8]

MOV ESP, EBP

POP EBP

RETN 8

ARG2

ARG1

RET Addr.

Saved EBP

LOC1

arg1: EBP+8

arg2: EBP+12

loc1: EBP-4

Copyright F-Secure 2010. All rights reserved.

Reading

C Programming Information:

http://www.cprogramming.com/

http://www.unixwiz.net/techtips/win32-

callconv-asm.html

Intel x86 Function-call Conventions: