lok kwong yan, and heng yin syracuse university air force research laboratory usenix 2012

28
DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis Lok Kwong Yan, and Heng Yin Syracuse University Air Force Research Laboratory USENIX 2012 1 Presentation: 2012-09-11 曾曾曾

Upload: santos

Post on 24-Feb-2016

77 views

Category:

Documents


0 download

DESCRIPTION

DroidScope : Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis. Lok Kwong Yan, and Heng Yin Syracuse University Air Force Research Laboratory USENIX 2012. Presentation: 2012-09-11 曾毓傑. Outline. Introduction Background Architecture - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

1

DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware AnalysisLok Kwong Yan, and Heng YinSyracuse UniversityAir Force Research Laboratory

USENIX 2012

Presentation: 2012-09-11 曾毓傑

Page 2: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

2

Outline• Introduction• Background• Architecture• Interface & Plugins• Evaluation• Discussion & Conclusion

Page 3: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

3

INTRODUCTION

Page 4: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

4

Introduction• Malicious applications exist in official and unofficial

marketplace with a rate of 0.02% and 0.2% respectively

• Virtualization-based analysis approach• Analysis runs underneath the entire virtual machine• Difficult for an attack within VM to disrupt the analysis• Loss the semantic contextual information when the analysis

component is moved out of the box

• We need to intercept certain kernel events and parse kernel data structure to reconstruct the semantic knowledge

Page 5: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

5

DroidScope• Reconstruct two levels of semantic knowledge

• OS-level: to understand the activities of the malware process and its native components

• Java-level: comprehend the behaviors in the Java components

• Built on top of QEMU emulator

• Build tools for analysis• Native instruction tracer• Dalvik instruction tracer• API tracer• Taint tracker

Page 6: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

6

BACKGROUND

Page 7: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

7

Android System Overview

Android System

Parent process for all Android processes

libdvm.so provide Java-level abstraction

Kernel data structure

Page 8: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

8

DroidScope Overview

Page 9: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

9

ARCHITECTURE

Page 10: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

10

Architecture• Integrating the changes into the QEMU emulator

• Came from Android SDK• Leave Android system unchanged

• For different virtual devices can be loaded

• Reconstruct OS-level and Java-level views• Monitors how malware’s Java components communicate with

Android Java Framework• Monitors how malware’s native components interact with the Linux

Kernel• Monitors how malware’s Java components and native components

communicate through the JNI interface

Page 11: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

11

Reconstructing OS-level View• Basic Instrumentation

• Insert extra instructions during the code translation phase for system status

Target Instructions

Tiny Code Generator(TCG)

Native Instructions

Add additional code for detection

Page 12: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

12

Reconstructing OS-level View (Cont.)• For example, context switch in ARM architecture would

change the c2_base0 and c2_base1 registers, which stores the page table address

• Extract semantic knowledge• System calls• Running processes, threads• Memory maps

Page 13: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

13

Reconstructing OS-level View (Cont.)• System calls

• ARM architecture use service zero instruction svc #0 as making system calls, and system call number is in register R7

• Processes and Threads• Read task_struct structure for process information• pid, tgid, pgd, uid, gid, euid, egid, comm, cmdline, thread_info• sys_fork, sys_execve, sys_clone, and sys_prctl system

calls trigger the information update

• Memory maps• mm_struct• sys_mmap2 triggers the information update

Page 14: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

14

Reconstructing Java-level View• Dalvik Instructions

• Knowing which instruction is executing right now• Register R15 points to the currently executing Dalvik instruction

Page 15: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

15

Reconstructing Java-level View (Cont.)

• Just-In-Time Compiler• Some hot, heavily used instructions are compiled into native

machine code• Those code execution would skip the mterp component

Call dvmGetCodeAddr() for address of compiled code

Flush JIT cache, return NULL and reset counter to disable JIT function

Page 16: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

16

Reconstructing Java-level View (Cont.)

• Dalvik Virtual Machine States• Record Register R4 to R8 for storing DVM states

R4: Program CounterR5: Stack Frame PointerR6: InterpState StructureR7: Instruction CounterR8: mterp Base Address

Page 17: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

17

Reconstructing Java-level View (Cont.)

• Java Objects• Obtaining data inside Java objects such as string data

Page 18: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

18

Symbol Information• Native library symbols

• Use objdump to retrieve symbol information• Some malwares often stripped of all symbol information

• Dalvik or Java symbols• Use dexdump to retrieve symbol information• Data structures of DVM also contains some symbol information• InterpState Structure (Register R6) has a method field points

to the Method structure for the currently executing method• Method structure has a name field points to method name

Page 19: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

19

INTERFACE & PLUGINS

Page 20: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

20

Interface & Plugins• APIs for analysis customization

• The instrumentation logic in DroidScope is complex and dynamic• An event based interface to facilitate custom analysis tool

developement

Page 21: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

21

Sample Plugin• Setup which program to be analyzed and print all Dalvik

opcode information

Page 22: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

22

API Implementation• API tracer

• Instrument the invoke* and execute* Dalvik bytecodes to identify and log method invocations

• Native instruction tracer• Gather each instruction including the raw instruction, its operands,

and their values• Dalvik instruction tracer

• Decode instructions into dexdump format, including values and all available symbol information

• Taint Tracker• Monitor sensitive information and keep track data propagation

Page 23: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

23

EVALUATION

Page 24: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

24

Evaluation• Benchmark checking efficiency and capability• 7 benchmark apps

• AnTuTu Benchmark• AnTuTu CaffeineMark• CaffeineMark• CF-Bench• Mobile Processor Benchmark• Benchmark by Softweg• Linpack

Page 25: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

25

Evaluation• Performance

• Capability• Analysis of DroidKongFu• Analysis of DroidDream

Page 26: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

26

DISCUSSION & CONCLUSION

Page 27: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

27

Discussion• Limited Code Coverage

• One drawback of dynamic analysis• By manipulating the return value of function call, we may increase

the code coverage

• Other Dalvik Analysis Tools• Dalvik/Java Static Analysis: Woodpecker, DroidMoss• Native Static Analysis: IDA, binutils, BAP• Android Dynamic Analysis: TaintDroid, DroidRanger• Linux Kernel Dynamic Analysis: logcat, adb

Page 28: Lok Kwong  Yan, and  Heng  Yin Syracuse  University Air Force Research  Laboratory USENIX 2012

28

Conclusion• We presented DroidScope, a fine grained dynamic binary

instrumentation tool for Android that rebuilds two levels of semantic information