1
A Spam Mail-based Solution for Botnet Detection and Network Bandwidth Protection
許富皓資訊工程學系中央大學1
2
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
2
3
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
3
4
Spam Mails and Bots At 2009, research shows that more
than 80% spam mails are sent by the bots, called spam bots hereafter, of botnets.
Spam mails take up more than 50% of network bandwidth.
4
5
Objectives Detect members of botnet Filter spam mails Save network bandwidth
5
6
Observation As our observation, the majority of
spam bots are not e-mail servers, spam bots usually only send mails but do not receive mails.
6
7
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
7
8
E-mail Architecture
8
9
Botnets & Spam Mails
9
10
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
10
11
System Layout
confirmation hostconfirm
redirect/blockHoneypot
11
SMTP, POP3, IMAP SMTP, POP3, IMAP
Packet Analyzer
Mail ServerEnd users
12
System Component – Packet Analyzer (PA) (1)
Located at a router Detect spam bots based on the IP ack-
packets which use SMTP(S), POP3(S), or IMAP(S)
protocol and whose sizes are less than 200 bytes (describe
later) Use credit number to record mail
transmission status of an IP address.12
13
IP Threat Level Table (IPTLT)
IP Credit Action
14
System Component – Packet Analyzer (PA) (2)
Clean IPTLT periodically to solve the problem of dynamic allocated IPs (such as, DHCP).
Add an NAT detection mechanism to avoid harming innocent hosts behind an NAT.
14
15
Credit Number Credit Number is a property of an IP
address. PA assigns a credit number to every
IP address which has appeared in a mail packet (SMTP/POP/IMAP) as the source IP address.
15
16
Operations of Credit Number
Increasing operation When PA detects a SMTP mail packet,
the credit number of the source IP of the mail packet will be increased by 1.
Decreasing operation When PA detects a POP/IMAP mail
packet, the credit number of the source IP of the mail packet will be decreased by a higher value, 3.
P.S.: By analyzing real world traffic, in a network the ratio of sending mails to receiving mails is 1:3.
16
17
Approach to Reduce Packet Analyzer Performance Overhead
Sampling A router solution should avoid high
performance overhead; hence, Packet Analyzer can use sampling to reduce the performance overhead of packet analyzer.
The sample rate is an adjustable parameter.
17
18
Avoid Noise Created by Large Size Mails
No matter what size a mail has, the number of protocol related packets exchanged between the sender and receiver is similar to each other.
The sizes of protocol related packets are usually smaller than 200 bytes.
To avoid counting large size mails sending by normal users more times, we filter out e-mail related packets with size larger than 200 bytes. 18
19
System Component – Confirmer (1)
Located at a confirmation host. Check if a host is a mail server
because a mail server may have the same behavior as a bot.
By connecting to SMTP port of a host to check whether it is a mail server.
19
20
System Component – Confirmer (2)
The record that an IP is used by a confirmed host is kept in the IPTLT until the IPTLT is cleaned up; hence, the IP is only needed to be confirmed once before the IPTLT is cleaned up.
20
21
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
21
22
Work Flow
IP Credit
Action
22
Packet Analyzer
NetFilterPREROUTIN
G
Kernel SpaceIP threat level
table
Kernel thread
Packets
e-mail relate
d traffic
Suspect IP
Clean up periodical
ly
Confirmer
Suspect IP
Confirmation Host
Check result
Check SMTP
Fetch action
Accept / Drop
Suspect Host
Fill action field
Linux Router
23
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
23
24
Performance Evaluation Scenario
Send 10000 mails. Mail size: 3 KB. Transmitting mails through the router with or without
SpamFinder.
24
SMTP serverSpamFinder
End user (E-mail client)
Host: ASUS Desktop AS-D672CPU: Intel Pentium 4 Dual Core 3.2 GHzRAM: 4GLAN: Gigabit Ethernet NICOS: Windows 7
Host: ASUS Desktop AS-D360CPU: Intel Pentium 4 3.0 GHzRAM: 512 MBLAN1: 10 Mb/100 Mb Ethernet ControllerLAN2: 10 Mb/100 Mb Ethernet ControllerOS: Fedora 10, kernel 2.6.27
25
Performance Evaluation Evaluation Result
Overhead, O(n%) n: sample rate
O(100%) = 4.13 %
O(0.2%) = 3.8 %
25
avg time to send mail0.1
0.105
0.11
0.115
0.12
0.125
0.13
0.135
Performance Evaluation
26
Effectiveness Evaluation Scenario
Analyze the real world traffic (about 1300 computers of NCTU dorm network) offered by NBL (Network Benchmarking Lab)@NCTU
Analyze the whole day traffic of 6/13/2010 (about 2TB)
Replay traffic (250 ~ 350 Mb/s)
26
Real world traffic replay SpamFinde
rTraffic logs
Host: HP CQ-45 NotebookCPU: Intel Core 2 Duo P7450 / 2.13 GHzRAM: 4GLAN: 10/100/1000 Gigabit Ethernet LANOS: Fedora 12 kernel 2.6.32
27
Effective Evaluation According to the result of analyze,
we get the follows information: The rate of sending and receiving data
is 1:3 With credit threshold = 150,
SpamFinder can save 25% e-mail related traffic
Average packet dropped ratio : 0.31 % NBL uses CISCO 7609 router to collect
packet traces. We use a notebook to make our analysis.
27
28
Effective Evaluation According to the result of analses,
we get the follows information: SpamFinder detect 2 spam bots after
analyzing 1 day traffic of NCTU dorm network
P.S.: that according to tyc.edu.tw reports: in average in the NCU campus there are 4.1 hosts per day be reported as spam hosts.
28
29
Effective Evaluation
29
150 200 300 450 65023.6%
23.8%
24.0%
24.2%
24.4%
24.6%
24.8%
25.0%
25.2%
save e-mail related traffic
credit number threshold
Save
rat
io o
f e-
mai
l rel
ated
tra
ffic
30
Effective Evaluation
30
[5068] ip: 140.#.#.135, credit: 18147, nat: 0, mail server: 0, action: 0
。。。Jun 17 14:11:51 [SEND] 140.#.#.135:4552 -> 74.125.157.27:25, total_len:1470Jun 17 14:11:51 [SEND] 140.#.#.135:4552 -> 74.125.157.27:25, total_len:1470Jun 17 14:11:51 [SEND] 140.#.#.135:4552 -> 74.125.157.27:25, total_len:1326Jun 17 14:11:52 [SEND] 140.#.#.135:4839->165.131.174.40:25, total_len:1500Jun 17 14:11:52 [SEND] 140.#.#.135:4839->165.131.174.40:25, total_len:1500Jun 17 14:11:52 [SEND] 140.#.#.135:4839->165.131.174.40:25, total_len:896Jun 17 14:11:53 [SEND] 140.#.#.135:4832 -> 74.86.7.196:25, total_len:1500Jun 17 14:11:53 [SEND] 140.#.#.135:4832 -> 74.86.7.196:25, total_len:1500Jun 17 14:11:53 [SEND] 140.#.#.135:4832 -> 74.86.7.196:25, total_len:811 。。。 repeat this action 3628 times
31
Effective Evaluation
31
[3370] ip: 140.#.#.148, credit: 8203, nat: 0, mail server: 0, action: 0
。。。Jun 14 16:24:14 [SEND] 140.#.#.148:6508 -> 148.123.15.75:25, total_len:1500Jun 14 16:24:14 [SEND] 140.#.#.148:6508 -> 148.123.15.75:25, total_len:1500Jun 14 16:24:14 [SEND] 140.#.#.148:6508 -> 148.123.15.75:25, total_len:1142Jun 14 16:24:17 [SEND] 140.#.#.148:6534->75.126.136.141:25, total_len:1500Jun 14 16:24:17 [SEND] 140.#.#.148:6534->75.126.136.141:25, total_len:1500Jun 14 16:24:17 [SEND] 140.#.#.148:6534->75.126.136.141:25, total_len:1500Jun 14 16:24:17 [SEND] 140.#.#.148:6526 -> 74.125.43.27:25, total_len:1470Jun 14 16:24:17 [SEND] 140.#.#.148:6526 -> 74.125.43.27:25, total_len:1470Jun 14 16:24:17 [SEND] 140.#.#.148:6526 -> 74.125.43.27:25, total_len:1470 。。。 repeat this action 2050 times
32
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
32
33
Related Work BotGraph, Large Scale Spamming
Botnet Detection, USENIX’09 Webmail botnet account detection
BotMiner, Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection, USENIX’08 Network behavior based detection
Wide-scale botnet detection and characterization, HotBots’07, USENIX
33
34
Outline Introduction Background System Design Work Flow Evaluation Related Work Conclusion
34
35
Limitation If the e-mail sending traffic passes
through the router, but the e-mail receiving traffic doesn’t, then the host would be considered as a spam bot.
SpamFinder cannot detect e-mails that are sent and received through a Webmail, but popular web mail services have their effective anti-spam mechanism to filter spam mails.35
36
Attack Analysis and Future Work
Attackers might send fake IP packets to defame some target hosts, we could check the existence of related connections to detect these behavior.
In the future, the spam mails (or IP packets) sent from bots will be redirected to a honeypot for further analysis. 36
37
Conclusion We propose a network level spam bot
detection mechanism, SpamFinder Implement it on a Linux router and make
evaluations using real world traffic that offered by NBL(Network Benchmarking Lab)@NCTU
The evaluation result show that SpamFinder has low performance overhead and could detect spam bots and protect network bandwidth effectively
37
38
End Q&A
38