performance evaluation of packet classification on fpga-based tcam emulation architectures globecom...
Post on 01-Jan-2016
231 Views
Preview:
TRANSCRIPT
Performance Evaluation of Packet Classification on FPGA-based TCAM Emulation Architectures
GLOBECOM (Global Communications Conference), 2012
Presenter: NTHU 101062607 李若萍
Outline•Introduction•Related Work•TCAM Emulation•RAM-based TCAM Architecture•Performance Evaluation•Conclusion
2/17
Introduction•Packet fields are used as keys to determine the best
matching rule and apply a corresponding action.▫Exact matching▫Prefix matching▫Range matching
•How to find the best matching rule?▫Each rule is assigned a cost.
3/17
Introduction (cont.)CAM(Content Addressable Memories) TCAM(Ternary Content Addressable
Memories)
SRAM cell ≠
VCC KEY
Match line
Data
Match
Match = (key ≠ Data)Match line = !Match
Match = (key ≠ Data) & MaskMatch line = !Match
SRAM cell ≠
VCC KEY
Match line
DataMatch
Mask SRAM
&Mask
Data Key Match line
0 0 1
1 0 0
0 1 0
1 1 1
Data Key Mask Actual Data Match line
0 0 0 X (don’t care) 1
1 0 0 X 1
0 0 1 0 1
1 0 1 1 0
0 1 0 X 1
1 1 0 X 1
0 1 1 0 0
1 1 1 1 1
4/17
Introduction (cont.)•TCAMs (Ternary Content Addressable Memories)
RAM
Compared key
0 1 X 1
TCAM
Priority Encoder
Compared result:
Memory address: 1 2 N
memory address as indexto find responding action
store rules
3
Capacity constraints Storage inefficiency High power consumptionLimited scalability
5/17
Introduction (cont.)•Purpose : we investigated performance and trade-
offs related to TCAM emulation in FPGAs (Field-Programmable Gate Array).
•We considered the impact of encoding different key ranges on rules for different configurations in terms of the search key length and the number of rules.
(Not ASIC: Application-Specific Integrated Circuits)
6/17
Related Work•Hardware-assisted packet classification▫Decision tree
Hierarchically split rule pattern straitens incremental updates.
▫Decomposition The cross-producting stage issue.
▫Exhaustive search Predictable memory requirements.
7/17
TCAM EmulationNative TCAM Emulated TCAM
8/17
RAM-based TCAM Architecture
m-bit key (m = 10)
w = m-1 = 9
m/w = 10/9 = 1 RAM blockblock size = 2^w = 2^9 ( 0~2^9-1 )
w = m-2 = 8
m/w = 10/8 = 1 RAM blockblock size = 2^w = 2^8 ( 0~2^8-1 )
w = 2
m/w = 10/2 = 5 RAM blockblock size = 2^w = 2^2 = 4 ( 0~3 )
w = m = 10
m/w = 1 RAM blockblock size = 2^w = 2^10 ( 0~2^10-1 )
Full address expansion
w = 1
m/w = 10/1 = 10 RAM blockblock size = 2^w = 2 ( 0~1 )
native TCAM
BRAMs demands (m/w) * 2^w bitsBRAMs modes = depth*width
9/17
RAM-based TCAM Architecture (cont.)
•n = 64, m = 16
w = 6
m/w = 16/6 = 2 RAM blockblock size = 2^w = 2^6 = 64
16 –bit key
m/w = 16/6 = 2
2^16*64
2^8*32*4
10/17
RAM-based TCAM Architecture (cont.)
11/17
Performance Evaluation•Resource Utilization▫ A TCAM bit typically demands 16 transistors, while
a RAM bit, only 6
▫TCAM => w*m*16▫TCAM emulation => (m/w)*(2^w)*6
emulated one(m/w)*(2^w)*6
TCAMw*m*16
12/17
Performance Evaluation (cont.)(m/w)*(2^w) bits
13/17
Performance Evaluation (cont.)
•Classification Throughput▫a crucial factor for evaluating emulated TCAM
performance on FPGA is the actual classification throughput in terms of packets per second (pps).
14/17
Performance Evaluation (cont.)
•Range Impact▫we assess the impact of supporting different
ranges in terms of memory requirements and classification rate.
15/17
Conclusion•Classification rates above 300Mpps for both large
keys and rule sets can be implemented with only a few megabits of RAM when considering up to medium size range intervals (512-2048).
•Support for both large ranges and large rule sets tends to demand much memory resources, which also penalizes the resulting classification rate.
16/17
Thank you!
The End.
17/17
top related