embedded systems laboratory archived: architectural checking via event digests for high performance...

17
Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering of National Sun Yet-Sen Unive Presenter : Hsiang-Ha Liang Date : Monday, July 5,2015 Chang-Hong Hsu; Chatterjee, D. ; Morad, R. ; Ga, R. ; Bertacco Design, Automation and Test in Europe Conference and Exhibitio

Upload: roland-harmon

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

Embedded Systems Laboratory

ArChiVED: Architectural Checking via Event Digests for High Performance Validation

Department of Computer Science Engineering of National Sun Yet-Sen University

Presenter : Hsiang-Hao LiangDate : Monday, July 5,2015

Chang-Hong Hsu; Chatterjee, D. ; Morad, R. ; Ga, R. ; Bertacco, V.Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014

Page 2: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

2

Abstract

Simulation-based techniques play a key role in validating the functional correctness of microprocessor designs. A common approach for validating microprocessors (called instruction-by-instruction, or IBI checking) consists of running a RTL and an architectural simulation in lock-step, while comparing processor architectural state at each instruction retirement.

This solution, however, cannot be deployed on long regression tests, because of the limited performance of RTL simulators. Acceleration platforms have the performance power to overcome this issue, but are not amenable to the deployment of an IBI checking methodology. Indeed, validation on these platforms requires logging activity on-platform and then checking it against a golden model off-platform.

Page 3: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

3

Abstract

Unfortunately, an IBI checking approach following this paradigm entails a large slowdown for the acceleration platform, because of the sizable amount of data that must be transferred off-platform for comparison against the golden model.

In this work we propose a sequence-by-sequence (SBS) checking approach that is efficient and practical for acceleration platforms. Our solution validates the test execution over sequences of instructions (instead of individual ones), thus greatly reducing the amount of data transferred for off-platform checking.

We found that SBS checking delivers the same bug-detection accuracy as traditional IBI checking, while reducing the amount of traced data by more than 90%.

Page 4: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

4

What’s Problem

Using the IBI checking has some drawback :(inefficient) Slowdown for the acceleration platform. sizable amount of data that must be transferred off-

platform. Creating an equivalent scheme for acceleration platform

is challenging : the checking functionality is usually too complex to be

implemented in hardware and complicated by re-orderings in architectural state updates

the recording rate necessary to gather information for IBI checking is too high (exert the performance advantage of acceleration).

Page 5: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

5

Related Works

gem5-simulator[2]

Accelaration and emulation platform validation

[1.7.9.12]

A run-time verification scheme[14.15]

traces based on the first symptom that manifests

[6]

Including support for:1. multiple cache coherence protocols 2.interconnect models. 3.supports most commercial ISAs.(ARM, ALPHA, MIPS, Power, SPARC, and x86)

4.including booting Linux on three of them. (ARM, ALPHA, and x86)

This Paper : ArChiVED

Page 6: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

6

Proposed Method

A sequence-by-sequence (SBS) checking approach that is efficient and practical for acceleration platforms.

Challenge : Handling the lack of event correlation. Reducing the amount of traced data.

Every four IC events combine to one Epoch

During off-line

Two trace buffer: one is being generated and the other is begin transferred.

Page 7: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

7

Overview of the SBS checking

accrues and compresses several architectural events over a period of time before it transfers the information off-platform for comparison with a golden model.

avoid the loss of critical information. bug detection potential in this accrual and compression process.

The possible downsides of this approach : limit the sensitivity to discrepancies between

corresponding architectural update events. Solution:more data recording ( long checksum).

identify a discrepancy in the cumulative record

of a large number of events. Solution: using their bug detection.

last entry of an epoch must always be an IC event

Page 8: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

8

Overview of the ArChiVED checking flow

Our SBS solution checks iteratively the consistency between a simulation trace generated on an acceleration platform and a golden model.

process takes two inputs: the trace’s digest from the acceleration and the unmodified trace generated by the golden model.

The epoch’s comparison flow consists

of three main parts: Epoch segmentation. RU events adjusting. Checksum computation.

Page 9: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

9

ArChiVED’s checking flow –

Epoch segmentation and RU events adjusting

This first step aligns epochs obtained from the accelerator simulation with those from a golden model

A mismatch reveals a bug due to incorrect program flow.

The Second step is to match the RU length vectors between the golden model’s epoch and the digest’s epoch.

For each register with a different length, our checker attempts to move RU events in the golden model across the epoch’s boundary, until it can attain a match.

If we cannot find a set of RU events that matches the digest, we flag a bug for missing register updates

Page 10: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

10

ArChiVED’s checking flow - Checksum computation

While the previous steps have already ruled out many bug manifestation possibilities, other manifestation types still remain. RU events may occur with incorrect register values . the wrong ordering among RU events updating a same register is

also an indicator of a bug. If everything matches, we move on to the next epoch;

otherwise we flag a bug for incorrect event orderings or corrupted event values.

Checksum Scheme desirable characteristic : small logic footprint in hardware. on-the-fly checksum computation, so that less storage is required. a checksum that is sensitive to event ordering. low aliasing.

Page 11: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

11

Checksum Schemes Example

The below three simple checksum schemes : Architectural state - the checker regularly compares the register file and the PC

register values. Shortcoming:

insensitive intra-epoch event re-orderings. high probability of aliasing.

XOR - simply updates the checksum by applying an exclusive-or operation. Shortcoming:

XOR cannot preserve information ordering between events.

rotate-and-XOR(XOR checksum scheme improvement) – rotate-and-XOR scheme left-rotates the accumulated checksum by one bit before updating it with a new message.

Page 12: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

12

Before Experiment

I expect to see the three conditions - reducing the amount of data. the recording rate. insert what kinds of the faults and where we can insert.

Page 13: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

13

Experiment Enviroment

built on the gem5 simulator- ARMv7 ISA. total of 64 integer registers. 168 special purpose registers. executing test programs using the cycle-accurate out-of-order

O3CPU model in gem5. used eight distinct testbenches from the in SPEC CPU2006

integer benchmark. Each testbench is executed many times (approximately 350 times).

(eliminate redundant situations)

inject bugs broadly in each of the key design modules: fetch stage, instruction buffer, execution stage and register file.

Page 14: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

14

Experiment Result

The SBS’s recording rate is much smaller, even much smaller than for IBI, and very practical.

Indeed, at practical epoch lengths, we can attain a 99% reduction in the recording bit-rate compared to an IBI solution.

a recording rate of “ ”bits/cycle

IBI checking is a special case of SBS with N = 1

Page 15: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

15

Experiment Result (Cont.)

Figure 4 indicates that the XOR and rotate-and-XOR checksum scheme achieved a detection ratio of >85% and 93% on average.

SBS’ bug detection accuracy is fairly insensitive to epoch length.(length of 100 and one of 100,000 is only 7%)

In contrast ,the architectural-state checker degrades quickly with longer epochs, up to 30%.

Page 16: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

16

Experiment Result (Cont.)

the rotate-and-XOR checksum scheme shines particular by detecting rarely-occurring bugs and the only one to be consistently high over a broad range of epoch lengths.

Page 17: Embedded Systems Laboratory ArChiVED: Architectural Checking via Event Digests for High Performance Validation Department of Computer Science Engineering

17

Conclusion

As a whole, this Paper told me a checking scheme design, described the checking flow detail, and finally the experiment results(like accuracy) are reasonable .

but there is one defect – They didn’t tell me what kind of faults that let us inject it.