10.1.1.186.4635
TRANSCRIPT
-
7/29/2019 10.1.1.186.4635
1/91
Equalizing Filter Design for Cross-talk Cancellation
by
Jihong Ren
B. Sc. (Electrical Engineering), Huazhong University of Science and Technology, 1995
M. Eng. (Electrical Engineering), Huazhong University of Science and Technology, 1998
M. Sc. (Neuroscience), The University of British Columbia, 2000
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
Master of Science
in
THE FACULTY OF GRADUATE STUDIES
(Department of Computer Science)
we accept this thesis as conformingto the required standard
The University of British Columbia
June 2002
c Jihong Ren, 2002
-
7/29/2019 10.1.1.186.4635
2/91
Abstract
As interconnect line width and spacing decreases and operating clock rate increases, in-
terconnect has become a bottleneck in developing high-speed integrated circuits, multichip
modules, printed circuit boards, and systems. With small line spacing, mutual capacitance
and inductance approach the level of self-capacitance and inductance, and can severely de-
grade signal integrity. The well-known equalizing filter method can significantly improve
signal integrity. This thesis explores the effectiveness of equalizing filters in cross-talk can-
cellation for high-speed, off-chip buses. It demonstrates that linear programming provides
effective methods for designing cross-talk canceling equalizing filters that greatly increase
the bandwidth of high-speed digital buses.
ii
-
7/29/2019 10.1.1.186.4635
3/91
Contents
Abstract ii
Contents iii
List of Tables vi
List of Figures vii
Acknowledgments ix
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Method and Proposed System Structure . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 6
2.1 Transmission channel limitations . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
iii
-
7/29/2019 10.1.1.186.4635
4/91
2.2.2 Application of equalizing filters in cross-talk cancellation for the
local telephone subscriber loop . . . . . . . . . . . . . . . . . . . . 12
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Coupled Distributed RLC Interconnect Model 14
3.1 Interconnect Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Bus parameters and Simulation results . . . . . . . . . . . . . . . . . . . . 18
4 Linear Equalizing Filter Design 20
4.1 Measurements of filter performance . . . . . . . . . . . . . . . . . . . . . 20
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.2 Matrix Representations of Convolution . . . . . . . . . . . . . . . 25
4.3 Least Squares Optimization Method with Pseudo-random Input . . . . . . . 30
4.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.2 Least Square problem formulation . . . . . . . . . . . . . . . . . . 30
4.3.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Linear Programming Method with Worst-case Input . . . . . . . . . . . . . 38
4.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.2 Linear Programming Problem formulation . . . . . . . . . . . . . . 44
4.4.3 Smoothing filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Testing results: Comparison of LSQ method and LP method . . . . . . . . 49
4.5.1 Worst-case input sequence . . . . . . . . . . . . . . . . . . . . . . 49
4.5.2 Indirect coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.3 Over-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
iv
-
7/29/2019 10.1.1.186.4635
5/91
4.5.4 Minimum bit time . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6 Time-variant Linear FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7 Optimized Smoothing Filter . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5 Predictor-Corrector Algorithm with Model Reduction 67
5.1 Mehrotras predictor-corrector algorithm . . . . . . . . . . . . . . . . . . . 68
5.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Starting and Stopping . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.2 Solving the linear systems . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Ill-conditioning and Model Reduction . . . . . . . . . . . . . . . . . . . . 72
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6 Conclusions and Future Work 77
6.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Bibliography 81
v
-
7/29/2019 10.1.1.186.4635
6/91
List of Tables
4.1 Performance of equalizing filters with different sizes for a bus 32-bits wide
and 5 cm long. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Performance of equalizing filters with different sizes for buses 32-bits wide.
All filters designed using the LP method. . . . . . . . . . . . . . . . . . . . 61
4.3 Performance of different smoothing filters with equalizing filters de-
signed by the LP method at 300 ps. . . . . . . . . . . . . . . . . . . . . . . 65
5.1 linprog() iteration display . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Iteration display of our approach: Mehrotra interior-point method with
model reduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
vi
-
7/29/2019 10.1.1.186.4635
7/91
List of Figures
1.1 Proposed transmission network structure. . . . . . . . . . . . . . . . . . . 3
2.1 A coupled microstrip transmission line. . . . . . . . . . . . . . . . . . . . 7
2.2 Simple lumped model for two coupled interconnects . . . . . . . . . . . . 7
2.3 Block diagram of an equalized transmission channel (from [3]). . . . . . . 9
2.4 Simplified model for full-duplex transmission over a linear multi-input/multi-
output channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 Analytical solution from equation 3.16 vs. Spice simulation results . . . . . 19
4.1 An illustrative eye diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Example of a data eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Predistorted signal: equalizing filter output . . . . . . . . . . . . . . . . . . 39
4.4 Examples of output signal for 32-bit interconnect network . . . . . . . . . 40
4.5 Eye-diagrams for a 32-bit interconnect network . . . . . . . . . . . . . . . 41
4.6 Frobenius norm of the bus impulse response. . . . . . . . . . . . . . . . . 46
4.7 System with smoothing filter at the receiver end. . . . . . . . . . . . . . . . 48
4.8 Example of output signals for systems with and without the equalizing filter
designed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
vii
-
7/29/2019 10.1.1.186.4635
8/91
4.9 Pseudo-random test: eye diagrams for systems with and without the equal-
izing filter designed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.10 Worst-case test vs. Pseudo-random test . . . . . . . . . . . . . . . . . . . . 53
4.11 Worst-case performance of different equalizing filters designed with the LP
method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.12 Indirect coupling between non-adjacent lines . . . . . . . . . . . . . . . . 55
4.13 Eye diagram for system with equalizing filters designed by the LP
method. Grey traces indicate high signal transmitted. Black traces indicate
low signal transmitted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.14 Magnitude of overshoot increases with the size of the equalizing filter de-
signed with the LP method . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.15 The convolution procedure of the time-variant FIR filter . . . . . . . . . . . 63
viii
-
7/29/2019 10.1.1.186.4635
9/91
Acknowledgments
First of all, I would like to thank my supervisor Dr. Mark Greenstreet. This thesis would not
have been possible without his inspiration, extensive support , patience and encouragement.
I also would like to thank my husband, Rui Li, for his consistent support.
JIHONG REN
The University of British Columbia
June 2002
ix
-
7/29/2019 10.1.1.186.4635
10/91
Chapter 1
Introduction
1.1 Motivation
Advances in digital integrated circuit (IC) fabrication technology have resulted in an ex-
ponential growth for the speed and integration levels of ICs. With more and more circuits
placed on each die, high-performance systems require larger and larger I/O bandwidth. This
demand has been addressed by increasing the number of high-speed signals and the per-pin
interconnection bandwidth. Although the number of I/Os has increased from
pins in the 1970s, to several hundred pins per IC now [18], this growth is being rapidly
out-paced by the bandwidth demands. To continue to improve overall system performance,
the per-pin interconnection bandwidth must scale with the speed and integration level of
ICs. However, without new approaches, we will soon reach the limit set by the intrinsic
properties of copper lines.
The number of I/Os increases by 12% per year, half of which is due to the increase
in chip perimeter and half of which is due to the increase in pin density. On chip, both the
number of devices and clock rates have increased at 50-60% per year, creating a growing
1
-
7/29/2019 10.1.1.186.4635
11/91
bandwidth gap. Higher bit-rates and pin densities have come to a point that interconnections
are no longer well-behaved short interconnections. With the decreasing cross sectional ar-
eas of interconnections, the line resistance per unit length has increased to a point that long
interconnections can no longer be considered lossless. Resistive effects are particularly se-
vere at high bit-rates because of both the high frequency roll-off of RC transmission lines
and the increase of resistance with frequency due to the skin effect. To achieve maximum
packing density, designers attempt to place signal lines as close to each other as possible.
This introduces problems of electromagnetic coupling (cross-talk) which are exacerbated
by high data rates. Cross-talk has become a critical issue in interconnect performance and
hence overall system performance. Traditionally, cross-talk is reduced by carefully control-
ling line geometry and arranging circuits to decrease the coupled line length. Moreover,
signaling conventions that are less susceptible to coupled energy can be used. These meth-
ods reduce cross-talk in a somewhat ad-hoc way. For example, as a rule-of-thumb, a ratio of
two-to-one for line spacing against line width is commonly used, based on the assumption
that cross-talk decreases monotonically with the increase in line spacing. However, this
simple assumption can fail for high bit-rate design. The relationship between line spacing
and line width is non-linear, and a two-to-one ratio between width and spacing may actu-
ally result in higher coupled energy than smaller line spacing [11][20]. Furthermore, while
these methods might reduce the amount of cross-talk, the problem of cross-talk still exists.
New approaches in cross-talk reduction are needed.
1.2 Method and Proposed System Structure
Equalizing filters have been used effectively for cross-talk cancellation in acoustic applica-
tions such as telephone line subscriber system [1][6][7]. Recently, they have been used to
compensate for the frequency-dependent attenuation of transmission lines [2].
2
-
7/29/2019 10.1.1.186.4635
12/91
0
0
Transmitter
filter
filter
filter
filter
Bus
Filter Network
Receiver
Figure 1.1: Proposed transmission network structure.
This thesis explores the effectiveness of equalizing filters in cross-talk cancellation
for high-bandwidth, digital communication. The proposed system structure is depicted in
figure 1.1. In this transmission system, an equalizing filter is assigned to each wire of the
bus. Each filter takes the input signals on a wire and its adjacent wires as its inputs, and
outputs a predistorted signal onto the wire. For a -bit bus, the filter system can be viewed
as a network. Cross-talk is eliminated if the filter network is designed in a way that
the concatenation of the filter network and the bus has frequency response in the form of a
diagonal matrix.
Several optimal filter design strategies are explored, such as the linear programming
method and the least-squares method. Matlab simulation results show that the resulting
3
-
7/29/2019 10.1.1.186.4635
13/91
filters dramatically reduce cross-talk and substantially increase the maximum bandwidth
that can be achieved by buses on PC boards. Thus, the equalizing filter method is promising
for cross-talk cancellation and merits further investigation.
1.3 Contributions
This thesis demonstrates that linear programming models provide effective methods for
designing cross-talk canceling equalizing filters that greatly increase the bandwidth of high
speed digital buses on printed circuit boards. The following are the major contributions
supporting this thesis:
Equalizing filter design for high speed digital buses can be formulated as a least
squares optimization problem, using a metric for optimality. This metric ensures
the quality of the received signal on average.
The metric corresponds to the traditional eye height measurement of signal in-
tegrity and guarantees worst-case performance. The filter design problem for
optimality can be formulated as a linear programming problem.
An evaluation of the linear programming and least squares methods for a variety of
filter configurations shows that both offer a dramatic increase in bandwidth when
compared with a bus with no filter or with transmitter pre-emphasis without cross-
talk cancellation. Furthermore, the filters designed for the optimality criterion
using linear programming significantly outperform their counterparts designed by
traditional, least-squares method, when evaluated for digital data transmission.
To evaluate these methods, I implemented them both using Matlab. In doing so,
I found that Matlab optimization package does not always converge for the linear
4
-
7/29/2019 10.1.1.186.4635
14/91
programming problems presented in this thesis. Therefore, I implemented an interior-
point method with a model reduction technique that successfully solves the linear
programming problems encountered.
1.4 Thesis Outline
In this thesis, Chapter 2 introduces the equalizing filter technique and its existing applica-
tions. Chapter 3 describes a coupled distributed RLC model for transmission lines. Based
on this model, Chapter 4 discusses various techniques, such as least squares and linear
programming, that I explored to design optimal linear FIR equalizing filters. Chapter 5 is
devoted to Mehrotras interior point method with a model reduction technique that is used
to solve our particular linear programming problem introduced in Chapter 4.
5
-
7/29/2019 10.1.1.186.4635
15/91
Chapter 2
Background
Computer system performance is often limited by communication bandwidths between
chips and between subsystems. A typical signaling system consists of a transmitter, a chan-
nel, and a receiver. The transmitter encodes digital information as analogue waveforms on
the transmission channel, such as a circuit board trace. On the other end of the transmission
channel, the receiver samples and quantizes the signal to recover the original digital infor-
mation. Although we often think of transmission channels such as wires as being ideal by
having zero resistance, capacitance and inductance, real wires are not ideal but rather par-
asitic circuit elements whose geometry affects their electrical properties. Moreover, with
small line spacing, inductive and capacitive cross-talk can severely degrade signal integrity.
With the growth in integration levels, the interconnect line width and spacing decreases,
and interconnect has become a bottleneck in high-speed digital designs.
This chapter first discusses the channel characteristics, particularly PC board traces.
I then provide background on the equalizing filter technique and an overview of its related,
existing applications.
6
-
7/29/2019 10.1.1.186.4635
16/91
t
s
w w
h
Figure 2.1: A coupled microstrip transmission line.
Figure 2.2: Simple lumped model for two coupled interconnects
2.1 Transmission channel limitations
Transmission channels, such as PC board traces and coaxial or twisted-pair cables, have
limited bandwidths that are determined by their physical characteristics: the size and con-
struction of their conductor and shield, and the dielectric material. In this thesis, I am
particularly interested in high-speed interconnect on PC boards. Thus, the following dis-
cussion focuses on PC board traces. Figure 2.1 shows typical microstrip interconnections.
A simple lumped model for two coupled interconnects is shown in figure 2.2.
The resistance per unit length of a trace is given by the conductance of the trace ma-
terial (typically copper) divided by the cross-sectional area of the trace. The cross-sectional
area is the product of the width of the trace and its thickness. The width is determined by the
design. The thickness is specified when the board is manufactured: thickness is specified
in ounces of copper per square yard. A board with 1 oz copper has a conductor thickness
7
-
7/29/2019 10.1.1.186.4635
17/91
of roughly 35 microns. More accurate models consider the skin effect: at high frequencies,
currents flow closer to the surface of the trace, resulting in a frequency-dependent increase
in the series resistance [10][3].
The capacitance per unit length ( ) and the inductance per unit length ( ) of a mi-
crostrip trace are determined by many factors including its width and height and its separa-
tion from the ground plane. Electric and magnetic fields between adjacent traces lead to the
coupling capacitance, , and the mutual inductance, , respectively.
For PC board traces, the loss in transmission is primarily due to the series resistive
component of the copper ( ). Because of this loss, without a special transmission scheme,
off-chip signaling on long wires, even with good current-mode signaling methods, is limited
to about 1GHz [2]. Full-swing unterminated signaling methods that are used in most digital
systems have even lower limits. With narrow wires and smaller line spacing, the coupling
inductance and capacitance between adjacent lines approach the level of self-inductance and
capacitance. In high speed circuits, because of fast signal rise times, coupling effects are
severe and have become a primary concern for present and future high-speed high-density
circuit design. Besides the resistive properties of the line, the coupling effects further limit
the maximum bit-rate at which data can be transmitted correctly.
2.2 Equalization
An ideal transmission channel would in all cases deliver the near end signal in from
the driver without distortion to the far end receiver, i.e. out in , where is
the propagation delay across the channel. Thus, an ideal channel would have the transfer
function , where and is the identity matrix. If an equalizing filter has a
transfer function that equals the inverse of the transfer function of the channel, the concate-
nation of the equalizer and the channel has a flat frequency and phase response. This is the
8
-
7/29/2019 10.1.1.186.4635
18/91
Transmitter
Equalizer
G(s)
Channel
H(s)
Figure 2.3: Block diagram of an equalized transmission channel (from [3]).
equalization technique widely used to actively compensate for the channel transfer func-
tion. Channel equalization can be performed at the transmitter end, as shown in figure 2.3,
preceding the actual channel driver. Transmitters that utilize equalizing filters are called
pre-distorting transmitters. The equalizing filter can also be incorporated into the receiver,
called receiver equalization. It can also be split between the two ends.
Pre-distorting Transmitters
Pre-distorting transmitters integrate equalizing filters, commonly realized as finite
impulse response (FIR) digital filters. While infinite impulse response (IIR) [9] fil-
ters can be more flexible than FIR, they are generally not used for high data rate
transmission because of the difficulty of calculating the IIR recurrence (i.e. feed-
back) at very high rates. The inputs to the equalizing FIR filters are the present and
past transmitted symbols. The output of the FIR filter is a weighted sum of these
symbols. The length of the filter depends on the number of symbols that affect the
response of the channel to the current symbol. The filter coefficients depend on the
channel characteristics.
Pre-distorting transmitters were first used by Poulton et al. [2] in a serial channel over
copper wires at 4Gb/s to reduce intersymbol interference caused by frequency depen-
9
-
7/29/2019 10.1.1.186.4635
19/91
dent attenuation of the channel. Later, other groups [4][17] used the same technique
to design high-speed serial link transceivers. FIR equalizing filters built into trans-
mitters are easy to implement at very high speed because of the availability of trans-
mitted symbols at the transmitter end. Furthermore, because the transmitted symbols
are either 1s or 0s, multiplication with the filter coefficients is easy. For example, in
[2], a five-tap FIR filter is implemented with digital adders, and a digital-to-analog
converter (DAC) is used to generate pre-distorted pulses. However, because trans-
mitters generally dont have information of received signals, FIR filter coefficients
are obtained either by characterization of channel properties in advance [2][4], or by
adaptive implementation with feedback information from the receiver end [17].
Receiver Equalization
Receiver equalization can be realized either with analog filters preceding the analog-
to-digital converter (ADC) or with digital filters following the ADC. The latter one is
the usual technique because digital filters are easy to implement and adapt. Moreover,
more complex and non-linear filters can be implemented. However, it is well-known
that receiver equalization amplifies high frequency noise [8]. Furthermore, histori-
cally, high speed ADC technology is behind high speed DAC technology. Therefore,
pre-distorting transmitters are commonly used in high speed transmission systems
that run at GHz speed. Recently, Horowitzs group realized 8-Gsamples/s ADC in
0.25 m CMOS, which makes high speed links with equalization at the receiver end
possible [19].
2.2.1 Design Methods
The following are two methods that are currently used to design equalizing filters.
10
-
7/29/2019 10.1.1.186.4635
20/91
Zero-forcing method
The transfer function of the channel can be derived from models established for
each particular channel (reviewed in [18]). The frequency response of the channel
and also the desired frequency response of the equalizing filter is then calculated at
each frequency point. This set of discrete points is used to obtain a discrete impulse
response function using inverse Fourier transform. The following two steps are used
to obtain a more manageable impulse response function.
Windowing: where is the desired impulse response
and is the windowing function. This step is needed to obtain a filter with
a finite number of taps.
Delaying: is shifted to the right until the samples are all indexed by a
non-negative integer to obtain a causal filter.
In practice, large windows must be used to obtain effective equalizing filters. Ac-
cordingly, many researchers have turned to using optimization methods to obtain
good approximate equalizing filters. This is the approach that I take in this thesis.
Least Squares Minimization
With an ideal transmission channel, the received signal is a delayed version of the
transmitted signal. Using least squares minimization, the equalizing filter design
problem is equivalent to the problem of designing equalizing filters to determine the
values for the filter coefficients that minimize the norm of the difference between
the received signal and the delayed version of the transmitted signal.
This method is used in optimal pre-emphasis equalizing filter design in [2][19] to
build serial links that operate at over 1 Gigabits per second. Also it is widely used to
design equalizers for telephone subscriber systems [1][6][7].
11
-
7/29/2019 10.1.1.186.4635
21/91
receiver
b(t)
a(t)
P(t)
transmitter
filter
G(t)
channel
filter
transmitter
P(t)
channel
farend
H(t)
nearend
filter
R(t)
n(t)
Figure 2.4: Simplified model for full-duplex transmission over a linear multi-input/multi-
output channel. are the impulse responses of the far-end channel,
near-end channel, transmitter filter and receive filter respectively.
2.2.2 Application of equalizing filters in cross-talk cancellation for the local
telephone subscriber loop
Equalizing filters are used to reduce intersymbol interference caused by the characteristics
of a single channel [2][4][17][19]. Until now, no work has been reported on the application
of equalizing filters in cross-talk cancellation for high speed buses that run at multi-Gb/s.
Along with the limited bandwidth of transmission channels, cross-talk is another critical
problem that limits the maximum data rate that can be achieved by high density wide buses.
Local telephone subscriber loops have the same problem. Bundles of twisted copper wires
are used in local telephone subscriber loops. Because of the close physical proximity, cross-
talk interference from neighbouring channels is one of the major limitations on the max-
imum data rate that can be achieved over the loops [7]. Multichannel equalization can
effectively suppress both near- and far-end cross-talk [6][7].
In these papers, a cable of twisted pairs that is terminated at a single physical loca-
tion is treated as a single multi-input/multi-output channel. Cross-talk is then characterized
by off-diagonal components of the matrix impulse response of the channel. The multichan-
nel adaptive FIR equalizers, the transmitter and the receiver process the entire vector of
12
-
7/29/2019 10.1.1.186.4635
22/91
inputs and outputs (see figure 2.4). Rather than directly diagonalizing the system trans-
fer function matrix, the multichannel equalizers are designed to minimize the norm of
the difference between the received signal and the transmitted waveform. In Salzs work
[16], the minimum mean square error (MMSE) linear equalizer for the channel is
completely specified, assuming uncorrelated data and white noise. Later, Honig et al. [6]
generalized Salzs work by assuming correlated data symbols, pulse amplitude modulation
(PAM) signals and colored noise.
2.3 Summary
The equalization technique has been successfully used to compensate for resistive effects of
transmission lines [2][4][17]. With this technique and carefully chosen signaling methods,
multi-Gb/s serial links have been built. Equalization is also commonly used in telephone
subscriber systems to cancel near-end and far-end cross-talk [7][1][6]. In this thesis, I
explored the effectiveness of the equalization technique in cross-talk cancellation for high-
speed, off-chip buses. Moreover, besides the least squares optimization technique that is
commonly used to design equalizing filters, this thesis is the first work that formulates the
optimal equalizing filter design problem into a linear programming problem for high speed
digital buses.
13
-
7/29/2019 10.1.1.186.4635
23/91
Chapter 3
Coupled Distributed RLC
Interconnect Model
3.1 Interconnect Model
An electrical model of a uniform transmission line has inductance , resistance , capaci-
tance and parallel conductance , all per unit length. The term models the effects of
current leakage and is practically zero for most digital transmission on integrated circuit
and printed circuit boards.
We would like our system be able to operate at bit rate greater than 2 Gbits/sec.
Assuming that the rise and fall times are 10% of the bit time, edges have an electrical length
of = Rise time (ps)/Delay (ps/cm) = 50 (ps)/33 (ps/cm) = 1.51 cm, where 33 ps/cm is
the speed of light in a vacuum. The propagation delay of signals traveling in other media
such as a PCB trace is larger [10], and thus the corresponding electrical length would be
even smaller. For example, the common FR-4 printed circuit board material has a dielectric
constant of about 4.5 and propagation delay about 71 ps/cm. The electrical length of a bit at
14
-
7/29/2019 10.1.1.186.4635
24/91
2Gbits/sec is 0.7 cm. As a rule of thumb, distributed models should be used when the wire
length is greater than or equal to . Thus the critical dimension separating lumped from
distributed systems for printed circuit board is 0.117 cm. The wire lengths we consider here
are in the range of 2 50 cm. Thus a distributed model is needed to correctly model the
behavior of this system at multigiga bit/sec data rate. Assuming the TEM mode of wave
propagation, for a lossy multiconductor system of wires, we have inductance matrix
, capacitance matrix and resistance matrix , where , is the mutual inductance
and coupling capacitance between line and respectively. For simplicity, the following
assumptions are made:
Coupling between lines is entirely due to mutual inductance and mutual capacitance.
There is no conductance between wires of the bus or between wires of the bus and
ground. Only coupling between adjacent lines are taken into account. We ignore
direct coupling between wires of the bus that are not adjacent.
Every wire is assumed to have the same characteristics.
Wires are assumed to be arranged around a cylinder so that every wire is the same as
others.
With the above assumptions, the and matrices are shown below. The capacitance
matrix has the same structure as .
......
......
......
(3.1)
15
-
7/29/2019 10.1.1.186.4635
25/91
The behavior of this distributed system can be described by the following partial
differential equation, where voltage vector and current vector are both functions of
position and time .
(3.2)
(3.3)
Taking the Fourier transformation of these equations yields:
(3.4)
(3.5)
where is the Fourier transform of , is the Fourier transform of , and .
Differentiating equation 3.4 with respect to and substituting equation 3.5 into the result
gives
(3.6)
Let . Let be a diagonalizing matrix for , i.e., is the diag-
onal matrix whose diagonal elements are the eigenvalues of . Rewriting equation 3.6
with yields:
(3.7)
Let and , we get
(3.8)
This differential equation has the general solution
(3.9)
16
-
7/29/2019 10.1.1.186.4635
26/91
For a bus with non-zero resistive and capacitive or inductive components, the elements of
and are complex numbers. Combining equation 3.9 with the definition of yields:
(3.10)
Assuming all source ends are terminated with an impedance of and the load ends are
left open, we have the following boundary conditions.
length (3.11)
(3.12)
Combined with equation 3.4 and 3.10, the first boundary condition given above yields:
length (3.13)
From equation 3.10, we know that:
(3.14)
Thus, equation 3.12 yields:
(3.15)
Equations 3.13, 3.15 yield the final solution
(3.16)
with
length length
length
(3.17)
where is the identity matrix. Note that , and . Thus, the
frequency response of the bus is:
length length (3.18)
17
-
7/29/2019 10.1.1.186.4635
27/91
with defined in equation 3.17. The inverse Fourier transform yields the impulse
response of the bus which is used extensively in the next chapter. Note that the frequency
response of the bus is a square matrix at each frequency. The impulse response of the bus
is also a square matrix at each time sample. Entry at time denotes the response on
wire at time given an impulse input on wire at time .
3.2 Bus parameters and Simulation results
I validated the model derived above by comparing its prediction with Spice simulations.
Figure 3.1a shows the solution of equation 3.16 using Matlab and figure 3.1b shows spice
simulation results. The parameters used in both simulation are: bus width = 3, length =
5 cm, = 0.066 ohm/cm, = 0.8 pF/cm, = 3.99 nH/cm, = 0.31, = 0.23,
= 5.0 V, bit time = 500 ps, = 10% *bit time = 50 ps. These parameters correspond to
microstrip lines 34.5 m thick (1 oz copper), 75 m wide with 75 m separation between
lines, running above a ground plane with a dielectric thickness of 100 m, and a dielectric
constant of 4.5. The bus parameters are computed using formulas given in [10].
18
-
7/29/2019 10.1.1.186.4635
28/91
Figure 3.1: Analytical solution from equation 3.16 (upper panel) vs. Spice simulation
results (lower panel) of 3-bit bus. All lines are quiet except line 1.
19
-
7/29/2019 10.1.1.186.4635
29/91
Chapter 4
Linear Equalizing Filter Design
In this chapter, I present techniques for the design of linear equalizing filters. I first in-
troduce the idea of a data eye and its use to quantify filter performance. The next section
defines notations that simplify the mathematical presentation of linear equalizing filter de-
sign. Then, I introduce the least squares (LSQ) method and the linear programming (LP)
method, followed by test results. Finally, based on the linear FIR filter designs, time-variant
FIR filter design and optimal smoothing filter design are discussed.
4.1 Measurements of filter performance
The effects of distortion and noises are often illustrated using eye diagrams. An illustrative
eye diagram is shown in figure 4.1. It is called eye diagram because of its shape. During
sample interval, signal is either distinctly high or distinctly low. It must not go through the
center of the eye. This allows the receiver to unambiguously determine the value of the
bit that was transmitted. The signal can change between sampling intervals. I also restrict
how high (or low) the signal may go, otherwise, with scaling any eye opening can be made
20
-
7/29/2019 10.1.1.186.4635
30/91
eye width, w
target
v(t)
Bad
Good
Good
Bad
low
targethigh
SampleInterval
Bad
Bad
IntervalSampleNext
hunder
overh
t
Figure 4.1: An illustrative eye diagram.
arbitrarily large. Eye height heightis defined as
height under target over (4.1)
where under and over are defined in figure 4.1. The eye height and width are often used as
an indication of signal integrity. Figure 4.2 shows how a data eye is formed by overlaying
a signal waveform over multiple cycles.
The eye width, in figure 4.1, is the time that the separation between high-going
and low-going signals is greater than zero. In practice, the receiver will attempt to sample
the signal near the moment of the widest eye opening. Due to uncertainties in the timing of
the transmitter and receiver and in the delay of the interconnect, the actual sampling may
occur at some time other than this ideal. The eye-width gives an indication of the robustness
of the interface to these timing uncertainties.
In this thesis, the effectiveness of a filter is quantified in the three following ways:
eye height of the output signal given a pseudo-random input sequence.
21
-
7/29/2019 10.1.1.186.4635
31/91
Figure 4.2: Example of a data eye. Upper panel shows a random signal. Its corresponding
eye diagram is shown in the lower panel.
22
-
7/29/2019 10.1.1.186.4635
32/91
eye height of the output signal given the worst-case input sequence.
the smallest bit time (or highest bit rate) at which the eye height of output signals
is greater than a specified amount, e.g. 50% of the nominal signal level and the eye
width is greater than another specified amount, e.g. 25% of the bit time.
4.2 Preliminaries
By defining some notation up-front, the presentation of the filter design methods can be
more succinct and direct. The responses of filters and buses are naturally written as con-
volutions while linear and least squares problems are naturally formulated with matrices.
Here I define some notation to show the connection between various convolutions and their
corresponding matrix representations.
Let be a vector of size . The components are . Ill write
to denote the size of , to denote the norm of , and to denote the norm
of .
Some matrix abbreviations used below are:
The identity matrix
The matrix of zeros
The matrix where
(4.2)
4.2.1 Convolution
Linear Convolution: Let and be two vectors. The linear convolution of and is
the vector of size defined below:
(4.3)
Linear convolution is commutative and associative.
23
-
7/29/2019 10.1.1.186.4635
33/91
Circular Convolution: Let and be two vectors in . Let denote the
circular convolution of and :
(4.4)
Circular convolution is commutative and associative.
Let be a vector and be an integer with . The zero-extension of pads
with zero elements to produce a vector of size :
extend (4.5)
Zero-extension is a linear operator:
extend (4.6)
Let extend be the left matrix on the right hand side of the equation.
Linear convolution can be expressed as circular convolution of zero-extended vec-
tors:
extend extend (4.7)
Block Linear Convolution: Let be a matrix. We can think of as a column
of matrices:
...
(4.8)
where each of the is a matrix. The block linear convolution of and
is defined similarly as linear convolution:
(4.9)
24
-
7/29/2019 10.1.1.186.4635
34/91
The block linear convolution of matrix and vector is defined simi-
larly.
Block Circular Convolution: The block circular convolution of and , where
:
(4.10)
Block circular convolution is associative. It is commutative if the product of the sub-
matrices is commutative, for example, if the sub-matrices are all symmetric or all circulant
(circulant matrices are defined in sec 4.2.2 below). Extending the extend operator to block
matrices, let be a matrix, and let .
extend (4.11)
Zero extension on block matrices is a linear operator just as it is for vectors.
Block linear convolution can be expressed as block circular convolution of zero-
extended matrices:
extend extend (4.12)
where .
4.2.2 Matrix Representations of Convolution
In this section, I will first present matrix representations for linear convolution, then extend
it to block linear convolution.
Let be a vector. Let be the circulant matrix [5] generated by
:
(4.13)
25
-
7/29/2019 10.1.1.186.4635
35/91
The form of this circulant matrix is depicted below:
......
......
(4.14)
Let and be two vectors of the same size. Equations 4.4 and 4.13 yield:
(4.15)
Furthermore, if , , . . . , are all vectors of the same size, then
(4.16)
Note that matrix multiplication of circulant matrices is commutative and associative, just
like the corresponding convolution.
Let be a matrix, and let row be the vector such that
row (4.17)
Likewise, let col be the vector such that
col (4.18)
Convolution can be expressed with all arguments represented as matrices:
col
(4.19)
Using equation 4.7, linear convolution can be expressed using matrix multiplication:
extend extend (4.20)
26
-
7/29/2019 10.1.1.186.4635
36/91
Define as the matrix given by
extend (4.21)
The form of this matrix is depicted below:
v(0)
v(1)
v(m1)
v(m1)
v(m1)
v(m1)
v(m1)
v(0)
v(0)
0
0(4.22)
where . The linear convolution of can be written as
col
(4.23)
where
(4.24)
The matrix representation for linear convolution described above can be extended
to block linear convolution. Let be a matrix. As described in the previous
section, the matrix can be regarded as a column of submatrices of dimension
each.
The block circulant matrix generated by is
......
...
(4.25)
27
-
7/29/2019 10.1.1.186.4635
37/91
For those who prefer formulas to ellipses:
div div(4.26)
Let be matrices. Equations 4.10 and 4.25 yield:
(4.27)
Using equation 4.12, block linear convolution of and
can be expressed using matrix multiplication:
extend extend
extend
col
(4.28)
where is defined as extend , , and col is
defined in the obvious manner.
Block linear convolution of with can also be expressed as
matrix multiplication:
extend (4.29)
where .
Define the following operators:
block creates circulant blocks from vector .
div (4.30)
The form of this matrix is depicted below:
...
(4.31)
28
-
7/29/2019 10.1.1.186.4635
38/91
vec2cir converts a vector to a circulant matrix:
vec2cir extend block (4.32)
Define as the matrix given by vec2cir . This matrix has the
same form as (see equation 4.14), except that now each block is a circulant
matrix of size . Notice that is a block circulant matrix and
extend col
With these operators, it is straightforward to see that for and ,
col (4.33)
where .
The block linear convolution of two vectors and is defined
as:
col (4.34)
where . The block linear convolution of
can be written as
col
(4.35)
where
(4.36)
29
-
7/29/2019 10.1.1.186.4635
39/91
4.3 Least Squares Optimization Method with Pseudo-random
Input
4.3.1 Motivation
As discussed in the previous section, an ideal bus would in all cases deliver the near end
signal without distortion to the far end receiver, with some amount of delay. Thus, we
know that in the ideal case, the expected output signal would be simply a delayed version
of the input signal. The goal of filter design is to find a set of filter coefficients that make
the output signal as close to this ideal output signal as possible. Following the example of
[6][7], I use RMS error ( metric) in this section as a measure of the distance of the filter
output from the ideal, delayed signal. In this case, filter design can be formalized as a least
square optimization problem. In section 4.4, I use worst-case difference between a signal
and the target as a measure of distance ( metric) and show that the resulting filter design
problem is an instance of linear programming.
4.3.2 Least Square problem formulation
Input
Consider a bus with bus wires. Let input denote the length of the input training
sequence in bit times. Thus, an input is a function that gives a value, +1 or -1, for
each wire bus at each time input . This function can be
represented by a vector, input, with input bus denoting the value of the
wire at time . Because filter coefficients are given in tap times, oversampling is
needed to convert the input from a sequence in bit times to a sequence in tap times.
Let input input bus be a vector and be a positive integer. The oversample op-
erator, oversample input bus computes in input bus which is the times
30
-
7/29/2019 10.1.1.186.4635
40/91
oversample of input:
in input bus div bus bus
The oversample operator is linear. In particular, oversample input bus is a ma-
trix with:
oversample input bus
if div bus div
and bus bus
otherwise
Thus,
oversample input bus oversample input bus input
Define input as the vector given by oversample input bus .
The form of this vector is depicted below:
input
input
input
...
input
Repeats bus more times
input
input
...
input
Repeats bus more times
...
input
(4.37)
31
-
7/29/2019 10.1.1.186.4635
41/91
Buses
In much the same manner as above, the impulse response of a bus with bus wires is
a column of bus bus matrices with each matrix giving the response corresponding
to a particular delay. Let bus denote the length of the bus impulse response in tap
times. The bus impulse response can be represented by a bus bus bus matrix, bus
where bus bus out in is the response of the out output wire of the bus after
a delay of tap times to the in input wire.
Let in be the vector for the input of the bus in tap time.
in input
Let in denote the value of the input at tap time t:
in in bus bus
Likewise, let bus denote the bus impulse response at time :
bus bus bus bus
Let output be the vector for the output of the bus and let output
be the output at tap time :
output bus in (4.38)
Equation 4.38 has the form of a block linear convolution. Thus,
output col bus in (4.39)
where bus input. Moreover, in this thesis, for simplicity, I assume that
wires are arranged around a cylinder (see chapter 3). This means that all wires have
the same characteristics, and bus is a circulant matrix. Let be the
32
-
7/29/2019 10.1.1.186.4635
42/91
vector whose bus bus components are the first column ofbus .
That is,
bus block
bus
Thus the ouput signal of the bus given input in bit time is:
output col input (4.40)
where bus input.
Filter
In figure 1.1, a filter is depicted for each wire of the bus. Because all wires have
the same characteristics, I assume that every filter is the same. For a bus-bit bus,
this filter system can also be viewed as a bus bus input/output network. Thus,
similar to the bus, the input/output relationship of the filter system with fir taps can
be expressed as:
filterOutput col input (4.41)
where is the filter coefficient vector of the filter for wire 0 of the bus. It
33
-
7/29/2019 10.1.1.186.4635
43/91
has the form depicted below:
fir
(4.42)
where denotes the contribution of the input on wire 0 at time 0 to the filter output
for wire at time .
Because the bus is symmetric, I restrict my attention to symmetric filters. That is, in
the vector depicted above,
for fir bus
Moreover, inputs on wires far away produce very little cross-talk. Therefore, it may
be practical to force the filter coefficients for these wires to zero to simplify the im-
plementation of the filter. In this thesis, filters with various sizes are investigated.
Filter size is defined as filter length filter width. A fir fir filter contains fir sets
of fir filter coefficients for inputs on wire itself and the fir nearest wires in both
34
-
7/29/2019 10.1.1.186.4635
44/91
direction. Its filter coefficient vector fir is depicted below:
fir
fir
fir
...
fir
fir
...
fir fir
Define filterExtend fir fir bus as the matrix depicted below:
(4.43)
where denotes the horizontal concatenation of a column vector
with a matrix . Operator filterExtend fir fir bus transforms fir to the full
filter coefficient vector in equation 4.42.
filterExtend fir fir bus filterExtend fir fir bus fir (4.44)
35
-
7/29/2019 10.1.1.186.4635
45/91
Denote fir as the vector given by filterExtend fir fir bus , which equals the full
filter coefficient vector depicted previously. Thus, with equation 4.41, the output
signal of the filter system with fir fir filters can be expressed as:
filterOutput col fir input (4.45)
where fir input.
Target signal
Let be the target signal which is a delayed version of the input signal.
I considered two ways to approximate the expected delay.
LC delay: length .
approximate the delay by determining the peak of the Frobenius norm of the
bus impulse response.
The second one is more accurate because the effect of resistance is also taken into
account, especially for long buses where RC delay dominates LC delay. In this thesis,
all results are obtained with the second method.
Let be the following matrix:
if
otherwise
where is the approximated delay in tap time and
input
The target signal is given by
input (4.46)
36
-
7/29/2019 10.1.1.186.4635
46/91
Output signal
With above analysis, it is straightforward to express the output signal of the system
with fir fir filters in figure 1.1 using matrix multiplication. Let the vector output
represent the output of the system in tap time:
output col h fir input
col h input fir
h input extend fir bus filterExtend fir fir bus fir
(4.47)
where input fir bus.
Let
h input extend fir bus filterExtend fir fir bus
(4.48)
Then
output fir (4.49)
Least Squares Problem
With equations 4.46 and 4.49, the least squares problem is:
firfir
Given and , I used QR decomposition (i.e. the backslash command in Matlab) to
find the vector fir that minimizes the least square error of the over-determined system
fir .
4.3.3 An example
To show the effectiveness of the equalizing filter approach in cross-talk cancellation, con-
sider a 32-bit bus with length 5 cm. The electrical parameters of the bus are: = 0.066
37
-
7/29/2019 10.1.1.186.4635
47/91
/cm, = 0.8 pF/cm, = 3.99 nH/cm, = 0.31, = 0.23. Filter design parameters
are: fir , fir , taps per bit = 4, bit time = 400 ps, length of the training sequence,
input bits.
For this particular example, in equation 4.46 and 4.49,
Bus width bus = 32. Length of the bus impulse response, bus is set to be 16 taps.
Thus, h is a vector of length: bus bus .
input is a vector of length: bus input .
fir .
is a matrix of size .
Pseudo-random input sequences are used as test sequence. By comparing the eye
opening with and without the filter designed, we get an indication of the effectiveness of
the filter.
Figure 4.3 shows the predistorted input signal on wire 1. The waveforms in fig-
ure 4.4 clearly show that the filter greatly reduces the overshoot and undershoot,
which is also shown by the eye diagrams in figure 4.5. With the FIR filter designed,
the eye height is increased from 31% to 82%. This tells us that equalizing filters are a very
promising method in cross-talk cancellation for high speed buses. More thorough testing
results are presented in section 4.5.
4.4 Linear Programming Method with Worst-case Input
In this section, a linear programming method is introduced with the assumption that we can
solve the formulated linear programming problem.
38
-
7/29/2019 10.1.1.186.4635
48/91
0 10 20 30 40 50 602
1.5
1
0.5
0
0.5
1
1.5
2
Voltage
(v)
t (ns)
Input signal on wire 1Predistorted signal on wire 1
Figure 4.3: Predistorted signal: equalizing filter output
4.4.1 Motivation
Although the least squares optimization method works and the FIR filter described greatly
improves the eye height of signals transmitted, this method has several shortcomings.
The filter designed by the LSQ optimization method greatly depends upon the pseudo-
random input pattern used as training sequence. To get a good filter design, a long
training sequence must be used, which makes the speed of the filter design very slow
for wide buses as occur frequently in practice.
The design objective is to transmit the bits without error. It is assumed that as long as
a bit satisfies the eye specification, it will be received correctly. Thus, getting some
bits that already satisfy the eye specification closer to the target signal doesnt matter.
Its the worst-case pattern that determines the eye height. Thus, the metric doesnt
39
-
7/29/2019 10.1.1.186.4635
49/91
0 10 20 30 40 50 602
1.5
1
0.5
0
0.5
1
1.5
2
t (ns)
Voltage
(v)
System without filtersInput signal on wire 1
Output signal on wire 1
0 10 20 30 40 50 601.5
1
0.5
0
0.5
1
1.5
Voltage
(v)
t (ns)
System with 8*2 filtersInput signal on wire 1Output signal on wire 1
Figure 4.4: Examples of output signal for 32-bit interconnect network with (lower panel)
and without (upper panel) 8 2 equalizing filters designed with the LSQ method.
40
-
7/29/2019 10.1.1.186.4635
50/91
0 100 200 300 400 500 600 700 8002
1.5
1
0.5
0
0.5
1
1.5
2Eye diagram for system without filters (eye height 29%, eye width 75%)
t (ps)
Voltage
(v)
0 100 200 300 400 500 600 700 800
1.5
1
0.5
0
0.5
1
1.5
t (ps)
Voltage
(v)
Eye diagram for system with 8*2 filters (eye height 82%, eye width 75%)
Figure 4.5: Eye-diagrams for a 32-bit interconnect network with (lower panel) and without
(upper panel) 8 2 equalizing filters designed with the LSQ method. Red traces indicate
high signal transmitted. Blue traces indicate low signal transmitted.
41
-
7/29/2019 10.1.1.186.4635
51/91
strictly correspond to eye height, the metric defined in equation 4.1. For example, it
is possible that for a training sequence, some filter coefficient set produces very small
RMS but the output signal has 1 bad trace. It is that 1 bad trace which determines
the eye height. Certainly, we can reformulate the same problem into a linear pro-
gramming problem, such that for a given training sequence (pseudo-random input),
the metric is minimized. However, in order to guarantee worst-case performance,
ideally, all possible input combinations should be part of the training sequence. This
is obviously not practical.
It turns out that for a given set of filter coefficients, the worst-case input pattern can
be figured out and thus the worst-case eye height can be computed. This section is
devoted to this method that minimizes over all possible inputs.
First, I show that the search space for the metric is convex even when more general,
non-linear filters are considered. Formalize an eye height specification as a sequence of
tuples:
A filter satisfies if and only if for every input every output satisfies:
output input
output input
(4.50)
where is the number of taps per bit, is the expected delay of the bus, output
h input and is the filter function. Let denote filter satisfies eye .
Let and be two filters that satisfy some eye opening constraint . That is,
and . Let , where . Because the bus, h , is linear,
a system with produces output signals that are the same linear combination of what is
42
-
7/29/2019 10.1.1.186.4635
52/91
produced by systems with and . It then follows from equation 4.50 that .
Thus the space of filters that satisfy eye opening constraint E is convex.
The objective is to send -1, 1 signals down the bus as clearly as possible. In this
system, every wire of the bus has the same configuration and thus is interchangeable. Thus,
the original objective is the same as trying to send down 1 on wire 1 as clearly as possible
with the worst-case disturbances from other wires and preceding and following bits.
The output signal on wire 1 for the current bit is simply a summation of the effect
on wire 1 at the current bit from
the input signal on wire 1 for the current bit, which is the signal expected to come
through if there is no disturbance.
the input signal on wire 1 at other bit times and also the input signals on other wires
for the current bit and other bits, which produce disturbances on the first wire at the
current bit.
Thus, the optimization problem can be restated as the following: Given that the
current bit input on the first wire is 1, find the best set of filter coefficients that makes the
output signal on wire 1 at the current bit as close to 1 as possible for the worst-case input
sequence which produces the largest disturbances on wire 1 from other bit times and other
wires.
subject to
undisturbed disturbances
undisturbed disturbances
(4.51)
43
-
7/29/2019 10.1.1.186.4635
53/91
4.4.2 Linear Programming Problem formulation
I now focus on the practical case where the filter is linear and FIR, and show that the design
problem is an instance of linear programming. The goal remains to send down 1 along
wire 1 as clearly as possible. A quantified version of this goal is: for the worst case input
sequence with 1 at the current bit, the output signal at some given sampling time is as close
to 1 as possible. That is, at this sampling point, the eye height is as high as possible. A
reasonable sampling point is , the delay of the bus. Equation 4.51 shows that to formulate
the LP problem for the equalizing filter design, we need to know the undisturbed output at
the sampling point and the largest total disturbances at the sampling point.
Let in be the input sequence that is 1 bit long and only the bit on the first wire is 1:
inif
otherwise
(4.52)
Because the whole system is linear and circulant, the response to this pulse input in gives
us all the information we need to compute the output for the worst-case disturbances. From
section 4.3, for the system with fir fir FIR filters, we know that the corresponding output
is given by G fir, where G as given by equation 4.48:
G h in extend fir bus filterExtend fir fir bus
where fir bus is the length of the response in tap time. Different rows ofG fir
represent the response on some wire at some tap time. The contribution of the bit from
equation 4.52 to the output at the sampling time is:
undisturbed row bus G fir (4.53)
Responses from other wires and responses from the first wire arising from earlier and later
bits are the disturbances. For example, the disturbance on wire 1 at the sampling time
44
-
7/29/2019 10.1.1.186.4635
54/91
caused by input on the second wire bit times earlier, is the same as the disturbance on wire
2 at the sampling time from the input on the first wire bit times earlier. Moreover, it is the
same as the response of the original pulse input from equation 4.52 observed on wire 2 at
the tap time that is bit times later than the sampling time. These are due to the linearity
and symmetry of the system. Thus,
disturbance row bus G fir (4.54)
where disturbance is the disturbance on the first wire at the sampling time given an
input of 1 on the wire bit times earlier. If the disturbances from other bit times and
other wires are all positive, we get the largest total disturbance and hence the worst-case
disturbances. Let d denote the worst-case, positive disturbance on wire 1 at the sam-
pling time from the input on wire bit times earlier. Noting that each input to the filter
is either +1 or -1, the following inequality constraints compute the absolute value function
needed to obtain d :
d rowbus
G fir
d row bus G fir(4.55)
Because the cost function is positive monotonic in each of the d , either the first con-
straint or the second constraint is tight at the optimal point.
Let be the matrix that contains all the rows in G that matter. The total number of
rows are . To calculate the total disturbances from other bit times and wires, ideally,
an infinitely long history should be considered because of the infinite impulse response
of the bus. This is not practical. Notice that most of the energy of the impulse response
expands over about 6 times LC delay of the bus (see figure 4.6). For the particular bus
model presented in this thesis, the LC delay is about 250 ps. All the results presented here
45
-
7/29/2019 10.1.1.186.4635
55/91
0 200 400 600 800 1000 1200 1400 16000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t (ps)
Frobeniu
snormo
ftheimpulseresponseof
thebus peak
future history
Figure 4.6: Frobenius norm of the bus impulse response.
are obtained with:
bus length (tap time)
Moreover, notice that the bus impulse response does not rise immediately to the peak. This
means that not only the history bits affect the output of the current bit but also a few future
bits. Among the rows, there are future bits, 1 current bit and the rest are history
bits. For bus
row bus row bus G (4.56)
Thus rowbus
fir gives the undisturbed output. Here is the equalizing filter
design problem as a linear programming problem in fir d :
46
-
7/29/2019 10.1.1.186.4635
56/91
row bus
row bus
fir
d
(4.57)
4.4.3 Smoothing filter
It was found that if we average a few taps of the current output bit and use that as the
objective function of the LP problem, the eye height obtained is better than simply asking
the optimizer to bring one tap of the current output bit as close to 1 as possible. However,
a corresponding smoothing filter is needed at the receiver end in order to get the desired
output signal. Fortunately, such averaging behavior is typical of the input circuits on real
chips [10]. The new system structure is shown in figure 4.7. Smoothing filters will be
further examined in section 4.7.
Assuming we are averaging over 3 taps (a more sophisticated strategy will be dis-
cussed in section 4.7), define a smoothing operator:
smooth
......
...
(4.58)
47
-
7/29/2019 10.1.1.186.4635
57/91
Transmitter Equalizingfilter
BUS Smoothingfilter
Receiver
Figure 4.7: System with smoothing filter at the receiver end.
With the smoothing operator, now
G smooth h in extend fir bus filterExtend fir fir bus
(4.59)
Moreover, the delay of the bus might not be the best sampling point. It was found
that the best sampling point depended on the filter size and the bit time. For example, at
300 ps bit time, equalizing filters designed with 1 tap extra delay in addition to the
bus delay give the highest eye height (81%) among all filters. It is also better than the
system without the smoothing filter (74%). In the rest of this thesis, all testing results are
obtained with the extra delay varied to give the best eye height.
4.4.4 An example
The following example shows the effectiveness of the equalizing filter approach (LP method)
in cross-talk cancellation. I use the same bus parameters and filter size as the example given
in section 4.3.3. A pseudo-random test sequence is used.
For this particular example, the LP problem formulated has the following properties:
fir .
number of disturbance variables, d : 223. Thus, total number of variables is 240.
48
-
7/29/2019 10.1.1.186.4635
58/91
number of constraints: 448.
From figures 4.5 and 4.9, note that the eye-height for the filter designed by the LP
method ( norm) is slightly higher than that for the LSQ filter ( norm), vs. .
As expected, optimizing for eye-height produces greater actual eye-height than the average
case optimization of the LSQ method. The eye width for LP is significantly smaller than
that for LSQ, vs. . This is expected because the LP filter is optimized for eye-
height at a specific sampling point, whereas the LSQ objective function considers the entire
waveform. Section 4.5 presents further comparisons.
The speed of FIR filter design with the LP method largely depends on the size of the
LP problem formulated. Thus it depends on how many bits (number of disturbances) are
used to design the filter and the size of the filter. The number of disturbances is determined
by the length of the bus impulse response in bit time. The smaller the bit time, the larger the
LP problem. For an filter design at 400 ps, on a Linux box with a 800MHz Pentium
III CPU and 256MB memory, it finishes within a few seconds. Based on this method, I
investigated other variations of linear FIR filters, such as time-variant linear FIR filters and
other types of smoothing filters.
4.5 Testing results: Comparison of LSQ method and LP method
4.5.1 Worst-case input sequence
In section 4.3.3 and 4.4.4, pseudo-random input sequences were used to measure the eye
opening (eye height and eye width). By comparing the eye opening with and without the
filter designed, we get an indication of the effectiveness of the filter. A shortcoming of
using pseudo-random input sequences as testing sequence is the result varies a lot from
time to time if the input sequence is not long enough. But simulation with a very long input
49
-
7/29/2019 10.1.1.186.4635
59/91
0 10 20 30 40 50 602
1.5
1
0.5
0
0.5
1
1.5
2
t (ns)
Voltage
(v)
System without filtersInput signal on wire 1
Output signal on wire 1
0 10 20 30 40 50 601.5
1
0.5
0
0.5
1
1.5
t (ns)
Voltage
(v)
System with 8*2 filters Input signal on wire 1Output signal on wire 1
Figure 4.8: Example of output signals for systems with (lower panel) and without (upper
panel) the equalizing filter designed with the LP method.
50
-
7/29/2019 10.1.1.186.4635
60/91
0 100 200 300 400 500 600 700 8002
1.5
1
0.5
0
0.5
1
1.5
2Eye diagram for system without filters (eye height 29%, eye width 75%)
t (ps)
Voltage
(v)
0 100 200 300 400 500 600 700 800
1.5
1
0.5
0
0.5
1
1.5
t (ps)
Voltage
(v)
Eye diagram for system with 8*2 filters (eye height 84%, eye width 50%)
Figure 4.9: Pseudo-random test: eye diagrams for systems with (lower panel) and without
(upper panel) the equalizing filter designed with the LP method. Red traces indicate high
signal transmitted. Blue traces indicate low signal transmitted.
51
-
7/29/2019 10.1.1.186.4635
61/91
sequence takes a long time. Inspired by the LP filter design procedure, I used the worst-
case input sequence for each filter instead of pseudo-random input sequence as the testing
sequence. Since the total disturbance from other bit times and other wires is the largest for
the worst-case input, the eye opening is the smallest among all input sequences and hence
the most representative.
For a given set of filter coefficients, the worst-case input sequence input with length
(where is defined in equation 4.48, is the number of taps per bit) can be found
by:
for every wire and every bit , calculate the resulting disturbance on wire 1 at a
given sampling time. If the filter coefficients are obtained with the LP method, the
sampling time is the same as what was used in the LP filter design. For an arbitrary
set of filter coefficients, the sample point is not defined in advance. Instead, I consider
every tap time as a possible sample point and select the one with the best eye height
as the sample point. Accordingly, the worst-case input sequence is determined by
finding the worst-case input for each possible sampling time and concatenating these
sequences together.
input = 1. We are looking for the largest negative disturbances
when 1 is sent. Negation of this input sequence is also a worst-case input sequence.
If disturbance , input . Otherwise input , for
bus , .
In this section, all testing results are obtained with input sequences that are con-
catenations of the worst-case input sequence and pseudo-random input sequences, unless
otherwise indicated.
52
-
7/29/2019 10.1.1.186.4635
62/91
0 100 200 300 400 500 6002
1.5
1
0.5
0
0.5
1
1.5
2
Voltage
(v)
t (ps)
Eye diagram for system with 8*3 filters (eye height 80%, eye width 50%)
0 100 200 300 400 500 600
2
1.5
1
0.5
0
0.5
1
1.5
2
t (ps)
Voltage
(v)
Eye diagram for system with 8*3 filters (eye height 92%, eye width 50%)
Figure 4.10: Worst-case test (upper panel) vs. Pseudo-random test (lower panel): eye dia-
grams for systems with equalizing filters designed with the LP method. Red traces
indicate high signal transmitted. Blue traces indicate low signal transmitted.
53
-
7/29/2019 10.1.1.186.4635
63/91
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
0 5 10 15 20
Filter Width
Eye
height
4 taps
8 taps
12 tap
16 taps
Filter Width
Eyeheight
Figure 4.11: Worst-case performance of different equalizing filters designed with the LP
method. Only equalizing filters with eye width greater than 25% are shown. Simulation
parameters are: /cm, pF/cm, nH/cm, = 0.31, =
0.23. Filter design parameters are: taps per bit = 4, bit time = 300 ps.
Figure 4.10 upper panel shows an eye diagram obtained with such a testing se-
quence. Comparing with the eye diagram shown on the lower panel, it is noticed that
the worst-case input sequence happens rarely and there is a significant difference between
worst-case eye height 80% and random eye height 92%. This gives a possibility that if a
certain amount of bit error is tolerated by using some error correcting code strategy, the
maximum bit rate could be further improved.
4.5.2 Indirect coupling
For simplicity in the distributed coupled RLC model, I only considered capacitive and in-
ductive coupling between adjacent lines. Thus, originally I thought that it should be enough
to equalize one line when only information on this line itself and its two nearest neighbours
54
-
7/29/2019 10.1.1.186.4635
64/91
Figure 4.12: Indirect coupling between non-adjacent lines
are used by the filter design. For a three-bit interconnect, this method considers all wires
and therefore does give the optimal result. However, for a bus with more than three lines,
if we consider more lines instead of just adjacent lines (increase the filter width), better
cross-talk cancellation can be achieved. Figure 4.11 shows the performance of filters with
different sizes at 300 ps bit time. Compared with an filter, the filter has a much
greater eye opening. This indicates that although the direct coupling between non-adjacent
lines is weak and ignored in the interconnect network model, the indirect coupling between
non-adjacent lines is strong and shouldnt be ignored in the equalizing filter design. It is
also shown in figure 4.11 that for the bus considered this trend nears its asymptote when
filter width is larger than 4.
The indirect coupling between non-adjacent lines is illustrated in figure 4.12. From
figure 4.12, a pattern of transfer function in the frequency domain was conjectured. That is,
If this pattern existed, a simple filter considering all lines could be designed. Unfortunately,
since we are considering far end noise cancellation, which is not only a function of input
voltage but also a function of distance , this speculated pattern does not occur in practice.
55
-
7/29/2019 10.1.1.186.4635
65/91
0 100 200 300 400 500 60015
10
5
0
5
10
15
Voltage
(V)
t (ps)
eye diagram for system with 8*16 filters (eye height 99%, eye width 0%)
Figure 4.13: Eye diagram for system with equalizing filters designed by the LP
method. Red traces indicate high signal transmitted. Blue traces indicate low signal trans-
mitted.
4.5.3 Over-fitting
Compared with the LSQ method, the LP method is fast and guarantees worst-case per-
formance. However, it has an over-fitting problem. Figure 4.14 shows the trend that the
magnitude of overshoot increases with the size of equalizing filter designed. It suggests
that with more degrees of freedom, the optimizer tends to put more energy into the filter in
order to get higher eye height at the sampling time, which results in much greater overshoot.
For some inputs, the output signal changes abruptly but close to the target right at the sam-
pling time, resulting in a larger eye height but also a much smaller eye width. This effect
is clearly shown in figure 4.13, which shows an eye diagram obtained with filter
designed with the LP method. The eye height of the diagram is the largest among all filters
with length 8, but it barely has an eye opening, and has an enormous amount of overshoot.
56
-
7/29/2019 10.1.1.186.4635
66/91
1
10
100
1000
10000
100000
1000000
0 100 200 300
Number of filter coefficients
Overshoot(V)
Overshoot(v)
Number of filter coefficients
1
10
100
1000
10000
100000
1000000
0 5 10 15 20
Filter width
Ove
rshoot(V) 4 taps
8 taps
12 taps
16 taps
Oversh
oot(v)
Filter Width
Figure 4.14: Magnitude of overshoot increases with the size of the equalizing filter designed
with the LP method. Upper panel shows this trend with the number of filter coefficients as
axis. Lower panel shows this trend with the filter width as axis, given filter length. All
simulations are done with the same parameters as in figure 4.11.
57
-
7/29/2019 10.1.1.186.4635
67/91
The lower panel of figure 4.14 shows that for a given filter length, magnitude of overshoot
increases with the filter width. Whereas, for filter widths less than 6, the magnitude of the
overshoot doesnt follow this trend with the filter length. Thus, the filter width plays a more
critical role than filter length in the over-fitting problem. This could be explained by the fact
that wires further away produce less disturbances on the wire than history bits on the wire
itself. Thus with a longer filter, the optimizer could easily push the eye height up without
putting more energy into the filter. When the filter length is long enough to cover most
significant portion of the bus impulse response, this trend reaches its limit. As we can see
from figure 4.11, at 300 ps bit time, filters with length 8 taps already do a very good job
in cross-talk cancellation. Compared with 12 tap filters, 16 tap filters dont significantly
improve the eye height.
Moreover, for the bus considered, the improvement of eye height by increasing filter
width also stops when the filter width is more than 4 for the bus considered (see 4.11). So,
very long and wide filters wont bring more benefit, yet the filter becomes more and more
complicated and expensive. In this sense, over-fitting problem of LP method may not be a
serious problem in practice.
Furthermore, instead of trying to bring 1 tap as close to 1 as possible, the LP method
can be easily formulated to bring 2 taps as close to 1 as possible. By doing this, sharp tran-
sitions are avoided hence the amount of overshoot is decreased. In practice, this method
works in decreasing the severity of the over-fitting problem of the LP method when design-
ing large filters.
4.5.4 Minimum bit time
To simplify design and yet achieve reasonable cross-talk cancellation, an important question
is how many lines away should be considered when designing the equalizing filter. In other
58
-
7/29/2019 10.1.1.186.4635
68/91
-
7/29/2019 10.1.1.186.4635
69/91
Equalizing filters designed with both the LP method and the LSQ method effectively
improve the maximum bit rate of the bus.
Equalizing filters designed with the LP method have better performance than equal-
izing filters designed with the LSQ method for every configuration considered. As
discussed further below, the advantage for the LP method is most pronounced for
wide filters.
An filter is a good choice in terms of performance and cost. Although An
filter does improve the eye height at lower bit rate (see figure 4.11), it has similar
minimum bit time as the filter.
Note that width = 1 is separate pre-emphasis for each line. With width = 1, LSQ and LP
have similar performance. Because the focus of this work is on cross-talk cancellation,
high-frequency attenuation caused by skin effect is not built in the bus model. Because
of this, the performance of the system without filter (width = 0) and systems with pre-
emphasis filters (width = 1) are similar (740 ps vs. 679 ps). With cross-talk cancellation
(width ), the performance of the bus is greatly improved (2.7 to 3.4 times higher bit rate
than independent pre-emphasis). With width , LP is significantly better than LSQ. This
might not be a completely fair comparison because all the LP results were obtained with
an additional smoothing filter at the receiver end whereas the LSQ results were obtained
without any smoothing filters. The LSQ method can be easily applied to the system with
a smoothing filter at the receiver end. However, the LSQ method with smoothing is no
better than LSQ with increased number of taps. For example, 8 tap filters designed by
the LSQ method with 3 tap smoothing filter is no better than 12 tap filters designed by
the LSQ method without any smoothing filter. Table 4.1 shows that the performance of
the LP method is better than this upper bound of the performance of the LSQ method with
60
-
7/29/2019 10.1.1.186.4635
70/91
-
7/29/2019 10.1.1.186.4635
71/91
4.6 Time-variant Linear FIR Filter
A bit time consists of several tap times. In most examples presented previously, there are
4 taps per bit. Among them, the first tap and the last tap are bit-transition taps. The other
two are stable taps. The receiver samples stable taps and is insensitive to the input value at
bit-transition taps. The FIR filter designed above has no information about bit transition. It
treats every tap the same, no matter whether it is a bit-transition tap or a stable tap. What if
we treat them differently and let the filter have the knowledge of bit transition? This leads
to the design of time-variant linear FIR filter. The idea is to assign a set of filter coefficients
for each tap per bit. For example, for a 8 3 FIR filter and 4 taps per bit, the time-variant
linear FIR filter contains 4 sets of 8 3 filter coefficients. By doing this, the optimizer can
differentiate transition taps from stable taps, and assign different filter coefficients to their
corresponding filter. The way this time-variant filter works is illustrated in figure 4.15. In
the figure, different sets of filter coefficients are indicated by different line style.
This problem can be formulated into a linear programming problem similarly as the
original simple linear FIR filter design. The only difference between these two linear pro-
gramming problem formulation is the way that filter output is calculated. Let fir fir
denote the set of filter coefficient vectors. The time-variant linear FIR filter coefficient
vector:
F
fir
fir
...
fir
(4.60)
62
-
7/29/2019 10.1.1.186.4635
72/91
FIR 2
FIR 3
FIR 4
FIR 1
Figure 4.15: The convolution procedure of the time-variant FIR filter. Different sets of filter
coefficients are indicated by different line style.
For a given input sequence input in bit time, the filter output is:
filterOutput shuffle
in
in
in
in
F (4.61)
where in input with input fir bus , and shuffle
is the following matrix:
shuffle if div and div
otherwise
(4.62)
The correctness of the time-variant FIR filter designed is checked by adding a set
of equality constraints which specify that all four sets of filter coefficients are equal. After
63
-
7/29/2019 10.1.1.186.4635
73/91
adding the equality constraints, this design gives the same set of filter coefficients as the
simple FIR filter designed in section 4.4, and the same objective value. So the time-variant
FIR filter designed should be at least as good as the simple FIR filter designed in section
4.4. For example, in the case of 4 taps per bit, and same design parameters as for figure 4.10,
time-variant FIR filter has worst-case eye height 86% (vs. 80% for simple FIR
filter). The improvement of the eye height tells us it does help to assign different filter
coefficients to different taps. However the benefit might not be large enough to justify any
extra cost in an implementation.
4.7 Optimized Smoothing Filter
The system structure shown in figure 4.7 naturally leads to the topic of optimized smoothing
filter design. In previous sections, a smoothing filter which simply averages over 3 taps was
used. Test results show that this is a good choice. However it is not optimal. For example, it
is observed from those eye diagrams that weights assigned to those 3 taps shouldnt be the
same. The first tap and last tap are closer to tap transition and should have smaller values.
The middle tap contributes more to the eye height and should be assigned a larger weight.
Moreover, there is no reason to limit the window size of the smoothing filter to only 3 taps.
The system where the coefficients of both the equalizing filter and the smoothing
filter are taken as variables is not linear because output values of the bus depend on the
product of the coefficients of the two filters