initial cell serch paper

Upload: acidwarrior

Post on 08-Aug-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/22/2019 Initial Cell Serch Paper

    1/63

    1

    Comparison of Initial Cell Search Algorithms for W-CDMA Systems

    by

    Sanat Kamal Bahl

    Thesis submitted to the Faculty of the Graduate School

    of the University of Maryland in partial fulfillment

    of the requirements for the degree ofMaster of Science

    2002

  • 8/22/2019 Initial Cell Serch Paper

    2/63

    2

    Title of Thesis: Comparison of Initial Cell Search Algorithms for

    W-CDMASytems

    Sanat Kamal Bahl, Master of Science, 2002

    Thesis directed by: James F. Plusquellic

    Assistant Professor

    Dept. of Computer Science and Electrical Engineering

    ABSTRACT

    In this thesis, an Improved Cell Search Design (Improved CSD) using cyclic codes is

    compared with the 3GPP Cell Search Design using comma free codes (3GPP-comma free

    CSD) in terms of (1) hardware utilization on a field programmable gate array (FPGA) and

    (2) acquisition time for different probabilities of false alarm rates. Our results indicate

    that for a channel whose signal-to-noise ratio is degraded with additive white gaussian

    noise (AWGN), the Improved CSD achieves faster synchronization with the base station

    and has lower hardware utilization when compared with the 3GPP-comma free CSD

    scheme under the same design constraints.

  • 8/22/2019 Initial Cell Serch Paper

    3/63

    3

    Table of Contents

    1.0 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    2.0 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    3.0 Cell Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    3.1 Synchronization Channels in W-CDMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    3.2 Cell Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.2.1 Stage 1: Slot Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.2.2 Stage 2: Frame Synchronization and Code Group Identification . . . . . . . . 13

    3.2.3 Stage 3: Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    4.0 Improved Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    4.1 Stage1: Slot Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    4.2 Stage2: Frame Synchronization and Code Group Identification . . . . . . . . . . . . . 21

    4.3 Stage3: Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4.3.1 Scrambling Code Generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4.3.2 Descrambler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    5.0 3GPP-comma free Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    5.1 Stage 2 of 3GPP-comma free Cell Search Design . . . . . . . . . . . . . . . . . . . . . . . . 32

    5.2 Reduced Length FHT Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    6.0 Experimental Method and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    6.1 Experimental Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    6.1.1 FPGA Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    6.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    7.0 Summary, Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    7.1 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    7.2 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    8.0 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

  • 8/22/2019 Initial Cell Serch Paper

    4/63

    4

    List of Abbreviations

    AMPS Advanced Mobile Phone Service

    ASIC Application Specific Integrated Circuit

    A/D Analog-to-Digital

    AWGN Additive White Gaussian Noise

    BS Base Station

    Cp Primary Synchronization Code

    Cssc Secondary Synchronization Code

    Cs Cyclic Hierarchical Sequence

    CLB Configurable Logic Block

    CPICH Common Pilot Channel

    D/A Digital-to-Analog

    DFT Discrete Fourier Transform

    DSP Digital Signal Processing

    DS-CDMA Direct Sequence-Code Division Multiple Access

    FHT Fast Hadamard Transformer

    FPGA Field Programmable Gate Array

    GIC Group Indicator Code

    GPS Global Positioning System

    GSM Global System for Mobile communication

    LC Logic Cell

    LFSR Linear Feedback Shift Register

    LUT Look-Up Table

    MS Mobile StationPSC Primary Synchronization Code

    P-SCH Primary Synchronization Channel

    SSC Secondary Synchronization Code

    SNR Signal-to-Noise Ratio

  • 8/22/2019 Initial Cell Serch Paper

    5/63

    5

    SCH Synchronization Channel

    S-SCH Secondary Synchronization Channel

    3G Third Generation

    3GPP Third Generation Partnership Project

    TIA Telecommunications Industry Association

    W-CDMA Wideband-Code Division Multiple Access

  • 8/22/2019 Initial Cell Serch Paper

    6/63

    6

    List of Figures

    Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page

    1 DS-CDMA Transmitter-Receiver Block Level Diagram . . . . . . . . . . . . . . . . . . . . . . 3

    2 Synchronization Channels in Cell Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3 Hierarchical Matched Filter (64-chip and 4-symbol accumulation). . . . . . . . . . . . . . 17

    4 Hierarchical Matched Filter (16-chip and 16-symbol accumulation). . . . . . . . . . . . . 18

    5 Slot Boundary Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    6 Frame Synchronization and Code Group Identification. . . . . . . . . . . . . . . . . . . . . . . 24

    7 Scrambling Code Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    8 Multiple Scrambling Code Generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    9 Scrambling Code Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3110 Individual Stage of FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    11 16 chip FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    12 Hadamard Code Metrics (Butterfly Operation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    13 2-Slice Virtex-E CLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    14 Detailed View of Virtex-E Slice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    15 Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-3. . . . . . . . . . . 48

    16 Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-4. . . . . . . . . . . 48

  • 8/22/2019 Initial Cell Serch Paper

    7/63

    7

    List of Tables

    Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page

    1 Hierarchical Matched Filter (16 and 64-chip Accumulation). . . . . . . . . . . . . . . . . . . 16

    2 Sequences X1,i and X2,i for Code Groups 1 to 32. . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    3 Masking Functions used in Stage 3: Scrambling Code Generator . . . . . . . . . . . . . . . 28

    4 Allocations of SSCs for Secondary SCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    5 Timing Diagram of Inputs to FHT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    6 Reduced Length Walsh Sequences (256 chip sequence to 16 chip sequence) . . . . . . 41

    7 Hardware Specifications of System: Quantization 4 Input Data Bits. . . . . . . . . . . . . 49

    8 Hardware Specifications of FHT: 16 and 256 chip sequence. . . . . . . . . . . . . . . . . . . 49

  • 8/22/2019 Initial Cell Serch Paper

    8/63

    8

    Chapter 1

    Introduction

    1.0 Introduction

    First generation (1G) mobile communications systems were based on analog technol-

    ogy and started in the early to mid 1980s. These 1G systems had a number of limitations

    which included (1) low quality voice service, (2) limited capacity and (3) inability to pro-

    vide global roaming.

    Digital second generation (2G) systems were then developed in Europe and US. The

    various second generation systems included (1) Global System for Mobile communica-

    tion (GSM) which utilizes time division multiple access (TDMA). In TDMA each user is

    assigned a particular time slot. (2) The TDMA/136 specification which was defined in the

    US, in 1988, by the Telecommunications Industry Association (TIA), developed with the

    aim of digitizing the analog Advanced Mobile Phone Service (AMPS). (3) In the US, IS-

    95 was proposed for 2G systems, to provide better voice quality and higher capacity. IS-

    95 was based on CDMA technology. However, different 2G technologies were not

    interoperable and not available across geographic areas. In addition, the low bit rate of 2G

    systems could not meet subscriber demands for multimedia services. Third generation

    (3G) systems aim to solve these problems encountered with 2G systems, by promising

    global roaming across 3G standards, higher data rates, improved quality of service and

  • 8/22/2019 Initial Cell Serch Paper

    9/63

    9

    support for multimedia applications. The most popular candidates for 3G cellular systems

    are CDMA2000 and Wideband-Code Division Multiple Access (W-CDMA) [1] [2]. Both

    of these schemes are based on Direct Sequence-Code Division Multiple Access (DS-

    CDMA) technology. In DS-CDMA, the data signals are directly modulated by a digital

    code signal.

    In a spread spectrum CDMA system, the transmitted signal is spread over a wide fre-

    quency band that is wider than the minimum bandwidth required to transmit the informa-

    tion being sent. In a typical scenario where there are multiple users or mobile stations

    (MSs) in a cell, each user has a unique scrambling code. This scrambling code should be

    such that it has low cross correlation properties with the other user codes. The signal

    received by the MS from the transmitting base station (BS) is correlated with the users

    scrambling code. This despreads only the signal of that particular user whereas the other

    spread spectrum signals will remain spread. A block diagram of a DS-CDMA transmitter

    and receiver is shown in Figure 1. Spreading consists of multiplying the input data by a

    scrambling code sequence whose bit rate is much higher than the data bit rate. At the

    receiving side the signal is multiplied with the same scrambling code sequence that is

    exactly synchronized to the received code sequence. The Encoding block shown in Figure

    1 is used to add error correcting bits and to perform interleaving in order to protect infor-

    mation bits from channel noise and interference. The reverse operations are performed in

    the Decoding stage at the receiver.

  • 8/22/2019 Initial Cell Serch Paper

    10/63

    10

    The main difference between W-CDMA and CDMA2000 is that W-CDMA supports asyn-

    chronous BSs whereas CDMA2000 relies on synchronized BSs. Synchronous CDMA

    systems need an external time reference. A Global Positioning System (GPS) clock can

    be used by all BSs to synchronize their operations. This allows the MS to use different

    phases of the same scrambling code to distinguish between adjacent BSs. In an asynchro-

    nous CDMA system, each BS has an independent time reference, and the MS, does not

    have prior knowledge of the relative time difference between various BSs. The advantage

    of asynchronous operation is that it eliminates the need to synchronize the BSs to an accu-

    rate external timing source. However, since there is no external time synchronization

    between the adjacent BSs, different phases of the same code cannot be used to distinguish

    XEncoding

    Scrambling Code

    Generator

    Scrambling CodeSynchronization

    DecodingBaseband Baseband

    XData Data

    Scrambling Code

    Generator

    Transmitter Receiver

    Figure 1: DS-CDMA Transmitter-Receiver Block Level Diagram

    D/A A/D

  • 8/22/2019 Initial Cell Serch Paper

    11/63

    11

    adjacent BS. Thus, in an asynchronous CDMA system, adjacent BSs can only be identi-

    fied by using distinct scrambling codes. Consequently, cell search, which involves the

    process of achieving code, time and frequency synchronization of the MS with the BS,

    takes longer in comparison to a synchronous CDMA system. Cell search is complicated in

    the presence of signals which are intended for other mobile systems within a cell as well

    as signals from other BSs. Thus, it is very important to develop algorithms and hardware

    implementations to perform cell search using lower acquisition time and minimum hard-

    ware resources for asynchronous CDMA systems.

    Cell search is performed according to the algorithm proposed by Wang et al. [3]. In the

    proposed cell search algorithm, code and time synchronization is achieved assuming a

    large frequency error and after achieving code and time synchronization, frequency syn-

    chronization is performed. In this study we consider the problem of achieving code and

    time synchronization. The process of achieving code and time synchronization in the cell

    search algorithm for W-CDMA systems is divided into three stages (1) slot synchroniza-

    tion, (2) frame synchronization and code group identification, and (3) scrambling code

    identification. This thesis presents a 3G Partnership Project (3GPP) cell search design

    using cyclic codes (Improved CSD) to achieve faster synchronization at lower hardware

    complexity. The second part of this thesis compares the two design algorithms for per-

    forming initial cell search: the Improved CSD and the 3GPP cell search design using

    comma free codes (3GPP-comma free CSD) in terms of (1) acquisition time measure and

    (2) hardware specifications on a Xilinx Virtex-E XCV1000E field programmable gate

    array (FPGA). The thesis also proposes design improvements in stage 2 of the 3GPP-

  • 8/22/2019 Initial Cell Serch Paper

    12/63

    12

    comma free CSD beyond those proposed by Li et al. [4]. The 3GPP-comma free CSD

    proposed in this thesis uses a Fast Hadamard Transformer (FHT) in stage 2 that achieves

    lower hardware complexity and faster decoding. Furthermore, masking functions are used

    in stage 3 of both the Improved CSD and the 3GPP-comma free CSD to reduce the num-

    ber of scrambling code generators required as described in previous work [4]. This results

    in a reduction in the ROM size required to store the initial phases of the scrambling code

    generators in stage 3. The Improved CSD proposed in this thesis aims to achieve faster

    synchronization between the MS and the BS and thus improves system performance. The

    experiments carried out using accumulation over multiple slots in stage 1 indicate that for

    an additive white gaussian noise (AWGN) channel in a high signal-to-noise ratio the

    Improved CSD achieves faster synchronization with the BS and has lower hardware utili-

    zation when compared with the 3GPP-comma free CSD scheme under the same design

    constraints.

    The thesis is organized as follows. Work done by other research groups and suggestions

    by the 3GPP working group are presented in Chapter 2. Chapter 3 describes the synchro-

    nization channels in W-CDMA cell search and introduces the three step cell search algo-

    rithm used in W-CDMA for synchronization between the MS and the BS. Chapter 4

    describes the Improved cell search design using cyclic codes proposed as a means of

    achieving faster synchronization. Chapter 5 discusses the 3GPP cell search design using

    comma free codes. Chapter 6 presents the experimental method and results of the compar-

    ison of the two cell search algorithms on a Xilinx Virtex-E XCV1000E FPGA. Chapter 7

    is a summary, discussion, and an overview of future directions of this research.

  • 8/22/2019 Initial Cell Serch Paper

    13/63

    13

    Chapter 2

    Background

    Cell search design is critical as it impacts the system performance and there is a need to

    design efficient receiver structures and algorithms to reduce the cell search time. This

    Chapter summarizes efforts by research groups and the 3GPP working groups to design

    efficient schemes and algorithms for each of the three stages of the cell search algorithm.

    2.0 Background

    Wang et al. proposes a pipelined process to be used in first three stages of the cell search

    algorithm [3]. The cell search scenarios considered in their study are (1) initial cell

    search: when a mobile is switched on and (2) target cell search: during idle and active

    modes of the MS. Instead of the serial cell search sequentially searching through code,

    time and frequency, their method first acquires code and time synchronization assuming a

    larger frequency error and then performs frequency synchronization [3] [5].

    The synchronization code sequences used in stage 1 and stage 2 of the cell search algo-

    rithm are made up of bits called "chips" which can be either +1 or -1. The synchronization

    code sequences are 256 chips in length. If a traditional matched filter is used then a huge

    adder circuit (256 input adder) will be required to sum up the correlation results. This will

  • 8/22/2019 Initial Cell Serch Paper

    14/63

    14

    lead to wastage of hardware resources. Hence, Siemens and Texas Instruments in their

    working group draft have suggested a hierarchical matched filter design which uses two

    matched filters to reduce the hardware complexity significantly [6]. The details of the

    hierarchical matched filter design will be presented in Chapter 4.

    The 3GPP specification uses comma free codes in stage 2 of the cell search algorithm

    [7] [8]. Nortel networks in their working group proposal have suggested the use of cyclic

    codes in the SCHs [9]. The use of cyclic codes for generating the synchronization codes

    will be explained in more detail in Chapter 4. These cyclic codes can reduce hardware uti-

    lization and acquisition time if the receiver is properly designed.

    To reduce the complexity of searching through all the 512 scrambling codes, the con-

    cept of code grouping and group indicator codes (GIC) was introduced [10]. This reduces

    the cell search time as the scrambling code is identified by first detecting the code group.

    Once the code group is detected then the scrambling code used by the cell can be easily

    identified as there are a limited number of codes in each code group. This reduces the cell

    search time significantly. This idea was accepted in the 3GPP specifications. To further

    reduce cell search time, frame boundary synchronization is also achieved in stage 2 after

    identifying the code group and slot ID [11].

    Ericsson in their working group draft have proposed increasing the number of code

    groups in stage 2 of the cell search [12]. Increasing the number of code groups reduces

    the number of scrambling codes in a code group. Their proposed scheme uses either 256,

  • 8/22/2019 Initial Cell Serch Paper

    15/63

    15

    128 or 64 code groups in stage 2 of the cell search. They claim that the scheme using 256

    code groups is the preferred scheme as it requires only two scrambling code correlators in

    stage 3 of initial cell search and achieves reduced hardware complexity.

    In stage 2 of the 3GPP-comma free CSD presented in this thesis, a FHT design is pro-

    posed in replacement to the Golay correlator presented by Li et al. [4]. A FHT provides an

    efficient technique to detect the code group and slot ID in stage 2. Previous FHT designs

    [13] and [14] utilize a lot of hardware resources, hence, a fast and efficient Hadamard

    transformer is needed to reduce the hardware utilization and to perform faster decoding.

    A compact and efficient FHT design will also draw less power from the handset.

    Siemens in their working group draft have suggested the use of masking functions in

    stage 3 to reduce the design complexity for generating the scrambling codes in parallel

    [15]. The use of masking functions reduces the number of scrambling code generators

    required to generate the codes in parallel. Any masking function can be selected by the

    designer as long as they generate codes with minimum overlap. The use of masking func-

    tions reduces the hardware significantly as compared to the previous design by Li et al.

    [4].

    Li et al. have designed an application specific integrated circuit (ASIC) for performing

    cell search in W-CDMA systems [4]. In stage 1 and stage 2 of their cell search design the

    authors use a correlator structure to detect the code group and slot ID. The correlator

    structure used is a Golay correlator [16]. In stage 3 of the cell search algorithm, 16 scram-

  • 8/22/2019 Initial Cell Serch Paper

    16/63

    16

    bling code generators are used for generating the codes in parallel.

    In summary, most of the literature found in this area have presented simulation results of

    their algorithms and have not investigated the hardware complexity of their design

    schemes except the work presented by Li et al. [4]. The designs used by the mobile man-

    ufacturers is company proprietary and there are very few documents which describe their

    actual design schemes. It is critical to consider a practical hardware implementation of the

    cell search algorithm especially because chip area and power utilization are the two most

    important factors in a mobile handset.

  • 8/22/2019 Initial Cell Serch Paper

    17/63

    17

    Chapter 3

    Cell Search Algorithm

    3.0 Cell Search Algorithm

    This Chapter describes the synchronization channels in W-CDMA cell search and intro-

    duces the cell search algorithm used in the synchronization of the MS with the BS for W-

    CDMA systems.

    3.1 Synchronization Channels in W-CDMA

    In CDMA systems, spreading codes are used to differentiate physical channels from the

    same transmitter, and scrambling codes are used to differentiate transmitters. The MS

    needs to achieve code and time synchronization with the BS before any communication

    with the BS can start. The process of searching for a code and achieving synchronization

    with the BS is called cell search. Cell search is performed in two scenarios: when a MS is

    switched on (initial cell search) and during active or idle mode (target cell search). Target

    cell search is used to find handover candidates during a call. Cell search design is impor-

    tant and needs to be completed in minimum delay as it impacts the system performance.

    Each cell in a CDMA system is identified by its downlink scrambling code which is of

    length 38,400 chips. The 38,400 chips form a radio frame which is divided into 15 slots.

  • 8/22/2019 Initial Cell Serch Paper

    18/63

    18

    Each slot in the radio frame is of 2,560 chips [7].

    Figure 2 shows the slot and frame structure of the three synchronization channels used

    in cell search: the Primary-Synchronization Channel (P-SCH), Secondary-Synchroniza-

    tion Channel (S-SCH) and the Common Pilot Channel (CPICH) [7] [17]. The P-SCH

    together with the S-SCH are also called Synchronization Channel (SCH). In the P-SCH, a

    256 chip sequence is transmitted at the start of each slot. The same P-SCH sequence is

    used by all the BSs and is transmitted once every slot. As the same sequence is used by all

    the transmitting stations, only one matched filter is sufficient to detect the slot boundary

    value. To reduce the complexity of the matched filter implementation, a hierarchical

    scheme is used as will be explained in detail in Chapter 4. The S-SCH is used for carrying

    15 different sequences, one in each slot, for the different code groups and is repeated after

    every frame. These sequences are used in identifying the code group. The CPICH is used

    38,400 chipsOne Frame = 15 slots (10 msec)

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    10 CPICH Symbols

    2,560 chips

    256 chips

    P-SCH

    S-SCH

    CPICH

    (0.67 msec)

    (0.067 msec)

    Figure 2: Synchronization Channels in Cell Search

  • 8/22/2019 Initial Cell Serch Paper

    19/63

    19

    to carry the downlink common pilot symbols scrambled by the scrambling code of the BS.

    Each slot of this channel is divided into 10 symbols, each of 256 chips in length.

    To reduce the complexity of synchronizing to the BSs in W-CDMA, the concept of code

    grouping and the use of code group indicator codes (GIC) were introduced [10]. The 512

    scrambling codes used in W-CDMA are divided into code groups. After the code group is

    identified then only the scrambling code used by the cell needs to be detected. The num-

    ber of possible scrambling codes from which one code needs to be identified depends on

    how many code groups are selected in stage 2 of the design. For example, if 32 code

    groups are used in stage 2 then the number of scrambling codes in stage 3 are 16. Simi-

    larly, if 64 code groups are used then there will be 8 possible scrambling codes. Although,

    the number of scrambling codes will be fixed at 512, the number of code groups can be

    increased from 32 to 256 [12]. The complexity is further reduced by combining frame

    synchronization and code group identification in stage 2 of the cell search algorithm [11].

    3.2 Cell Search Algorithm

    The process of achieving code and time synchronization in the cell search algorithm is

    divided into three stages (1) slot synchronization, (2) frame synchronization and code

    group identification, and (3) scrambling code identification [3] [7] [8] [18].

    3.2.1 Stage 1: Slot Synchronization

  • 8/22/2019 Initial Cell Serch Paper

    20/63

    20

    During stage 1 of the cell search procedure the MS uses the SCHs Primary Synchroniza-

    tion Code (PSC) to acquire slot synchronization to a cell. This is typically done with a

    single matched filter matched to the PSC which is common to all cells. The slot timing of

    the cell can be obtained by detecting peak values in the matched filter output. The starting

    position of the synchronization code may be determined from observations over one slot

    duration. However, decisions based on observations over a single slot may be unreliable,

    when the signal-to-noise ratio (SNR) is low or if fading is severe. Reliable slot synchroni-

    zation is required to minimize cell search time. In order to increase reliability, observa-

    tions are made over multiple slots and the results are then combined. This ensures that the

    correct slot boundary is identified.

    3.2.2 Stage 2: Frame Synchronization and Code Group Identification

    During stage 2 of the cell search procedure, the MS uses the SCHs Secondary Synchro-

    nization Code (SSC) to achieve frame synchronization and identify the code group of the

    cell found in stage 1. This is done by correlating the received signal with all possible SSC

    sequences and identifying the maximum correlation value. Since the cyclic shifts of the

    sequences are unique, the code group as well as the frame synchronization is determined.

    3.2.3 Stage 3: Scrambling Code Identification

    During stage 3 of the cell search procedure, the MS determines the exact primary scram-

    bling code used by the cell. The primary scrambling code is typically identified through

  • 8/22/2019 Initial Cell Serch Paper

    21/63

    21

    symbol-by-symbol correlation over the CPICH with all codes within the code group iden-

    tified in stage 2. In this stage, a threshold value is used to decide whether the code has

    been identified. The threshold value can be predetermined using a parameter called prob-

    ability of false alarm rate [19].

    This three stage cell search algorithm helps in simplifying the synchronization process

    of the MS with the BS. Each stage and their hardware implementation will be explained

    in the following Chapters.

  • 8/22/2019 Initial Cell Serch Paper

    22/63

    22

    Chapter 4

    Improved Cell search Design

    4.0 Improved Cell Search Design

    This Chapter describes the Improved CSD using a set of cyclic codes. The cyclic codes

    were proposed by Nortel networks to be used on the Secondary SCH [9]. These cyclic

    codes allow very efficient detection and improves the cell search in terms of acquisition

    time and hardware utilization. The three stage cell search design and their hardware

    implementation are explained in Sections 4.1, 4.2 and 4.3.

    4.1 Stage 1: Slot Synchronization

    The MS first needs to acquire the PSC which is common to all the BSs. These codes are

    of length 256 chips. The matched filter output is given by

    where Rj

    is the jth sample of the received complex signal, and

    Cpj is the jth bit of the PSC

    Hence, a traditional matched filter implementation would require 256 taps and a large

    Y RjC pjj 0=

    255

    = (1)

  • 8/22/2019 Initial Cell Serch Paper

    23/63

    23

    adder circuit. This would increase the delay as well as power consumption at the receiver

    which is not desirable. Thus, a hierarchical structure is proposed for performing the

    matched filter operations which will need lesser number of taps, reduced circuitry and

    lower power consumption [6]. The PSC consists of an unmodulated hierarchical sequence

    of length 256 chips, transmitted once every slot. The PSC is the same for every BS in the

    system and is transmitted time aligned with the slot boundary. The PSC is chosen to have

    good auto-correlation properties. This means that when the PSC sequence is correlated

    with itself, the interference from adjacent BSs is minimized and a high peak value is

    obtained.

    The hierarchical sequences used for generating the PSC are constructed from two con-

    stituent sequences X1 and X2 of length n1 and n2, respectively, using the following equa-

    tion

    Cp(n)=X1(n mod n2)+X2(n div n1) modulo 2, n=0,1,..,(n1*n2)-1 (2)

    where n1=n2=16.

    The constituent sequences X1 and X2 are both defined as:

    X1=X2=(1,1,-1,-1,-1,-1,1,-1,1,1,-1,1,1,1,-1,1) [9].

    There are different techniques in which the hierarchical matched filter can be designed

    as shown in Table 1.

    Table 1: Hierarchical Matched Filter (16 and 64 chip Accumulation)

    16 chip

    Accumulator

    16 symbol

    Accumulator

    64 chip

    Accumulator

    4 symbol

    Accumulator

    Register Taps 16 16 64 4

  • 8/22/2019 Initial Cell Serch Paper

    24/63

    24

    The hierarchical matched filter consists of two concatenated matched filter blocks. The

    design using 64 taps is shown in Figure 3. This solution is not ideal because of the follow-

    ing reasons. First, the matched filter design requires 64 taps. Second, the design needs a

    64-input adder as shown in Figure 3. A better solution is to use the design shown in Fig-

    ure 4. Hence, in stage 1 of both the Improved CSD and the 3GPP-comma free CSD the

    hierarchical matched filter using 16 chip and 16 symbol accumulation is used.

    Adder Length 16 16 64 4

    Table 1: Hierarchical Matched Filter (16 and 64 chip Accumulation)

    16 chip

    Accumulator

    16 symbol

    Accumulator

    64 chip

    Accumulator

    4 symbol

    Accumulator

    X X X X X X X X X X X X X X X X

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    +

    X

    +

    X

    +

    InData

    Adder Tree 1

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    1 64 65 128 129 192 193 256

    Adder Tree 2

    PSCHCode

    PSCHCode

    5 levels of adders

    Result

    X

    +

    X

    +

    X X X X X X X X X X X X X X X X

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

    Figure 3: Hierarchical Matched Filter (64 chip and 4 symbol accumulation)

    + +

    +

    ShiftRegister 1

    ShiftRegister 2

  • 8/22/2019 Initial Cell Serch Paper

    25/63

    25

    In this design, the first matched filter receives the input signals serially from the BS.

    Correlation over X1 (16 chip accumulation) is performed before correlation over X2 (16

    symbol accumulation). However, the two matched filters can be interchanged and the

    selection is an implementation option. After 16 clock cycles when the shift register 1 is

    filled, the data stored in the shift register 1 is matched in parallel with the code applied to

    the taps of the matched filter (tap coefficients). The tap coefficients are the PSC sequences

    which are the same for all the BSs. Hence, the same matched filter structure can be used

    for all the BSs. The adder circuit is implemented as a tree structure with the 16 inputs

    applied in parallel. If the data bits in the shift register 1 match with the tap coefficients

    then the result of the adder tree will be the highest value possible (16 or greater). The sec-

    ond matched filter has a shift register 2 of size 256 registers. Only 16 taps are needed to

    X X X X X X X X X X X X X X X X

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    +

    +

    X X X X X

    + + + + +

    +

    X X X X

    + + + +

    + + +

    X

    +

    X

    +

    +

    X

    +

    +

    InData

    Adder Tree 1

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    1 16 17 32 33 48 49 64 65 80 81 96 176 177 192 193 208 209 224 225 240 241 256

    Adder Tree 2

    PSCH

    Code

    PSCHCode

    3 levels of adders

    3 levels of adders

    Result

    Figure 4: Hierarchical Matched Filter (16 chip and 16 symbol accumulation)

    ShiftRegister 1

    ShiftRegister 2

  • 8/22/2019 Initial Cell Serch Paper

    26/63

    26

    match every sixteenth value of the shift register 2. The result from the first adder tree is

    stored in the shift register 2 of the second matched filter. After 256 clock cycles the shift

    register 2 in the second matched filter will be filled with the results from the first matched

    filter. The data in the shift register 2 is then matched in parallel with the tap coefficients.

    The tap coefficients are the same as the PSC sequence. If the data bits match the code

    sequence then the result of the second adder tree will be 256 or greater in magnitude corre-

    sponding to the peak value. An advantage of this scheme is that no multiplier circuit is

    needed as the correlations can be performed using an adder/subtractor circuit.

    Each memory cell in shift register 1 is 4-bits wide assuming that, at the input to the dig-

    ital receiver, the signal is sampled with a 4-bit analog-to-digital (A/D) convertor. Shift

    register 2 is 8-bits wide to store the result from the first adder tree block. For performing

    the correlation, it is not necessary to perform 16*16 operations but only 16+16 accumula-

    tion operations, which leads to a considerable reduction in hardware complexity. The

    hardware complexity of implementing the hierarchical matched filter is calculated as

    shown. In one slot period (2,560 chips), the receiver has to perform at least 81,920 com-

    plex additions per slot, (2,560*(16+16)). The traditional matched filter implementation

    without the hierarchical structure would require 256 complex additions. Thus, the hierar-

    chical matched filter achieves a saving of a factor of 8 in terms of complex additions.

    From Figure 2, each slot has a duration of 0.67 msec (670 sec). The complexity of stage 1

    in terms of real additions per second is 245 Madds/sec (8,1920*2/670). The incoming

    complex signal is divided into two components, the sine part called the "in-phase" (I-

    phase) and the cosine part called the "quadrature-phase" (Q-phase). The factor of 2 is for

  • 8/22/2019 Initial Cell Serch Paper

    27/63

    27

    the two branches I and Q of the complex signal. Thus, in stage 1 of the initial search,

    8,1920 complex additions in 1 slot and computing power of 245 Madds/sec is needed.

    There are two such hierarchical matched filters for the I and Q channels of the received

    complex signal as shown in Figure 5. The correlation results over I and Q channels are

    combined non-coherently over 1 slot duration and the result is stored in an accumulator

    which is implemented as a shift register. The output of the accumulator is given to a com-

    parator block to detect the peak value corresponding to the slot boundary of the closest BS

    and the MS needs to synchronize with this BS. As the code can be affected by AWGN and

    fading, accumulation over multiple slots is needed to correctly identify the slot boundary.

    It is important that the slot boundary is correctly identified in order to avoid the cost of

    increased acquisition time in case the wrong slot boundary is given to stage 2.

    X X X X X X X X X X X X X X X X

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    +

    +

    X X X X X

    + + + + +

    +

    X X X X

    + + + +

    + + +

    X

    +

    X

    +

    +

    X

    +

    + +

    I-Phase

    Q-Phase

    InData

    Adder Tree 1

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    1 16 17 32 33 48 49 64 65 80 81 96 176 177 192 193 208 209 224 225 240 241 256

    Non-Coherent Detection Block

    Accumulator

    Comparator

    Adder Tree 2

    PSCH

    Code

    PSCHCode

    3 levels of adders

    3 levels of addersSlot Boundary Value

    Stage 1 Complete

    (.)

    (.) 2

    2

    +

    Figure 5: Slot Boundary Detection

    ShiftRegister 1

    ShiftRegister 2

  • 8/22/2019 Initial Cell Serch Paper

    28/63

    28

    4.2 Stage 2: Frame Synchronization and Code Group

    Identification

    The Secondary SCH consists of 15 sequences belonging to a family of cyclic codes

    (SSCs), each of length 256 chips. These SSCs are transmitted repeatedly in parallel with

    the Primary SCH. The procedure for constructing the cyclic codes is similar to that of the

    hierarchical sequence (equation 2) for the Primary SCH except that it uses specific

    sequences of length 16 from Table 2 for each code group.

    The procedure for constructing the cyclic hierarchical sequence Csi,1 for slot 1 is exactly

    the same as constructing the hierarchical sequence Cp for the Primary SCH. The

    sequence Csi,1 for slot 1 will be referred to as the zero cyclic shift sequence as no shift is

    applied to the constituent sequence X1i. For slots 2 to 15, the cyclic codes are constructed

    from the two constituent sequences X1i,k-1 and X2i,k-1 of length n1 and n2 respectively

    using the following formula

    Csi,k(n)=X2i,k-1 (n mod n2)+X1i,k-1 (n div n1) modulo 2, n=0,1,..,(n1*n2)-1 (3)

    where i is code group number,

    k=2,3,..,15 is slot number,

    n is chip number in slot, n1=n2=16, and

    the constituent sequences X1i,k-1 and X2i,k-1 in each code group i are chosen to be the

    following sequences from Table 2 [9].

  • 8/22/2019 Initial Cell Serch Paper

    29/63

    29

    The constituent sequence X2i,k-1 (inner sequence) is exactly equal to the base sequence

    X2i in every slot, i.e. X2i,k-1=X2i at all k. The constituent sequence X1i,k-1 (outer

    sequence) are formed from the base sequence X1iby cyclic right shifts of X1

    ion k-1 posi-

    tions (from 0 to 15) clockwise for each slot number k, from 1 to 15. The generation of the

    cyclic codes can be understood clearly by considering the following example.

    For the first code group the sequence is given by

    X11,0=(1,1,1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1,1), k=1 for slot 1, No cyclic shift

    X11,1=(1,1,1,1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1), k=2 for slot 2, cyclic right shift by 1 posi-

    tion

    X11,14=(1,-1,-1,-1,1,-1,-1,1,1,-1,1,-1,1,1,1,1), k=15 for slot 15, cyclic right shift by 14

    positions.

    Table 2: Sequences X1i and X2i for Code Groups 1 to 32

    Code Group Code Group

    1 1 1 1-1-1-1 1-1-1 1 1-1 1-1 1 1 17 1-1 1 1-1 1-1 1 1 1-1 1 1 1-1 1

    2 1-1 1 1-1 1 1 1-1-1 1 1 1 1 1-1 18 1 1 1-1-1-1-1-1 1-1-1-1 1-1-1-13 1 1-1 1-1-1-1 1-1 1-1 1 1-1-1-1 19 1-1-1-1 1-1-1 1 1 1 1-1-1-1-1 1

    4 1-1-1-1-1 1-1-1-1-1-1-1 1 1-1 1 20 1 1-1 1 1 1-1-1 1-1 1 1-1 1-1-1

    5 1 1 1-1 1 1-1 1-1 1 1-1-1 1-1-1 21 -1-1-1 1 1-1-1 1 1-1 1-1 1 1-1-1

    6 1-1 1 1 1-1-1-1-1-1 1 1-1-1-1 1 22 -1 1-1-1 1 1-1-1 1 1 1 1 1-1-1 1

    7 1 1-1 1 1 1 1-1-1 1-1 1-1 1 1 1 23 -1-1 1-1 1-1 1-1-1 1 1-1-1-1-1-1

    8 1-1-1-1 1-1 1 1-1-1-1-1-1-1 1-1 24 -1 1 1 1 1 1 1 1-1-1 1 1-1 1-1 1

    9 1 1-1 1-1-1-1 1 1-1 1-1-1 1 1 1 25 -1 1 1 1-1-1 1 1 1-1-1-1-1-1-1 1

    10 1-1-1-1-1 1-1-1 1 1 1 1-1-1 1-1 26 -1-1 1-1-1 1 1-1 1 1-1 1-1 1-1-1

    11 -1 1-1-1-1-1-1 1 1 1 1-1 1-1 1-1 27 -1 1 1 1 1 1-1-1 1-1-1-1 1 1 1-1

    12 -1-1-1 1-1 1-1-1 1-1 1 1 1 1 1 1 28 -1-1 1-1 1-1-1 1 1 1-1 1 1-1 1 1

    13 1-1-1-1 1-1-1 1-1-1-1 1 1 1 1-1 29 -1 1-1-1 1 1 1 1 1-1 1 1 1 1-1 1

    14 1 1-1 1 1 1-1-1-1 1-1-1 1-1 1 1 30 -1-1-1 1 1-1 1-1 1 1 1-1 1-1-1-1

    15 1-1-1-1-1 1 1-1 1 1 1-1 1 1 1-1 31 -1 1 1 1-1-1 1 1-1 1 1 1 1 1 1-1

    16 1 1-1 1-1-1 1 1 1-1 1 1 1-1 1 1 32 -1-1 1-1-1 1 1-1-1-1 1-1 1-1 1 1

  • 8/22/2019 Initial Cell Serch Paper

    30/63

    30

    The same procedure for forming the cyclic codes will be used for other code groups.

    Thus, for the 32 codes groups and 15 slots (in one frame), 512 different cyclic codes with

    a length of 256 chips each are constructed. In other words, each of the 32 code groups has

    16 cyclic codes. This set of 512 (32X16) cyclic codes has good correlation properties that

    make it good candidates for the SSCs. Many pairs of cyclic codes are fully orthogonal as

    the cross correlation is zero, some pairs have small cross correlation properties. The cross

    correlation of each cyclic hierarchical sequence Csi,kwith Cp code of Primary SCH is

    small. These 512 cyclic codes are unique for each code group/slot locations pair. Thus, it

    is possible to uniquely determine both the scrambling code group and the frame timing in

    the second stage of the initial cell search.

    By identifying the code group/slot location pair that gives the maximum correlation

    value, the code group as well as the frame synchronization is determined. The output

    from the matched filter is given to a non-coherent block which computes the energy over I

    and Q channels and then gives the result to the comparator module as shown in Figure 6.

    One slot search period time (2,560 chips) is enough to uniquely identify the correct code

    group and the frame timing in the second stage of acquisition when the signal-to-noise

    ratio is high. This is one major difference with the 3GPP-comma free CSD where at least

    three slots are necessary to uniquely identify the correct code group and frame timing.

    The Improved CSD also uses a smaller size ROM 32X16 to store the cyclic codes as com-

    pared to the 3GPP-comma free CSD which uses a ROM of size 32X60 to store the comma

    free codes.

  • 8/22/2019 Initial Cell Serch Paper

    31/63

    31

    The input data samples for the Secondary SCH are stored in an input buffer with 256

    complex memory cells called the Secondary Buffer as shown in Figure 6. These input

    data samplesare producedafter waveform matched filtering and sampling at thechip rate.

    The result from the hierarchical matched filter design is then given to a non-coherent mod-

    ule which is used to calculate the energy over I and Q channels and then give it to a com-

    parator block.

    The ROM-stored code sequences given in Table 2 are each tried in sucession before the

    data from the next slot comes in. The data in the shift register is latched till all these

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    +

    X X X X X X X X X X X X X X X X

    Adder Tree 1

    Adder Tree 2

    + + + + + + + + + + + + + + + +

    + + + + + + + +

    +

    X X X X X X X X X X X X X X X X

    1 256Sampling Counter Secondary Buffer

    Code Register 1

    Code Register 2

    Slot Boundary Value

    3 levels adder tree

    3 levels adder tree

    Enable Stage1 Complete

    Matched Filter 1

    Matched Filter 2

    5X SysClock

    5X SysClock

    I-Phase

    Q-Phase

    Code Group

    Slot ID

    Non-coherent Detection Block

    Comparator

    Stage 2 Complete

    Cyclic Codes

    Buffer used to fill the Data Register of

    Matched Filter1

    (.)

    (.) 2

    2

    +

    1 2 3 4 5 6 7 8 9 10111213 1514 16

    1 2 3 4 5 6 7 8 9 10111213 1514 16

    Rom32 X 16

    12

    3

    32

    Figure 6: Frame synchronization and Code Group Identification

    Shift Register 1

    Shift Register 2

  • 8/22/2019 Initial Cell Serch Paper

    32/63

    32

    sequences have been correlated. This is achieved in stage 2 of the Improved CSD scheme

    using two clocks, a slow clock called the system clock in the design and a fast clock which

    runs at 5X system clock. The sampling is performed at the slow clock rate (system clock).

    Once the data is latched in the buffer, the fast clock (5X system clock) is used to perform

    the correlations.

    The comparator block gives the highest correlated code group from the Table 2 with the

    data sequence and also the number of shifts which have been applied to the code group

    sequence. The number of shifts is the same as the slot ID. From the slot ID the frame

    boundary can easily be identified because the number of slots in a frame is fixed at 15.

    4.3 Stage 3: Scrambling Code Identification

    After achieving code group and frame synchronization, the scrambling code is identified

    by correlating the symbols in the CPICH with all possible scrambling codes in the code

    group. The codes are generated using a scrambling code generator and the descrambling

    operation is carried out using a descrambler. The details of the scrambling code generator

    and the descrambler used in stage 3 of the cell search are explained in Sections 4.3.1 and

    4.3.2 respectively.

    4.3.1 Scrambling Code Generator

    Each cell is allocated one and only one primary scrambling code. The scrambling code

  • 8/22/2019 Initial Cell Serch Paper

    33/63

    33

    sequences are constructed by combining two real sequences into a complex sequence [7].

    Each of the two real sequences are constructed as the position wise modulo 2 sum of

    38,400 chip segments of two binary sequences generated by means of two generator poly-

    nomials of degree 18. Let x and y be the two sequences respectively. The resulting

    sequences constitute segments of a set of Gold sequences. The x sequence is constructed

    using the primitive polynomial 1+X7+X18. The y sequence is constructed using the poly-

    nomial 1+X5+X7+X10+X18. The sequence depending on the chosen scrambling code

    number n is denoted as zn. Furthermore, let x(i), y(i) and zn(i) denote the ith symbol of the

    sequence x, y, and zn, respectively. The sequences x and y are constructed as

    x(i+18)=x(i+7)+x(i) modulo 2, i=0,1,..,218 - 20 (4)

    y(i+18)=y(i+10)+y(i+7)+y(i+5)+y(i) modulo 2, i=0,1,..,218 - 20 (5)

    The nth Gold code sequence zn, n=0,1,..,218 - 2, is then defined as

    zn(i)=x((i+n) modulo (218 -1))+y(i) modulo 2, i=0,1,..,218- 2 (6)

    Finally, the nth complex scrambling code sequence sn is defined as

    sn(i)=zn(i)+jzn((i+131,072) modulo (218-1)), i=0,1,..,38,399 (7)

    The pattern from phase 0 up to the phase of 38,399 is repeated for every radio frame.

  • 8/22/2019 Initial Cell Serch Paper

    34/63

    34

    The scrambling code generator used to generate the long codes is shown in Figure 7. A

    total of 218 -1=262,143 scrambling codes, numbered 0,1,..,262,142 can be generated using

    the code generator. However not all the scrambling codes are used. The scrambling codes

    are divided into 512 sets each of a primary scrambling code and 15 secondary scrambling

    codes. The primary scrambling codes consist of scrambling codes n=16*i where

    i=0,1,..,511. The ith set of secondary scrambling codes consists of scrambling codes

    16*i+k, where k=1,2,..,15. There is a one-to-one mapping between each primary scram-

    bling code and 15 secondary scrambling codes in a set such that ith primary scrambling

    code corresponds to ith set of secondary scrambling codes. The set of primary scrambling

    codes is further divided into 32 scrambling code groups, each consisting of 16 primary

    scrambling codes. The jth scrambling code group consists of primary scrambling codes

    16*16*j+16*k, where j=0,1,..,31 and k=0,1,..,14.

    +

    +

    +

    +

    0717

    I Channel

    Q ChannelCode

    Code

    +

    +

    6 5 4 3 2 18910111213141516

    0717 6 5 4 3 2 18910111213141516

    Figure 7: Scrambling Code Generator

  • 8/22/2019 Initial Cell Serch Paper

    35/63

    35

    In stage 3, 16 scrambling codes need to be generated in parallel. If the scrambling code

    generator shown in Figure 7 is used to generate the codes then 16 such code generators

    would be required. However, generating the codes in parallel using 16 code generators

    could be expensive as a huge ROM would be required to store the initial phases for all the

    16 code generators.

    Table 3: Masking Functions used in Stage 3: Scrambling Code Generator

    Masking Function For I Channel Code

    in LFSR 1

    Masking Function For Q Channel

    Code in LFSR 1

    Code1 000000000000000001 001000000001010000Code2 000000000000000010 010000000010100000

    Code3 000000000000000100 100000000101000000

    Code4 000000000000001000 000000001000000001

    Code5 000000000000010000 000000010000000010

    Code6 000000000000100000 000000100000000100

    Code7 000000000001000000 000001000000001000

    Code8 000000000010000000 000010000000010000

    +

    +

    +

    +

    Masking Function for I Channel

    Masking Function for I Channel

    Masking Function for Q Channel

    Masking Function for Q Channel

    0717

    071017

    I Channel

    Q ChannelInitial Phases

    1

    2

    32

    ROM 32 X 18

    for Code generator Code

    Code

    . . .

    . . .

    . . .

    . . .

    5

    Figure 8: Multiple Scrambling Code Generator

    LFSR 1

    LFSR 2

  • 8/22/2019 Initial Cell Serch Paper

    36/63

    36

    In order to reduce the hardware utilization, in stage 3 of both the designs only one

    scrambling code generator is used to generate 16 codes in parallel when 32 code groups

    are used as shown in Figure 8. Sixteen masking functions are used to generate the codes

    in parallel [15]. Masking functions can generate codes which have minimum overlap and

    reduce the hardware circuitry to a single scrambling code generator at the expense of a few

    logic gates. The masking functions used for generating the codes are given in Table 3.

    Masking function for I and Q Channel Code in linear feedback shift register (LFSR) 2

    were kept fixed as 000000000000000001 and 001111111101100000. Besides reducing

    the hardware from 16 code generators to one code generator, the design also reduces the

    ROM size to 32X18 from the size 512X18 if 16 code generators were used.

    4.3.2 Descrambler

    Descrambling is carried out using data over the CPICH and the codes generated by the

    scrambling code generator and masking functions. Counters are used as shown in Figure

    9 to keep track of the votes obtained after the descrambling and the comparison opera-

    tions. After these operations are completed, the final step is to decide whether cell search

    Code9 000000000100000000 000100000000100000

    Code10 000000001000000000 001000000001000000

    Code11 000000010000000000 010000000010000000Code12 000000100000000000 100000000100000000

    Code13 000001000000000000 000000001010000001

    Code14 000010000000000000 000000010100000010

    Code15 000100000000000000 000000101000000100

    Code16 001000000000000000 000001010000001000

    Table 3: Masking Functions used in Stage 3: Scrambling Code Generator

    Masking Function For I Channel Code

    in LFSR 1

    Masking Function For Q Channel

    Code in LFSR 1

  • 8/22/2019 Initial Cell Serch Paper

    37/63

    37

    has been successful and a code has been found. For this purpose a parameter called prob-

    ability of false alarm rate (PFA) is used to predefine the threshold value (VTH) [19]. The

    relation can be expressed by the following equation

    PFA=e-V

    TH/V (8)

    where V is twice the variance of the I and Q components.

    If the counter exceeds VTH then the cell search operation is declared a success and the

    particular long code is identified.

  • 8/22/2019 Initial Cell Serch Paper

    38/63

    38

    X X

    +

    (.)

    (.)22

    +

    +

    X X

    +

    Descrambler2

    Descrambler3

    Descrambler16

    Descrambler1

    .

    Descrambler

    Descrambler

    counter15

    ..16

    counter13..

    14

    counter11..

    12

    counter10..

    9

    counter7..

    8

    counter5..

    6

    counter3..

    4

    counter1..

    2

    T

    hreshold

    FirstComparatorBlock

    SecondComparatorBlock

    IChannelCode

    QCh

    annelCode

    QCh

    annelCode

    IChannelCode

    Data

    Data

    Data

    Data

    Increment

    Counter

    Code

    Found

    + +

    ++

    MaskingFunctionforIChannel

    MaskingFunctionforIChannel

    MaskingFunctionforQChannel

    MaskingFunctionforQChannel0

    7

    17

    0

    7

    10

    17

    IChannel

    QChannel

    InitialPhases

    1232

    MultipleScramblingCodeGenerator

    ROM

    32X18

    Descrambler

    Long

    Code

    IChannel

    QChann

    el

    IChannel

    QChann

    el

    Output1

    Output16

    Value

    forCodegenerator

    Code

    Code

    Output1

    ...

    ...

    ...

    ... 5

    Figur

    e9:ScramblingCodeIdentification

  • 8/22/2019 Initial Cell Serch Paper

    39/63

    39

    Chapter 5

    3GPP-comma free Cell Search Design

    5.0 3GPP-comma free Cell Search Design

    This Chapter discusses stage 2 of the 3GPP cell search design using comma free codes.

    Stage 1 and stage 3 for the 3GPP-comma free CSD design were kept the same as the

    Improved CSD to compare stage 2 of both the designs. A Fast Hadamard Transformer

    (FHT) is proposed to be used in stage 2 of the cell search algorithm. To reduce the hard-

    ware utilization of the FHT design, reduced length Walsh sequences are proposed as

    explained in Section 5.1.

    5.1 Stage 2 of 3GPP-comma free Cell Search Design

    In CDMA systems, the BS identifies each user in a cell by a unique scrambling code. In

    order to minimize the interference in a cell when two users transmit at the same time,

    orthogonal (Walsh) codes are used. The Walsh codes are generated using a Walsh-Had-

    amard function. When these Walsh codes are transmitted by the BS, they are affected by

    interference, fading and noise which may be AWGN. At the receiver, a decoding logic is

    required to correctly determine which of the Walsh codes was the most likely to have been

    sent. A FHT can be used to provide such a decoding circuitry.

    The table provided in the 3GPP Specifications for the comma free codes is for 64 code

  • 8/22/2019 Initial Cell Serch Paper

    40/63

    40

    groups. For comparison with the Improved CSD scheme which uses 32 code groups, only

    32 of the possible 64 code groups are used. The 32 secondary SCH sequences are con-

    structed such that their cyclic shifts are unique, i.e., a non-zero cyclic shift less than 15 of

    any of the 32 sequences is not equivalent to some cyclic shift of any other of the 32

    sequences. Also, a non-zero cyclic shift less than 15 of any of the sequences is not equiv-

    alent to itself with any other cyclic shift less than 15. Table 4 lists the sequences of SSCs

    used to encode the 32 different scrambling code groups [7].

    Table 4: Allocation of SSCs for Secondary SCH

    Scrambling

    Code

    Group

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

    Group 0 1 1 2 8 9 10 15 8 10 16 2 7 15 7 16

    Group 1 1 1 5 16 7 3 14 16 3 10 5 12 14 12 10

    Group 2 1 2 1 15 5 5 12 16 6 11 2 16 11 15 12

    Group 3 1 2 3 1 8 6 5 2 5 8 4 4 6 3 7

    Group 4 1 2 16 6 6 11 15 5 12 1 15 12 16 11 2

    Group 5 1 3 4 7 4 1 5 5 3 6 2 8 7 6 8

    Group 6 1 4 11 3 4 10 9 2 11 2 10 12 12 9 3

    Group 7 1 5 6 6 14 9 10 2 13 9 2 5 14 1 13

    Group 8 1 6 10 10 4 11 7 13 16 11 13 6 4 1 16Group 9 1 6 13 2 14 2 6 5 5 13 10 9 1 14 10

    Group 10 1 7 8 5 7 2 4 3 8 3 2 6 6 4 5

    Group 11 1 7 10 9 16 7 9 15 1 8 16 8 15 2 2

    Group 12 1 8 12 9 9 4 13 16 5 1 13 5 12 4 8

    Group 13 1 8 14 10 14 1 15 15 8 5 11 4 10 5 4

    Group 14 1 9 2 15 15 16 10 7 8 1 10 8 2 16 9

    Group 15 1 9 15 6 16 2 13 14 10 11 7 4 5 12 3

    Group 16 1 10 9 11 15 7 6 4 16 5 2 12 13 3 14

    Group 17 1 11 14 4 13 2 9 10 12 16 8 5 3 15 6

    Group 18 1 12 12 13 14 7 2 8 14 2 1 13 11 8 11

    Group 19 1 12 15 5 4 14 3 16 7 8 6 2 10 11 13

    Group 20 1 15 4 3 7 6 10 13 12 5 14 16 8 2 11

    Group 21 1 16 3 12 11 9 13 5 8 2 14 7 4 10 15

    Group 22 2 2 5 10 16 11 3 10 11 8 5 13 3 13 8

    Group 23 2 2 12 3 15 5 8 3 5 14 12 9 8 9 14

    Group 24 2 3 6 16 12 16 3 13 13 6 7 9 2 12 7

    Group 25 2 3 8 2 9 15 14 3 14 9 5 5 15 8 12

    Group 26 2 4 7 9 5 4 9 11 2 14 5 14 11 16 16

    Group 27 2 4 13 12 12 7 15 10 5 2 15 5 13 7 4

  • 8/22/2019 Initial Cell Serch Paper

    41/63

    41

    The 16 SSCs, (Cssc,1,..,Cssc,16), are complex-valued with identical real and imaginary

    components, and are constructed from position wise multiplication of a Hadamard

    sequence and a sequence z, defined as z=(b,b,b,-b,b,b,-b,-b,b,-b,b,-b,-b,-b,-b,-b), where

    b=(1,1,1,1,1,1,-1,-1,1,-1,1,-1,1,-1,-1,1). The Hadamard sequence is obtained from one of

    the rows of a Hadamard matrix which consists of +1 and -1. The rows and columns of the

    Hadamard matrix have the property that they are mutually orthogonal. The following

    examples show how to construct a Hadamard matrix

    In general the Hadamard matrix can be defined recursively as

    where HN is a matrix of size N X N.

    If a vector X with length N is an input then a vector Y obtained as a result of the Had-

    amard transform is equal to

    Y=HN*X (10)

    Group 28 2 5 9 9 3 12 8 14 15 12 14 5 3 2 15

    Group 29 2 5 11 7 2 11 9 4 16 7 16 9 14 14 4

    Group 30 2 6 2 13 3 3 12 9 7 16 6 9 16 13 12

    Group 31 2 6 9 7 7 16 13 3 12 2 13 12 9 16 6

    Table 4: Allocation of SSCs for Secondary SCH

    Scrambling

    Code

    Group

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

    H21 1

    1 1=

    H4

    1 1 1 1

    1 1 1 1

    1 1 1 1

    1 1 1 1

    =

    HNHN HN

    HN HN= (9)

  • 8/22/2019 Initial Cell Serch Paper

    42/63

    42

    The entries in Table 4 denote what SSC to use in the different slots for the different

    scrambling code groups, e.g. the entry "5" means that SSC Cssc,5 shall be used for the cor-

    responding scrambling code group and slot. The kth SSC, Cssc,kk=1,2,..,16 can be calcu-

    lated using the following expression:

    Cssc,k=(1+j)(Hm(0)z(0),Hm(1)z(1),Hm(2)z(2),..,Hm(255)z(255)) (11)

    where m=16(k-1)

    As each element of the Hadamard matrix is either +1 or -1, the multiplication operation

    used in equation 11 can be reduced to a series of addition/subtraction operations. In gen-

    eral, for a N-point input sample, the FHT algorithm needs to perform Nlog2N addition and

    subtraction operations.

    Figure 10 shows an individual stage of the FHT. Each stage has an upper and a lower

    input terminal. The upper input terminal is configured to receive multiple input signals

    which are either Walsh chips (if the stage is the first stage of the FHT) or intermediate cor-

    relation coefficients (if the stage is not the first stage of the FHT). If an input of N-Walsh

    chips is to be processed then the upper input terminal receives N/2 input signal bits and the

    lower input terminal receives the other N/2 input bits.

    +

    -

    0

    1

    1

    0

    1

    0

    +

    +

    En

    1 2

    1 2

    Figure 10: Individual Stage of FHT

    Upper Input

    Lower Input

    Output to

    Next Stageof FHT

    Terminal

    Terminal

    Enable

  • 8/22/2019 Initial Cell Serch Paper

    43/63

    43

    + -

    0 1

    1 01 0

    ++

    + -

    0 1

    1 01 0

    ++

    + -

    0 1

    1 01 0

    ++

    SamplingC

    ounter

    SlotBoundaryVa

    lue

    EnableS

    tage1Complete

    CommaFreeCodes

    1 2 32

    Slot1

    Slot2

    Slot3

    S

    lot15

    Buffer

    D

    etector

    3

    RegistertoStore

    Comparator

    CodeGroup

    SlotID

    Table43GPP25.2

    13v4.0

    ShiftRegister

    Adder

    Adder/Subtrac

    tor

    InputDataBitsfrom

    Buffer

    fromStage1

    En

    En

    En

    MSB

    LSB

    Counter

    3Bit

    + + -+

    Phase1

    Phase

    2

    Phase3

    Phase4

    Phas

    e5

    DatatoFHT

    H

    adamardCodeMetrics

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10111213

    15

    14

    16

    1

    1

    1

    1

    1

    2

    2

    15

    161012

    2

    6

    9

    6

    HadamardRow

    Ids

    ROM

    32X60

    123

    4

    123

    4

    1

    2

    1

    2

    11

    1

    2

    3

    y15

    y15+y16

    (y13-y14)+(y15-y16)

    ((y9-y10)-(y11-y12))+((y13-y14)+(y15-y16)

    )

    ((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))+((y9-y10)-(y11-1

    2))-((y13-y14)+(y15-y16))

    y16

    y15-y16

    (y13-y14)-(y15-y16)

    ((y9-y10)-(y11-12))-((y13-y14)+(y15-y16))

    ((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))-((y9-y10)-(y11-12))-((y13-y14)+(y15-y16))

    InputPhase1

    Phase2

    Phase3

    Phase4

    y2

    y1-y2

    (y1+y2)-(y3+y4)

    ((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))

    ((y

    1+y2)+(y3+y4))+((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))

    y1

    y1+y2

    (y1+y2)+

    (y3+y4)

    ((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))

    ((y1

    +y2)+(y3+y4))+((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12

    ))+((y13+y14)+(y15+y16))

    Figure11:16chipFHT

    (C2)

    (C0)

  • 8/22/2019 Initial Cell Serch Paper

    44/63

    44

    Figure 11 shows the design for a FHT structure which is used for decoding a 16 chip

    sequence. The design proposed is a very compact and efficient implementation as com-

    pared to previous designs [13] [14]. The inputs to the FHT are applied according to the

    timing diagram as shown in Table 5. The inputs are applied in a non-sequential order and

    hence a buffer is required to initially store the vectors before passing them to the FHT

    structure. If a 16 chip sequence needs to be decoded then a buffer of length 16 registers is

    required to initially store the vectors. The addition and subtraction operations in the FHT

    algorithm are used to generate correlation coefficients for the received Walsh code. The

    correlation coefficients express the likelihood that a received codeword is the correct

    Walsh code.

    Table 5: Timing Diagram of Inputs to FHT

    Phase 1 Upper Input 0 1 2 3 4 5 6 7

    Phase 1 Lower Input 8 9 10 11 12 13 14 15

    Phase 2 Upper Input 0 1 2 3

    Phase 2 Lower Input 4 5 6 7

    Phase 3 Upper Input 0 1Phase 3 Lower Input 2 3

    Phase 4 Upper Input 0

    Phase 4 Lower Input 1

  • 8/22/2019 Initial Cell Serch Paper

    45/63

    45

    Phase4

    ((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12))+((y13+

    y14)+(y15+y16))

    ((y1+y2)+(y3+y4))+((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))

    ((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))+((y9+y10)+(y11+y12))-((y13+y

    14)+(y15+y16))

    ((y1+y2)+(y3+y4))-((y5+y6)+(y7+y8))-((y9+y10)+(y11+y12))-((y13+y14)+(y15+y16))

    ((y1+y2)-(y3+y4)

    )+((y5+y6)-(y7+y8))+((y9+y10)-(y11+y12))+((y13+y14)-(y15+y16))

    ((y1+y2)-(y3+y4)

    )+((y5+y6)-(y7+y8)-((y9+y10)-(y11+y12))+((y13+y14)-(y15+y16))

    ((y1+y2)-(y3+y4)

    )-((y5+y6)-(y7+y8)+((y9+y10)-(y11+y12))-((y13+y14

    )-(y15+y16))

    ((y1+y2)-(y3+y4)

    )-((y5+y6)-(y7+y8))-((y9+y10)-(y11+y12))-((y13+y14

    )-(y15+y16))

    ((y1-y2)+(y3-y4))+((y5-y6)+(y7-y8))+((y9-y10)+(y11-y12))+((y13-y14

    )+(y15-y16))

    ((y1-y2)+(y3-y4))+((y5-y6)+(y7-y8))-((y9-y10)+(y11-y12))+((y13-y14)

    +(y15-y16))

    ((y1-y2)+(y3-y4))-((y5-y6)+(y7-y8))+((y9-y10)+(y11-y12))-((y13-y14)+(y15-y16))

    ((y1-y2)+(y3-y4))-((y5-y6)+(y7-y8))-((y9-y10)+(y11-y12))-((y13-y14)+

    (y15-y16))

    ((y1-y2)-(y3-y4))+((y5-y6)-(y7-y8))+((y9-y10)-(y11-y12))+((y13-y14)-(y15-y16))

    ((y1-y2)-(y3-y4))+((y5-y6)-(y7-y8))-((y9-y10)-(y11-y12))+((y13-y14)-(

    y15-y16))

    ((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))+((y9-y10)-(y11-y12))-((y13-y14)-(y15-y16))

    ((y1-y2)-(y3-y4))-((y5-y6)-(y7-y8))-((y9-y10)-(y11-y12))-((y13-y14)-(y

    15-y16))

    Phase3

    ((y1+y2)+(y3+y

    4))+((y5+y6)+(y7+y8))

    ((y1+y2)+(y3+y

    4))-((y5+y6)+(y7+y8))

    ((y1+y2)-(y3+y4

    ))+((y5+y6)-(y7+y8))

    ((y1+y2)-(y3+y4

    ))-((y5+y6)+(y7+y8))

    ((y1-y2)+(y3-y4

    ))+((y5-y6)+(y7-y8))

    ((y1-y2)+(y3-y4

    ))-((y5-y6)+(y7-y8))

    ((y1-y2)-(y3-y4)

    )+((y5-y6)-(y7-y8))

    ((y1-y2)-(y3-y4

    ))-((y5-y6)-(y7-y8))

    ((y9+y10)+(y11+y12))+((y13+y14)+(y15+y16))

    ((y9+y10)+(y11+y12))-((y13+y14)+(y15+y16))

    ((y9+y10)-(y11+

    y12))+((y13+y14)-(y15+y16))

    ((y9+y10)-(y11+

    y12))-((y13+y14)-(y15+y16))

    ((y9-y10)+(y11-

    y12))+((y13-y14)+(y15-y16))

    ((y9-y10)+(y11-

    y12))-((y13-y14)+(y15-y16))

    ((y9-y10)-(y11-y

    12))+((y13-y14)+(y15-y16))

    ((y9-y10)-(y11-1

    2))-((y13-y14)+(y15-y16))

    Phase2

    (y1+y2)+(y3+y4)

    (y1+y2)-(y3+y4)

    (y1-y2)+(y3-y4)

    (y1-y2)-(y3-y4)

    (y5+y6)+(y7+y8)

    (y5+y6)-(y7+y8)

    (y5-y6)+(y7-y8)

    (y5-y6)-(y7-y8)

    (y9+y10)+(y11+y12)

    (y9+y10)-(y11+y12)

    (y9-y10)+(y11-y12)

    (y9-y10)-(y11-y12)

    (y13+y14)+(y15+y16)

    (y13+y14)-(y15+y16)

    (y13-y14)+(y15-y16)

    (y13-y14)-(y15-y16)

    Phase1

    y1+

    y2

    y1-y2

    y3+

    y4

    y3-y4

    y5+

    y6

    y5-y6

    y7+

    y8

    y7-y8

    y9+

    y10

    y9-y10

    y11

    +y12

    y11

    -y12

    y13

    +y14

    y13

    -y14

    y15

    +y16

    y15

    -y16

    Input

    y1

    y2

    y3

    y4

    y5

    y6

    y7

    y8

    y9

    y10

    y11

    y12

    y13

    y14

    y15

    y16

    Figure12:H

    adamardCodeMetrics(ButterflyO

    peration)

  • 8/22/2019 Initial Cell Serch Paper

    46/63

    46

    The correlation coefficients are also called the Hadamard code metrics and are gener-

    ated as shown in Figure 12 for a 16-point FHT. This operation is also called the butterfly

    operation. The butterfly operation is also used in other digital signal processing (DSP)

    applications such as calculating the discrete fourier transform (DFT). The Walsh code

    having the largest metric is then selected as the most likely code that will be transmitted.

    It is the job of the detector to find which of the code groups and slot ID is being used

    from the table provided in the 3GPP specifications [7], using the three Hadamard rows

    (Walsh codes). The detector needs to identify the code group in the minimum amount of

    time which uses a lot of hardware resources. Also, if the correct sequence of Hadamard

    rows is not identified and given to the detector then it can lead to wastage of additional

    clock cycles as it will try to find the sequence from the table provided in the 3GPP specifi-

    cations. The detection circuitry is used to locate the sequence from the table and hence

    find the code group and slot ID. Also, in the 3GPP-comma free CSD implementation, two

    clocks are not needed. Even if two clocks are used, a marginal gain will be achieved only

    in the detection phase 5 as shown in Figure 11. This is due to the fact that detection of the

    code group and slot ID cannot start till at least three slots have been identified by phases 1

    - 4.

    There are a number of stages in the FHT design depending on the length of the Walsh

    sequence. Each subsequent stage receives an input from the previous stage in half the

    number of clock cycles required for the previous stage. This is achieved by reducing the

    length of shift register by a factor of two for each subsequent stage of the FHT.

  • 8/22/2019 Initial Cell Serch Paper

    47/63

    47

    A counter is used as a clock to determine the time interval at which each successive pair

    of input signals is received by the FHT. The upper shift registers in each of the stages are

    always enabled whereas the lower shift registers are enabled by the bits of the counter.

    The length of the counter register is dependent on how many stages are there in the FHT.

    The counter bit C0 is the LSB and C2 is the MSB. Counter bit C2 is alternately high for

    four clock cycles and then goes low for four clock cycles (000...011, 100...111). The bit

    C0 is alternately high and low for each clock cycle (000,001,...etc.). The number of bits in

    the counter depend on the number of stages, which in turn depends on the length of Walsh-

    Hadamard sequence to be used. If there are N Walsh chips then the counter length must be

    log2N bits. The length of the shift register in each of the stage s of the design is given by

    the following relation (N/4)/2s. For example the length of the shift registers used in the

    first stage of the FHT is (16/4)/20=4. Similarly, the length of registers used in other stages

    can be calculated.

    In the first stage, the input signals corresponding to Walsh chips 0 to 7 arrive at the

    upper adder whereas the Walsh chips from 8 to 15 are applied to the adder/subtractor cir-

    cuit in the lower half of stage 1. During the first four clock cycles, the data bits from the

    adder unit are selected by the multiplexer 1 in stage 1. The lower shift register of stage 1

    is enabled to store the outputs from the adder/subtractor unit. Thus at the end of four

    clock cycles, the upper shift register stores the result of addition of the first four pairs

    whereas the lower shift register stores the result of subtraction. In the fifth clock cycle, C2

    goes high which disables the lower shift register in stage 1. The result of the upper shift

    register in stage 1 and the adder output from stage 1, which gives the addition of a new

  • 8/22/2019 Initial Cell Serch Paper

    48/63

    48

    pair of inputs, is then passed onto the adder and adder/subtractor unit in stage 2. Thus,

    each subsequent stage receives its input from the previous stage. This process is then

    repeated for each of the other stages in the FHT. At the end of eight clock cycles, all of the

    16 correlation coefficients are generated and the largest coefficient is selected as the most

    likely Walsh-Hadamard codeword to have been transmitted. The design is flexible and can

    be easily modified to incorporate any chip sequence which has a length of a power of two.

    5.2 Reduced Length FHT Design

    If the 256X256 matrix is observed carefully then it is noticed that the 256 chip sequence

    can be identified by 16 chip sequences shown in Table 6.

    Table 6: Reduced Length Walsh Sequences (256 chip sequence to 16 chip sequence)

    Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

    2 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1 1 -1

    3 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1

    4 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1

    5 1 1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1

    6 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1

    7 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 -1 -1 1 1

    8 1 -1 -1 1 -1 1 1 -1 1 -1 -1 1 -1 1 1 -1

    9 1 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1

    10 1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1

    11 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1

    12 1 -1 -1 1 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1

    13 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 1

    14 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 1 -1 1 -1

    15 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 1 1 -1 -1

    16 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 1

  • 8/22/2019 Initial Cell Serch Paper

    49/63

    49

    Thus in a CDMA receiver, only the first 16 chips of the entire Walsh sequence can be

    used. The buffer, which is used to store the input value, will also be reduced in length

    from 256 to 16 registers. The proposed design ideas lead to considerable savings in hard-

    ware resources. The reduced length Walsh sequence helps in achieving faster decoding.

    The two designs were synthesized and the hardware resources utilized were compared on

    a Xilinx Virtex-E XCV1000E FPGA.

  • 8/22/2019 Initial Cell Serch Paper

    50/63

    50

    Chapter 6

    Experimental Method and Results

    6.0 Experimental Method and Results

    This Chapter explains the method used to measure the acquisition time for both of the

    cell search designs, Improved CSD and the 3GPP-comma free CSD. Section 6.1.1 pro-

    vides details of the FPGA used for prototyping the algorithms and for comparing the hard-

    ware specifications of both designs. Section 6.2 presents the results of the acquisition time

    measure and the hardware comparison. Section 6.2 also compares the hardware utiliza-

    tion of the FHT design using 256 and 16 chip sequences.

    6.1 Experimental Method

    The acquisition time was measured by counting the number of clock cycles used by the

    RTL simulation. The input chip rate is given by the 3GPP specifications and this gives the

    acquisition time measure. For comparing the hardware specifications and the maximum

    frequency of operation of both designs on the FPGA, the Xilinx Foundation ISE software

    was used to generate the bit map file for programming the FPGA. The details of the

    FPGA and the design process used for the hardware comparison are explained in Section

    6.1.1.

  • 8/22/2019 Initial Cell Serch Paper

    51/63

    51

    6.1.1 FPGA Design Process

    The FPGA used for prototyping the designs is a Xilinx Virtex-E XCV1000E BG560

    with a speed grade of 6. As the name suggests, FPGAs are capable of being reconfigured

    to implement any desired digital circuit. This is made possible by having a large number

    of small configurable logic blocks (CLB) and a connection mechanism between these

    blocks which is used to interconnect the CLBs according to the design. The basic building

    block of the Virtex-E CLB is the logic cell (LC). Each Virtex-E CLB contains four LCs,

    organized in two similar slices, as shown in Figure 13 [20]. A LC includes a 4-input func-

    tion generator, carry logic, and a storage element. Virtex-E function generators are imple-

    mented as 4-input look-up tables (LUTs). Along with the LUTs the CLB also contains D

    flip-flops for storing data. The output from the function generator in each LC drives both

    the CLB output and the D input of the flip-flop. The block diagram of a 2-Slice Xilinx

    Virtex-E CLB is as shown in Figure 13. The detailed view of a Virtex-E Slice is shown in

    Figure 14 [20].

  • 8/22/2019 Initial Cell Serch Paper

    52/63

    52

    Figure 13: 2-Slice Virtex-E CLB

    Figure 14: Detailed View of Virtex-E Slice

  • 8/22/2019 Initial Cell Serch Paper

    53/63

    53

    The entire design was coded in Verilog at the Register Transfer Level (RTL). The RTL

    design was then synthesized using the Synopsys FPGA Express synthesis tool available

    with the Foundation ISE software. The bit map generated was then used to program the

    FPGA using the JTAG cable.

    6.2 Experimental Results

    To compare the acquisition time between the Improved CSD and the 3GPP-comma free

    CSD, experiments were carried out using input vectors generated in Matlab. Threshold

    values determined for the two probabilities of false alarm rates (PFA=10-3 and PFA=10

    -4)

    were 28 and 37 respectively. The number of clock cycles between the start of the system

    and the point when the counter in stage 3 exceeds the computed threshold values was

    determined. The equivalent gate count and maximum frequency of operation were com-

    pared for both the designs using a 256 chip sequence in stage 2 and the same design con-

    straints in the FPGA Express synthesis tool on a Xilinx Virtex-E XCV1000E FPGA.

    From the experiments conducted, it was observed that the Improved CSD uses fewer

    number of slots to achieve synchronization as compared to the 3GPP-comma free CSD in

    stage 2. The results obtained indicate that when averaging is carried out over 15 slots in

    stage 1 of both the designs (PFA1=10-3 and VTH1=28), the Improved CSD has an acquisi-

    tion time of 13.66 msec as compared to 14.53 msec for the 3GPP-comma free CSD. Thus,

    the Improved CSD achieves an improvement of 0.87 msec for an AWGN channel (Figure

  • 8/22/2019 Initial Cell Serch Paper

    54/63

    54

    15). Similarly, an improvement of 0.87 msec was observed when PFA2=10-4 and

    VTH2=37. Figures 15 and 16 show the acquisition time measures for 2,4,8 and 15 slots in

    stage 1 of the design. The number of slots in the other stages, as discussed in previous

    Chapters, were kept fixed as 1 slot in stage 2 of the Improved CSD and three slots in

    3GPP-comma free CSD and 15 slots in stage 3 of both designs.

  • 8/22/2019 Initial Cell Serch Paper

    55/63

    55

    Figure 15: Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-3

    Figure 16: Comparison of Improved CSD and 3GPP-comma free CSD PFA=10-4

    2 4 6 8 10 12 14 162

    4

    6

    8

    10

    12

    14

    16Acquisition Time Measures: Quantization 4 Input Data Bits

    Number of Slots in Stage1

    AcquisitionTime(inmsec)

    Improved CSD3GPPcomma free CSD

    2 4 6 8 10 12 14 164

    6

    8

    10

    12

    14

    16Acquisition Time Measures: Quantization 4 Input Data Bits

    Number of Slots in Stage 1

    AcquisitionTime(inmsec)

    Improved CSD3GPPcomma free CSD

  • 8/22/2019 Initial Cell Serch Paper

    56/63

    56

    As seen from Table 7, the Improved CSD had a lower equivalent gate count (136,297)

    and a higher maximum frequency of operation (22.066 MHz) on a Xilinx Virtex-E

    XCV1000E FPGA as compared to the 3GPP-comma free CSD when the same constraints

    were used in the synthesis of both the designs.

    In the FHT design, the input Walsh sequence length can be reduced from 256 chips to

    16 chips to reduce the hardware utilization. The proposed idea leads to considerable sav-

    ings in hardware resources. The buffer, which is used to store the input value, is reduced

    in length from 256 to 16 registers. The reduced length Walsh sequence helps in achieving

    faster decoding. The FHT designs using 16 and 256 chip sequences were synthesized and

    the hardware resources utilized were compared using a Xilinx Virtex-E XCV1000E

    FPGA. The hardware utilization for both the FHT designs are compared in Table 8.

    The results of the reduced length sequence indicate that the FHT design, using 16 chip

    sequence, achieves 90% reduction in hardware resources (equivalent gate count) as com-

    pared to the design which uses 256 chip sequence. Also, the maximum frequency of oper-

    Table 7: Hardware Specifications of System: Quantization 4 Input Data Bits

    FPGA XCV 1000E

    BG560 Speed Grade 6

    Number

    of Slice

    Registers

    Number of

    4 Input

    LUTs

    Equivalent

    Gate Count

    Max. Frequency of

    Operation (Post

    Route Timing)

    Improved CSD 9086 7354 136297 22.066 MHz

    3GPP-comma free CSD 10141 7777 144180 12.887 MHz

    Table 8: Hardware Specifications of FHT: 16 and 256 chip sequence

    FPGA XCV

    1000E BG560

    Speed Grade 6

    Number of

    Slice Registers

    Number of 4

    Input LUTs

    Equivalent

    Gate Count

    Max. Frequency of

    Operation (Post

    Route Timing)

    FHT 16 chips 71 173 1591 35.769 MHz

    FHT 256 chips 1070 1370 17,191 16.025 MHz

  • 8/22/2019 Initial Cell Serch Paper

    57/63

    57

    ation of the 16 chip FHT (35.679 MHz) is more than double that of the 256 chip FHT

    (16.025 MHz).

  • 8/22/2019 Initial Cell Serch Paper

    58/63

    58

    Chapter 7

    Summary, Conclusions and Future Work

    7.0 Summary, Conclusions and Future Work

    In this Chapter the conclusions drawn form the experimental results are summarized

    and the scope for future work is outlined.

    7.1 Summary

    In Chapter 2, we discussed some of the previous work done by other research groups

    and also the 3GPP working group suggestions. Chapter 3 introduced the cell search algo-

    rithm, which is divided into three stages to simplify the synchronization between the MS

    and the BS. Chapter 4 discussed the Improved CSD which is the proposed design scheme

    to perform initial cell search. The hierarchical matched filter design proposed by Siemens

    and Texas Instruments was used in stage 1 of both the cell search designs [6]. In stage 2 of

    the initial cell search algorithm, two possible design schemes were compared: the

    Improved CSD which uses cyclic codes and the 3GPP-comma free CSD using the comma

    free codes. The details of the Improved CSD are described in Chapter 4. In stage 3 of

    both the cell search designs, masking functions are proposed to reduce the hardware utili-

    zation as compared to the previous design described by Li et al. [4]. Chapter 5 described

    the 3GPP-comma free CSD using a FHT design in stage 2 of the cell search algorithm.

    Further design improvements are suggested in the FHT design by reducing the length of

  • 8/22/2019 Initial Cell Serch Paper

    59/63

    59

    the input Walsh sequence from 256 chips to 16 chip sequences. Chapter 6 discussed the

    experimental method and presented the results in terms of acquisition time and hardware

    utilization for both the Improved CSD and the 3GPP-comma free CSD. The hardware uti-

    lization of the FHT design using 256 chip sequences and the reduced length (16 chip

    sequences) are also presented.

    7.2 Conclusions

    For an AWGN channel model in a high signal-to-noise ratio environment, it was found

    that accumulation over one slot in the Improved CSD scheme and accumulation over three

    slots in the 3GPP-comma free CSD scheme in stage 2 of the cell search algorithm gives

    correct code group and slot boundary identification. Due to the reduction in the required

    number of slots, the Improved CSD uses lesser number of clock cycles in stage 2 as com-

    pared to the 3GPP-comma free CSD to detect the code group and slot ID. This reduction

    in the number of clock cycles leads to faster acquisition, fewer calls getting dropped and

    lower power consumption during the synchronization between the MS and the BS. The

    use of cyclic codes in the Improved CSD has lower hardware utilization and a higher max-

    imum frequency of operation as compared to the 3GPP-comma free CSD. In conclusion,

    the Improved CSD is a better cell search design in comparison to the 3GPP-comma free

    CSD since it has faster acquisition time and lower hardware utilization.

  • 8/22/2019 Initial Cell Serch Paper

    60/63

    60

    7.3 Future Work

    This thesis investigates code and time synchronization of the cell search algorithm. In

    addition to code and time synchronization, frequency synchronization between the MS

    and the BS needs to be achieved. The receiver design presented in this thesis would need

    to include another module to achieve frequency synchronization. Also, the cell search

    considered in this thesis is initial cell search. There is another cell search called target cell

    search which needs to be performed during a call and when a MS is in motion and moves

    from one cell to another. VLSI implementations to perform target cell search efficiently

    need to be investigated.

    Kiessling et al. [21] suggest performance enhancements to W-CDMA initial cell search

    algorithm. The authors consider the advantages of oversampling and passing multiple

    candidates in the cell search stages instead of one candidate to reduce the cell search time.

    Passing multiple candidates in each of the stages will reduce the cell search time but

    increase the design complexity a