00049932

Upload: hub23

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 00049932

    1/5

    A NEW SPEECH PROCESSING SCHEMEFOR ATM SWITCHING SYSTEMS

    Takao SUZUKI, OsamuNOGUCHI, Kiyoshi YOKOTA and YasuoSHOJI

    Digital Communications Laboratories, Oki Electric Industry Co., Ltd.4-10-3, Shibaura, Minato-ku, Tokyo 108 J apan

    ABSTRACTFrom the viewpoint of speech communication servicesfor Asynchronous Transfer Mode (ATM) network and inorder to introduce the necessary conditions for speechprocessing over an ATM network, we have developed anew speech processing scheme applied at the end of theATM network. For this speech processing, speech signalsare processed basically by two techniques; these are silencedeletion for speech compression and low-bit speech codingfor 32-kbiVs Adaptive Differential P CM (ADPCM).In order to reduce speech quality degradation caused bylost ATM cells in network congestion conditions, we haveproposed a new cell reconstruction algorithm usingwaveform substitution for ADPCM-coded speech based on

    the pitch estimation method. In addition, to maintaingood speech quality we applied some new algorithms forspeech processing. Therefore we confirmed throughsubjective evaluation tests that the proposed speechprocessing scheme for the ATM network could providegood speech quality up to a cell loss rate of about 3% as forpublic communications. We have developed two kinds ofcustom L S l s for implementing these speech processingalogrithms.1. INTRODUCTION

    For future multi-media communication systems theintegrated packet switching systems using AsynchronousTransfer Mode (ATM) technology could become the mostlikely leading candidates, and many studies have beencarried out in this field. Broadband architecture for ATMnetwork applications has been set forward with CCITTStudy Group XVl l l by the Broadband Task Group (BBTG). I1 1The ATM systems seem to have the flexibility for futureindistinct communication services. However it is importantto assure the feasibility for speech communications whichmay well become the earliest application as a majorcommunication service inATM network.The ATM is a transport multiplexing technique forBroadband ISDN (BISDN), where the information is packedinto fixed-size "ATM cells" as relatively short lengthpackets. For speech communications over an ATMnetwork, in order to reduce speech quality degradationcaused by lost ATM cells during network congestions, wehave proposed a new cell reconstruction using waveformsubstitution for ADPCM-coded speech based on the pitchestimation method. Inaddition, to maintain good speechquality we have applied some new algorithms for speechprocessing.This paper describes the development and evaluation ofthe Speech Processing Unit (SPU) at the end of the ATM

    network which provides high performance use of thenetwork and maintains good speech quality. We havecarried out this development for experimental equipmentwhich is combined with a conventional analog 2-wiresubscriber interface. The SPU is useful in common for theapplication of speech terminals to an ISDN I-interface and afuture ATM-interface. Insection 2, functions for the ATMinterface and speech processing algorithms are proposed.Custom LSls which were used to implement these speechprocessing algorithms are given in section 3. Performanceevaluations by segmental SIN measurements and subjectivetests are made in section 4. Features of ADPCM cellreconstruction and feasibility for speech communicationsare discussed insection 5.2 . FUNCTIONS AND A LGO RITHMS OF SPEECHPROCESSING UNIT2.1 Functions for ATM Interface

    Fig. 1 shows the functional block configuration for anATM interface which is composed of a Speech ProcessingUnit (SPU) and a Cell Assembler/Disassembler (CAD). Thefunctions of the SPU transmitter are speech detection,silence detection and ADPCM encoding. Those of the SP Ureceiver are ADPCM cell reconstruction, ADPCM decoding,silence reconstruction and noise injection. Inan ATM cellassembler of CAD, speech samples are collected until theyfil l an ATM cell for cell assembly. An ATM cell disassemblerof CAD has the functions of cell disassembly, delay jittercompensation and cell loss detection.For the functions of SPU and CAD for ATM interfacesupport circuit-mode speech services for the adaptationlayer functions over an ATM network, we have chosen ashort cell information field size of 32 octets for speechsamples which correspond to ADPCM speech packetizationblocks of8ms.2.2

    On the SPU transmitter, the key technique of speechdetection is precise estimation of the background noisepower during pause or silence duration. 121 By calculation ofthe silence noise power an adaptive power threshold forspeech detection and a noise level code for silencereconstruction are obtained.The signal power is estimated with a 8 ms movingwindow which has64speech samples of PCM coding, so it spossible to precisely detect the power variation of thespeech waveforms. The short-time power P made by alow-pass operation method 151 using the previous powerP, and the input signal X can be calculated as :

    Speech Detection and Silence Detection

    49.6.1.CH2655-918910000-1515$1 OO 0 989 IEEE 1515

    _ _ _ - -_

  • 7/29/2019 00049932

    2/5

    1 1P = (1- - ) . P,-1 +- ......64 64 x n 2The speechlsilence detection V made by comparing thepower P with the adaptive power threshold PTH can begiven as :

    1 (speech)0 (silence)

    (if Pn 2 PTH)(if Pn

  • 7/29/2019 00049932

    3/5

    information is reconstructed and then input to the ADPCMdecoder. After converting the ADPCM code to the P C Msignal, the PCM signal is again input to LSI-2. Then LSI-2adds the pseudo-noise produced from the white noise tothe silence duration. Then the continuous PCM signal i stransferred to the SLIC through the EC.3.2 LSI Architecture

    LSI-1 and LSI-2 have the same LSI architecture but havedifferent micro-programs for the digital signal processor(DSP). The LSI has architecture specifications with powerfulperformance for autocorrelation and power calculation,suitable memory capacity for cell reconstruction and directinterfacing for ADPCM CODEC, EC and CAD. For examplethe autocorrelation calculation R (j, k) in Eq. (6) is processedin two cycles using the pipeline technique. This makespossible real-time and no-delay operation for cell losscompensation.

    Fig. 2shows the signal processing LSls developed for SPU.The LSI is produced using 2-pm C-MOS technology with achip size of 8.1 x7.2 mm2. The power consumption is90mW and the cycle time is 180ns.4. PERFORMANCE EVALUATIONS FOR SPEECHPROCESSING UNIT4.1 Segmental S/N MeasurementsUsing computer simulation, the ADPCM-coded cellinformation is randomly discarded. The segmental S/N ismeasured to confirm the effect on cell reconstruction. Thesegmental S/N (SNSEG) measurement is given by:

    63E s(n-i)Z

    I =owhere, S(n-i) is the input signal of the ADPCM encoder,Sr(n-i) is the reconstructed output signal of the ADPCMdecoder and P is the number of cells in the active speechduration.Fig. 3 shows the segmental S/N as a function of the cellloss rate for the G.721 ADPCM. Fig. 4 shows the samerelationship for the Advanced ADPCM. In both cases, theimprovement of the segmental S IN is obtained over the fullrange of the cell loss rate for the pitch estimation methodas compared with the zero substitution method.

    4.2 Subjective Evaluation TestsThe real-time evaluation equipment as shown in Fig. 5with a pseudo-configuration of the ATM network iscomprised of two SPUs. Using this equipment, we haveevaluated the total speech quality for speechcommunications in an ATM network. The speech terminal

    is an analog telephone, and the speech signal is transmittedto the SPU through the 2-wire analog line, the SLIC and theEC. The speech information is randomly discarded in theDigital Impairment (DI) block which corresponds to thepseudo-network.The informal subjective evaluation test with the relativeopinion score was implemented by 20 listeners using 5 s

    speech samples from 2 males and 2 females. In order toevaluate an accurate improvement effect of speech quality,we applied 5-point grades as used with the relative opinionscore. [7l To obtain reference criteria for normal telephoneuse, noise and naturalness of speech were evaluated asfol ows:Scores Impairment Scale

    4 lmpercepti ble3 Perceptible but not Annoying2 Slightly Annoying1 Annoying0 Very Annoying

    Fig. 6 shows the relative opion score as a function of thecell loss rate for the proposed pitch estimation methodusing Advanced ADPCM-coded speech. The proposed pitchestimation method shows a remarkable improvement inspeech quality for the discarded cell. It i s possible tomaintain good speech quality up to a cell loss rate of 3%using cell information of 32 octets.5. DISCUSSION5.1

    When the embedded ADCPCM coding 181 is applied to theATM network, it needs, for example, to divide ADPCM-coded speech into ATM cells of high-priority and those oflow-priority. The low-priority cells are discarded in extremecongestion conditions at a network node. Hence theprocessing delay for reconstructing ATM cells may beincreased by priority processing.The proposed ADPCM cell reconstruction whichsubstitutes lost cells for ADPCM code based on the speechpitch estimation method has the following features:

    No-priority processing requirement at network.Short processing delay for reconstructing speech.Good speech quality at a few percent cell loss rate

    Features of ADPCM Cell Reconstruction

    .*.

    5.2 Feasibility for Speech CommunicationsThe proposed speech processing scheme does not requireany particular priority processing at each network node inthe the ATM network. A t the ATM node composed of

    transmission and switching systems ATM cells must betransferred at high speed, therefore this speech processingscheme may be preferable in terms of the end-to-end delayover the ATM network.Inthe case of no-priority processing at a network node,we consider that an ATM network i s a simple ten-nodetandem network of queuing systems with an independentprocess. Assuming that ATM cells are statistically discardedat a output buffer of the node and that the buffer size K issufficiently small, we apply the queue MIDIlIK model.[glFig.7 shows the probability of cell loss as a function of theoffered load using the MIDIlIK model. Supposing that acell loss rate up to 3% (~3x 10- 2) an be accepted forspeech communications by the proposed speech processingscheme, we obtain the high rate offered load as: p =0.81when K =10 and p =0.92 when K =20.

    5.3We covered 32-kbit/s ADPCM-coded speech for ATM cellsin this paper. However the application of regular 64-kbitfs

    Aspects of PCM Speech Services

    49.6.3.1517

  • 7/29/2019 00049932

    4/5

    P C M speech was also considered. We studied anddeveloped the PCM cell reconstruction of speech processingin the same custom LSI. For P C M speech processing weapplied a new cell loss reconstruction algorithm based onthe waveform matching method. We confirmed that thisPCM cell reconstruction could provide good speech qualityup to a cell loss rate of about 8% us ing a speechinformation of 32 octets which corresponds to PCM speechblocks of 4 ms.6 . CONCLUSION

    In this paper we have proposed a new speech processingscheme for an ATM network. As a result of the subjectiveevaluation tests, we have confirmed that good speechquality for speech communications was obtained up to acell loss rate of about 3% for ADPCM speech and that ofabout 8% for PCM speech. Features of ADPCM cellreconstruction and feasibility for speech communicationswere discussed.ACKNOWLEDGEMENT

    The authors would like to thank Dr. Atsushi Fukasawa,General Manager of Digital Communications Laboratories,for his encouragement and support. We would like tothank Mr. Shosaku Tsukagoshi, General Manager of SystemVLSl R & D Dep. B, for his help in the development of theLSls. Thanks are also due to our colleagues for theircooperation during the subjective evaluation tests.

    REFERENCE[l ] CCITT Recommendation 1.121, "Broadband Aspects ofISDN", Temporary Document 140, J une 1988.[2] J . F . Lynch J r., J .G. J osenhans and R.E. Crochiere,"Speech/Silence Segmentation for Real-Time Coding viaRule based Adaptive Endpoint Detection", IEEEICASSP'87 Proceedings, pp. 1348-1351, 1987.(31 Y.Shoji, 0.Noguchi and T.Suzuki, "Development of HighPerformance DCMS with 3-bit and 4-bit Coding

    ADPCM" I E E E ICC '88 Proceedings, pp.1598-1602, 1988.141 D.J .Goodman, O.G.J affe, G.B.Lockhart and W.C. Wong,"Waveform Substitution Techniques for RecoveringMissing Speech Segments in Packet VoiceCommunications", IEEE ICASSP '86 Proceedings,[S I L.R.Rabiner and R.W. Schafer, "Digital Processing ofSpeech Signals", Prentice-Hall, 1978.[61 H.Ando, O.Noguchi, R.Miyamoto, S.Tsukagoshi and N.Yonekura,"DSP LSI Configuration to Implement andAdvanced ADPCM Scheme," I E E E ISCAS '87Proceedings, pp.919-922, 1987.[71 N.S.J ayant and P.Noll, "Digital Coding of Waveforms,"Prentice-Hall, PP.658-665, 1984.181 D. 0. 8owker and C. A. Dvorak,"Speech TransmissionQuality of Wideband Packet Technology", I E E EGLOBECOM '87 Proceedings, pp. 1887-1889, 1987.[9] K.Sriram and W.Whitt "Characterizing SuperpositionArrival Processes in Packet Multiplexers for Voice andData", I E E E J . SAC, SAC-4, 6, pp.833-846, 1986.

    pp. 105-108, 1986.

    ATM Interface (Adaptation Layer) ATM Network/ \ / %- ATM-SW

    nr-Analog

    AnalogTelephone

    EC-wireAnalogTelephone "SLIC :EC :

    Subscriber Line InterfaceEcho Canceller

    Speech Processing Unit (SPU)

    III

    I

    1

    ! r - - - - - - - - - - - - - - 7ATMCELLASM

    I \I -I- ATMCELL- DASM-

    Fig. 1 Functional Block Configuration for ATM Interface

    49.6.4.1518

  • 7/29/2019 00049932

    5/5

    Fig. 2 S ignal P rocessing LSI(LSI-I, LSI-2 fo r SPU)e0 Pitch Es timation (ADPCM)

    Zero Substitution (ADPCM)........25 c20 ,

    hmE5 15-maI& 10aII

    Cell Length : 2 OctetsADPC M Speech Cell : ms

    Cell Loss Rate (% )Fig.3 Segmental S/N vs. C ell Loss Rate(G.721 ADPCM)0-0 Pitch Es timation (ADP CM)y . Zero S ubstitution (ADPCM)

    0 NoCell Loss20 --

    3 15 -5-mc 10 -VI

    Cell Length :32 OctetsADPC M Speech Cell : ms

    Cell Loss Rate (%)Fig.4 Segmental S/N vs. Cell Loss Rate(Advanced ADPC M)

    Fig.5 E valuation E quipment(Hardware S imulator for ATM Interface)

    oo Pitch Es timation (ADPC M)x- . -. Zero Substitution (ADPC M).. . Zero Substitution (PCM)

    0 ADPCM SpeechwithNoCell Loss

    4 t

    l I I , , , I1.0 2.0 5.0 10.0Cell Loss Rate (%)

    Fig.6 Relative O pinion Score vs. Cell Loss Rate(Advanced ADPC M)

    0Offered Load

    Fig.7 P robability o f Cell Loss vs. O ffered Load(M/D/I/K Model)

    49.6.5. 1519