Apple Computer Inc. v. Burst.com, Inc.

Filing 108

Declaration of Allen gersho in Support of 107 Response of Burst.com, Inc.'s Opposition to Plaintiff Apple Computer, Inc.'s Motion for Summary Judgment on Invalidity Based on Kramer and Kepley Patents filed byBurst.com, Inc.. (Attachments: # 1 Exhibit A to A. Gersho Declaration# 2 Exhibit B to A. Gersho Declaration# 3 Exhibit C to A. Gersho Declaration# 4 Exhibit D to A. Gersho Declaration# 5 Exhibit E to A. Gersho Declaration# 6 Exhibit F to A. Gersho Declaration (Part 1)# 7 Exhibit F to A. Gersho Declaration (Part 2)# 8 Exhibit G to A. Gersho Declaration# 9 Exhibit H to A. Gersho Declaration# 10 Exhibit I to A. Gersho Declaration (Part 1)# 11 Exhibit I to A. Gersho Declaration (Part 2)# 12 Exhibit J to A. Gersho Declaration# 13 Exhibit K to A. Gersho Declaration# 14 Exhibit L to A. Gersho Declaration# 15 Exhibit M to A. Gersho Declaration# 16 Exhibit N to A. Gersho Declaration# 17 Exhibit O to A. Gersho Declaration)(Related document(s) 107 ) (Crosby, Ian) (Filed on 6/7/2007)

Download PDF
Apple Computer Inc. v. Burst.com, Inc. Doc. 108 Att. 13 Case 3:06-cv-00019-MHP Document 108-14 Filed 06/07/2007 Page 1 of 10 Dockets.Justia.com Case 3:06-cv-00019-MHP 702 Document 108-14 Filed 06/07/2007 Page 2 of 10 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 4, APRIL 1982 Abstract-New algorithms for the design of trellis encoding data Algorithm compression systems described. main are The algorithm a uses training sequence of actual data from a source to improve an initial trellis decoder. An additional algorithm extends the constraint length Decoder Algorithm DeltaMod Viterbi of a given decode? Combined, these algorithms allow the automatic. M, L Algorithm design a of trellis encoding system for a particular source. The sequence XU) .-source algorithms' effectiveness for random sources is demonstrated through performance comparisons with other source coding systems and with theoretical bounds. algorithms applied the The are to practical problem of the design of trellis and hybrid codesfor medium-to-lowrate speech compression. y(i) p:q ... +p ... Nonlinear Filter Adpcm Decoder Fake ProcesS Decoder u(j) .. channel sequence .-reproduction'sequence Fig. 1 . Datacompressionsystem. INTRODUCTION N the past decade considerable progress has been made towards an understanding of tree and trellis encoding data compression systems. In thesediscrete-time or sampled-data systems (Fig. l), a deterministic but possibly nonlinear decoding filter transforms the channel sequence into a reproduction sequence. The channel sequence is chosen by a tree or trellis search encoding algorithm to minimize the distortion between the reproduction sequence and the sourcesequence.Results in information theory have proved the existence of tree and trellis systems which operate close to the theoretical limits of performance,but these papersofferonly existence proofs, witho,ut describing ways of actually constructing good codes [201, ~ 5 1 . The study of encoding algorithms for tree and trellis codes has been of interest since the earliest work on convolutional channel coding. TheViterbialgorithm [l 11 is optimum for searching the trellis structures associated with finite-state decoders, but has computational cost exponential in the constraint length of the code. Someof the nonoptimal algorithms, such as the Fano and stack algorithms, have been derived from problems of decoding error correcting codes; othershave been designedspecifically forthetree source encoding problem. The M , L algorithm [21] , for example, is simply a breadthfirst tree search. Other search algorithms are classed as depthfirst or metric-first (distortion-first) types [2], [3]. Because many effective encoding algorithmsare known, the more difficult part of the design of a tree or trellis coding system lie's in the design of the decoder. Decoders may be drawn ' I Manuscript received June 2, 1981; revised September 29, 1981. This paper was presented in part at the IEEE International Symposium on Information Theory, Santa Monica, CA, February 1981. This work was supported in part by the Joint Services Electronics Program under Contract DAAG-0047, by .the U.S. Army Research Office under Contract DAAG29-80-K-0073, and by the Xerox Corporation. L. C. Stewart is with the Information Systems Laboratory, Stanford University, Stanford, CA 94305, and the ComputerScience Laboratory, Xerox Palo Alto Research Center, Palo Alto, CA 94304. R. M. Gray is with the Information Systems Laboratory, Stanford University, Stanford, CA 94305. Y.Linde is with the Codex Corporation, Mansfield, MA 02048. from any of the traditional waveform coding techniques, such as predictive quantization. Such plagiarized decoder systems achieve improved performance by replacing their traditionalencodersby a tree or trellissearchalgorithm. In the literature, various other methods for selecting decoders have beenproposed.For speech sources, forexample,attempts have been made at the design of tree decoders based on predictive quantizers which match the average correlation properties 'of speech [4]. Variational methods have been proposed to improve decoders based on predictive quantizers and Linde and Gray have used the notion of a fake process to suggest decoders[5],[24] . While most of thesecodes based on traditional systems incorporate structures similar to recursive digital filters and thus have such a large number of states that they generate tree codes, several investigators - have use'd decoders incorporating either transversal digital filter approximations of recursivefilters or decoders originally based on f i n i t e impulse response models [23]. These decoders have a relatively small number of states and, therefore, a trellis structure. In this paper we describe new algorithms for the design of trellis encoding data compression systems. The main algorithm uses a training sequence 'of actual data from a source t o improve an initialtrellis decoder. An additionalalgorithmextendstheconstraintlength.of a given decoder.Combined, these algorithms allow the automatic design of a trellis encoding system for a particular source. The decoders designed with these techniques are locally optimal and, we. conjecture, globally optimal certain for sources [17]. The algorithms' effectiveness for random sourcesis demonstrated through performance comparison withother source coding systems and with theoretical bounds. The algorithms are also applied to the 'practical problem of the design oftrellis and hybrid codes for medium-to-low-rate speech compression. Additional information, includinga Soundsheet with the speech coding results, may be found in [34]. The goal of this paper is not the presentation of particular coding results, but the presentation of design methods which offer the potential of improving the performance of any existing trellis coder-or of constructing a new coder for a source of entirely unknown character. ' ' 0090-6778/82/0400-0702$00.75 0 1982 IEEE Case 3:06-cv-00019-MHP STEWART et a1 : TRELLIS WAVEFORM CODERS Document 108-14 Filed 06/07/2007 Page 3 of 10 703 The new codeword is the average of those training sequence samples encoded by the old codeword. As a step towards the description of the trellis code design Repeated application these of twostepsmust result in algorithms, we describe an algorithm for the design of block decreasing, or at least nonincreasing, sample average distortion source codes,or vector quantizers, based on a long training sequence of symbols from a source. This algorithm is a multi- over the training sequence. Since the average distortion is nonnegative and decreasing, it must eventually a reach limit. dimensional version ofa quantizer design method ofLloyd Although the limitmay not be the global optimum, it is at [25] and is more completely reported by Linde et al. [22]. An N-level, k-dimensional quantizer is a function f that least a local optimum. The algorithm may be stopped either assigns t o each input vector x a reproduction vector x' drawn when a fixed point is reached (when no changes occur in the from a finite alphabet A = bi:i = 0, .-, N - l}. The function reproduction vectors or partitions), or when the reduction in f returnstheindex of the selected reproduction vector or average distortion per iteration falls below some threshold. If the vectorsource {xi} is ergodic andstationary,the codeword. quantizer The is described by reproduction the quantizer designed by this algorithm will work as well on data alphabet and an encoding rule. For a particular input (training) trainingsequence as fromthe training sesequence {xi: j = 0, -., n - l}, the encoding rule induces a fromoutsidethe quence itself. However, operation of the design algorithm does partition,S = { S i : i= 0, -,N- l}, of the input sequence into ergodicity. the disjoint sets Si = { j : f ( x i ) = y i } of the time indexes of not depend on either stationarity or those input vectors mapping into the ith reproductionvector. TRELLIS CODE DESIGN ALGORITHM The fidelity of reproduction is measured by a nonnegative The most general case ofatrellis decoder consistsofa distortion measure d(x, x'). Many distortion measures have finite-state machine driving a table-lookup codebook. of debeen proposed for various applications, but perhaps the best coder output values. Symbols arriving from channel the known is the squared-error measure drive the finite-state machine, which in turn selects reproduck- 1 tion symbols from the codebook. In what follows, we speciald(x, x') = ( X i -x i y . ize thefinite-state machine tothetradional case of a shift i= 0 register containingthe most recent k channel symbols (Fig. 2). The contents of the shift register are used as a table index An obviousencodingrulemaps each sourcevector into the trellis codedereproduction vector giving minimum distortion. A tie break- to select the decoder output codeword. The sign algorithm seeks t o fill in the contents of the codebook. ing rule is necessary but since ties are generally lowprobability events, nearly any rule will do. Suppose that we have a table-lookup trellis decoder, a single The initial conditions for the design algorithm are N , the symbol additive distortion measure, and an optimal encoding desired number of reproduction vectors, A', an initial quan- (search) algorithm such as theViterbi algorithmwhich will tizer, and {xi}, a long training sequence of symbols from the find the trellis encoding of a source sequence giving the lowsource. The algorithm consists of the repetition of two steps: est possible sample average distortion.The symbolsof the finding the best encoding of the training sequence for a given channel sequence, path or map, chosen by the encoding with the set of reproduction vectors, and finding the best set of repro- algorithm associate the sequence of source symbols reproduction sequence of codewords. This association can be duction vectors for the given encoding. used as thepartitionfunction inatrellis adaptation of the Encode: Given A m , the reproduction alphabet ofgeneration m, find the minimum distortion partition = p ( A m ) by block quantizer design algorithm. (See Table I.) Sm mapping each element of the training sequence to the minimum The initial conditions for the design algorithm are a finitestatedecoderwith initial codebook C o (such as theshift distortion reproduction vector: register decoder described above), a single symbol distortion f(x) = i: d(x,y i ) < d(x,yi), for all j measure (such as squared-error), and {xi}, a long training sewith some tie breaking rule. quence of symbols from some source. The algorithm consists the best encoding of the training Update Reproduction Alphabet: Given a partition Sm , find of two keysteps:finding the minimum distortion reproduction alphabet for generation sequence for a given codebook, and finding the best codebook m + 1, Am+' = A(Sm) = A(p(Am)),by settingy i m + ' ,the ith for the given encoding. Executed alternately, these steps proreproduction vector of the new alphabet, to the value giving vide an iterative design algorithm for improving the initial the minimum average distortion over the trainingsequence vec- trellis decoder. tors indexed by elements of Sirn. This value will be the genFind Encoding: Given the trellis codebook for generation eralized centroid,orcenter of gravity underthedistortion m, find the minimum average distortion encoding of the trainmeasure, of those training sequence values which were repro- ing sequence. This encoding induces a partition on the trainduced by the value y i m. ing sequence so that the elements of the partition cell correInthe case ofthesquared-errordistortion measure, this ponding to a certain codeword are the time indexes of those center of gravity calculation is just the sample average over a elements of the training sequence which were reproduced by partition: that codeword. Find Codebook: Given the partition for generation m , find the minimum distortion codebook for generation m 1. The updated value for a certain codeword will be the value giving BLOCK SOURCE CODE DESIGN ALGORITHM x + Case 3:06-cv-00019-MHP 704 Shift Register Implementation (constraint-length9) Document 108-14 TRANSACTIONS IEEE Filed 06/07/2007 Page 4 of 10 4,APRIL 1982 COMMUNICATIONS, ON VOL.NO. COM-30, output (to D/A) This expression does not work when a partition cell is empty. As mentioned, there are many ways to handle this situation; the goal of the method selected must be to select a value that will be used by the next pass of the encoder. Once the partition cell is no longer empty, the design algorithm will take care of adjusting the value of the associated codeword. EXTENSION In thissection, we describea method of constructing a decoder based on a shift register of length k 1 from a decoder of length k. Our method has the advantage of constructTABLE I ing a decoder with sample average distortion over the training TRELLIS DECODER DESIGN ALGORITHM sequence at least as low as that of the starting (shorter) decoder. We call this method exlension because the new, longer (0) Initialization: Given a distortion threshold + 0, a q-ary noiseless .y t channel, an R state decoder, an initial codebookCo with cardinality decoder is constructedbyadding an additional stage to the I1Co I1 = N = qR, and a training sequence {xi} = { x j : j = 0, -, n shift register of the starting decoder. I}, set m = 0. Again, the decoder ofatrellis encoding system is imple(1) Given Cm = b i m : i = 0, ...,N - 1}, the codebook for generationm , find the minimum distortion trellis encoding{ i j : j = 0, .-,n - I} of mented by a shift register driving a table-lookupcodebook. the training sequence. This encoding induces a artition on the training sequence { S i m : i = O , - . , N - l}withSim = fIi:;i=yim}. Eachset Arriving channel symbols are shifted into the least significant Sim = P(Cm, { x i } ) contains the time indexes of those elementsthe end of the register and the contents of the register act as the of training sequence which were encoded by codeword yim. table address or codeword index to generate the decoder out(2) Compute theaverage distortion Am = n-lXi=On-l d(xi, ;i). (3) If at least one full iteration has been completed (m > 0) and the de- put. For a q-ary channel and constraint-length k decoder, let the register contents be r. With the arrival of symbol u from crease in distortion has fallen below the threshold E , ( a m - am-1)/ Am-1 < E , then halt with Cm as the final codebook. Otherwisego the channel, the new register contents will be (qr u ) mod qk to step (4). m o d q k . Now (4) Find the optimal codebook Cm+l for generation m + 1 as Cm+l = and the decoder will produce codewordy(,.,,.) C(P(Cm,{xi})) with Cm+l = {yim+l:i= 0, . - , N - 1) andyim+l = suppose the shift register is extended by one stage t o length Y : E a i m d ( X i , Y ) < Z i q m d ( x i , JJ'), for ally'. Replace m by m + 1 k 1; the codebook must increase in size from qk entries to and go to step(1). qk+' entries. We fill in the new codeword values by duplicating the old codebook 4 times so that the symbol stored in theminimum average distortion over thoseelements of the the new register stage does not affect the decoder output. Let training sequence indexed by the partition cell corresponding the old codebook contain codewords { y i : i = 0 , qk - 1); to that codeword. then the extended codebook contain codewords y r : will The encoding function-the trellis search-does not necessarily map a source symbol into the minimum distortion codeword, but maps the entiretraining sequence into the minimum average distortion sequence of codewords. i = 0 , -, qk - I . Each iteration of this procedure can only result in decreasing, or at least. nonincreasing, average distortion. Because the This procedure produces a decoder of constraint length encodingalgorithm is optimal, the new encoding canresult k 1 whose behavior is identical tothe behavior of the in distortion no worse thanthedistortiondue to using the original constraint-length k decoder. This new decoder can be old path with the new codewords-which, in turn, is at least used as the initial guess forthedecoderimprovement algoas good as the distortion due to using the old path with the rithm. old codewords. If the same path is chosen twice, the algorithm From an initial constraint-length 1 decoder, alternate applihas reached a f n e d point and will proceed no further. cation of the decoder improvement algorithm and the extenEach new codewordhasthe value whichminimizes the sion algorithm permits the design of successively longer codes average distortion over itspartition cell. Eachcodeword in (Table 11). generation m 1 will be the centroid of those elenients of the Extension can be illustrated by considering its application training sequencewhichwere encoded by the corresponding t o a one bit per symbol (q = 2 ) system using squared error codewordofgeneration m . Some rule mustbeadoptedto distortion. cover the case of an empty partition cell. Some approaches to Let the initial codebook for the constraint-length1 decoder this problem are t o retain the old codeword, to copy thevalue includecodewords - . l and +l. The trellis code design algoof anothercodeword,orto use thecentroidoftheentire rithm for this constraint length selects a locally optimal twotraining sequence as the new codeword. level scalar quantizer for the training sequence. Assuming that If the distortion measure is squared-error, d(x, X I ) = the mean of the training data is 0, the optimum partition for (x thencentroid the computationthe forupdated the final constraint-length 1 decoder collects all the samples codewords is particularly simple: in the training sequence with positive values in one partition cell and all the samples with negative values in the other. We have So = { j Ixi > 0 ) and SI = { j / x i< 0); ties are broken by assigning the associated sample to the first partition. The two Fig. 2. Shiftregisterimplementationofaonebitpersymboltrellis decoder. + + + ., e . + + Case 3:06-cv-00019-MHP STEWART et al. : TRELLIS WAVEFORM CODERS Document 108-14 Filed 06/07/2007 Page 5 of 10 705 TABLE I1 A COMBINED EXTENSION N D IMPROVEMENT ALGORITHM ,380 Lloyd-Max quantizer Scrambling function fake process decoder Let Ckm refer to a constraint-length k codebook at the mth iteration of the decoder improvement algorithm. (1) Design an initial codebook CIO by finding a good q-level quantizer .rx)Trained decoders for the training sequence. .280 (2) Given Cko, an initial code of a certain constraint length, use the trellis code design algorithm improveit. The extentof this procedure will to .m Rate-distortion bound be governed by the value of E in the trellis code algorithm. I (3) If the constraint-length k is sufficiently large, or the distortion pro2 3 4 5 6 7 8 9 4 0 16 3 64 12a ~. 256 . _ . 512 vided by ckmsufficiently low, then halt. Constraint length (L number of codewords from (4)Use the extension algorithm to produce the codebook ck+lo Ckm and return to step(2). Fig. 3. Performance of fake process codes and trained random codes on the memoryless Gaussian source. codeword values will be the means, respectively, of the positive and negative trainingsequence samples. The first extension of this decoder-to constraintlength 2-contains four codewords. The original partition is refined in a very intuitive way: the partition cell containing the positive valued samples of the training data is divided into two cells containing, respectively, the positive samples which were preceded in the training data by other positive samples, and those which were whe1e.D is the average squared error and R is the information rat.qfn bits per symbol. In this formula, R is the theoretical minimum information rate required for any data compression system with average distortion no greater than D [8]. Memoryless Gaussian Sources Fig. 3 compares decoders designed by the iterative design algorithm with the rate distortion bound, with the one-bit best n o t . S o o = { j I x ~ > O , x ~ - ~ > O } a n d S l o = { j I x ~ > O , x ~ - lscalar quantizer(the Lloyd-Max quantizer[31] ), andwith < 0} are the two subcells created from So (using binary notation the Linde-Grayscrambling function decoder [22]. Pearlman [32] reports squared error performance of 0.303 and 0.301 for the partition cell indexes). for length 9 and 10 decoders,respectively, designed using a CODING RANDOM SOURCES new theoryfor source coding with constrained alphabets. The basic trellis code design algorithm guarantees only two These results are very close to ours. Pearlman's decoders use only four discrete codeword values, but a fairly complex things: it will eventually halt and the average distortion obtained over the training data will be nonincreasing with each function is used to map the decoder register contents to the successive iteration. These guarantees are of considerable four codewords. Fehnand No11 [lo] have obtained similar theoretical interest, but the algorithm wouldbe of little practi- results using randomly populated trellis. Thetraineddecoders of Fig. 3 were designed using the cal value if iteither failed t o produce improved codes or atraining converged slowly. In practice, its convergence is almost always Viterbiencodingalgorithmforteniterationson shift register extremely rapid-a few iterations usually suffice-and per- sequence of 20 000 samples. Table-lookup decoders withrandomcodewords were used as the "initial formance improvements are usually obtained. The Gaussian i.i.d. and Gauss-Markov (autoregressive) guess" for the design algorithm. The results of Fig. 3 represent sources are common yardsticks of source coding. In this sec- thetraineddecoderperformanceondatafromoutsidethe tion, one bit per symbol trellis codes designed by the iterative training sequence. algorithm are compared with the fake process trellis codes of Fromtheory [8, ch. 41 we knowthatthereproduction Linde-Gray, with some recent results byother researchers, values of a data compression system should have a distribution andwiththeratedistortionlimits[18]. Since existing tree yielding theappropriatedistortion-ratefunction(thistechand trellis codes already perform quite close to the theoretical nique is used in [ 101). Since we would eventually like to use limits, there is not a lot of room for improvement, but the the trellis code design algorithm for sources with an unknown design algorithm is able toobtain 0.2-0.4 dB performance distribution, we may not be able to calculate the best R-D gains. Since the design maybe accomplishedoff-line,these distribution for the initial codebook. For this reason, we have gains are obtained without additional operating complexity. repeatedtheapproach of [22]and used initial codewords drawn from the sample distribution of the source, based on Rate-Distortion Functions the training sequence. A first-order Gauss-Markov source is the output of a first-order digital filter driven by Gaussian i.i.d. symbols: First-Order Gauss-Markov Fig. 4 presents performance results of trained decoders for xI. = = a x . !the < 1. al first-order Gauss-Markov source with correlation coefficients between 0.35 and 0.95. Shift register decoders of conwhere w is distributed N(0, u2 ). For u2 = 1 and sufficiently straint lengths 5 and 6 were designed by a combination of the small distortion,theShannon lower boundforthis source extensionalgorithmandthedecoderimprovementalgorithm with squared error. distortion is starting initial from constraint-length 1 decoders codewith words +1 and -1. The resulting decoders are compared with I -a2 1 -a R(D) = lg D<rate-distortion the the and function with truncated predictive D 1 +a quantization decoders described in The [23]. TPQD decoders 3 9 Case 3:06-cv-00019-MHP 706 mse Document 108-14 TRANSACTIONS COMMUNICATIONS, IEEE ON Filed 06/07/2007 Page 6 of 10 VOL. COM-30, NO. 4, APRIL 1982 tap weights of a transversal filter. This approach is similar in spirit to the techniques of this paper, but considers only linear decoder structures. ,150 Digital speech data sampled at 6500 Hz were made available by Signal Technology, Inc.of Santa Barbara, CA. Since the ,100 Extension Decoders residual excited and hybrid coding systems discussed later are ,050 associated with linearpredictivecoding techniques, we will ' length % also refer to the speech data as broken into LPC framesof a .60 .m .70 .75 .80 .E .XI .95 128 samples each-corresponding t o a rate of about 50 frames/s. Fig. 4. Performance of TPQD codes and extension codes on first-order The code design efforts discussed below used two segments of Gauss-Markov sources. 600 frames or about 12 s each. One segment was used as the trainingsequence for the design algorithms. The second segare shift register trellis decodersobtainedbytruncatingthe ment, by a different speaker, was used as a test sequence to impulse response of a predictive quantizer. When used with a check the performance of designed decoders on different data. trellis search algorithm as the encoder, the TPQD system can While the speech segments were not phonetically balanced or outperform ordinary predictive quantization. As demonstrated otherwise selected, they were carefully recorded in a low noise here, the iterative design algorithm can obtain still better per- environment and withgood gain control. formance. We notethat tree-searched predictive quantizers The speech coding systems discussed here use the M , L [ 101 can obtain similar results for highly correlated sources. algorithmforbothdecoder design andcodeoperation.The Additional research ofGrayand Linde foronebit per algorithm parameters were set so thatthealgorithm mainsymbol block codes designed using the block code design algo- tained 20 path sequences as possible encodings with an encoding delay of 31 symbols. It is widely held [ 121, [35] that rithm yieldsresults comparable t o thosereportedhere.For a = 0.85, a 128 codeword block code has been designed which less intensive search will achieve most of the benefits of depurpose` was the yields an average distortion of 0.101. For a = 0.90, an average layed decision encoding,butourprimary design ofspeech codes,ratherthan asearch forthemost distortion of 0.073 has been obtained[ 181 . efficientencoding algorithm. Trellis decoders were designed CODING SPEECH SOURCES using the extension method from small initial codebooks. At eachconstraintlength,the design algorithm was run for at In this section, we turn to the problem of designing tree most six iterations or halted when thechange in signal-to-noise and trellis encoding systems for speech. 'We discuss trellis ratio measured in decibels fell below 1 percent. Decoders were codes for the original speechwaveform,trellis codes for the designed for data rates of 1/2 bit/speech sample, 1 bit/sample, residual signal of linear predictive coding systems, and a tree and 2 bits/sample. encoding system using a hybrid decoder for the speech waveform.The resultsof thehybridtreecodes have been very encouraging, as these codes can provide a good quality speech Speech Waveform Trellis Codes -at ratesofonebit persample through an automatic design The simplest way t o apply trellis encoding to speech is t o procedure. build a trellis decoder the for original speech waveform. 3250 bits/s were required for the rate 1/2 code, 6500 bits/s Previous Tree and Trellis Codesfor Speech were requiredfortherate 1 code,and 13 000 bits/s were The use of tree and trellis techniques for speech coding has required for the 2 bits/sample code. Fig. 5 contains signal-to-noise ratios in a long `history.Most of the early work was in the area we have decibels forthe called plagiarized decoders-the application ofa tree search various speech waveform trellis codes designed, measured algorithm to the encoding problem in a standard coding sys- outside the training sequence. The information is presented as tem. In the literature, this technique is usually referred t o as a function of the base two logarithm of the size of the codedelayed decision [7], [35]. By using a treeencoder, which book. This value is equivalent to the number of address bits delays the encoding process by a number of samples in order required by a memory implementing the codebook. t o observe the consequences of particular encodings, delayed The decrease in performance of both the rate 1/2 code and decision systems areable t o achieve improved performance. therate 1 codefor large codebooksoutsidethetraining seThe general consensus has been that such methods can yield quenceindicatesthattraining sequence was ofinsufficient improvements insignal-to-noise ratio,butthatthe improve- length for the design of codebooks with more than about 512 ments are not usually audible[12] . Asecond class of ap- codewords. If the training sequence is too short, the decoder proaches to tree coding of speech has been the design of tree becomes specialized to the particular sequence rather than the decoders based on short- or long-term correlation functions of properties of the source. Audibly, as might be expected, the 3250 bit/s system is very noisy, although it is intelligible. The speech [4], [19], [30]. Systemsbothwith fixed andwith adaptive decoders have been built [lo],[26],[37]. More rate 1 systems are an improvement, although still quite noisy, recently, a third approach has been the use of an optimization and the rate 2 systems, at 13 000 bits/s, are still sufficiently noisy to be classed as "communications quality"-not quite up algorithm to improve an initial decoder [SI. Thetechnique uses a variational method and a training sequence to adapt the to the standards of long distance telephony. ,200 - Original TPQD decoders Case 3:06-cv-00019-MHP STEWART e t al.: TRELLIS WAVEFORM CODERS Document 108-14 Filed 06/07/2007 Page 7 of 10 707 ..... . . ... . . . ... . . . . ... . . .. . 12 I c -/ . . . . . . . I: . . Fig. 6 . Vector quantization of LPC. . . . . . . r lgN1 2 , 3 , 4 5 , 6 Fig. 5. Speech waveform trellis code performance. Figures are decibel signal-to-noise ratio. Curve a is for the 1/2 bit per sample code outside the training sequence. Curve b is for the rate 1 code, and curve c is for the 2 bits per speech sample code. Log N is the base-2 logarithm of the codebook size. + . . . . . 6 9 16 11 Linear Predictive Coding and Vector Quantization Linear predictive coding of speech has been very successful [27]. The basic principle of the method is a decomposition of the speech signal into an excitation function, e.g., the vocal cords, and an all-pole linear filter model of the vocal tract. LPC systems operateby breaking up the speech signal into segments or frames, estimating the filterand excitation parameters for each frame, and transmitting quantized versions of the parameters the to decoder, which uses the values t o operate a synthesizer. The length of a frame is typically 10-50 ms.While there are many methods for estimating the filter parameters, all seek t o minimize the power of the error or residual signal [14] . The speech residuals, the result of LPC with Trellis Encoded Residuals passing the original speech waveform through the LPC analysis Transmission ofa coded form of the residuals, combined filter A ( z ) or A(z)/a, play a role in the various methods for with the model or model and gain portions of an LPC system, estimating the excitation function and aresometimes them- should have several advantages. First, since standard LPC selves quantized and transmitted. systems successfully use either white noise or an impulse There has been much work done on the way t o encode train as excitation functions, it is evident that errors in the best the LPC parametersfor transmission [13] . Recent research details of the residual waveform or synthesizer driving process into applications of the block quantizer design algorithm has may not result in much perceptual deterioration of the speech led to vector quantization methods of encoding the gain and atthe synthesizer output.Second, codingof the actual remodel parameters simultaneously [9], [33]. This method siduals should tend to reduce theeffects offailuresof the is illustrated by Fig. 6. During the design phase of the vector voiced/unvoiced model and the effectsof background noise. quantizer, a finite collection of gain/model combinations typiFor these systems, a vectorquantized LPC system was cal of speech are selected by using the block code design algo- used, operating with a codebook of 512 tenth-order filters for rithm together with a distortion measure between the wave- a rate of 9 bits per LPC frame or about 450 bits/s. The LPC formsegmentandthe filtercoefficients.During operation, system supplied combined gain and model, so that the residual for each frame a parallel test of each filter is made in search signal, generatedby passing the original speech throughthe for the one giving the minimum residual energy. The index of filter A(z)/o, was normalized by the LPC gain. The LPC filter the filter is then transmitted. set was designed by Rebolledo using the block code design The most difficult part of LPC encoding has been the esti- algorithm and the Itakura-Saito distortion measure [16] , mation of the excitation parameters, which determine whether 1331. a speech frame is voiced or unvoiced and, if voiced, determine Using 450 bits/s of side information for the gain and model the pitch. While there area plethora of methods, none has parameters combined with an original sampling rate of 6500 beencompletelysatisfactory.Partofthedifficulty is that Hz, trellis RELP speech coding systems were produced operat- there are some speech sounds like "v" that are partly voiced and partly unvoiced, and part is due to thedesire that a speech coding system continue t o operate in a reasonable manner in the presence of nonspeech background noise or with multiple simultaneous speakers. In response to the problemsof estimatingtheexcitation part of the LPC decomposition, several systems have been LPC residual signal proposed which encode either the actual or some associated signal. Adaptivepredictivecoding (APC), the voice excited vocoder (VEV), and residual excited LPC (RELP) systems all fall intothis category [6],[7],[36]. The major difficulty in each of these systems lies in the encoding of the spectrally flat residual signal while retaining a low transmission rate. A typical method used is t o band limit the residuals and then to use some form of adaptive PCM t o transmit the resulting signal. At the receiver, a nonlinearity is used t o restore some energy to higher frequencies and the resulting signal is used to excite a standard LPC synthesis filter. Tree and trellisencoding offers the potential of transmittingthe residual signal at low rates without resorting t o these downsampling and spectral extension techniques. Case 3:06-cv-00019-MHP 708 Document 108-14 Filed 06/07/2007 Page 8 of 10 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 4, APRIL 1982 ing at 3700 bits/s, 6950 bits/s, and 13 450 bits/s for the 1/2, 1, and 2 bits/symbol cases. As in the case of the waveform coding experiments,the trellis decoders forthe speech residuals were designed using theextensionmethod, starting from constraint-length 1 decoders. The distortion measure used was the simple squared-error measure. The most evident artifact in the RELP speech was a growllike sound at the LPC frame rate. In our judgment, the RELP system offered' improved quality over the waveform coder of Fig. 5 at equivalent rate. Hybrid Tree Codes In the trellis-coded RELP systems just discussed, the rateconstrained residual signal is reproduced according tothe trellis code distortion measure (squared-error) without regard forthe eventualwaveform afterthe receiver passes the decoded residuals through LPC'synthesis filter. It was with some surprise, therefore, that we observed that the output waveform of the RELP system was close to that of the original speech signal. As aresult of thisobservation, we decided to test a composite'decoder consisting of the a/A(z) LPC synthesis filter combined with a trellis decoder front end. This "hybrid tree code"system is shownin Fig. 7.Theonly difference betweenthe RELPsystem andthis newsystem is that the hybrid system encoder seeks ' a channel sequence which will, at the decoder output, match the original speechwaveform while theRELP system encoder seeks a channel sequence trellis decoder,matchthe which will, at the output of the speech residuals. As' before, the LPC portion of the decoder consists of one of 512 tenth-order filters selected by a vector quantizer. The filter selection is 'made independently on each frame of speech. The trellis portion of the hybrid decoder is in fact the same decoder as 'was used in theRELP systems-it was designed using the trellis code design algorithm on the speech residuals from the training sequence. Fig. 8 gives performance data for hybrid tree encoding systemsoperatingat1/2, 1, and 2bits/sample forthe driving process, plus 450 bits/s for the LPC portion of the system. As before, the figure gives signal-to-noise ratios in decibels for the entire 10 s speechsegment. The combined channel rates for these codes are 3700,' 6950, or 13 450 bits/s. Although these rates are the same as for the RELP system, the hybrid tree system produces considerably better perceptual quality. In the usual predictive coding systems (including adaptive systems), the prediction filter is driven by a variable step-size quantizer, which is not a particularly'good model of speech residuals. In the hybrid tree coder, the trellis front end has been specifically designed for speech residuals. It makes intuitive sense that the closer the driving process of an APClike system is to residuals, the better the overall system will perform. Although the vector quantized LPC filters used in this paper are different from those used in most APC SYStems, the predictor from an APC system could be easily used. Descriptions of trellis implementations of otherwiseconventional APC systems may be found in [ 101 . In [30], a collection of hybrid decoders consisting of LPC filters with a fixed trellis front end are used for universal cod- I ENCODER LPC codeword Selection ' LPC codewords ( 450 b/s ) b Reconstructed speech b (from channel) Trellis codebook LPC codebook DECODER LPC codewords (from channel) Fig. 7. Hybridtreecodingsystem. s a ... ' . ... ... . . . 10 - . . . . . . . . . .b 06- 4- . 21. . . . . . . . . 5 . . i . . . . . . . 11 Ig~i i 9 4 d d i IO Fig. 8. Hybrid code tree performance. Performance figures are in decibel SNR for the original speech waveform. Curveu is for the 1/2 bit per sample code outside the training sequence. Curveb is for the rate 1 code, and curve c is for the rate 2 code. ing of a speech waveform. This system differs from the hybrid systemdescribed in thissection in two ways: only 16 LPC filters are used, rather than the much larger set of 512 filters considered here, and a fake process trellis decoder matched to a Laplacian distribution is used, rather than a trellis decoder designed explicitly for speech residuals. Matsuyama and Gray do suggest in [30] the use of a spectral distortion measure for filter selection in place of universal coding and this approach is taken in [29].The same encodingmethod is used ?n [ l ] , although the decoders considered do not include a trellis component and the encoder is a single path search such as in standard APC. In an informal subjective comparison,thehybrid tree encoding system described in this section has, for a given channel rate, given speech quality significantly improved over that of the similar system of [29]. Case 3:06-cv-00019-MHP STEWART e t a l . :TRELLIS WAVEFORM CODERS Document 108-14 Filed 06/07/2007 Page 9 of 10 709 SUMMARY We have presented algorithms the for design of trellis encodingdata compression systems based on atraining sequence of data. The algorithms have been tested on random sources and applied to the problem ofspeechcompression. The relatively poor performance of the fixed-trellis (nonadaptive) speech waveform coder, together the with observed waveform tracking ability of the RELP systems, suggest that it is advantageous t o design a composite, or hybrid, decoder which makes use ofknowledge about speech. In the hybrid tree system, a spectral distortion measure is used on a frameby-frame basis t o select that filter from a finite collection of LPC filterswhich bestmatchestheshort-term speechspectrum.The chosenfilter is used togetherwith afixedtrellis decoder designed t o emulate speech residuals. The combined decoder is used as thecodegeneratorfor a classic squarederror tree encodingsystem. The design methods presented in this paper have two main applications. First, any existing coding system whose decoder can be described at leastin partby atrellis structure can potentially be improved by the algorithms. Second, the algorithmscanconstruct newtrellis decodersfromscratchfor sources withunknown characteristics. Most other code construction methods depend upon special knowledge of the source. REFERENCES J.-P. Adoul,L. J. Debray and D. Dalle, "Spectral distance measure applied to the design of DPCMkoders with L predictors," in Proc. IEEE Int. Conf. Acoust.. Speech, Signal Processing, Denver, CO, Apr. 1980, pp. 512-515. J. B. Anderson, "Effectiveness'of sequential tree search algorithms in speech digitization," in Proc. Int. Conf.Commun., Boston, MA, June 1979, `pp. 8.1.1-8.1.6. __ , "Recent ' advancessequential in encoding of analog waveforms," in Proc. Nut. Telecommun. Conf., Birmingham, AL, Dec.1978,pp.19.4.1-19.4.5. J. B. Anderson and J. B. Bodie, "Tree encoding of speech," IEEE Trans. Inform. Theory, vol. IT-21, pp. 379-387, July 1975. J. B. Anderson and C. W. Law, "Real-number convolutional codes for speech-likequasi-stationarysources," IEEE Trans. Inform. Theory, vol. IT-23, pp. 778-782, Nov. 1977. predictivecoding of B. S . Atal and M. R. Schroeder, "Adaptive speech signals," Bell Syst. Tech. J . , vol. 49, pp. 1973-1986, Oct. 1970. D. W. Becker A. andJ. Viterbi, "Speech digitization and compression by adaptive predictive codingwith delayed decision," in Proc. IEEE Nut. Telecomrnun. Conf., New Orleans, LA, 1975, pp. 4611846123. T . Berger, Rate Distortion Theory, A Mathematical Basis for Data Compression. EnglewoodCliffs, NJ: Prentice-Hall, 197 1. A. Buzo, A. H. Gray, Jr., R. M. Gray, and J. D. Markel, "Speech coding based vector on quantization," IEEE Trans. Acoust., Speech, SignalProcessing, vol. ASSP-28, pp. 562-574, Oct. 1980. H. G . Fehnand P. NOH. "Tree and trellis coding of speechand stationary speech-like signals," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing. Denver, CO, Apr. 1980, pp. 547-550. G. D. Forney, Jr., "The Viterbi algorithm," Proc. IEEE. vol. 61, pp. 268-278, Mar. 1973. J.Gibson, coding D. "Tree versus differential encoding of speech," presented Int. Inform. at Conf. the `Theory, Santa Monica, CA, Feb. 1981. A. M. Gray, Jr., R. M. Gray, and J. D. Markel, "Comparison of IEEE optimal quantizations of speech reflection coefficients," Trans.Acoust.,Speech,SignalProcessing, vol. ASSP-25, pp. 9-23, Feb.1977. A. H. Gray and D. Y. Wong, "The Burg algorithm for LPC speech analysislsynthesis," IEEE Trans.Acoust.,Speech, SignalProcessing, vol. ASSP-28, pp. 609-615, Dec. 1980. R. M. Gray, "Time-invariant trellis encoding of ergodic discretetime sources with a fidelity criterion," IEEE Trans. Inform. Theory, vol. IT-23, pp. 71-83, Jan. 1977. R.M.Gray, A. Buzo, H. A. Gray, and Jr., Y. Matsuyama, "Distortion measuresspeech for processing," IEEE Trans. Acoust., Speech, Signal Processing, vol.ASSP-28, pp. 367-376, Aug. `1980. R. M. Gray, J. C. Kieffer, and Y. Linde, "Locally optimal block quantizerdesign," Inform.Contr., vol. 45,pp. 178-198, May 1980. Y. Linde, "Vector quantizers and predictive R.M. Gray and quantizersfor Gauss-Markov sources,'! IEEE Trans. Commun., vol. COM-30, pp. 381-389, Feb. 1982. N. S . Jayant and S.A. Christensen, "Tree-encodingof speech using the ( M , L)-algorithm adaptive and quantization," IEEE Trans. Commun., vol. COM-26, pp. 13761379, Sept. 1978. F.Jelinek,"Treeencoding ofmemoryless discrete timesources with a fidelity criterion." IEEE Trans. Inform. Theory, vol. IT-15, pp. 584-590, Sept.1969. F. Jelinek and J. B. Anderson, "Instrumentable tree encoding of information sources," IEEE Trans. Inform. Theory,vol. IT-17, PU. .. 118-1 19, Jan. `1971. Y. Linde,A.Buzo,andR.M.Gray,"An algorithm for vector quantizer design," IEEE Trans. Commun., vol. COM-28, pp. 8495, Jan,. 1980. Y. Lindeand R.M.Gray,"Thedesign of treeandtrellisdata compressionsystems,"Inform.Syst.Lab.rep., Stanford Univ., Stanford, CA, SEL Tech. Rep. 6504-2, Feb. 1978. , "A fake process approach to data compression," IEEE Trans. Commun. vol. COM-26, pp. 840-847, June 1978. S . P. Lloyd, "Least squares quantization in PCM's," Bell Labs. Paper, Murray Hill, NJ, 1957; also, IEEE Trans. Inform Theory, vol.IT-28,Mar.1982. J . W. Mark, "Adaptivetrellisencoding of discrete-timesources with a distortion measure,'' IEEE Trans. Commun.. vol. COM-25, pp. 408417,'Apr. 1977. Linear Prediction of J. Markel D. Gray, and H. A. Speech. New York: Springer-Verlag. 1976. Y. Matsuyama, "Process distortion measures and signal processing," presented at the Int. Conf. Inform. Theory, Santa Monica, CA, Feb. 1981. -, "Speech'compression systems using a set of inverse filters," Ph.D. dissertation, Inform. Syst. Lab., Stanford Univ., Stanford, CA,Aug.1978. Y. Matsuyama and R. M. "Universal coding Gray, tree for speech," IEEE Trans. Inform. Theory, vol. IT-27, pp. 31-40, Jan. 1981. J. Max, "Quantizing for minimum distortion," IRE Trans. Inform. Theory, vol. IT-6, pp. 7-12, Mar. 1960. W. A. Pearlman, "A sliding block code for small user alphabets with performance near the rate-distortion limit," presented at /he Int. Conf. Inform. Theory, Santa Monica, CA, Feb. 1981. G. Rebolledo, "Speech and waveform coding based on vector quantization,"Ph.D.dissertation,Inform.Syst.Lab.,Stanford Univ., Stanford, CA, 1982. L.C.Stewart,"Trellisdatacompression,"Ph.D.dissertation, Inform. Syst. Lab., Stanford Univ., Stanford, June CA, 1981; available as Tech. Rep: L905-I. J. Uddenfeldt and L. Zetterberg, H. "Algorithms delayed for IEEE encoding in delta modulation speech-like with signals," Trans. Commun., vol. COM-24, pp. 652-658, June 1976. C. K. Un and D. T. Magill, "The residual-excited linear prediction vocoder with transmission rate below 9.6 kbitsls," IEEE' Trans. Commun., vol. COM-23, pp. 14661474, Dec.1975. S . G. Wilson and S . Husain, "Adaptive tree encoding of speech at 8000 bitsls withfrequency a weighted error criterion,': IEEE Trans. Commun., vol.COM-27,pp. 165-170, Jan. 1979. D. Y. Wong, F. B. Juang, and A. H. Gray, "An 800 bps speech compression system basedon vector quantization," presented at the Int.Conf.Inform.Theory,SantaMonica,CA,Feb.1981;also, IEEE Trans. Acoust., Speech, Signal Processing,to be published. Case 3:06-cv-00019-MHP 7 10 Document 108-14 Filed 06/07/2007 Page 10 of 10 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-30, NO. 4, APRIL 1982 Lawrence C. Stewart (S'74-M'EO) received the S . B . degree from the Massachusetts Institute of Technology,Cambridge, in 1972andthe M.S. and Ph.D. degrees Stanford from University, Stanford, CA, in 1977 and 1981, respectively, all in electrical engineering. Since 1977, he has been with the Xerox Palo Alto Research Center, Palo Alto, CA, joining the Computer Science Laboratory there in 1981. His current interests are in data compression, packet radio, computer communications, and computer systems. Dr. Stewart is a mc:mber of Eta Kappa Nu and Tau Beta Pi. I

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?