# Full text of "High-speed recursive digital filter realization"

## See other formats

LIBRARY MONTEREY. CALIFORNIA 93940 NPS-62-84-061 <( POSTGRADUATE SCHOOL Monterey, California HIGH-SPEED RECURSIVE DIGITAL FILTER REALIZATION Herschel H. Loomis, Jr. Bhaskar Sinha 6 February 1984 public release; distribution unlimited. onic Systems Command D.C. 20360 FedDocs D 208. 14/2 NPS-62-84-061 NAVAL POSTGRADUATE SCHOOL Monterey, California Commodore Robert H. Shumaker Superintendent David A. Schrady Provost This work was supported by the Naval Electronic Systems Command, Washington, D.C. Reproduction of all or part of this report is authorized. This report was prepared by: ij axTa . LIBRARy SECURITY CLASSIFICATION OF THIS PAGE 0«f« Entmrmd) REPORT DOCUMENTATION PAGE READ INSTRUCTIONS BEFORE COMPLETING FORM 1. REPORT NUMBER 2. GOVT ACCESSION NO. NPS-62-84-061 3. RECIPIENT’S CATALOG NUMBER 4. title (and Subtltie) High-Speed Recursive Digital Filter Realization 5. TYPE OF REPORT 6 PERIOD COVERED Technical 6. PERFORMING ORG. REPORT NUMBER 7. AOTHORf*; Herschel H. Loomis, Jr. Bhaskar Sinha 8. contract or grant NUMBER('«J »■ PERFORMING ORGANIZATION NAME ANO ADDRESS Naval Postgraduate School Monterey, CA 93943 10. program ELEMENT. PROJECT, TASK AREA 6 WORK UNIT NUMBERS 62734N N0003983WRDY078 t1. CONTROLLING OFFICE NAME ANO ADDRESS Naval □ ectronic Systems Command Washington, DC 20360 12. REPORT DATE 6 February 1984 13. NUMBER OF pages 14. MONITORING AGENCY NAME & ADD RESS( H dl Karan t from ControUing OUlca) 15. SECURITY CLASS, (ot thl» ttpott) UNCLASSIFIED DECLASSIFICATION/ DOWNGRADING schedule 16. DISTRIBUTION ST AT EmEHT (o( thi a Raport) Approved for public release; distribution unlimited. 17. DISTRIBUTION STATEMENT (o( tha abatraet antarmd in Block 20» it dlUarant from Raport) IS. supplementary NOTES This is a preprint of a paper accepted for publication by CIRCUITS, SYSTEMS, AND SIGNAL PROCESSING. 19. KEY WORDS ^Continue on ravataa alda it nacaaamry and idantlfy by block numbar) Digital filters, high rate computation, high-speed logic, pipelined logic, recursive digital filters. 20. ABSTRACT ^Con(/nu« on ravaraa alda It naeaaamry and Idantlty by block numbar) Pipeline techniques have been successfully applied to speeding up processing in both general and special purpose digital computers. Application of these techniques to non-recursive (FIR) filters has been suggested and is quite straightforward. Application to recursive (HR) filters has not previously been shown. In this paper, the technique for applying pipeline techniques to recursive filters is shown and the advantages and disadvantages of the technique are discussed. Using these techniques, recursive digital filters DD I jaN^3 1473 EDITION OF t MOV 68 IS OBSOLETE 5/N 0102- LF- 014- 6601 SECURITY CLASSIFICATION OF THIS PAGE (Whmn Dmtm Snttmd) security classification of this page fWhan Dmim Bntmrmd) operating at hitherto impossibly high rates can be designed. S/N 0102- LF- 014- 6601 SECURITY CLASSIFICATION OF THIS PAGeni7i#n Bntmr^d) High-Speed Recursive Digital Filter Realization Herschel H. Loomis, Jr. Bhaskar Sinha Pre-print of paper to appear in CIRCUITS, SYSTEMS, AND SIGNAL PROCESSING High-Speed Recursive Digital Filter Realization* Herschel H. Loomis, Jr.** Bhaskar Sinha*** ABSTRACT Pipeline techniques have been successfully applied to speeding up processing in both general and special purpose digital computers. Application of these techniques to non-recursive (FIR) filters has been suggested and is quite straightforward. Application to recursive (HR) filters has not previously been shown. In this paper, the technique for applying pipeline techniques to recursive filters is shown and the advantages and disadvantages of the technique are discussed. Using these techniques, recursive digital filters operating at hitherto impossibly high rates can be designed. *The basic research for this idea was supported by Department of Electrical and Computer Engineering, University of California, Davis, Cal ifornia 95616. The basic idea is the subject of a pending patent application. Subsequent research was supported by the U.S. Naval Hectronic Systems Command. **Department of □ectrical and Computer Engineering, Naval Postgraduate School , Monterey, Cal ifornia 93943. On leave from the Department of □ectrical and Computer Engineering, University of California, Davis, California 95616. ***International Business Machines Corporation, Boca Raton, Florida 33432. High-Speed Recursive Digital Filter Realization ABSTRACT Pipeline techniques have been successfully applied to speeding up proc- essing in both general and special purpose digital computers. Application of these techniques to non-recursive (FIR) filters has been suggested and is quite straightforward. Application to recursive (IIR) filters has not previously been shown. In this paper, the technique for applying pipeline techniques to recursive filters is shown and the advantages and disadvantages of the tech- nique are discussed. Using these techniques, recursive digital filters oper- ating at hitherto impossibly high rates can be designed. 1 INTRODUCTION Digital filters have an important place in the technology of processing signals. Because of the sample theorem, the sampling rate which is also the clock rate of the filter, must be higher than twice the highest frequency component in the signal to be filtered. Thus the use of digital filters in real time application involving high frequencies demands high sample rate. Pipeline techniques are well known [1,2] ways to increase the clock rate (throughput) of a particular state-of-the-art realization of a logic module. For example, the Cray I supercomputer has a clock rate of 80 Mhz and requires a 7 stage pipeline multiplier to perform 64-bit floating point multiplication at the rate of one product every 12.5 ns. [3]. Other workers have investigated residue number techniques to obtain high rate digital filters [4]. However, difficulties in implementing division by a constant impose limitations on such real izations , making scaling awkward and precluding anything resembling floating point. Pipeline techniques can thus be used to produce multipliers and adders that operate at the maximum clock rate possible for a given state of the art of logic devices. For example, TRW makes a non-pipeline 16 bit integer multiply chip with input and output registers that can be operated at the rate of about 9 MHz. Pipeline techniques could be applied to that design, producing a new chip that might have three stages and operate at near 40 MHz. Discrete logic realization of the basic functions of multiplication and division can yield even better performance. At the extreme we have some of the integrated circuit ECL lines. Based on the experience gained with the pipeline supercomputers, it is reasonable to expect that a three stage 12-bit integer multiply and 1 to 2 stage 16-bit integer adder could be built to operate at around 80 MHz. Floating point operation could be obtained at the same rate 2 with a cost of perhaps only 2 additional stages for the multiplier and 3 more for the adder. What finally limits the clock rate in a pipeline system is the net delay of all logic and the register delay between successive registers, however pipeline techniques will yield the highest throughput or clock rate possible with a given state of the art. Now it should be clear why we should consider the application of pipeline techniques. 1) higher clock (sample) rates are possible for a given state of the art. 2) more complex operations (i.e. floating point) are possible for a given state of the art and clock rate. Pipeline techniques can be easily applied to non-recursive systems, that is, systems without feedback. In the digital filter arena, this would correspond to non-recursive or Finite Impulse Response (FIR) filters. These techniques are straight forward and proceed as follows: An FIR filter can be described by a transfer function in the delay operator (z"l or D) as in (1) I - H(z-l) = + a. +• . . + a -m m ( 1 ) Figure 1 shows the realization of such a filter using delayless add and multiply modules as well as unit delays. That realization is unrealistic and does not capitalize on the advantages of pipeline processors as discussed above. Let us suppose that we have a 1 -stage pi pel ine adder and a 3-stage pi pel ine mul tipi ier. 3 Figure 2 shows a pipeline realization of the transfer function given in (1). In fact, (1) is not exactly realized because of the delay of the adders. Instead the output of the filter is y delayed by 7 clock periods. This, however, is not a serious difficulty. Thus, pipeline realization of FIR filters is straight forward, costing only an added delay in the resulting output sequence rate, and perhaps some additional adders over the realization of figure 1, if the adder has more than one stage. What then about doing this for HR filters? Applying pipeline techniques to recursive calculations is not so strai ghtforward as it is in the non- recursive case, and was first reported in [1] in connection with accumulator design. In the next section we shall consider the design of recursive (HR) filters. Following that, we discuss the general attributes of the solution and some of the unsolved problems associated with the technique. Finally, an example of a sixth order Butterworth filter is treated, to illustrate the technique. 4 PIPELINE RECURSIVE DESIGN An order recursive or Infinite Impulse Response (IIR) filter can be characterized by a ratio of two nth degree polynomials in z“l, the delay operator, as shown in (2) y = ag + 3iZ ^ + ... + amZ 71 ^ 1 - bjZ ^ -b2Z ^ -n X = A(z) "ffTzT "" ( 2 ) This representation can be expressed using D, another notation for the delay operator, as shown in (3). Time is measured in integer steps by the index i. y(T) . A(D) ■ mi x(i) (aQ Sj^D + ... + a^^D ) x(i) (1 - b.D - ... - b d" 1 n ( 3 ) where D*^x(i) = x(i-k) Rearranging (3) yields the familiar recursive formulation of the filter equation, as shown in (4) y(i) = (ag + a^D + ... + a^o"’) x(i) + (b^D + ... + b^o") y(i) (4) Equation (4) leads directly to the canonical representation of the filter as shown in figure 3, assuming the availability of a multiply-add unit with unit delay (one clock pulse period delay). This realization actually does not yield y, but in fact produces Dy, y delayed by one clock pulse period. The canonical form shown in this figure can be realized, but unfortunately because the multiply-add unit is very complex, the realization will be quite slow. If we assume 12 bits of significance to the representation of the numbers in the filter, we find that the unit has two 1-input, 12-bit constant multipliers 5 feeding a three-input by 12-bit adder. All of this complexity forces the clock pulse period to be very long, and the clock or sampling rate as a consequence to be low. A typical value or clock rate using state-of-the-art Schottkey integrated circuit PROMs and adders would be on the order of 10 MHz. A slight improvement in the rate of operation is possible if we use the basic module shown in figure 4a. A delay of 1 clock pulse period is produced by the output register. This module could be built to operate at a clock rate of perhaps 12 MHz, compared with the realization of figure 3. What is desired is some means to take advantage of higher clock rate pipe- line realization of the basic multiply-add module, thus producing a higher sam- pling rate realization of the recursive digital filter. In order to increase the clock rate of the multiply-add unit, we first use the two input version, and then insert several registers for intermediate re- sults in the coefficient multiply and adder sections. For example, a 2-stage version is shown in Figure 4b that could be made to operate at 16 MHz. The ndt result is a kp, + k^-stage (register) pipeline multiply-add unit as shown in figure 5, with kp, stages involved in the coefficient multiply and k^ stages in the adder. When we attempt to use the module of figures 4 or 5, we find it impossible to insert it in the feedback loop, preserving a minimum delay of only one. So, if we are to use the pipeline module, some other way must be found. For the purposes of the derivation which follows, we will restrict our consideration to second-order filters, that is m=2, n=2. This in no way limits the result, for the process used is not dependent on n, except for the value of the individual coefficients. On the other hand, higher order filters can also be realized as the cascade of rn/2"I 2 ^^ order filters, and at most one filter of lower order. Now we let n=2 and consider that specific case of Y T " A(z"^) aQ + a^z”^ + a2z“^ 1 - ( 5 ) or represented in terms of the delay operator D, the usual computer representation of delay. x( i) _ yTTT 2 Qq + 820 1 - b^D - b20^ ( 6 ) Writing this in a different form we have y(i) = (aQ + a^O + a 2 D^)x(i) + (b^D + b 20 ^)y( i) ( 7 ) which can also be written in difference equation form: y, . a„x, t ajx,.; t ajX,.^ * (^1-1 * '>2^t-2> <«> For subsequent development we will use (7); furthermore, we will omit the (i) notation. We will relate that form to the other forms where appropriate. Let us first ignore the A(D) polynomial by substituting x' = A(D)x (9) into (7) yielding y = x' + (b^Dy + b20^y) (10) Delaying (10) and substituting for Dy in (9) we obtain y = x' + b 2 ^( 0 x' + bjD^y + b 20 ^y) + b 2 D^y 7 or y = (1 + b^D)x' + (b^ + b 2 )D^y + bjb 2 D^y (11) (11) Is now a third order difference equation describing the same digital fil- ter. That is to say, the filter described by (11) has the same transfer char- acteristic as the one described by (7) or (10). Delaying (10) by and substituting into (11) yields y = [1 + bjD + (b^ + b 2 )D^]x' + (2bjb2 + b^)D^y + {hj + b^b2)D^ (12) In general, successively more delayed versions of (10) can be substituted, raising the order of the difference equation, but, more importantly, raising also the minimum delay associated with the feedback y terms. The general higher order difference equation has the following general form, provided we started from a second-order equation: y = [1 + D + 4'’’ * ••• 4”’ P+1 P+2 (13) -th In fact, if we began with an n order difference equation, we would have an equation of p+n order resulting from the foregoing process as shown in (14) y = [1 + D + ... + Dp]x' (H) The coefficients ajP^ and b|P^ can be calculated by following the process described above in detail. For example, it should be clear from (12) that 8 ( 15 ) .( 2 ) = = b. .( 2 ) (2) - OK K 4. k3 . k(2) >2 “ 2b^b2 * ^4 2 2 + b^b2 where the original b coefficients are from (10). We will cover the process in detail a little later in this section. Now let us recall what x' is in terms of the original IIR filter as des- cribed in (7). Substituting x' = A(D)x into (13) yields: y = [1 + D + ... + nP] [ag + a^O + a20^]x This equation when multiplied out for the case of p=2 yields (17) y = + aj^^D + ... + a^^^D^^lx >(2) n3 + [bA ^ 0 y + b; ^ D y] (2) n4 (16) (17a) and w' = [aj^^ + aj^^D + ... + a^^^D^]x + w' + b|^^D^ w'] (l^la) These two equations look very similar now, the only difference being that the non-recursive parts of (18a) is the non-recursive part of (17a) delayed by A + 4. If both sides of (17a) are delayed by that amount, (17b) results: [a^2) * a|2)D . ... . af >D^]x (17b) 9 Finally, it should be noted that D^‘'’^y in (17b) is the same as w' in (18a). substituting D^''’^y = w' in (17b) produces (17c): w' = [a^^^ + aj^^ + ... + aj^^0^]x + w' + b|^^D^ w'] (17c) which is the same as (18a). What this all means is that the structure of figure 6 will produce an out- put w' which is simply the desired signal y, delayed by a+4 sampling periods. This digital filter structure uses pipeline logic units in both the recursive and non-recursive portions to realize the desired output. The design still needs to be completed, even though the heart of the structure has been developed. Figure 7 shows the structure of the non- recursive portion of the filter to complete the example. The circuit of figure 7 produces D^Cag + a^H +...+ a 4 D^]x which corresponds to the D^Cag + ajD +...+ a 40 ^]x term required as the out- put of the non-recursive part of figure 7. Therefore, the pipel ine digital filter of figure 7 combined with figure 6 produces that is y, de- layed by 8 sample periods. Now we return to the calculation of the and b^^^ coefficients. More general relations than (15) can be developed in difference equation form with respect to p. Starting with the p augmented order (p+2 order) difference equation (13) we can derive the p+1 augmented order equation by substituting for [(10) delayed by into (13). 10 y = [1 + ajP^ D + ... + dP]x' + b|jP{ x' +(bj dP"^^ y + b 2 dP'^^ y)] * b<P| y [1 + ,(P> D t ... + c,<P> dP t b<P> dP*']x’ * (b[,Sl bi - b<P>) dP "2 y . b'Pj bj 0 P" 3 y From (19) we can see that „(P+1) “p+1 . b';r> •b'Sl = b'?l bi * b(P> “ bp?l bj Where agP^= 1, b|*^^= bj, and b 2 *^^= b. (19) (20a) (20b) (20c) Table 1 shows the values of the coefficients of (13) for the values of p from 0 ‘f' ^ through 5 as derived from (20). Furthermore, multiplication of the p*”^ order polynomial ip> 1 0 1 p by the original 2 ag + aj^D + a20 yields the polynomial a (P) 0 + a<P>0 + • • • + a (p)nP+2 p+2^ 11 for p as shown in (16) and (18). Table 2 shows the values of aj^^ from 0 through 5. The unique structure of this realization, which differs from conventional realizations of digital filters is contained in the recursive portion, where the basic feedback loop involving the longest delay passes through p+2 pipeline nrl stages, for the realization of a 2”° order filter. In the example devel- oped, p was 2, causing us to use a 4^ order pipeline filter represen- tation of the original 2*^*^ order filter. In general, we assume we have stage pipeline add units and k^ stage pipeline multipliers. The structures of the ralization of the feedback portion of the filter would be as shown in figure 8. Since the maximum delay around the loop must match the p augmented difference equation order, we have p = 2k^.k^-2 (21) The FIR portion of the filter is shown in general terms in figure 14. From this figure, it should be clear that 4 ' ^ ^ ^ "m and that the over all delay is therefore 4 + 2IC5 = [io 92(I<3)]|C3 t t (22) STABILITY ISSUES In the previous section we have seen the general process of realization of a digital filter transfer function using pipel inemultiply-add units with delay 2. We have also shown a realization using 2-stage multiply-add units. The essence of the techique is to represent a second-order section by means of a (2+p) order section of a particular form, where p is determined by the number of stages in the pipeline multiply and add units. 12 We know from other considerations that the transfer function of the higher order realization must in fact be the same as the original and therefore must have the same poles and zeros in the z plane. Thus, the operation which trans- forms (7) into (17) can be thought of in the following way: (aQ + a^z'^ + a2z'^) (1 + aJP^z'^ + ... + aJ^^^z'P) (1 - b^z"'*’ - b2Z^) (1 + + ... + Op^^z"*^) a(p) + a(p) ,-l + + a(p) ^-P-^ aQ + aj z + ... + ap ^2 ^ 1 - b^P^ z'P"^ - b^P^ z'P"^ p+1 p+2 (23) (24) Where the coefficients of the a polynomial depend on the original bj and b 2 of (5) and (7). These a coefficients are governed by (20) and some were tabulated in Table lb. Thus, we raise the order of the denominator, introduc- ing p new poles and corresponding cancelling zeros. These new poles correspond to the roots of 0=1+ D + ... + dP (25) The roots of a(D) of course are the poles in the z~^ plane and are the reciprocal of the poles in z plane. One concern in the realization is that the filter be stable, that is, that it have an impulse response which decays to zero. A sufficent condition for the stability of a digital filter with transfer function Hfz”^) is that the poles of H(z~^) in the z~^ plane lie outside the unit circle, and that the order of the numerator (-m) be less than or equal in magnitude to the order (-n) of the denominator. A more familiar sufficient condition is that the poles of the transfer function of z (H(z)) be inside the unit circle. 13 Now, if we start with a stable filter, we would like to be assured that the poles of the augmented order filter all be outside the unit circle in the z~^ plane. We desire this because, even though the added poles are cancelled by the zeros of (1 + + ... + realization imperfec- tions will prevent exact cancellation and the augmented filter would be unstable. In order to examine the stability question, we will make use of Jury's stability test [6] as applied to the successively higher degree denominator polynomials in z as p increases. We start with the denominator of (5), written as a polynomial in z P(z) = z^ - b^z - b^ (26) To insure that the original poles of (5) are inside the unit circle and hence that the original filter is stable, we apply Jury's test with necessary cond itions: P(l) = 1 - bj - b^ > 0 I (-1)'’P(-1) = 1 + bj - b2 > 0 / and sufficient condition: \ (27) I I < 1 J Hence, for this second-order digital filter to be stable, the values of bj and b 2 must be in the shaded area as shown in Fig. 10. Note that this is the case when p = 0, i.e., no augmentation. Now, assuming that the original filter transfer function (5) is stable, i.e., (27) is satisfied, it is desirable to determine conditions to assure 14 stability of the new p poles introduced in the system when the transfer function is p-augmented. For p = 1, the new polynomial introduced in the numerator and denominator is F(z) = z + (28) Applying Jury's test, conditions for stability are F(l) = 1 + bj > 0 1 (29) (-1)"f(- 1) = 1 - bj > 0 I This condition, | bj | < 1, is shown graphically in figure 11. Similarly, for p = 2, the added polynomial is F(z) = z^ + bjz + (bj + b^) (30) Jury's test gives conditions for stable F(l) = 1 + bj + (bj + b2) > 0 (-1)"f(- 1) = 1 - bj + (bj + b, and the sufficient condition is roots. The necessary conditions are The region defined by conditions (31) is shown graphically in figure 12. Finally, for p = 3 F(z) = z^ + bjZ^ + (bj + + (bj + 2b^b2) 15 and the conditions for stable poles are (figure 13): F(l) = 1 + + (bj + b2) + (bj + > 0 (-1)"f(- 1) = 1 - b^ + (b^ + b2) - (bj + Zbjb^) > 0 I bf + 2b^b2 I < 1 I (bj + 2b^b2)^ - 1 I > I bi(bj + 2bjb2) - (b^ + b2) | This procedure can be continued for higher values of p; in fact general stability tests based on Jury’s test have been developed [6]. Unfortunately, those tests are cumbersome to apply. Instead, let us take another approach. From figures 10-13 it may be observed that as the value of p increases, the area of stability (the shaded areas in the figures) tends to increase. It was seen graphically that as p increased, the stable region came to be closer and closer to the original shaded area for p = 0, i.e., figure 10. This leads to the obvious conjecture that all stable filters have a stable augmented filter for some large enough p. We will follow the procedure suggested by Voelcker [73 for Block Filters to show that for a large value of augmentation, p, the pipelined version of the original filter will always be stable, provided that the original second order filter is stable. This is a very important and useful conclusion and implies that an augmented stable realization is always possible if the original trans- fer function is stable. The proof for this fact follows: Since the process of augmentation implies multiplying the numerator and the denominator by the same polynomial, the original transfer function may be equated to the augmented transfer function as 16 where N(z) is the numerator of the original transfer function. Oenoting the denominator of the original transfer function as 0(z), since this original fil- ter is stable, the roots of D(z) are inside the unit circle. Figure 14(a) shows the original filter where Hp(z) is a pole only function representing the denominator, (1/D(z)) of the original filter. The process of augmenting the difference equation suggests building the filter as shown in figure 14(b) where and Thus (33) Hence, the nonrecursive transfer function a(z) may be written as a(z) = Hp(z) - z -(p+1) G(z) nrzT (34) Since H is a pole only transfer function P 2 a. z . i=l 1 z 17 Taking the inverse transform V"> " D (•1 Since initial poles, they are inside the unit cir- c,e. Hence h^(n) 1s an Infinite se,eence decreasing in ^agnitede. If the sequence is truncated at the (p + 1) term, the remaining portion of the sequence 'h'pCn) = i=l n 0, 1, •••, p otherwi se Taking the z-transform 1^p(z) =^1^p(n)z-'’ n=0 P 2 E n=0 i=l E^iEH i=l n^ 2 i=l fi - (z,./z)P*n '1 1 - U,/z) j 2 2 , , ,p+l a.. _ a-Cz-Zz)*" ^ i - Kzjz) " S i - {z./z) i=l i=l V 18 Thus , Hp(z) = Hp(z) - •(P+1) (35) Comparing (34) and (35) and, since 1i'p(z) is a finite impulse response transfer function, equating 1lp(2) and a(z) 2 a 7 G(z) _ ^ ^i^i ITTzI' 2^ i - z./z i=l ’ Thus G(z) i=l a.Zi P+1 D(z) (z^./z) (36) The intent of this procedure is to prove that, for some high value of p, all roots of a(z) are inside the unit circle in the z-domain. From (33) D(z)a(z) = 1 - z"^P'"^^G(z) (37) Since 0(z) is the denominator of the original transfer function, all roots of D(z) are inside the unit circle. Hence all roots of a(z) are also inside if, and only if, all roots of the right hand expression of (37) are inside the unit circle. To do this, let tilde (~) denote the result of the mapping z~^ z, i.e., Tiz) = f(z"^) and X(z) is defined as A(z) = 1 - z^P^^^^(z) =^(z)a(z) (38) All roots of D(z) are exterior (outside the unit circle). Hence, all roots of a(z) are exterior if all roots of A(z) are exterior. To prove this, Rouche's theorem is used which states that: "If f(z) and g(z) are analytic inside and on a closed contour C, and | g(z) | < j f(z) | on C, then f(z) and f(z) + g(z) have 19 the same number of zeroes inside C. " Clearly, A in (38) is analytic within and on the unit circle. The constant "1" has no interior zeroes. Hence A(z) will have no interior zeroes if, on the unit circle. z I < 1 (39) From (4.20), z<P*‘>G(z) z=e (40) But I Zi I < 1 for all i because these are poles of the original filter. Thus, for some value of p greater than some critical value, equation (40) will satis- fied. Note that (40) may be satisfied for smaller value of p also. From the above discussion, it may be concluded that for some high value of augmentation, it is always possible to ensure that a(z) has roots inside the unit circle. Since these roots are the new additional poles of the augmented system, the new high order transfer function would be stable. As a result of this fact, the following design method is suggested. We assume that the designer is given a stable recursive digital filter transfer function H(z) and a desired sampling (clock) rate of operation, fj. 1. Find k(f, stage pipeline multiply and k^ stage pipeline add units that will operate at clock frequency f^. 2. Factor H(z) into Ji ()l - 1 if n is odd) 2*^^ order (and if n is s t odd, at most one 1 order) transfer functions. H(z) = H^(z) . H 2 (z) ...Hj^(z) where 4 = fn/21 3. For each H-j(z), i = 1, 2, ... z. Realize H^(z) as follows: a. Set p - 2kg + km - 2, 20 b. Determine the polynomial 1 - = Bp(z~^), - 1 . c. Solve for the roots of Bp(z ) = 0 (41) d. If all roots of B ( 2 "^) = 0 are outside the unit cir- P cle, go to e, else set p = p + 1 and return to step 35 . e. H-j(z) can be realized by a stable p augmented order difference equat ion , a'P) . a'P>z-l . ... a '* ' 1 - b'P’z-P-l - ^ °p+l^ °p+2^ (24) where p 2 2kg + k^, - 2 Because the last result, we know that we can always find a large enough p so that (24) will be stable. ' In the next section, we will illustrate the method with an example filter taken from the 1 iterature. EXAMPLE We will treat the Butterworth filter considered by Oppenheim and Shafer [8], This is a sixth order filter whose original transfer function H(s) is given by H(s) = 0.20238 ( s^+0 . 396S+0 . 5871 ) ( s^+1 . 083s+0 . 5871 ) ( s^+1 . 4802s+0 . 5871 ) (42) Applying the bi-1 inear transformation, a sixth order difference equation is obtained that can be factored in three second order filters. H(z) = H^(z) • W^iz) • H 3 (z) (43) 21 where \ _ .0007378 (1 + 2z~^ + z'^) rl. (Z) - _i _y ^ (1 - 1.2686 z ^ + 0.7051 z H^u) a - ^ (1 -1.0106 z ^ + 0.3583 z H (z) = — (1 (1 - 0.9044 z ^ + 0.2155 z '^) The poles in the z'^plane of these equations are as follows Pj = 0.8996 ± j.7804 I P J = 1.1909 P, = 1.4103 ± j.8956 1^2 1.6706 P, = 2.0984 ± j.4870 P 3 I = 2.1542 (44a) (44b) (44c) (45a) (45b) (45c) Thus all three of the factoring second order transfer functions are stabl e. p augmented order difference equations of the form of (16) were derived for each of the three sections, for values of p from 1 through 5. and H 2 were unstable for p=l and stable for all values of p from 2 through 5. H 3 was stable for all values of p from 1 through 5. If we had a 2 stage adder and a 2 stage multiplier, (21) would yield a p of 4, which for our example adds stable poles to each section. The general structure of each of the 2nd order (4-augmented) sections would be as shown in figure 15. 22 Finally, the values for the coefficient of each of the multipliers would be as shown in table 3. f4l Note that the a' ’ coefficient for the first section are all on the order of .001 in magnitude. These coefficient could easily be computed to more significant figures, although realization of more significant figures in the coefficient of an actual filter would require either scaling or the use of a floating point number system. This example shows how an IIR filter can be realized using high rate pipe- line modules, with the ultimate objective of achieving a higher sample rate than possible with non-pipel ined multiplier and adders. 23 SUMMARY AND CONCLUSIONS In this paper we have developed a method for applying pipeline techniques to the design of high speed recursive digital filters. Using these techniques, recursive digital filters operating at rates hitherto impossible can be designed. The general structure of the filter and the method for calculating the multiplier coefficients is presented. The stability of the resulting real- ization has been investigated and a technique for ascertaining the stability of the realization is presented. A significant example taken from the literature illustrates the technique and demonstrates the existence of a stable, high rate pipeline realization of a practically useful filter. 4 24 REFERENCES 1. Loomis, H. H. Jr., "The Maximum Rate Accumulator," IEEE TEC, EC-15, Number 4, Aug. 1966, pp. 628-639. 2. Chen, T. C. , "Overlap and Pipeline Processing," INTRODUCTION TO COMPUTER ARCHITECTURE, H Stone Editor, Science Research Associate, Chicago, Chapter 9. 3. Cray Research, Inc., CRAY-1 HARDWARE REFERENCE MANUAL, Cray Research Publication No. 2240004, Rev E, May 15, 1979. 4. M. A. Soderstrand and E. L. Fields, "A High Speed Low-Cost Recursive Digital Filter Using Residue Number Arthmetic," PROC IEEE, July 1977. 5. Sinha, B., PIPELINED FINITE STATE MACHINE ARCHITECTURE APPLIED TO DIGITAL FILTERS, Ph.D. Dissertation, University of California, Davis, CA, September 1983. 6. Nagrath, I. J. and Gopal , M. , CONTROL SYSTEMS ENGINEERING, Hal sted Press, John Wiley and Sons, 1982. 7. Voelcker, H. B. and Hartquest, E. E., "Digital Filtering via Block Recursion," IEEE TRANSACTIONS on Audio and Electroacoustics, June 1970. 8. Oppenheim, A. and Shafer, R. , DIGITAL SIGNAL PROCESSING," Prentice Hall, Englewood Cliffs, NJ, 1975. 25 Coeff of Coeff of p b<p> P+1 b(P) %+2 0 ^1 ^2 1 '>1 " ‘'2 '>l‘>2 2 bj + 2b^b2 b^b2 + b2 3 4 2 2 bj + 3b^2 + b2 4 2 2 3 + 3b^b2 + b^ 4 b^ + 4b^b2 + 3b^b2 4 2 2 3 bjb2 + 3b^b2 + b^ 5 b^+5bjb2+6b^b2+b2 b^b2+4b^b2+3b^b2 Table la. b's for p = 0, 1, ...5. P “0 "7^ “1 Jp) “2 Jp) “3 Jp) “4 0 1 0 0 0 0 0 1 1 0 0 0 0 2 1 ^1 »1 *■ ‘‘2 0 0 0 3 1 ^1 ^1 ^ ’'2 b^ + 2b^b2 0 0 4 1 ^1 b^ + 2b^b2 4 2 2 b^ + 3bp2 + b2 0 5 1 ^1 ^1 ^ *"2 bj + 2b^b2 4 2 2 b^ + 3bp2 + b2 b^ + 4b^b2 + 3b^b2 Table lb. q s Tor* p~ 1> ... B. 26 Q- IT3 O o o O c:: CNJ CNi -Q CO + cv -Q CO ^ -Q -r CD —4 -Q CNJ fO -1- CNI CNJ -Q ^ CNJ CNJ ^ CNJ CNJ -Q ^ + CO + ..—V CNI + CNJ CL CNJ ^ CNJ ^ -Q CNJ ^ fT3 o o o O jC CO j2 CO -Q CO -h ^ -f- 4- ^ »— 4 LT> ^ -Q X -Q CNi CNI fO 1— • la la + CNI CNJ X. -Q — X CNJ CNJ 1—4 CNI CNJ -O ^ -C X -K — X CO 4- . — X CNJ CNJ CNJ + CNI CNi -Q -Q -Q CJ ^ f— ^ CNJ «— f 1—4 -Q CNJ ^ ^ -C -Q -Q CO ^ -Q .-—V CNJ CO CNJ -Q CO CNJ CL + + 4^r -K + ^ — LD ro ^ ^ 4 CO ^ -4- rr ^ CO 1-4 <T3 o o o -Q -Q LO ^ -Q -Q X. ^ x, -Q ^ CNJ ^ CNJ 1—4 CNJ fT3 »T3 fO O ro ro + + + la + + CNJ CNJ CNI CNJ -Q + ^ CNJ CNI CNJ CNJ CNI -—V -Q -Q ^ ^ -Q ^ »— «v CNJ ^ CNJ CNJ ^ CNJ CNJ 1— 9 1—9 CNI CL -Q -Q -Q -Q ^ -Q -C -Q -Q ' — ■+• CNJ CO CNJ + CO CNJ -K fT3 o o CVJ + CNJ ^ -K + CNJ ^ + + CNI 1-i n 4 -Q ^ 1-4 CO ^ -a ^ 1—9 CO —1 jG -Q — ' -Q -Q -Q OJ CNJ CNJ CNJ <n O 1— 9 ^T3 O 1 — 4 ro (T3 + «T3 fT3 -♦- ra ra + + -1- S. CNJ CNJ CNI -Q ^ CNJ t— » CNJ ^ CNJ 1-^ CNJ CL -Q -Q -Q ^cn CNJ H- CNJ CNJ -H (o o CNJ ^ -Q + CNJ 8 *-l + CNJ r— 9 1 -H -f- CNI 1-4 ^ CSJ -Q CVJ m --4 ^ -Q CO ^ ^ ^ CO ^ ^ <T3 (T3 -Q CNJ ^ CNJ -Q ^ — 'CNJ -J- ,—4 ITS ^ fO <T3 O »T3 + O fo H- O fO -f fT3 + fO + fa + C\J CNJ CNJ CNJ CNJ CNJ CNI CNi CNJ <T3 -Q <T3 ^ (T3 ^ ta Q. + -J- + + + + + + + ^CVJ C\J CNJ *-» ^ CNJ »— 1 *— 1 CNJ i—» •— 9 CNJ ^ tT3 flj -Q -Q "x. — ^ f— 4 Xx— ^ ' 1— H X>^— ^ 1—4 ro o O fTS O <T3 o »a »T3 + »T3 + 9T3 + fa ■4' O*^ «T3 ro rrs <T3 fa CL -K -K + -K + f-H » '< <— 4 fT3 »T3 X) O O o O o «T3 8T3 fa fa CL o O o o O o fT3 <T3 »T3 fa ra fa CL O f-H CNJ cn m 27 Table 2. Coefficients of eqiiation (17) 28 to realize 6th order Butterworth filter. High-Speed Recursive Digital Filter Realization Figure 1. Figure 2. Figure 3. Figure 4a. Figure 4b. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. Figure 10. Figure 11. Figure 12. Figure 13. Figure 14. Figure 15. LIST OF FIGURES Simple FIR Realization Realization of A(D)x using 3-stage pipeline multipliers, 2-stage pipeline adders and unit delays Canonical Realization of a Recursive Filter One-stage pipeline multiply-add unit One-stage pipeline multiply feeding one-stage add kg + k^i stage pipeline multiply-add module Heart of the pipeline recursive filter realization Non-recursive portion of pipeline filter realization Heart of more general pipeline filter realization Realization of FIR portion of filter Condition for stability of the original filter Condition for stability of the augmented filter: p=l Condition for stability of the augmented filter: p=2 Condition for stability of the augmented filter: p=3 (a) Original transfer function (b) Equivalent augmented transfer function Sixth-order pipeline realization of second-order recursive filter 29 >N 30 gure 1. Simple FIR Realization X +• 31 Figure 2. Realization of A(D)x using 3-staqe pipeline ^Itipliers, 2-stage pipeline adders and unit delays X 32 Figure 3. Canonical Realization of a Recursive Filter Figure 4a. One-stage pipeline multiply-add unit Figure 4b, One-stage pipeline multiply feeding one-stage add 33 X Figure 5. ka + stage pipeline mul ti ply-add module 34 D+ - +0 <3 O 35 Figure 6. Heart of the pipeline recursive filter realization X 5 _ Non-recursive portion of pipeline filter realization A-l-2k ! 3 o o o o 37 Figure 8. Heart of more general pipeline filter realization X 38 Note* figure Illustrates case where p+l is integer multiple of k . 2 Polynomial* F(z) = z -b^z-b^sQ Conditions for Stability* 1-b,-b2>0 © l + b,-b2>0 (D |b2|<l <D Figure 10. Condition for stability of the original filter 39 Polynomial- F(z)= z+b^ = 0 Conditions for Stability- l + b ,>0 © 1 - b ,>0 ® »b 1 Figure 11. Condition for stability of the augmented filter: p 40 2 2 Polynomial' F(z)=z +b^z + (b^ + b2) Conditions for Stability' l + b^+(b^+b2) >0 © 1 + b +(b^-hb2 >0 ® I b + b I < 1 © I 2| Figure 12. Condition for stability of the augmented filter: p=2 41 Polynomial: F(z)= z'’+bi2^+(o^+b2)2+(b^+2b-|b2) Conditions for Stability: l+b^+(b^+b2)+(b^+2b^b2) > 0 l-bi+(b^+b2)-(b^+2b^b2) > 0 Fig. 13 Stability of augmented filter; p=3 © © © 0 (a) Hp(z) (b) Figure 14. (a) Original transfer function (b) Equivalent augmented transfer function 43 GO % CT Q cn T3 I CD Figure 15. Sixth-order pipeline realization of second-order recursive filter DISTRIBUTION LIST No. Copies 1. Defense Technical Information Center Cameron Station Alexandria, Virginia 22214 2. Library, Code 0142 Naval Postgraduate School Monterey, Cal ifornia 93943 3. Commander Attention: CAPT Glover, Code 61 Naval Electronic Systems Command National Center 1 Washington, D.C. 20360 4. Dr. Sydney Parker, Code 62Px Department of nectrical and Computer Engineering Naval Postgraduate School Monterey, California 93943 5. Dr. Donald E. Kirk, Code 62 Ki Department of Hectrical and Computer Engineering Naval Postgraduate School Monterey, California 93943 6. Dr. Richard Twogood Mail Stop L-156 Lawrence Livermore National Laboratory P.O. Box 808 Livermore, California 94550 7. Dr. Herschel H. Loomis, Jr., Code 62Lm Department of nectrical and Computer Engineering Naval Postgraduate School Monterey, California 93943 8. Dr. Michael Soderstrand Department of nectrical and Computer Engineering University of Cal ifornia Davis, California 95616 9. Dr. Ralph A1 gazi Chairman Department of nectrical and Computer Engineering University of California Davis, California 95616 2 2 1 1 1 1 6 1 1 45 No. Copies 10. Dr. Abraham Sheingold, Code 62 1 Department of □ectrical and Computer Engineering Naval Postgraduate School Monterey, Cal ifornia 93943 11. Dr. Bhaskar Sinha 6 IBM Corporation 1000 NW 51 Street Boca Raton, Florida 33432 12. Director, National Security Agency 1 Attention: Mr. Jerry Freidman, W3 Fort George G. Meade, Maryland 20755 13. Naval Research Laboratory 1 Code 9110, PME 106-5 4555 Overl ook Avenue SW Washington, D.C. 20375 Attention; Mr. Jim Morgan 14. Naval Research Laboratory 1 Code 9110, PME 106-5 4555 Overlook Avenue SW Washington, D.C. 20375 Attention: CDR Richard Rowe 15. Director, National Security Agency 1 Attention: Dr. Jeff Friedhoffer, R631 Fort George G. Meade, Maryland 20755 46 f