Skip to main content

Full text of "High-speed recursive digital filter realization"

See other formats


LIBRARY 



MONTEREY. CALIFORNIA 93940 



NPS-62-84-061 



<( 



POSTGRADUATE SCHOOL 



Monterey, California 



HIGH-SPEED RECURSIVE DIGITAL 
FILTER REALIZATION 



Herschel H. Loomis, Jr. 
Bhaskar Sinha 



6 February 1984 



public release; distribution unlimited. 



onic Systems Command 
D.C. 20360 




FedDocs 
D 208. 14/2 
NPS-62-84-061 




NAVAL POSTGRADUATE SCHOOL 
Monterey, California 



Commodore Robert H. Shumaker 
Superintendent 



David A. Schrady 
Provost 



This work was supported by the Naval Electronic Systems Command, Washington, 

D.C. 

Reproduction of all or part of this report is authorized. 

This report was prepared by: 



ij axTa . LIBRARy 



SECURITY CLASSIFICATION OF THIS PAGE 0«f« Entmrmd) 



REPORT DOCUMENTATION PAGE 


READ INSTRUCTIONS 
BEFORE COMPLETING FORM 


1. REPORT NUMBER 2. GOVT ACCESSION NO. 

NPS-62-84-061 


3. RECIPIENT’S CATALOG NUMBER 


4. title (and Subtltie) 

High-Speed Recursive Digital Filter 
Realization 


5. TYPE OF REPORT 6 PERIOD COVERED 

Technical 


6. PERFORMING ORG. REPORT NUMBER 


7. AOTHORf*; 

Herschel H. Loomis, Jr. 
Bhaskar Sinha 


8. contract or grant NUMBER('«J 


»■ PERFORMING ORGANIZATION NAME ANO ADDRESS 

Naval Postgraduate School 
Monterey, CA 93943 


10. program ELEMENT. PROJECT, TASK 
AREA 6 WORK UNIT NUMBERS 

62734N 

N0003983WRDY078 


t1. CONTROLLING OFFICE NAME ANO ADDRESS 

Naval □ ectronic Systems Command 
Washington, DC 20360 


12. REPORT DATE 

6 February 1984 


13. NUMBER OF pages 


14. MONITORING AGENCY NAME & ADD RESS( H dl Karan t from ControUing OUlca) 


15. SECURITY CLASS, (ot thl» ttpott) 

UNCLASSIFIED 


DECLASSIFICATION/ DOWNGRADING 

schedule 


16. DISTRIBUTION ST AT EmEHT (o( thi a Raport) 

Approved for public release; distribution unlimited. 


17. DISTRIBUTION STATEMENT (o( tha abatraet antarmd in Block 20» it dlUarant from Raport) 


IS. supplementary NOTES 

This is a preprint of a paper accepted for publication by CIRCUITS, 
SYSTEMS, AND SIGNAL PROCESSING. 


19. KEY WORDS ^Continue on ravataa alda it nacaaamry and idantlfy by block numbar) 

Digital filters, high rate computation, high-speed logic, pipelined logic, 
recursive digital filters. 


20. ABSTRACT ^Con(/nu« on ravaraa alda It naeaaamry and Idantlty by block numbar) 

Pipeline techniques have been successfully applied to speeding up processing 
in both general and special purpose digital computers. Application of these 
techniques to non-recursive (FIR) filters has been suggested and is quite 
straightforward. Application to recursive (HR) filters has not previously 
been shown. In this paper, the technique for applying pipeline techniques 
to recursive filters is shown and the advantages and disadvantages of the 
technique are discussed. Using these techniques, recursive digital filters 



DD I jaN^3 1473 EDITION OF t MOV 68 IS OBSOLETE 

5/N 0102- LF- 014- 6601 



SECURITY CLASSIFICATION OF THIS PAGE (Whmn Dmtm Snttmd) 




security classification of this page fWhan Dmim Bntmrmd) 



operating at hitherto impossibly high rates can be designed. 



S/N 0102- LF- 014- 6601 



SECURITY CLASSIFICATION OF THIS PAGeni7i#n Bntmr^d) 



High-Speed Recursive Digital Filter Realization 



Herschel H. Loomis, Jr. 
Bhaskar Sinha 



Pre-print of paper to appear in 
CIRCUITS, SYSTEMS, AND SIGNAL PROCESSING 



High-Speed Recursive Digital Filter Realization* 

Herschel H. Loomis, Jr.** 

Bhaskar Sinha*** 

ABSTRACT 

Pipeline techniques have been successfully applied to speeding up 
processing in both general and special purpose digital computers. Application 
of these techniques to non-recursive (FIR) filters has been suggested and is 
quite straightforward. Application to recursive (HR) filters has not 
previously been shown. In this paper, the technique for applying pipeline 
techniques to recursive filters is shown and the advantages and disadvantages 
of the technique are discussed. Using these techniques, recursive digital 
filters operating at hitherto impossibly high rates can be designed. 



*The basic research for this idea was supported by Department of Electrical and 
Computer Engineering, University of California, Davis, Cal ifornia 95616. The 
basic idea is the subject of a pending patent application. Subsequent research 
was supported by the U.S. Naval Hectronic Systems Command. 

**Department of □ectrical and Computer Engineering, Naval Postgraduate School , 
Monterey, Cal ifornia 93943. On leave from the Department of □ectrical and 
Computer Engineering, University of California, Davis, California 95616. 



***International Business Machines Corporation, Boca Raton, Florida 33432. 



High-Speed Recursive Digital Filter Realization 



ABSTRACT 



Pipeline techniques have been successfully applied to speeding up proc- 
essing in both general and special purpose digital computers. Application of 
these techniques to non-recursive (FIR) filters has been suggested and is quite 
straightforward. Application to recursive (IIR) filters has not previously 
been shown. In this paper, the technique for applying pipeline techniques to 
recursive filters is shown and the advantages and disadvantages of the tech- 
nique are discussed. Using these techniques, recursive digital filters oper- 
ating at hitherto impossibly high rates can be designed. 



1 



INTRODUCTION 



Digital filters have an important place in the technology of processing 
signals. Because of the sample theorem, the sampling rate which is also the 
clock rate of the filter, must be higher than twice the highest frequency 
component in the signal to be filtered. Thus the use of digital filters in 
real time application involving high frequencies demands high sample rate. 

Pipeline techniques are well known [1,2] ways to increase the clock rate 
(throughput) of a particular state-of-the-art realization of a logic module. 
For example, the Cray I supercomputer has a clock rate of 80 Mhz and requires a 
7 stage pipeline multiplier to perform 64-bit floating point multiplication at 
the rate of one product every 12.5 ns. [3]. 

Other workers have investigated residue number techniques to obtain high 
rate digital filters [4]. However, difficulties in implementing division by a 
constant impose limitations on such real izations , making scaling awkward and 
precluding anything resembling floating point. 

Pipeline techniques can thus be used to produce multipliers and adders 
that operate at the maximum clock rate possible for a given state of the art of 
logic devices. For example, TRW makes a non-pipeline 16 bit integer multiply 
chip with input and output registers that can be operated at the rate of about 
9 MHz. Pipeline techniques could be applied to that design, producing a new 
chip that might have three stages and operate at near 40 MHz. 

Discrete logic realization of the basic functions of multiplication and 
division can yield even better performance. At the extreme we have some of the 
integrated circuit ECL lines. Based on the experience gained with the pipeline 
supercomputers, it is reasonable to expect that a three stage 12-bit integer 
multiply and 1 to 2 stage 16-bit integer adder could be built to operate at 
around 80 MHz. Floating point operation could be obtained at the same rate 



2 



with a cost of perhaps only 2 additional stages for the multiplier and 3 more 
for the adder. 

What finally limits the clock rate in a pipeline system is the net delay 
of all logic and the register delay between successive registers, however 
pipeline techniques will yield the highest throughput or clock rate possible 
with a given state of the art. 

Now it should be clear why we should consider the application of pipeline 
techniques. 

1) higher clock (sample) rates are possible for a given state of the 

art. 

2) more complex operations (i.e. floating point) are possible for a given 
state of the art and clock rate. 

Pipeline techniques can be easily applied to non-recursive systems, that 
is, systems without feedback. In the digital filter arena, this would 
correspond to non-recursive or Finite Impulse Response (FIR) filters. These 
techniques are straight forward and proceed as follows: 

An FIR filter can be described by a transfer function in the delay 
operator (z"l or D) as in (1) 



I - H(z-l) = 



+ a. 



+• . . + a 



-m 



m 



( 1 ) 



Figure 1 shows the realization of such a filter using delayless add and 
multiply modules as well as unit delays. That realization is unrealistic and 
does not capitalize on the advantages of pipeline processors as discussed 
above. 

Let us suppose that we have a 1 -stage pi pel ine adder and a 3-stage 
pi pel ine mul tipi ier. 



3 



Figure 2 shows a pipeline realization of the transfer function given in 
(1). In fact, (1) is not exactly realized because of the delay of the adders. 
Instead the output of the filter is y delayed by 7 clock periods. This, 
however, is not a serious difficulty. 

Thus, pipeline realization of FIR filters is straight forward, costing 
only an added delay in the resulting output sequence rate, and perhaps some 
additional adders over the realization of figure 1, if the adder has more than 
one stage. 

What then about doing this for HR filters? Applying pipeline techniques 
to recursive calculations is not so strai ghtforward as it is in the non- 
recursive case, and was first reported in [1] in connection with accumulator 
design. 

In the next section we shall consider the design of recursive (HR) 
filters. Following that, we discuss the general attributes of the solution and 
some of the unsolved problems associated with the technique. Finally, an 
example of a sixth order Butterworth filter is treated, to illustrate the 
technique. 



4 



PIPELINE RECURSIVE DESIGN 



An order recursive or Infinite Impulse Response (IIR) filter can 
be characterized by a ratio of two nth degree polynomials in z“l, the delay 
operator, as shown in (2) 



y = 



ag + 3iZ ^ + ... + amZ 



71 ^ 

1 - bjZ ^ -b2Z ^ 






-n 



X = 



A(z) 

"ffTzT "" 



( 2 ) 



This representation can be expressed using D, another notation for the 
delay operator, as shown in (3). Time is measured in integer steps by the 
index i. 



y(T) 



. A(D) 

■ mi 



x(i) 



(aQ Sj^D + ... + a^^D ) x(i) 

(1 - b.D - ... - b d" 

1 n 



( 3 ) 



where 

D*^x(i) = x(i-k) 

Rearranging (3) yields the familiar recursive formulation of the filter 
equation, as shown in (4) 



y(i) = (ag + a^D + ... + a^o"’) x(i) + (b^D + ... + b^o") y(i) (4) 

Equation (4) leads directly to the canonical representation of the filter as 
shown in figure 3, assuming the availability of a multiply-add unit with unit 
delay (one clock pulse period delay). This realization actually does not yield 
y, but in fact produces Dy, y delayed by one clock pulse period. The canonical 
form shown in this figure can be realized, but unfortunately because the 
multiply-add unit is very complex, the realization will be quite slow. If we 
assume 12 bits of significance to the representation of the numbers in the 
filter, we find that the unit has two 1-input, 12-bit constant multipliers 



5 



feeding a three-input by 12-bit adder. All of this complexity forces the clock 
pulse period to be very long, and the clock or sampling rate as a consequence 
to be low. A typical value or clock rate using state-of-the-art Schottkey 
integrated circuit PROMs and adders would be on the order of 10 MHz. A slight 
improvement in the rate of operation is possible if we use the basic module 
shown in figure 4a. A delay of 1 clock pulse period is produced by the output 
register. This module could be built to operate at a clock rate of perhaps 12 
MHz, compared with the realization of figure 3. 

What is desired is some means to take advantage of higher clock rate pipe- 
line realization of the basic multiply-add module, thus producing a higher sam- 
pling rate realization of the recursive digital filter. 

In order to increase the clock rate of the multiply-add unit, we first use 
the two input version, and then insert several registers for intermediate re- 
sults in the coefficient multiply and adder sections. For example, a 2-stage 
version is shown in Figure 4b that could be made to operate at 16 MHz. The ndt 
result is a kp, + k^-stage (register) pipeline multiply-add unit as shown in 
figure 5, with kp, stages involved in the coefficient multiply and k^ stages 
in the adder. 

When we attempt to use the module of figures 4 or 5, we find it impossible 
to insert it in the feedback loop, preserving a minimum delay of only one. So, 
if we are to use the pipeline module, some other way must be found. 

For the purposes of the derivation which follows, we will restrict our 
consideration to second-order filters, that is m=2, n=2. This in no way limits 
the result, for the process used is not dependent on n, except for the value of 
the individual coefficients. On the other hand, higher order filters can also 
be realized as the cascade of rn/2"I 2 ^^ order filters, and at most one 
filter of lower order. 



Now we let n=2 and consider that specific case of 



Y 

T " 



A(z"^) 



aQ + a^z”^ + a2z“^ 
1 - 



( 5 ) 



or represented in terms of the delay operator D, the usual computer 
representation of delay. 



x( i) _ 
yTTT 



2 

Qq + 820 

1 - b^D - b20^ 



( 6 ) 



Writing this in a different form we have 

y(i) = (aQ + a^O + a 2 D^)x(i) + (b^D + b 20 ^)y( i) 



( 7 ) 



which can also be written in difference equation form: 

y, . a„x, t ajx,.; t ajX,.^ * (^1-1 * '>2^t-2> <«> 

For subsequent development we will use (7); furthermore, we will omit the (i) 
notation. We will relate that form to the other forms where appropriate. 

Let us first ignore the A(D) polynomial by substituting 

x' = A(D)x (9) 

into (7) yielding 

y = x' + (b^Dy + b20^y) (10) 

Delaying (10) and substituting for Dy in (9) we obtain 
y = x' + b 2 ^( 0 x' + bjD^y + b 20 ^y) + b 2 D^y 



7 



or 



y = (1 + b^D)x' + (b^ + b 2 )D^y + bjb 2 D^y (11) 

(11) Is now a third order difference equation describing the same digital fil- 
ter. That is to say, the filter described by (11) has the same transfer char- 
acteristic as the one described by (7) or (10). Delaying (10) by and 
substituting into (11) yields 

y = [1 + bjD + (b^ + b 2 )D^]x' 

+ (2bjb2 + b^)D^y + {hj + b^b2)D^ (12) 



In general, successively more delayed versions of (10) can be substituted, 
raising the order of the difference equation, but, more importantly, raising 
also the minimum delay associated with the feedback y terms. The general 
higher order difference equation has the following general form, provided we 
started from a second-order equation: 



y = [1 + D + 4'’’ * ••• 4”’ 






P+1 



P+2 



(13) 



-th 



In fact, if we began with an n order difference equation, we would have 
an equation of p+n order resulting from the foregoing process as shown in (14) 



y = [1 + D + ... + Dp]x' 



(H) 



The coefficients ajP^ and b|P^ can be calculated by following the process 
described above in detail. For example, it should be clear from (12) that 



8 



( 15 ) 



.( 2 ) = 



= b. 



.( 2 ) 






(2) - OK K 4. k3 . k(2) 

>2 “ 2b^b2 * ^4 



2 2 
+ b^b2 



where the original b coefficients are from (10). We will cover the process in 
detail a little later in this section. 

Now let us recall what x' is in terms of the original IIR filter as des- 
cribed in (7). Substituting x' = A(D)x into (13) yields: 

y = [1 + D + ... + nP] [ag + a^O + a20^]x 



This equation when multiplied out for the case of p=2 yields (17) 
y = + aj^^D + ... + a^^^D^^lx 



>(2) n3 



+ [bA ^ 0 y + b; ^ D y] 



(2) n4 



(16) 



(17a) 



and 



w' = [aj^^ + aj^^D + ... + a^^^D^]x 

+ w' + b|^^D^ w'] 



(l^la) 



These two equations look very similar now, the only difference being that 
the non-recursive parts of (18a) is the non-recursive part of (17a) delayed by 
A + 4. If both sides of (17a) are delayed by that amount, (17b) results: 



[a^2) * a|2)D . ... . af >D^]x 



(17b) 



9 



Finally, it should be noted that D^‘'’^y in (17b) is the same as w' in 
(18a). substituting D^''’^y = w' in (17b) produces (17c): 

w' = [a^^^ + aj^^ + ... + aj^^0^]x 



+ w' + b|^^D^ w'] 



(17c) 



which is the same as (18a). 

What this all means is that the structure of figure 6 will produce an out- 
put w' which is simply the desired signal y, delayed by a+4 sampling periods. 
This digital filter structure uses pipeline logic units in both the recursive 
and non-recursive portions to realize the desired output. 

The design still needs to be completed, even though the heart of the 
structure has been developed. Figure 7 shows the structure of the non- 
recursive portion of the filter to complete the example. 

The circuit of figure 7 produces D^Cag + a^H +...+ a 4 D^]x which 
corresponds to the D^Cag + ajD +...+ a 40 ^]x term required as the out- 
put of the non-recursive part of figure 7. Therefore, the pipel ine digital 
filter of figure 7 combined with figure 6 produces that is y, de- 

layed by 8 sample periods. 

Now we return to the calculation of the and b^^^ 

coefficients. More general relations than (15) can be developed in difference 
equation form with respect to p. 

Starting with the p augmented order (p+2 order) difference equation (13) 
we can derive the p+1 augmented order equation by substituting for 

[(10) delayed by into (13). 



10 



y = 



[1 + ajP^ D + ... + dP]x' 

+ b|jP{ x' +(bj dP"^^ y + b 2 dP'^^ y)] 



* b<P| y 



[1 + ,(P> D t ... + c,<P> dP t b<P> dP*']x’ 
* (b[,Sl bi - b<P>) dP "2 y . b'Pj bj 0 P" 3 y 



From (19) we can see that 



„(P+1) 

“p+1 . 

b';r> 



•b'Sl 

= b'?l bi * b(P> 

“ bp?l bj 



Where agP^= 1, b|*^^= bj, and b 2 *^^= b. 



(19) 



(20a) 

(20b) 

(20c) 



Table 1 shows the values of the coefficients of (13) for the values of p from 0 

‘f' ^ 

through 5 as derived from (20). Furthermore, multiplication of the p*”^ 
order polynomial 



ip> 1 

0 1 p 

by the original 

2 

ag + aj^D + a20 



yields the polynomial 



a 



(P) 

0 



+ a<P>0 



+ 



• • • 



+ a 



(p)nP+2 

p+2^ 



11 



for p 



as shown in (16) and (18). Table 2 shows the values of aj^^ 

from 0 through 5. 

The unique structure of this realization, which differs from conventional 
realizations of digital filters is contained in the recursive portion, where 
the basic feedback loop involving the longest delay passes through p+2 pipeline 

nrl 

stages, for the realization of a 2”° order filter. In the example devel- 
oped, p was 2, causing us to use a 4^ order pipeline filter represen- 
tation of the original 2*^*^ order filter. 

In general, we assume we have stage pipeline add units and k^ stage 

pipeline multipliers. The structures of the ralization of the feedback portion 
of the filter would be as shown in figure 8. Since the maximum delay around 
the loop must match the p augmented difference equation order, we have 

p = 2k^.k^-2 (21) 

The FIR portion of the filter is shown in general terms in figure 14. 
From this figure, it should be clear that 

4 ' ^ ^ ^ "m 

and that the over all delay is therefore 

4 + 2IC5 = [io 92(I<3)]|C3 t t (22) 

STABILITY ISSUES 

In the previous section we have seen the general process of realization of 
a digital filter transfer function using pipel inemultiply-add units with delay 
2. We have also shown a realization using 2-stage multiply-add units. The 
essence of the techique is to represent a second-order section by means of a 
(2+p) order section of a particular form, where p is determined by the 
number of stages in the pipeline multiply and add units. 



12 



We know from other considerations that the transfer function of the higher 
order realization must in fact be the same as the original and therefore must 
have the same poles and zeros in the z plane. Thus, the operation which trans- 
forms (7) into (17) can be thought of in the following way: 






(aQ + a^z'^ + a2z'^) (1 + aJP^z'^ + ... + aJ^^^z'P) 

(1 - b^z"'*’ - b2Z^) (1 + + ... + Op^^z"*^) 

a(p) + a(p) ,-l + + a(p) ^-P-^ 

aQ + aj z + ... + ap ^2 ^ 

1 - b^P^ z'P"^ - b^P^ z'P"^ 

p+1 p+2 



(23) 



(24) 



Where the coefficients of the a polynomial depend on the original bj and 
b 2 of (5) and (7). These a coefficients are governed by (20) and some were 
tabulated in Table lb. Thus, we raise the order of the denominator, introduc- 
ing p new poles and corresponding cancelling zeros. These new poles correspond 
to the roots of 



0=1+ D + ... + dP (25) 

The roots of a(D) of course are the poles in the z~^ plane and are 
the reciprocal of the poles in z plane. 

One concern in the realization is that the filter be stable, that is, that 
it have an impulse response which decays to zero. A sufficent condition for 
the stability of a digital filter with transfer function Hfz”^) is that 
the poles of H(z~^) in the z~^ plane lie outside the unit circle, 
and that the order of the numerator (-m) be less than or equal in magnitude to 
the order (-n) of the denominator. A more familiar sufficient condition is 
that the poles of the transfer function of z (H(z)) be inside the unit circle. 



13 



Now, if we start with a stable filter, we would like to be assured that 
the poles of the augmented order filter all be outside the unit circle in the 
z~^ plane. We desire this because, even though the added poles are 
cancelled by the zeros of (1 + + ... + realization imperfec- 

tions will prevent exact cancellation and the augmented filter would be 
unstable. 

In order to examine the stability question, we will make use of Jury's 
stability test [6] as applied to the successively higher degree denominator 
polynomials in z as p increases. 

We start with the denominator of (5), written as a polynomial in z 

P(z) = z^ - b^z - b^ (26) 



To insure that the original poles of (5) are inside the unit circle and 
hence that the original filter is stable, we apply Jury's test with necessary 
cond itions: 

P(l) = 1 - bj - b^ > 0 I 

(-1)'’P(-1) = 1 + bj - b2 > 0 / 

and sufficient condition: \ (27) 

I I < 1 J 

Hence, for this second-order digital filter to be stable, the values of bj 
and b 2 must be in the shaded area as shown in Fig. 10. Note that this is the 
case when p = 0, i.e., no augmentation. 

Now, assuming that the original filter transfer function (5) is stable, 
i.e., (27) is satisfied, it is desirable to determine conditions to assure 



14 



stability of the new p poles introduced in the system when the transfer 
function is p-augmented. For p = 1, the new polynomial introduced in the 
numerator and denominator is 

F(z) = z + (28) 

Applying Jury's test, conditions for stability are 

F(l) = 1 + bj > 0 1 

(29) 

(-1)"f(- 1) = 1 - bj > 0 I 

This condition, | bj | < 1, is shown graphically in figure 11. Similarly, for 
p = 2, the added polynomial is 

F(z) = z^ + bjz + (bj + b^) (30) 



Jury's test gives conditions for stable 
F(l) = 1 + bj + (bj + b2) > 0 

(-1)"f(- 1) = 1 - bj + (bj + b, 

and the sufficient condition is 




roots. The necessary conditions are 




The region defined by conditions (31) is shown graphically in figure 12. 
Finally, for p = 3 

F(z) = z^ + bjZ^ + (bj + + (bj + 2b^b2) 



15 



and the conditions for stable poles are (figure 13): 

F(l) = 1 + + (bj + b2) + (bj + > 0 

(-1)"f(- 1) = 1 - b^ + (b^ + b2) - (bj + Zbjb^) > 0 
I bf + 2b^b2 I < 1 

I (bj + 2b^b2)^ - 1 I > I bi(bj + 2bjb2) - (b^ + b2) | 

This procedure can be continued for higher values of p; in fact general 
stability tests based on Jury’s test have been developed [6]. Unfortunately, 
those tests are cumbersome to apply. 

Instead, let us take another approach. 

From figures 10-13 it may be observed that as the value of p increases, 
the area of stability (the shaded areas in the figures) tends to increase. It 
was seen graphically that as p increased, the stable region came to be closer 
and closer to the original shaded area for p = 0, i.e., figure 10. This leads 

to the obvious conjecture that all stable filters have a stable augmented 

filter for some large enough p. 

We will follow the procedure suggested by Voelcker [73 for Block Filters 

to show that for a large value of augmentation, p, the pipelined version of the 

original filter will always be stable, provided that the original second order 
filter is stable. This is a very important and useful conclusion and implies 
that an augmented stable realization is always possible if the original trans- 
fer function is stable. The proof for this fact follows: 

Since the process of augmentation implies multiplying the numerator and 
the denominator by the same polynomial, the original transfer function may be 
equated to the augmented transfer function as 




16 




where N(z) is the numerator of the original transfer function. Oenoting the 
denominator of the original transfer function as 0(z), since this original fil- 
ter is stable, the roots of D(z) are inside the unit circle. Figure 14(a) 
shows the original filter where Hp(z) is a pole only function representing 
the denominator, (1/D(z)) of the original filter. 

The process of augmenting the difference equation suggests building the 
filter as shown in figure 14(b) where 




and 




Thus 




(33) 



Hence, the nonrecursive transfer function a(z) may be written as 



a(z) = Hp(z) - z 



-(p+1) G(z) 

nrzT 



(34) 



Since H is a pole only transfer function 
P 



2 a. 




z . 



i=l 1 



z 



17 



Taking the inverse transform 



V"> " D 

(•1 

Since initial poles, they are inside the unit cir- 

c,e. Hence h^(n) 1s an Infinite se,eence decreasing in ^agnitede. If the 

sequence is truncated at the (p + 1) term, the remaining portion of the 
sequence 



'h'pCn) = 



i=l 



n 0, 1, •••, p 



otherwi se 



Taking the z-transform 



1^p(z) =^1^p(n)z-'’ 

n=0 



P 2 

E 

n=0 i=l 



E^iEH 

i=l n^ 



2 

i=l 



fi - (z,./z)P*n 

'1 1 - U,/z) j 



2 2 , , ,p+l 

a.. _ a-Cz-Zz)*" 



^ i - Kzjz) " S i - {z./z) 



i=l 



i=l 



V 



18 



Thus , 



Hp(z) = 



Hp(z) - 



•(P+1) 




(35) 



Comparing (34) and (35) and, since 1i'p(z) is a finite impulse response 
transfer function, equating 1lp(2) and a(z) 

2 a 7 
G(z) _ ^ ^i^i 

ITTzI' 2^ i - z./z 

i=l ’ 



Thus 



G(z) 

i=l 



a.Zi 



P+1 



D(z) 



(z^./z) 



(36) 



The intent of this procedure is to prove that, for some high value of p, all 
roots of a(z) are inside the unit circle in the z-domain. From (33) 

D(z)a(z) = 1 - z"^P'"^^G(z) (37) 

Since 0(z) is the denominator of the original transfer function, all roots of 
D(z) are inside the unit circle. Hence all roots of a(z) are also inside if, 
and only if, all roots of the right hand expression of (37) are inside the unit 
circle. To do this, let tilde (~) denote the result of the mapping 
z~^ z, i.e., Tiz) = f(z"^) and X(z) is defined as 

A(z) = 1 - z^P^^^^(z) =^(z)a(z) (38) 

All roots of D(z) are exterior (outside the unit circle). Hence, all roots of 
a(z) are exterior if all roots of A(z) are exterior. To prove this, Rouche's 
theorem is used which states that: "If f(z) and g(z) are analytic inside and 

on a closed contour C, and | g(z) | < j f(z) | on C, then f(z) and f(z) + g(z) have 



19 



the same number of zeroes inside C. " Clearly, A in (38) is analytic within and 
on the unit circle. The constant "1" has no interior zeroes. Hence A(z) will 
have no interior zeroes if, on the unit circle. 



z 



I < 1 




(39) 



From (4.20), 



z<P*‘>G(z) 



z=e 




(40) 



But I Zi I < 1 for all i because these are poles of the original filter. Thus, 
for some value of p greater than some critical value, equation (40) will satis- 
fied. Note that (40) may be satisfied for smaller value of p also. 

From the above discussion, it may be concluded that for some high value of 
augmentation, it is always possible to ensure that a(z) has roots inside the 
unit circle. Since these roots are the new additional poles of the augmented 
system, the new high order transfer function would be stable. 

As a result of this fact, the following design method is suggested. 

We assume that the designer is given a stable recursive digital filter 
transfer function H(z) and a desired sampling (clock) rate of operation, fj. 

1. Find k(f, stage pipeline multiply and k^ stage pipeline add units 
that will operate at clock frequency f^. 

2. Factor H(z) into Ji ()l - 1 if n is odd) 2*^^ order (and if n is 

s t 

odd, at most one 1 order) transfer functions. 



H(z) = H^(z) . H 2 (z) ...Hj^(z) 



where 4 = fn/21 



3. For each H-j(z), i = 1, 2, ... z. Realize H^(z) as follows: 
a. Set p - 2kg + km - 2, 



20 



b. Determine the polynomial 1 - = Bp(z~^), 

- 1 . 



c. Solve for the roots of Bp(z ) = 0 



(41) 



d. If all roots of B ( 2 "^) = 0 are outside the unit cir- 

P 

cle, go to e, else set p = p + 1 and return to step 35 . 

e. H-j(z) can be realized by a stable p augmented order difference 

equat ion 



, a'P) . a'P>z-l . ... a 
'* ' 1 - b'P’z-P-l - 

^ °p+l^ °p+2^ 



(24) 



where p 2 2kg + k^, - 2 

Because the last result, we know that we can always find a large 
enough p so that (24) will be stable. 

' In the next section, we will illustrate the method with an example filter 
taken from the 1 iterature. 

EXAMPLE 

We will treat the Butterworth filter considered by Oppenheim and Shafer 
[8], This is a sixth order filter whose original transfer function H(s) is 
given by 



H(s) = 



0.20238 



( s^+0 . 396S+0 . 5871 ) ( s^+1 . 083s+0 . 5871 ) ( s^+1 . 4802s+0 . 5871 ) 



(42) 



Applying the bi-1 inear transformation, a sixth order difference equation is 
obtained that can be factored in three second order filters. 



H(z) = H^(z) • W^iz) • H 3 (z) 



(43) 



21 



where 



\ _ .0007378 (1 + 2z~^ + z'^) 

rl. (Z) - _i _y 

^ (1 - 1.2686 z ^ + 0.7051 z 

H^u) a - 

^ (1 -1.0106 z ^ + 0.3583 z 

H (z) = — (1 

(1 - 0.9044 z ^ + 0.2155 z '^) 

The poles in the z'^plane of these equations are as follows 
Pj = 0.8996 ± j.7804 
I P J = 1.1909 
P, = 1.4103 ± j.8956 



1^2 



1.6706 



P, = 2.0984 ± j.4870 



P 3 I = 2.1542 



(44a) 

(44b) 

(44c) 

(45a) 

(45b) 

(45c) 



Thus all three of the factoring second order transfer functions are 
stabl e. 

p augmented order difference equations of the form of (16) were derived 
for each of the three sections, for values of p from 1 through 5. and 

H 2 were unstable for p=l and stable for all values of p from 2 through 5. 
H 3 was stable for all values of p from 1 through 5. 

If we had a 2 stage adder and a 2 stage multiplier, (21) would yield a p 
of 4, which for our example adds stable poles to each section. The general 
structure of each of the 2nd order (4-augmented) sections would be as shown in 
figure 15. 



22 



Finally, the values for the coefficient of each of the multipliers would 

be as shown in table 3. 

f4l 

Note that the a' ’ coefficient for the first section are all on the 
order of .001 in magnitude. These coefficient could easily be computed to more 
significant figures, although realization of more significant figures in the 
coefficient of an actual filter would require either scaling or the use of a 
floating point number system. 

This example shows how an IIR filter can be realized using high rate pipe- 
line modules, with the ultimate objective of achieving a higher sample rate 
than possible with non-pipel ined multiplier and adders. 



23 



SUMMARY AND CONCLUSIONS 



In this paper we have developed a method for applying pipeline techniques 
to the design of high speed recursive digital filters. Using these techniques, 
recursive digital filters operating at rates hitherto impossible can be 
designed. The general structure of the filter and the method for calculating 
the multiplier coefficients is presented. The stability of the resulting real- 
ization has been investigated and a technique for ascertaining the stability of 
the realization is presented. 

A significant example taken from the literature illustrates the technique 
and demonstrates the existence of a stable, high rate pipeline realization of a 
practically useful filter. 



4 



24 



REFERENCES 



1. Loomis, H. H. Jr., "The Maximum Rate Accumulator," IEEE TEC, EC-15, Number 
4, Aug. 1966, pp. 628-639. 

2. Chen, T. C. , "Overlap and Pipeline Processing," INTRODUCTION TO COMPUTER 
ARCHITECTURE, H Stone Editor, Science Research Associate, Chicago, Chapter 

9. 

3. Cray Research, Inc., CRAY-1 HARDWARE REFERENCE MANUAL, Cray Research 
Publication No. 2240004, Rev E, May 15, 1979. 

4. M. A. Soderstrand and E. L. Fields, "A High Speed Low-Cost Recursive 
Digital Filter Using Residue Number Arthmetic," PROC IEEE, July 1977. 

5. Sinha, B., PIPELINED FINITE STATE MACHINE ARCHITECTURE APPLIED TO DIGITAL 
FILTERS, Ph.D. Dissertation, University of California, Davis, CA, September 
1983. 

6. Nagrath, I. J. and Gopal , M. , CONTROL SYSTEMS ENGINEERING, Hal sted Press, 
John Wiley and Sons, 1982. 

7. Voelcker, H. B. and Hartquest, E. E., "Digital Filtering via Block 
Recursion," IEEE TRANSACTIONS on Audio and Electroacoustics, June 1970. 

8. Oppenheim, A. and Shafer, R. , DIGITAL SIGNAL PROCESSING," Prentice Hall, 
Englewood Cliffs, NJ, 1975. 



25 





Coeff of 


Coeff of 


p 


b<p> 

P+1 


b(P) 

%+2 


0 


^1 


^2 


1 


'>1 " ‘'2 


'>l‘>2 


2 


bj + 2b^b2 


b^b2 + b2 


3 


4 2 2 

bj + 3b^2 + b2 


4 2 2 3 

+ 3b^b2 + b^ 


4 


b^ + 4b^b2 + 3b^b2 


4 2 2 3 

bjb2 + 3b^b2 + b^ 


5 


b^+5bjb2+6b^b2+b2 


b^b2+4b^b2+3b^b2 



Table la. b's for p = 0, 1, ...5. 



P 


“0 


"7^ 

“1 


Jp) 

“2 


Jp) 

“3 


Jp) 

“4 




0 


1 


0 


0 


0 


0 


0 


1 


1 




0 


0 


0 


0 


2 


1 


^1 


»1 *■ ‘‘2 


0 


0 


0 


3 


1 


^1 


^1 ^ ’'2 


b^ + 2b^b2 


0 


0 


4 


1 


^1 




b^ + 2b^b2 


4 2 2 

b^ + 3bp2 + b2 


0 


5 


1 


^1 


^1 ^ *"2 


bj + 2b^b2 


4 2 2 

b^ + 3bp2 + b2 


b^ + 4b^b2 + 3b^b2 



Table lb. q s Tor* p~ 1> ... B. 



26 




Q- 

IT3 


O 


o 


o 


O 


c:: 


CNJ CNi 

-Q 

CO 

+ 

cv 

-Q 

CO ^ 
-Q 

-r 

CD —4 

-Q 

CNJ 

fO 

-1- 




























CNI CNJ 














-Q ^ 












CNJ CNJ 


^ CNJ CNJ 














-Q ^ 












+ 


CO + 


..—V 










CNI 


+ CNJ 


CL 












CNJ ^ 












CNJ ^ 


-Q CNJ ^ 


fT3 


o 


o 


o 


O 


jC 


CO j2 












CO 


-Q CO 












-h 


^ -f- 














4- ^ »— 4 














LT> ^ -Q 












X 


-Q 












CNi 


CNI 












fO 


1— • la 














la + 




























CNI CNJ 












X. 


-Q — X 












CNJ CNJ 


1—4 CNI CNJ 












-O 


^ -C 










X 


-K — X 


CO 4- . — X 










CNJ 


CNJ CNJ 


+ CNI CNi 










-Q 


-Q -Q 


CJ ^ 










f— ^ 


CNJ «— f 1—4 


-Q CNJ ^ ^ 










-C 


-Q -Q 


CO ^ -Q 


.-—V 








CNJ 


CO CNJ 


-Q CO CNJ 


CL 










+ + 


4^r -K + 


^ — LD 








ro ^ 


^ 4 CO ^ 


-4- rr ^ CO 1-4 


<T3 


o 


o 


o 




-Q -Q 


LO ^ -Q -Q 












X. ^ x, 


-Q ^ 










CNJ 


^ CNJ 


1—4 CNJ 










fT3 


»T3 fO 


O ro ro 










+ 


+ + 


la + + 


























CNJ CNJ 


CNI CNJ 












-Q 














+ ^ 












CNJ 


CNI CNJ 


CNJ CNI 








-—V 


-Q 


-Q ^ ^ 


-Q ^ 


»— «v 






CNJ 


^ CNJ 


CNJ ^ CNJ 


CNJ 1— 9 1—9 CNI 


CL 






-Q 


-Q -Q 


-Q ^ -Q 


-C -Q -Q 


' — 






■+• 


CNJ 


CO CNJ + 


CO CNJ -K 


fT3 


o 


o 


CVJ 


+ CNJ ^ 


-K + CNJ ^ 


+ + CNI 1-i 










n 4 -Q 


^ 1-4 CO ^ -a 


^ 1—9 CO —1 jG 










-Q — ' 


-Q -Q 


-Q 








OJ 


CNJ 


CNJ 


CNJ 








<n 




O 1— 9 ^T3 


O 1 — 4 ro 










(T3 + 


«T3 fT3 -♦- 


ra ra + 










+ 




-1- 













S. 













CNJ 


CNJ 


CNI 










-Q ^ 












CNJ 


t— » CNJ 


^ CNJ 


1-^ CNJ 


CL 










-Q 


-Q -Q 


^cn 








CNJ H- 


CNJ 


CNJ -H 


(o 


o 




CNJ ^ -Q 


+ CNJ 8 *-l 


+ CNJ r— 9 1 -H 


-f- CNI 1-4 ^ 






CSJ 


-Q CVJ 


m --4 ^ -Q 


CO ^ ^ ^ 


CO ^ ^ 






<T3 


(T3 


-Q CNJ 


^ CNJ 


-Q ^ — 'CNJ 








-J- 




,—4 ITS 


^ fO 








<T3 


O »T3 + 


O fo H- 


O fO -f 










fT3 + 


fO + 


fa + 




















C\J 


CNJ CNJ 


CNJ CNJ 


CNJ CNI 


CNi CNJ 






<T3 


-Q <T3 




^ (T3 


^ ta 


Q. 




+ 


-J- + 


+ + 


+ + 


+ + 


^CVJ 


C\J 




CNJ *-» ^ 


CNJ »— 1 *— 1 


CNJ i—» •— 9 


CNJ ^ 


tT3 


flj 










-Q -Q 








"x. — ^ f— 4 


Xx— ^ 


' 1— H 


X>^— ^ 1—4 






ro 


o 


O fTS 


O <T3 


o »a 








»T3 + 


»T3 + 


9T3 + 


fa ■4' 
















O*^ 




«T3 


ro 


rrs 


<T3 


fa 


CL 




-K 


-K 


+ 


-K 


+ 




f-H 


» '< 








<— 4 


fT3 


»T3 






X) 










O 


O 


o 


O 


o 






«T3 




8T3 


fa 


fa 


CL 


o 


O 


o 


o 


O 


o 




fT3 


<T3 


»T3 


fa 


ra 


fa 
















CL 


O 


f-H 


CNJ 


cn 




m 



27 



Table 2. Coefficients of eqiiation (17) 




28 



to realize 6th order Butterworth filter. 



High-Speed Recursive Digital Filter Realization 



Figure 1. 
Figure 2. 

Figure 3. 
Figure 4a. 
Figure 4b. 
Figure 5. 
Figure 6. 
Figure 7. 
Figure 8. 
Figure 9. 
Figure 10. 
Figure 11. 
Figure 12. 
Figure 13. 
Figure 14. 

Figure 15. 



LIST OF FIGURES 
Simple FIR Realization 

Realization of A(D)x using 3-stage pipeline multipliers, 2-stage 

pipeline adders and unit delays 

Canonical Realization of a Recursive Filter 

One-stage pipeline multiply-add unit 

One-stage pipeline multiply feeding one-stage add 

kg + k^i stage pipeline multiply-add module 

Heart of the pipeline recursive filter realization 

Non-recursive portion of pipeline filter realization 

Heart of more general pipeline filter realization 

Realization of FIR portion of filter 

Condition for stability of the original filter 

Condition for stability of the augmented filter: p=l 

Condition for stability of the augmented filter: p=2 

Condition for stability of the augmented filter: p=3 

(a) Original transfer function (b) Equivalent augmented transfer 

function 

Sixth-order pipeline realization of second-order recursive filter 



29 



>N 




30 



gure 1. Simple FIR Realization 



X 



+• 




31 



Figure 2. Realization of A(D)x using 3-staqe pipeline ^Itipliers, 2-stage 
pipeline adders and unit delays 




X 



32 



Figure 3. Canonical Realization of a Recursive Filter 




Figure 4a. One-stage pipeline multiply-add unit 




Figure 4b, One-stage pipeline multiply feeding one-stage add 



33 



X 




Figure 5. 




ka + stage pipeline mul ti ply-add module 



34 



D+ - +0 







<3 

O 



35 



Figure 6. Heart of the pipeline recursive filter realization 



X 




5 _ 



Non-recursive portion of pipeline filter realization 



A-l-2k 



! 




3 o 

o 

o 

o 



37 



Figure 8. Heart of more general pipeline filter realization 




X 



38 



Note* figure Illustrates case where p+l is 
integer multiple of k . 



2 

Polynomial* F(z) = z -b^z-b^sQ 
Conditions for Stability* 



1-b,-b2>0 © 

l + b,-b2>0 (D 

|b2|<l <D 




Figure 10. Condition for stability of the original filter 



39 



Polynomial- F(z)= z+b^ = 0 
Conditions for Stability- 



l + b ,>0 © 

1 - b ,>0 ® 




»b 



1 




Figure 11. Condition for stability of the augmented filter: p 



40 



2 2 

Polynomial' F(z)=z +b^z + (b^ + b2) 



Conditions for Stability' 

l + b^+(b^+b2) >0 © 

1 + b +(b^-hb2 >0 ® 

I b + b I < 1 © 

I 2| 




Figure 12. Condition for stability of the augmented filter: p=2 



41 



Polynomial: F(z)= z'’+bi2^+(o^+b2)2+(b^+2b-|b2) 



Conditions for Stability: 

l+b^+(b^+b2)+(b^+2b^b2) > 0 
l-bi+(b^+b2)-(b^+2b^b2) > 0 




Fig. 13 



Stability of augmented filter; p=3 



© © © 0 




(a) 



Hp(z) 




(b) 



Figure 14. (a) Original transfer function (b) Equivalent augmented transfer 

function 



43 





GO 



% 

CT Q 

cn T3 
I 

CD 




Figure 15. Sixth-order pipeline realization of second-order recursive filter 



DISTRIBUTION LIST 



No. Copies 



1. Defense Technical 
Information Center 
Cameron Station 
Alexandria, Virginia 22214 

2. Library, Code 0142 

Naval Postgraduate School 
Monterey, Cal ifornia 93943 

3. Commander 

Attention: CAPT Glover, Code 61 

Naval Electronic Systems Command 
National Center 1 
Washington, D.C. 20360 

4. Dr. Sydney Parker, Code 62Px 

Department of nectrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943 

5. Dr. Donald E. Kirk, Code 62 Ki 

Department of Hectrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943 

6. Dr. Richard Twogood 
Mail Stop L-156 

Lawrence Livermore National Laboratory 
P.O. Box 808 

Livermore, California 94550 

7. Dr. Herschel H. Loomis, Jr., Code 62Lm 
Department of nectrical and Computer Engineering 
Naval Postgraduate School 

Monterey, California 93943 

8. Dr. Michael Soderstrand 

Department of nectrical and Computer Engineering 
University of Cal ifornia 
Davis, California 95616 

9. Dr. Ralph A1 gazi 
Chairman 

Department of nectrical and Computer Engineering 
University of California 
Davis, California 95616 



2 



2 



1 



1 



1 



1 



6 



1 



1 



45 



No. Copies 



10. Dr. Abraham Sheingold, Code 62 1 

Department of □ectrical and Computer Engineering 

Naval Postgraduate School 
Monterey, Cal ifornia 93943 

11. Dr. Bhaskar Sinha 6 

IBM Corporation 

1000 NW 51 Street 

Boca Raton, Florida 33432 

12. Director, National Security Agency 1 

Attention: Mr. Jerry Freidman, W3 

Fort George G. Meade, Maryland 20755 

13. Naval Research Laboratory 1 

Code 9110, PME 106-5 

4555 Overl ook Avenue SW 
Washington, D.C. 20375 
Attention; Mr. Jim Morgan 

14. Naval Research Laboratory 1 

Code 9110, PME 106-5 

4555 Overlook Avenue SW 
Washington, D.C. 20375 
Attention: CDR Richard Rowe 

15. Director, National Security Agency 1 

Attention: Dr. Jeff Friedhoffer, R631 

Fort George G. Meade, Maryland 20755 



46 



f