LIBRARY
MONTEREY. CALIFORNIA 93940
NPS-62-84-061
<(
POSTGRADUATE SCHOOL
Monterey, California
HIGH-SPEED RECURSIVE DIGITAL
FILTER REALIZATION
Herschel H. Loomis, Jr.
Bhaskar Sinha
6 February 1984
public release; distribution unlimited.
onic Systems Command
D.C. 20360
FedDocs
D 208. 14/2
NPS-62-84-061
NAVAL POSTGRADUATE SCHOOL
Monterey, California
Commodore Robert H. Shumaker
Superintendent
David A. Schrady
Provost
This work was supported by the Naval Electronic Systems Command, Washington,
D.C.
Reproduction of all or part of this report is authorized.
This report was prepared by:
ij axTa . LIBRARy
SECURITY CLASSIFICATION OF THIS PAGE 0«f« Entmrmd)
REPORT DOCUMENTATION PAGE
READ INSTRUCTIONS
BEFORE COMPLETING FORM
1. REPORT NUMBER 2. GOVT ACCESSION NO.
NPS-62-84-061
3. RECIPIENT’S CATALOG NUMBER
4. title (and Subtltie)
High-Speed Recursive Digital Filter
Realization
5. TYPE OF REPORT 6 PERIOD COVERED
Technical
6. PERFORMING ORG. REPORT NUMBER
7. AOTHORf*;
Herschel H. Loomis, Jr.
Bhaskar Sinha
8. contract or grant NUMBER('«J
»■ PERFORMING ORGANIZATION NAME ANO ADDRESS
Naval Postgraduate School
Monterey, CA 93943
10. program ELEMENT. PROJECT, TASK
AREA 6 WORK UNIT NUMBERS
62734N
N0003983WRDY078
t1. CONTROLLING OFFICE NAME ANO ADDRESS
Naval □ ectronic Systems Command
Washington, DC 20360
12. REPORT DATE
6 February 1984
13. NUMBER OF pages
14. MONITORING AGENCY NAME & ADD RESS( H dl Karan t from ControUing OUlca)
15. SECURITY CLASS, (ot thl» ttpott)
UNCLASSIFIED
DECLASSIFICATION/ DOWNGRADING
schedule
16. DISTRIBUTION ST AT EmEHT (o( thi a Raport)
Approved for public release; distribution unlimited.
17. DISTRIBUTION STATEMENT (o( tha abatraet antarmd in Block 20» it dlUarant from Raport)
IS. supplementary NOTES
This is a preprint of a paper accepted for publication by CIRCUITS,
SYSTEMS, AND SIGNAL PROCESSING.
19. KEY WORDS ^Continue on ravataa alda it nacaaamry and idantlfy by block numbar)
Digital filters, high rate computation, high-speed logic, pipelined logic,
recursive digital filters.
20. ABSTRACT ^Con(/nu« on ravaraa alda It naeaaamry and Idantlty by block numbar)
Pipeline techniques have been successfully applied to speeding up processing
in both general and special purpose digital computers. Application of these
techniques to non-recursive (FIR) filters has been suggested and is quite
straightforward. Application to recursive (HR) filters has not previously
been shown. In this paper, the technique for applying pipeline techniques
to recursive filters is shown and the advantages and disadvantages of the
technique are discussed. Using these techniques, recursive digital filters
DD I jaN^3 1473 EDITION OF t MOV 68 IS OBSOLETE
5/N 0102- LF- 014- 6601
SECURITY CLASSIFICATION OF THIS PAGE (Whmn Dmtm Snttmd)
security classification of this page fWhan Dmim Bntmrmd)
operating at hitherto impossibly high rates can be designed.
S/N 0102- LF- 014- 6601
SECURITY CLASSIFICATION OF THIS PAGeni7i#n Bntmr^d)
High-Speed Recursive Digital Filter Realization
Herschel H. Loomis, Jr.
Bhaskar Sinha
Pre-print of paper to appear in
CIRCUITS, SYSTEMS, AND SIGNAL PROCESSING
High-Speed Recursive Digital Filter Realization*
Herschel H. Loomis, Jr.**
Bhaskar Sinha***
ABSTRACT
Pipeline techniques have been successfully applied to speeding up
processing in both general and special purpose digital computers. Application
of these techniques to non-recursive (FIR) filters has been suggested and is
quite straightforward. Application to recursive (HR) filters has not
previously been shown. In this paper, the technique for applying pipeline
techniques to recursive filters is shown and the advantages and disadvantages
of the technique are discussed. Using these techniques, recursive digital
filters operating at hitherto impossibly high rates can be designed.
*The basic research for this idea was supported by Department of Electrical and
Computer Engineering, University of California, Davis, Cal ifornia 95616. The
basic idea is the subject of a pending patent application. Subsequent research
was supported by the U.S. Naval Hectronic Systems Command.
**Department of □ectrical and Computer Engineering, Naval Postgraduate School ,
Monterey, Cal ifornia 93943. On leave from the Department of □ectrical and
Computer Engineering, University of California, Davis, California 95616.
***International Business Machines Corporation, Boca Raton, Florida 33432.
High-Speed Recursive Digital Filter Realization
ABSTRACT
Pipeline techniques have been successfully applied to speeding up proc-
essing in both general and special purpose digital computers. Application of
these techniques to non-recursive (FIR) filters has been suggested and is quite
straightforward. Application to recursive (IIR) filters has not previously
been shown. In this paper, the technique for applying pipeline techniques to
recursive filters is shown and the advantages and disadvantages of the tech-
nique are discussed. Using these techniques, recursive digital filters oper-
ating at hitherto impossibly high rates can be designed.
1
INTRODUCTION
Digital filters have an important place in the technology of processing
signals. Because of the sample theorem, the sampling rate which is also the
clock rate of the filter, must be higher than twice the highest frequency
component in the signal to be filtered. Thus the use of digital filters in
real time application involving high frequencies demands high sample rate.
Pipeline techniques are well known [1,2] ways to increase the clock rate
(throughput) of a particular state-of-the-art realization of a logic module.
For example, the Cray I supercomputer has a clock rate of 80 Mhz and requires a
7 stage pipeline multiplier to perform 64-bit floating point multiplication at
the rate of one product every 12.5 ns. [3].
Other workers have investigated residue number techniques to obtain high
rate digital filters [4]. However, difficulties in implementing division by a
constant impose limitations on such real izations , making scaling awkward and
precluding anything resembling floating point.
Pipeline techniques can thus be used to produce multipliers and adders
that operate at the maximum clock rate possible for a given state of the art of
logic devices. For example, TRW makes a non-pipeline 16 bit integer multiply
chip with input and output registers that can be operated at the rate of about
9 MHz. Pipeline techniques could be applied to that design, producing a new
chip that might have three stages and operate at near 40 MHz.
Discrete logic realization of the basic functions of multiplication and
division can yield even better performance. At the extreme we have some of the
integrated circuit ECL lines. Based on the experience gained with the pipeline
supercomputers, it is reasonable to expect that a three stage 12-bit integer
multiply and 1 to 2 stage 16-bit integer adder could be built to operate at
around 80 MHz. Floating point operation could be obtained at the same rate
2
with a cost of perhaps only 2 additional stages for the multiplier and 3 more
for the adder.
What finally limits the clock rate in a pipeline system is the net delay
of all logic and the register delay between successive registers, however
pipeline techniques will yield the highest throughput or clock rate possible
with a given state of the art.
Now it should be clear why we should consider the application of pipeline
techniques.
1) higher clock (sample) rates are possible for a given state of the
art.
2) more complex operations (i.e. floating point) are possible for a given
state of the art and clock rate.
Pipeline techniques can be easily applied to non-recursive systems, that
is, systems without feedback. In the digital filter arena, this would
correspond to non-recursive or Finite Impulse Response (FIR) filters. These
techniques are straight forward and proceed as follows:
An FIR filter can be described by a transfer function in the delay
operator (z"l or D) as in (1)
I - H(z-l) =
+ a.
+• . . + a
-m
m
( 1 )
Figure 1 shows the realization of such a filter using delayless add and
multiply modules as well as unit delays. That realization is unrealistic and
does not capitalize on the advantages of pipeline processors as discussed
above.
Let us suppose that we have a 1 -stage pi pel ine adder and a 3-stage
pi pel ine mul tipi ier.
3
Figure 2 shows a pipeline realization of the transfer function given in
(1). In fact, (1) is not exactly realized because of the delay of the adders.
Instead the output of the filter is y delayed by 7 clock periods. This,
however, is not a serious difficulty.
Thus, pipeline realization of FIR filters is straight forward, costing
only an added delay in the resulting output sequence rate, and perhaps some
additional adders over the realization of figure 1, if the adder has more than
one stage.
What then about doing this for HR filters? Applying pipeline techniques
to recursive calculations is not so strai ghtforward as it is in the non-
recursive case, and was first reported in [1] in connection with accumulator
design.
In the next section we shall consider the design of recursive (HR)
filters. Following that, we discuss the general attributes of the solution and
some of the unsolved problems associated with the technique. Finally, an
example of a sixth order Butterworth filter is treated, to illustrate the
technique.
4
PIPELINE RECURSIVE DESIGN
An order recursive or Infinite Impulse Response (IIR) filter can
be characterized by a ratio of two nth degree polynomials in z“l, the delay
operator, as shown in (2)
y =
ag + 3iZ ^ + ... + amZ
71 ^
1 - bjZ ^ -b2Z ^
-n
X =
A(z)
"ffTzT ""
( 2 )
This representation can be expressed using D, another notation for the
delay operator, as shown in (3). Time is measured in integer steps by the
index i.
y(T)
. A(D)
■ mi
x(i)
(aQ Sj^D + ... + a^^D ) x(i)
(1 - b.D - ... - b d"
1 n
( 3 )
where
D*^x(i) = x(i-k)
Rearranging (3) yields the familiar recursive formulation of the filter
equation, as shown in (4)
y(i) = (ag + a^D + ... + a^o"’) x(i) + (b^D + ... + b^o") y(i) (4)
Equation (4) leads directly to the canonical representation of the filter as
shown in figure 3, assuming the availability of a multiply-add unit with unit
delay (one clock pulse period delay). This realization actually does not yield
y, but in fact produces Dy, y delayed by one clock pulse period. The canonical
form shown in this figure can be realized, but unfortunately because the
multiply-add unit is very complex, the realization will be quite slow. If we
assume 12 bits of significance to the representation of the numbers in the
filter, we find that the unit has two 1-input, 12-bit constant multipliers
5
feeding a three-input by 12-bit adder. All of this complexity forces the clock
pulse period to be very long, and the clock or sampling rate as a consequence
to be low. A typical value or clock rate using state-of-the-art Schottkey
integrated circuit PROMs and adders would be on the order of 10 MHz. A slight
improvement in the rate of operation is possible if we use the basic module
shown in figure 4a. A delay of 1 clock pulse period is produced by the output
register. This module could be built to operate at a clock rate of perhaps 12
MHz, compared with the realization of figure 3.
What is desired is some means to take advantage of higher clock rate pipe-
line realization of the basic multiply-add module, thus producing a higher sam-
pling rate realization of the recursive digital filter.
In order to increase the clock rate of the multiply-add unit, we first use
the two input version, and then insert several registers for intermediate re-
sults in the coefficient multiply and adder sections. For example, a 2-stage
version is shown in Figure 4b that could be made to operate at 16 MHz. The ndt
result is a kp, + k^-stage (register) pipeline multiply-add unit as shown in
figure 5, with kp, stages involved in the coefficient multiply and k^ stages
in the adder.
When we attempt to use the module of figures 4 or 5, we find it impossible
to insert it in the feedback loop, preserving a minimum delay of only one. So,
if we are to use the pipeline module, some other way must be found.
For the purposes of the derivation which follows, we will restrict our
consideration to second-order filters, that is m=2, n=2. This in no way limits
the result, for the process used is not dependent on n, except for the value of
the individual coefficients. On the other hand, higher order filters can also
be realized as the cascade of rn/2"I 2 ^^ order filters, and at most one
filter of lower order.
Now we let n=2 and consider that specific case of
Y
T "
A(z"^)
aQ + a^z”^ + a2z“^
1 -
( 5 )
or represented in terms of the delay operator D, the usual computer
representation of delay.
x( i) _
yTTT
2
Qq + 820
1 - b^D - b20^
( 6 )
Writing this in a different form we have
y(i) = (aQ + a^O + a 2 D^)x(i) + (b^D + b 20 ^)y( i)
( 7 )
which can also be written in difference equation form:
y, . a„x, t ajx,.; t ajX,.^ * (^1-1 * '>2^t-2> <«>
For subsequent development we will use (7); furthermore, we will omit the (i)
notation. We will relate that form to the other forms where appropriate.
Let us first ignore the A(D) polynomial by substituting
x' = A(D)x (9)
into (7) yielding
y = x' + (b^Dy + b20^y) (10)
Delaying (10) and substituting for Dy in (9) we obtain
y = x' + b 2 ^( 0 x' + bjD^y + b 20 ^y) + b 2 D^y
7
or
y = (1 + b^D)x' + (b^ + b 2 )D^y + bjb 2 D^y (11)
(11) Is now a third order difference equation describing the same digital fil-
ter. That is to say, the filter described by (11) has the same transfer char-
acteristic as the one described by (7) or (10). Delaying (10) by and
substituting into (11) yields
y = [1 + bjD + (b^ + b 2 )D^]x'
+ (2bjb2 + b^)D^y + {hj + b^b2)D^ (12)
In general, successively more delayed versions of (10) can be substituted,
raising the order of the difference equation, but, more importantly, raising
also the minimum delay associated with the feedback y terms. The general
higher order difference equation has the following general form, provided we
started from a second-order equation:
y = [1 + D + 4'’’ * ••• 4”’
P+1
P+2
(13)
-th
In fact, if we began with an n order difference equation, we would have
an equation of p+n order resulting from the foregoing process as shown in (14)
y = [1 + D + ... + Dp]x'
(H)
The coefficients ajP^ and b|P^ can be calculated by following the process
described above in detail. For example, it should be clear from (12) that
8
( 15 )
.( 2 ) =
= b.
.( 2 )
(2) - OK K 4. k3 . k(2)
>2 “ 2b^b2 * ^4
2 2
+ b^b2
where the original b coefficients are from (10). We will cover the process in
detail a little later in this section.
Now let us recall what x' is in terms of the original IIR filter as des-
cribed in (7). Substituting x' = A(D)x into (13) yields:
y = [1 + D + ... + nP] [ag + a^O + a20^]x
This equation when multiplied out for the case of p=2 yields (17)
y = + aj^^D + ... + a^^^D^^lx
>(2) n3
+ [bA ^ 0 y + b; ^ D y]
(2) n4
(16)
(17a)
and
w' = [aj^^ + aj^^D + ... + a^^^D^]x
+ w' + b|^^D^ w']
(l^la)
These two equations look very similar now, the only difference being that
the non-recursive parts of (18a) is the non-recursive part of (17a) delayed by
A + 4. If both sides of (17a) are delayed by that amount, (17b) results:
[a^2) * a|2)D . ... . af >D^]x
(17b)
9
Finally, it should be noted that D^‘'’^y in (17b) is the same as w' in
(18a). substituting D^''’^y = w' in (17b) produces (17c):
w' = [a^^^ + aj^^ + ... + aj^^0^]x
+ w' + b|^^D^ w']
(17c)
which is the same as (18a).
What this all means is that the structure of figure 6 will produce an out-
put w' which is simply the desired signal y, delayed by a+4 sampling periods.
This digital filter structure uses pipeline logic units in both the recursive
and non-recursive portions to realize the desired output.
The design still needs to be completed, even though the heart of the
structure has been developed. Figure 7 shows the structure of the non-
recursive portion of the filter to complete the example.
The circuit of figure 7 produces D^Cag + a^H +...+ a 4 D^]x which
corresponds to the D^Cag + ajD +...+ a 40 ^]x term required as the out-
put of the non-recursive part of figure 7. Therefore, the pipel ine digital
filter of figure 7 combined with figure 6 produces that is y, de-
layed by 8 sample periods.
Now we return to the calculation of the and b^^^
coefficients. More general relations than (15) can be developed in difference
equation form with respect to p.
Starting with the p augmented order (p+2 order) difference equation (13)
we can derive the p+1 augmented order equation by substituting for
[(10) delayed by into (13).
10
y =
[1 + ajP^ D + ... + dP]x'
+ b|jP{ x' +(bj dP"^^ y + b 2 dP'^^ y)]
* b<P| y
[1 + ,(P> D t ... + c,<P> dP t b<P> dP*']x’
* (b[,Sl bi - b<P>) dP "2 y . b'Pj bj 0 P" 3 y
From (19) we can see that
„(P+1)
“p+1 .
b';r>
•b'Sl
= b'?l bi * b(P>
“ bp?l bj
Where agP^= 1, b|*^^= bj, and b 2 *^^= b.
(19)
(20a)
(20b)
(20c)
Table 1 shows the values of the coefficients of (13) for the values of p from 0
‘f' ^
through 5 as derived from (20). Furthermore, multiplication of the p*”^
order polynomial
ip> 1
0 1 p
by the original
2
ag + aj^D + a20
yields the polynomial
a
(P)
0
+ a<P>0
+
• • •
+ a
(p)nP+2
p+2^
11
for p
as shown in (16) and (18). Table 2 shows the values of aj^^
from 0 through 5.
The unique structure of this realization, which differs from conventional
realizations of digital filters is contained in the recursive portion, where
the basic feedback loop involving the longest delay passes through p+2 pipeline
nrl
stages, for the realization of a 2”° order filter. In the example devel-
oped, p was 2, causing us to use a 4^ order pipeline filter represen-
tation of the original 2*^*^ order filter.
In general, we assume we have stage pipeline add units and k^ stage
pipeline multipliers. The structures of the ralization of the feedback portion
of the filter would be as shown in figure 8. Since the maximum delay around
the loop must match the p augmented difference equation order, we have
p = 2k^.k^-2 (21)
The FIR portion of the filter is shown in general terms in figure 14.
From this figure, it should be clear that
4 ' ^ ^ ^ "m
and that the over all delay is therefore
4 + 2IC5 = [io 92(I<3)]|C3 t t (22)
STABILITY ISSUES
In the previous section we have seen the general process of realization of
a digital filter transfer function using pipel inemultiply-add units with delay
2. We have also shown a realization using 2-stage multiply-add units. The
essence of the techique is to represent a second-order section by means of a
(2+p) order section of a particular form, where p is determined by the
number of stages in the pipeline multiply and add units.
12
We know from other considerations that the transfer function of the higher
order realization must in fact be the same as the original and therefore must
have the same poles and zeros in the z plane. Thus, the operation which trans-
forms (7) into (17) can be thought of in the following way:
(aQ + a^z'^ + a2z'^) (1 + aJP^z'^ + ... + aJ^^^z'P)
(1 - b^z"'*’ - b2Z^) (1 + + ... + Op^^z"*^)
a(p) + a(p) ,-l + + a(p) ^-P-^
aQ + aj z + ... + ap ^2 ^
1 - b^P^ z'P"^ - b^P^ z'P"^
p+1 p+2
(23)
(24)
Where the coefficients of the a polynomial depend on the original bj and
b 2 of (5) and (7). These a coefficients are governed by (20) and some were
tabulated in Table lb. Thus, we raise the order of the denominator, introduc-
ing p new poles and corresponding cancelling zeros. These new poles correspond
to the roots of
0=1+ D + ... + dP (25)
The roots of a(D) of course are the poles in the z~^ plane and are
the reciprocal of the poles in z plane.
One concern in the realization is that the filter be stable, that is, that
it have an impulse response which decays to zero. A sufficent condition for
the stability of a digital filter with transfer function Hfz”^) is that
the poles of H(z~^) in the z~^ plane lie outside the unit circle,
and that the order of the numerator (-m) be less than or equal in magnitude to
the order (-n) of the denominator. A more familiar sufficient condition is
that the poles of the transfer function of z (H(z)) be inside the unit circle.
13
Now, if we start with a stable filter, we would like to be assured that
the poles of the augmented order filter all be outside the unit circle in the
z~^ plane. We desire this because, even though the added poles are
cancelled by the zeros of (1 + + ... + realization imperfec-
tions will prevent exact cancellation and the augmented filter would be
unstable.
In order to examine the stability question, we will make use of Jury's
stability test [6] as applied to the successively higher degree denominator
polynomials in z as p increases.
We start with the denominator of (5), written as a polynomial in z
P(z) = z^ - b^z - b^ (26)
To insure that the original poles of (5) are inside the unit circle and
hence that the original filter is stable, we apply Jury's test with necessary
cond itions:
P(l) = 1 - bj - b^ > 0 I
(-1)'’P(-1) = 1 + bj - b2 > 0 /
and sufficient condition: \ (27)
I I < 1 J
Hence, for this second-order digital filter to be stable, the values of bj
and b 2 must be in the shaded area as shown in Fig. 10. Note that this is the
case when p = 0, i.e., no augmentation.
Now, assuming that the original filter transfer function (5) is stable,
i.e., (27) is satisfied, it is desirable to determine conditions to assure
14
stability of the new p poles introduced in the system when the transfer
function is p-augmented. For p = 1, the new polynomial introduced in the
numerator and denominator is
F(z) = z + (28)
Applying Jury's test, conditions for stability are
F(l) = 1 + bj > 0 1
(29)
(-1)"f(- 1) = 1 - bj > 0 I
This condition, | bj | < 1, is shown graphically in figure 11. Similarly, for
p = 2, the added polynomial is
F(z) = z^ + bjz + (bj + b^) (30)
Jury's test gives conditions for stable
F(l) = 1 + bj + (bj + b2) > 0
(-1)"f(- 1) = 1 - bj + (bj + b,
and the sufficient condition is
roots. The necessary conditions are
The region defined by conditions (31) is shown graphically in figure 12.
Finally, for p = 3
F(z) = z^ + bjZ^ + (bj + + (bj + 2b^b2)
15
and the conditions for stable poles are (figure 13):
F(l) = 1 + + (bj + b2) + (bj + > 0
(-1)"f(- 1) = 1 - b^ + (b^ + b2) - (bj + Zbjb^) > 0
I bf + 2b^b2 I < 1
I (bj + 2b^b2)^ - 1 I > I bi(bj + 2bjb2) - (b^ + b2) |
This procedure can be continued for higher values of p; in fact general
stability tests based on Jury’s test have been developed [6]. Unfortunately,
those tests are cumbersome to apply.
Instead, let us take another approach.
From figures 10-13 it may be observed that as the value of p increases,
the area of stability (the shaded areas in the figures) tends to increase. It
was seen graphically that as p increased, the stable region came to be closer
and closer to the original shaded area for p = 0, i.e., figure 10. This leads
to the obvious conjecture that all stable filters have a stable augmented
filter for some large enough p.
We will follow the procedure suggested by Voelcker [73 for Block Filters
to show that for a large value of augmentation, p, the pipelined version of the
original filter will always be stable, provided that the original second order
filter is stable. This is a very important and useful conclusion and implies
that an augmented stable realization is always possible if the original trans-
fer function is stable. The proof for this fact follows:
Since the process of augmentation implies multiplying the numerator and
the denominator by the same polynomial, the original transfer function may be
equated to the augmented transfer function as
16
where N(z) is the numerator of the original transfer function. Oenoting the
denominator of the original transfer function as 0(z), since this original fil-
ter is stable, the roots of D(z) are inside the unit circle. Figure 14(a)
shows the original filter where Hp(z) is a pole only function representing
the denominator, (1/D(z)) of the original filter.
The process of augmenting the difference equation suggests building the
filter as shown in figure 14(b) where
and
Thus
(33)
Hence, the nonrecursive transfer function a(z) may be written as
a(z) = Hp(z) - z
-(p+1) G(z)
nrzT
(34)
Since H is a pole only transfer function
P
2 a.
z .
i=l 1
z
17
Taking the inverse transform
V"> " D
(•1
Since initial poles, they are inside the unit cir-
c,e. Hence h^(n) 1s an Infinite se,eence decreasing in ^agnitede. If the
sequence is truncated at the (p + 1) term, the remaining portion of the
sequence
'h'pCn) =
i=l
n 0, 1, •••, p
otherwi se
Taking the z-transform
1^p(z) =^1^p(n)z-'’
n=0
P 2
E
n=0 i=l
E^iEH
i=l n^
2
i=l
fi - (z,./z)P*n
'1 1 - U,/z) j
2 2 , , ,p+l
a.. _ a-Cz-Zz)*"
^ i - Kzjz) " S i - {z./z)
i=l
i=l
V
18
Thus ,
Hp(z) =
Hp(z) -
•(P+1)
(35)
Comparing (34) and (35) and, since 1i'p(z) is a finite impulse response
transfer function, equating 1lp(2) and a(z)
2 a 7
G(z) _ ^ ^i^i
ITTzI' 2^ i - z./z
i=l ’
Thus
G(z)
i=l
a.Zi
P+1
D(z)
(z^./z)
(36)
The intent of this procedure is to prove that, for some high value of p, all
roots of a(z) are inside the unit circle in the z-domain. From (33)
D(z)a(z) = 1 - z"^P'"^^G(z) (37)
Since 0(z) is the denominator of the original transfer function, all roots of
D(z) are inside the unit circle. Hence all roots of a(z) are also inside if,
and only if, all roots of the right hand expression of (37) are inside the unit
circle. To do this, let tilde (~) denote the result of the mapping
z~^ z, i.e., Tiz) = f(z"^) and X(z) is defined as
A(z) = 1 - z^P^^^^(z) =^(z)a(z) (38)
All roots of D(z) are exterior (outside the unit circle). Hence, all roots of
a(z) are exterior if all roots of A(z) are exterior. To prove this, Rouche's
theorem is used which states that: "If f(z) and g(z) are analytic inside and
on a closed contour C, and | g(z) | < j f(z) | on C, then f(z) and f(z) + g(z) have
19
the same number of zeroes inside C. " Clearly, A in (38) is analytic within and
on the unit circle. The constant "1" has no interior zeroes. Hence A(z) will
have no interior zeroes if, on the unit circle.
z
I < 1
(39)
From (4.20),
z<P*‘>G(z)
z=e
(40)
But I Zi I < 1 for all i because these are poles of the original filter. Thus,
for some value of p greater than some critical value, equation (40) will satis-
fied. Note that (40) may be satisfied for smaller value of p also.
From the above discussion, it may be concluded that for some high value of
augmentation, it is always possible to ensure that a(z) has roots inside the
unit circle. Since these roots are the new additional poles of the augmented
system, the new high order transfer function would be stable.
As a result of this fact, the following design method is suggested.
We assume that the designer is given a stable recursive digital filter
transfer function H(z) and a desired sampling (clock) rate of operation, fj.
1. Find k(f, stage pipeline multiply and k^ stage pipeline add units
that will operate at clock frequency f^.
2. Factor H(z) into Ji ()l - 1 if n is odd) 2*^^ order (and if n is
s t
odd, at most one 1 order) transfer functions.
H(z) = H^(z) . H 2 (z) ...Hj^(z)
where 4 = fn/21
3. For each H-j(z), i = 1, 2, ... z. Realize H^(z) as follows:
a. Set p - 2kg + km - 2,
20
b. Determine the polynomial 1 - = Bp(z~^),
- 1 .
c. Solve for the roots of Bp(z ) = 0
(41)
d. If all roots of B ( 2 "^) = 0 are outside the unit cir-
P
cle, go to e, else set p = p + 1 and return to step 35 .
e. H-j(z) can be realized by a stable p augmented order difference
equat ion
, a'P) . a'P>z-l . ... a
'* ' 1 - b'P’z-P-l -
^ °p+l^ °p+2^
(24)
where p 2 2kg + k^, - 2
Because the last result, we know that we can always find a large
enough p so that (24) will be stable.
' In the next section, we will illustrate the method with an example filter
taken from the 1 iterature.
EXAMPLE
We will treat the Butterworth filter considered by Oppenheim and Shafer
[8], This is a sixth order filter whose original transfer function H(s) is
given by
H(s) =
0.20238
( s^+0 . 396S+0 . 5871 ) ( s^+1 . 083s+0 . 5871 ) ( s^+1 . 4802s+0 . 5871 )
(42)
Applying the bi-1 inear transformation, a sixth order difference equation is
obtained that can be factored in three second order filters.
H(z) = H^(z) • W^iz) • H 3 (z)
(43)
21
where
\ _ .0007378 (1 + 2z~^ + z'^)
rl. (Z) - _i _y
^ (1 - 1.2686 z ^ + 0.7051 z
H^u) a -
^ (1 -1.0106 z ^ + 0.3583 z
H (z) = — (1
(1 - 0.9044 z ^ + 0.2155 z '^)
The poles in the z'^plane of these equations are as follows
Pj = 0.8996 ± j.7804
I P J = 1.1909
P, = 1.4103 ± j.8956
1^2
1.6706
P, = 2.0984 ± j.4870
P 3 I = 2.1542
(44a)
(44b)
(44c)
(45a)
(45b)
(45c)
Thus all three of the factoring second order transfer functions are
stabl e.
p augmented order difference equations of the form of (16) were derived
for each of the three sections, for values of p from 1 through 5. and
H 2 were unstable for p=l and stable for all values of p from 2 through 5.
H 3 was stable for all values of p from 1 through 5.
If we had a 2 stage adder and a 2 stage multiplier, (21) would yield a p
of 4, which for our example adds stable poles to each section. The general
structure of each of the 2nd order (4-augmented) sections would be as shown in
figure 15.
22
Finally, the values for the coefficient of each of the multipliers would
be as shown in table 3.
f4l
Note that the a' ’ coefficient for the first section are all on the
order of .001 in magnitude. These coefficient could easily be computed to more
significant figures, although realization of more significant figures in the
coefficient of an actual filter would require either scaling or the use of a
floating point number system.
This example shows how an IIR filter can be realized using high rate pipe-
line modules, with the ultimate objective of achieving a higher sample rate
than possible with non-pipel ined multiplier and adders.
23
SUMMARY AND CONCLUSIONS
In this paper we have developed a method for applying pipeline techniques
to the design of high speed recursive digital filters. Using these techniques,
recursive digital filters operating at rates hitherto impossible can be
designed. The general structure of the filter and the method for calculating
the multiplier coefficients is presented. The stability of the resulting real-
ization has been investigated and a technique for ascertaining the stability of
the realization is presented.
A significant example taken from the literature illustrates the technique
and demonstrates the existence of a stable, high rate pipeline realization of a
practically useful filter.
4
24
REFERENCES
1. Loomis, H. H. Jr., "The Maximum Rate Accumulator," IEEE TEC, EC-15, Number
4, Aug. 1966, pp. 628-639.
2. Chen, T. C. , "Overlap and Pipeline Processing," INTRODUCTION TO COMPUTER
ARCHITECTURE, H Stone Editor, Science Research Associate, Chicago, Chapter
9.
3. Cray Research, Inc., CRAY-1 HARDWARE REFERENCE MANUAL, Cray Research
Publication No. 2240004, Rev E, May 15, 1979.
4. M. A. Soderstrand and E. L. Fields, "A High Speed Low-Cost Recursive
Digital Filter Using Residue Number Arthmetic," PROC IEEE, July 1977.
5. Sinha, B., PIPELINED FINITE STATE MACHINE ARCHITECTURE APPLIED TO DIGITAL
FILTERS, Ph.D. Dissertation, University of California, Davis, CA, September
1983.
6. Nagrath, I. J. and Gopal , M. , CONTROL SYSTEMS ENGINEERING, Hal sted Press,
John Wiley and Sons, 1982.
7. Voelcker, H. B. and Hartquest, E. E., "Digital Filtering via Block
Recursion," IEEE TRANSACTIONS on Audio and Electroacoustics, June 1970.
8. Oppenheim, A. and Shafer, R. , DIGITAL SIGNAL PROCESSING," Prentice Hall,
Englewood Cliffs, NJ, 1975.
25
Coeff of
Coeff of
p
b<p>
P+1
b(P)
%+2
0
^1
^2
1
'>1 " ‘'2
'>l‘>2
2
bj + 2b^b2
b^b2 + b2
3
4 2 2
bj + 3b^2 + b2
4 2 2 3
+ 3b^b2 + b^
4
b^ + 4b^b2 + 3b^b2
4 2 2 3
bjb2 + 3b^b2 + b^
5
b^+5bjb2+6b^b2+b2
b^b2+4b^b2+3b^b2
Table la. b's for p = 0, 1, ...5.
P
“0
"7^
“1
Jp)
“2
Jp)
“3
Jp)
“4
0
1
0
0
0
0
0
1
1
0
0
0
0
2
1
^1
»1 *■ ‘‘2
0
0
0
3
1
^1
^1 ^ ’'2
b^ + 2b^b2
0
0
4
1
^1
b^ + 2b^b2
4 2 2
b^ + 3bp2 + b2
0
5
1
^1
^1 ^ *"2
bj + 2b^b2
4 2 2
b^ + 3bp2 + b2
b^ + 4b^b2 + 3b^b2
Table lb. q s Tor* p~ 1> ... B.
26
Q-
IT3
O
o
o
O
c::
CNJ CNi
-Q
CO
+
cv
-Q
CO ^
-Q
-r
CD —4
-Q
CNJ
fO
-1-
CNI CNJ
-Q ^
CNJ CNJ
^ CNJ CNJ
-Q ^
+
CO +
..—V
CNI
+ CNJ
CL
CNJ ^
CNJ ^
-Q CNJ ^
fT3
o
o
o
O
jC
CO j2
CO
-Q CO
-h
^ -f-
4- ^ »— 4
LT> ^ -Q
X
-Q
CNi
CNI
fO
1— • la
la +
CNI CNJ
X.
-Q — X
CNJ CNJ
1—4 CNI CNJ
-O
^ -C
X
-K — X
CO 4- . — X
CNJ
CNJ CNJ
+ CNI CNi
-Q
-Q -Q
CJ ^
f— ^
CNJ «— f 1—4
-Q CNJ ^ ^
-C
-Q -Q
CO ^ -Q
.-—V
CNJ
CO CNJ
-Q CO CNJ
CL
+ +
4^r -K +
^ — LD
ro ^
^ 4 CO ^
-4- rr ^ CO 1-4
<T3
o
o
o
-Q -Q
LO ^ -Q -Q
X. ^ x,
-Q ^
CNJ
^ CNJ
1—4 CNJ
fT3
»T3 fO
O ro ro
+
+ +
la + +
CNJ CNJ
CNI CNJ
-Q
+ ^
CNJ
CNI CNJ
CNJ CNI
-—V
-Q
-Q ^ ^
-Q ^
»— «v
CNJ
^ CNJ
CNJ ^ CNJ
CNJ 1— 9 1—9 CNI
CL
-Q
-Q -Q
-Q ^ -Q
-C -Q -Q
' —
■+•
CNJ
CO CNJ +
CO CNJ -K
fT3
o
o
CVJ
+ CNJ ^
-K + CNJ ^
+ + CNI 1-i
n 4 -Q
^ 1-4 CO ^ -a
^ 1—9 CO —1 jG
-Q — '
-Q -Q
-Q
OJ
CNJ
CNJ
CNJ
<n
O 1— 9 ^T3
O 1 — 4 ro
(T3 +
«T3 fT3 -♦-
ra ra +
+
-1-
S.
CNJ
CNJ
CNI
-Q ^
CNJ
t— » CNJ
^ CNJ
1-^ CNJ
CL
-Q
-Q -Q
^cn
CNJ H-
CNJ
CNJ -H
(o
o
CNJ ^ -Q
+ CNJ 8 *-l
+ CNJ r— 9 1 -H
-f- CNI 1-4 ^
CSJ
-Q CVJ
m --4 ^ -Q
CO ^ ^ ^
CO ^ ^
<T3
(T3
-Q CNJ
^ CNJ
-Q ^ — 'CNJ
-J-
,—4 ITS
^ fO
<T3
O »T3 +
O fo H-
O fO -f
fT3 +
fO +
fa +
C\J
CNJ CNJ
CNJ CNJ
CNJ CNI
CNi CNJ
<T3
-Q <T3
^ (T3
^ ta
Q.
+
-J- +
+ +
+ +
+ +
^CVJ
C\J
CNJ *-» ^
CNJ »— 1 *— 1
CNJ i—» •— 9
CNJ ^
tT3
flj
-Q -Q
"x. — ^ f— 4
Xx— ^
' 1— H
X>^— ^ 1—4
ro
o
O fTS
O <T3
o »a
»T3 +
»T3 +
9T3 +
fa ■4'
O*^
«T3
ro
rrs
<T3
fa
CL
-K
-K
+
-K
+
f-H
» '<
<— 4
fT3
»T3
X)
O
O
o
O
o
«T3
8T3
fa
fa
CL
o
O
o
o
O
o
fT3
<T3
»T3
fa
ra
fa
CL
O
f-H
CNJ
cn
m
27
Table 2. Coefficients of eqiiation (17)
28
to realize 6th order Butterworth filter.
High-Speed Recursive Digital Filter Realization
Figure 1.
Figure 2.
Figure 3.
Figure 4a.
Figure 4b.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.
Figure 12.
Figure 13.
Figure 14.
Figure 15.
LIST OF FIGURES
Simple FIR Realization
Realization of A(D)x using 3-stage pipeline multipliers, 2-stage
pipeline adders and unit delays
Canonical Realization of a Recursive Filter
One-stage pipeline multiply-add unit
One-stage pipeline multiply feeding one-stage add
kg + k^i stage pipeline multiply-add module
Heart of the pipeline recursive filter realization
Non-recursive portion of pipeline filter realization
Heart of more general pipeline filter realization
Realization of FIR portion of filter
Condition for stability of the original filter
Condition for stability of the augmented filter: p=l
Condition for stability of the augmented filter: p=2
Condition for stability of the augmented filter: p=3
(a) Original transfer function (b) Equivalent augmented transfer
function
Sixth-order pipeline realization of second-order recursive filter
29
>N
30
gure 1. Simple FIR Realization
X
+•
31
Figure 2. Realization of A(D)x using 3-staqe pipeline ^Itipliers, 2-stage
pipeline adders and unit delays
X
32
Figure 3. Canonical Realization of a Recursive Filter
Figure 4a. One-stage pipeline multiply-add unit
Figure 4b, One-stage pipeline multiply feeding one-stage add
33
X
Figure 5.
ka + stage pipeline mul ti ply-add module
34
D+ - +0
<3
O
35
Figure 6. Heart of the pipeline recursive filter realization
X
5 _
Non-recursive portion of pipeline filter realization
A-l-2k
!
3 o
o
o
o
37
Figure 8. Heart of more general pipeline filter realization
X
38
Note* figure Illustrates case where p+l is
integer multiple of k .
2
Polynomial* F(z) = z -b^z-b^sQ
Conditions for Stability*
1-b,-b2>0 ©
l + b,-b2>0 (D
|b2|<l <D
Figure 10. Condition for stability of the original filter
39
Polynomial- F(z)= z+b^ = 0
Conditions for Stability-
l + b ,>0 ©
1 - b ,>0 ®
»b
1
Figure 11. Condition for stability of the augmented filter: p
40
2 2
Polynomial' F(z)=z +b^z + (b^ + b2)
Conditions for Stability'
l + b^+(b^+b2) >0 ©
1 + b +(b^-hb2 >0 ®
I b + b I < 1 ©
I 2|
Figure 12. Condition for stability of the augmented filter: p=2
41
Polynomial: F(z)= z'’+bi2^+(o^+b2)2+(b^+2b-|b2)
Conditions for Stability:
l+b^+(b^+b2)+(b^+2b^b2) > 0
l-bi+(b^+b2)-(b^+2b^b2) > 0
Fig. 13
Stability of augmented filter; p=3
© © © 0
(a)
Hp(z)
(b)
Figure 14. (a) Original transfer function (b) Equivalent augmented transfer
function
43
GO
%
CT Q
cn T3
I
CD
Figure 15. Sixth-order pipeline realization of second-order recursive filter
DISTRIBUTION LIST
No. Copies
1. Defense Technical
Information Center
Cameron Station
Alexandria, Virginia 22214
2. Library, Code 0142
Naval Postgraduate School
Monterey, Cal ifornia 93943
3. Commander
Attention: CAPT Glover, Code 61
Naval Electronic Systems Command
National Center 1
Washington, D.C. 20360
4. Dr. Sydney Parker, Code 62Px
Department of nectrical and Computer Engineering
Naval Postgraduate School
Monterey, California 93943
5. Dr. Donald E. Kirk, Code 62 Ki
Department of Hectrical and Computer Engineering
Naval Postgraduate School
Monterey, California 93943
6. Dr. Richard Twogood
Mail Stop L-156
Lawrence Livermore National Laboratory
P.O. Box 808
Livermore, California 94550
7. Dr. Herschel H. Loomis, Jr., Code 62Lm
Department of nectrical and Computer Engineering
Naval Postgraduate School
Monterey, California 93943
8. Dr. Michael Soderstrand
Department of nectrical and Computer Engineering
University of Cal ifornia
Davis, California 95616
9. Dr. Ralph A1 gazi
Chairman
Department of nectrical and Computer Engineering
University of California
Davis, California 95616
2
2
1
1
1
1
6
1
1
45
No. Copies
10. Dr. Abraham Sheingold, Code 62 1
Department of □ectrical and Computer Engineering
Naval Postgraduate School
Monterey, Cal ifornia 93943
11. Dr. Bhaskar Sinha 6
IBM Corporation
1000 NW 51 Street
Boca Raton, Florida 33432
12. Director, National Security Agency 1
Attention: Mr. Jerry Freidman, W3
Fort George G. Meade, Maryland 20755
13. Naval Research Laboratory 1
Code 9110, PME 106-5
4555 Overl ook Avenue SW
Washington, D.C. 20375
Attention; Mr. Jim Morgan
14. Naval Research Laboratory 1
Code 9110, PME 106-5
4555 Overlook Avenue SW
Washington, D.C. 20375
Attention: CDR Richard Rowe
15. Director, National Security Agency 1
Attention: Dr. Jeff Friedhoffer, R631
Fort George G. Meade, Maryland 20755
46
f