Skip to main content

Full text of "ERIC ED036867: Comments on Professor Glaser's Paper Entitled "Evaluation of Instruction and Changing Educational Models." Center for the Study of Evaluation of Instructional Programs Occasional Report No. 15."

See other formats


DOCUMENT RESUME 



ED 036 867 



24 



CG 005 166 



AUTHOR 

TITLE 



INSTITUTION 

SPONS AGENCY 

BUREAU NO 
PUB DATE 
CONTRACT 
NOTE 



EDES PRICE 
DESCRIPTORS 



Lumsdaine, Arthur A« 

Comments on Professor Glaser’s Paper Entitled 
’’Evaluation of Instruction and Changing Educational 
Models. ” Center for the Study of Evaluation of 
Instructional Programs Occasional Report No., 15« 
California Uiiiv,,, Los Angeles*. Center for the Study 
of Evaluation. 

Office of Education (DHEW) , Washington, D.C. Bureau 
of Research. 

BR-6- 1646 
Sep 68 

OEC-4-6-061646-1909 

lip.; Paper from the Proceedings of the Symposium on 
Problems in the Evaluation of Instruction, Los 
Angeles, California, December 13 — 15, 1 967« 

EDRS Price MF-ipO<.25 HC-$0.65 
Behavioral Objectives, ^•'Critical Reading, 

Educational Programs, ^Evaluation, Evaluation 
Techniques, *I nstruction , Measurement, ^Models, 
Objectives 



ABSTRACT 

Professor Lumsdaine, while in general agreement with 
Professor Glaser on the importance of behavioral objectives, stresses 
the need for providing an explicit rationale or foundation for 
evaluation. Emphasis is placed upon improving the technology of 
evaluation and creating a better market for assessment data and the 
standards for the adequacy of such data. Also indicated is the need 
for different procedures for different purposes of evaluation. 

(Author) 



C!G0(^lj6« EDO 36867 



m 









CSE Report No 



15 



j 

t 

»' 



COHMENTS ON PROFESSOR GLASER'S PAPER ENTITLED 
"EVALUATION OF INSTRUCTION AND CHANGING EDUCATIONAL MODELS" 



Arthur A. Lumsdaine 



Center for the 

iStudy of 



lERiC 



OF INSTRUCTIONAL 
PROGRAMS 

University o£ California, Los Angeles, September 1968 



L. 



CO-DIRECTORS 



Merlin C. WittrocR Erick L. Lindman 
ASSOCIATE DIRECTORS 

Marvin C. Alkin Frank Massey, Jr. C. Robert Pace 



The CENTER FOR THE STUDY OF EVALUATION OF INSTRUCTIONAL 
PROGRAMS is engaged in research that will yield new ideas 
and new tools capable o£ analyzing and evaluating instruc- 
tion. Staff members are creating new ways to evaluate con- 
tent o£ curricula, methods of teaching and the multiple 
effects of both on students . The CENTER is unique because 
of its access to Southern California’s elementary, second- 
ary and higher schools of diverse socio-economic levels 
and cultural backgrounds. 



ED036867 



U.S. DEPARTMENT OF HEAtTH, EDUCATION & WEIFARE 
OFFICE OF EDUCATION 



THIS DOCUMENT HAS BEEN REPRODUCED EJSACTiy AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING IT, POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY. 



COMMENTS ON PROFESSOR GLASER'S PAPER ENTITLED 
"EVALUATION OF INSTRUCTION AND CHANGING EDUCATIONAL MODELS" 



Arthur A. Lumsdaine 
University of Washington 



From the Proceedings of the 

SYMPOSIUM ON PROBLEMS IN THE EVALUATION OF INSTRUCTION 

University of California, Los Angeles 
December, 1967 

M. C, Wittrock, Chairman 

Sponsored by the Center for the 
Study of Evaluation of Instructional Programs 



The research and deveto'pment reported herein was 
performed pursuant to a contract with the United 
States Department of Health, Education, and Wel- 
fare, Office of Education under the provisions of 
the Cooperative Research Program* 



CSEIP Occasional Report No. 15, September 1968 
University of California, Los Angeles 



COMMENTS ON PROFESSOR GLASER’S PAPER ENTITLED 
’’EVALUATION OF INSTRUCTION AND CHANGING EDUCATIONAL MODELS” 



Arthur A. Lumsdaine 

In relation to the controversy between Glaser and Stake, I 
want to say that I am on Glaser’s side with respect to the impor- 
tance of behavioral objectives. It is impossible for me to imagine 
how we are going to make progress in evaluation unless we do a 
better job of providing a rationale, a logical foundation for 
what we are measuring. I think that the most important contribu- 
tion of programmed instruction to date has been less the improve- 
ment of instruction per se as a process of teaching than as an 
engineering effort emphasizing what is required to derive reason- 
able, measurable, useable instructional objectives from general 
statements . 

I would enter one caveat here. In seeking what outcomes 
to assess in the process of evaluation, we can concentrate too 
fully on stated objectives and fail to include an assessment of 
the extent to which unexpected outcomes may eventuate. That is, 
the ultimate criterion of whether an educational program is a good 
program is what it accomplishes. It may have had some unintended 
bad effects which need to be ascertained, if possible, though 
they certainly were not among the objectives of the educational 
planner. A program also may have some unexpected good effects. 
Thus, if we confine our assessment of the outcomes of an educa- 
tional program to its effects on its specified objectives, we may 
be overlooking some extremely important effects . 



s 



2 

There are several points on which I would like to comment 
briefly: One problem of prime importance, in addition to improving 

the technology of evaluation (conceived as the means of ascertain- 
ing the outcomes of educational programs) , is to try to create a 
better market for assessment data. As it now stands, there is not 
much demand for evidence about the effectiveness of specific 
programs. Educational products are still sold on the basis of 
unsubstantiated advertising. Those who have tried to change this 
situation have sometimes become quite discouraged by the realiza- 
tion of this. It seems rather futile to create data about an 
educational product and its effectiveness if there is little 
inclination on the part of those responsible for the purchase or 
selection of such products to look at the data. There is need 
for an educational job for the educational administrator or the 
curriculum supervisor, the person that makes the purchase decis- 
ions, to teach him about the usefulness of data and the demonstra- 
ble effectiveness of programs in making educational decisions. 

This important problem of long-range education is not necessarily 
going to be accomplished by those who are concerned with the 
technology of evaluation as such. 

I also ought to mention my conviction related to the work of 
the American Psychological Association and the National Education 
Association joint committee (Lumsdaine, 1965), on which several 
of us here participated: that as part of a viable technology 

of assessment of program, measurements we also need standards for 
the adequacy of such data. The reason for the standards derives 
from the following sequence of events. First, you have programs 




3 



with no eviuence of output. Then you say, "Let’s have evidence 
or data about the effectiveness of the programs." But then, in 
absence of standards, you get cheap, fallacious kinds of evidence, 
statistically reported in an impressive manner, perhaps, but 
technically unsound, i.e., with respect to methods of control. 

There emerges a sort of Gresham’s Law of data, in which bad data, 
being easier to obtain and more impressive to report than good 
data, tend to drive out good data. So, some kinds of standards 
are needed, such as those which have come to be taken for granted 
and which will be observed in papers reported in, say. Journal 
o f Experimental Psychology - -but which are far from being safely 
assumed in the data being reported by evaluations of educational 
programs and materials. 

I would also like to emphasize a point on which Marvin Alkin 
will probably comment further: namely, that if we are going to 

try to use cost-effectiveness criteria, we have to be able to 
measure output in cost translatable terms. Lack of this is the 
big hang-up, as I see it, in any cost-effectiveness program at 
the present time. It is very hard to say what the economic signif- 
icance is of the difference between an achievement score of 128 
and one of 212, even if these are translated into normative stand- 
ard scores. The dependent variables which we characteristically 
use for measuring the outcomes of education are not easily trans- 
lated into cost terms. 

This may be, however, a fortunate accident, because I think 
that such measures as test scores are probably not what we really 
ought to use as measures of educational output, anyway. That remark 




could be easily misunderstood. Let me see i£ I can clarify 
it. 



4 



Often we need to know that the most important competence is 
not that achieved through an educational procedure immediately 
after instruction. Rather, what we need to know (and this will 
become increasingly important as knowledge multiplies) is how 
well the effects of education enable a person to relearn something 
that he has forgotten, or to get quickly up to current operational 
proficiency from a background of prior training. There is just 
too much for everyone to know to expect that people will have a 
complete repertoire of competences on tap at all time. 

What we need can be described in part as a problem of transfer. 
It is to begin to assess proficiency in terms not of what the per- 
son can do now or at the end of instruction or what he retains a 
week or a month or a semester later; but, rather, in terms of the 
amount of educational effort, what is required to bring him up to 
proficiency from where his education to date has left him, (or to 

bring him back up to it after he has forgotten. 

This implies something like a "savings" measure. Although 

there are many problems in developing and using such measures, 
they are attractive as measures of the effects of education which 
have promise of being translatable into cost terms. This is also 
true, of course, of the measure of instructional time needed to 
reach a criterion (as opposed to difference in scores after a 
fixed time of instruction) . 

Let me turn briefly to a different point. One thing that is 



ERLC 



very important to recognize clearly is that there are great 



5 



differences in the procedures needed for different purposes of 
evaluation. 

The needs of program evaluation (particularly for such pur- 
poses as program improvement) are quite different from testing to 
evaluate the potentialities or achievement level of indiv iduals . 

To take one example: when we are concerned with the evaluation 

of programs, we have quite a different sampling task than when 
we are dealing with the evaluation of individuals. For the 
evaluation of individuals, to oversimplify a little, we need a 
small sample of items about a large sample of people. We want 
to give each person a score but are willing to base each person’s 
figure of merit on a relatively small sample of items. But in the 
problem of program evaluation, the opposite is true. We now want 
evidence from a relatively small - -adequate but relatively small-- 
number of individuals on all relevant items. This is because we 
are assessing the program for the purpose of product improvement; 
so, it is very important for us to have this fine-grain differen- 
tial knowledge of the successes and the failures of the program 
on each point it covers to detect its very specific failures and 
successes . 

I have tried hard to think of something to disagree with 
Bob Glaser about. One point of partial dissent concerns the 
alleged lack of relation between intelligence and program effects. 
There are, in fact, numerous instances in which successful attempts 
have been made to relate measures of ability to differences in 
programs. I can think of examples because I have been involved 
in collecting, analyzing, and reporting such data, in studies in 



6 



which effects of instructional devicei^, such as films, were ana- 
lyzed with concern for differential effects on individuals of 
greater and lesser ability, as measured by standard tests of 
intelligence. Great diff ex'ences , in fact, were found between the 
effectiveness of programs as a function of such ability-test scores 
and educational level of adults (Hovland, Lumsdaine, and Sheffield, 
1949) . 

These differential effects as a function of stratifying var- 
iables, such as educational level, IQ, or other measures of abil- 
ity, are of the form of interactions talked about as an important 
outcome of our educational tests. However, there are two kinds 
of interactions. One kind is as follows; The effects of Program 
A are greater for Group X tha,n they are for Group Y. Here the 
difference in the relative effectiveness of two programs, A and B 
(e.g., color versus bJ :ick and white films, overt versus implicit 
response procedures, etc.) is greater for a segment of one group 
of the population, X, than it is for some other segment, Y. 

There are many instances where this is the case--v;here a partic- 
ular instructional variable demonstrably makes more or less dif- 
ference for, let's say, brighter students than for less bright 
students (cf. Hovland, Lumsdaine, and Sheffield, 1949, chapters 
8 and 9) . 

However, the argument for the individualization of instruc- 
tion rests in part on the assumption that there is a more power- 
ful kind of interaction at work than the kind iust described. 

This would be where not only is the difference between A and B 
greater for Group X than for Group Y, but A is superior for Group 



o 



7 



X while B is superior for Group Y. This is a "reversal" kind of 
interaction that demonstrates, as a function of some population 
characteristics, that one program is better for certain persons, 
whereas other programs are better for other persons. 

Now, if you search the literature, both on formal instruc- 
tion and on attitudes, you will find very few such instances of 
reversible interactions documented by solid evidence. There 
are a few of them, and modesty forbids mention of the first that 
come to mind. But it is interesting that with all our talk about 
the importance of tailoring instruction to individual character- 
istics, we find so few instances of differential effects of this 
kind; so we can conclude that the program is tailored for a 
particular, not just individual, subgroup of individuals and is 
differentially effective for them. 



o 



REFERENCES 



Hovland, C. I., Lumsdaine, A. A., § Sheffield, F. D. Experiments 
on mass communication . Princeton, N. J.: Princeton Univer- 

sity Press , 1949 . 

Lumsdaine, A. A. Assessing the effectiveness of instructional 
programs « In R. Glaser (Ed.), Teaching machines and pro - 
grammed learnin g, II : D ata and directio ns . Washington , 

D~i CTi National Education Association.