Notes on:
Byrne, C. (1976) 'Computerized question
banking systems: 1—the state of the art' in British
Journal of Educational Technology 7(2):
44-64
Dave Harris
[Apparently, this and a more critical part two
consists of a report for a national development
programme in computer assisted learning]
It's possible to develop question banks of
particularly reliable items which can be indexed
according to lists of topics and/or levels of
difficulty. Subdivisions inside the bank are
important: topics might be divided into sub topics
and so on [this implies single task questions, of
course, which is the default position for
educational technologists]. Knowledge of
distributions of answers might be required too,
such as the mean and standard deviation for a
given student population. All this can be
conveniently computerized [which needed saying in
1976].
One example can be found in the Educational
Testing Service in the USA, which produces
national standardized tests and has done so since
1966. There is complex indexing, including
an 'ability - process dimension', which is a
'pragmatic adaptation of the Bloom taxonomy'
(47). Other dimensions include
abstract/concrete. Ultimately, the aim is to
make these items matchable to the 'individual
school's behavioural objectives' (48).
Another example is found in the Classroom Teacher
Support System for high school teachers in Los
Angeles. There are 8000 multichoice
questions in history in the bank. There is
some subjective appraisal, for example of the
difficulty of these items. Routine
statistics are gathered for each question based on
difficulty, discrimination, the frequencies of
student response and 'teacher rejection
rate'. The system can even score tests if
the teacher wishes it to do so. In practice,
it tends to be used for frequent class
tests.
A third example is the Comparative Achievement
Monitoring system (CAM), also in the U.S..
This bank uses similar classifiers, but makes sure
that every question is 'associated with an
objective'—the objectives are supplied too to help
the teachers select appropriate tests.
Teacher selection solves the problem of matching
assessment tasks and principles, but certain
functions of assessment are also discussed by the
designers, including the extent of discrimination
among students, but also links between formative
and summative evaluation, the results of student
feedback, and the use of assessment to test the
effectiveness of courses. There is a lot of
user guidance in this particular bank. Banks in
the past have revealed that it is sometimes
difficult 'to mesh well with developmental
conceptions... and the curriculum' (51) CAM trains
teachers and offers an evaluation service.
It also produces diagrams showing connections
between scores and questions.
There are some anticipated problems, including a
simplification of the forms of student response,
and limits to the types of question.
However, one bank, a reading program, offers 55
types of question! Student feedback is
available and is gathered in various kinds of
detail. The computers can help to design
tests for example by supplying distractors at
random [multi choice tests that is], and can
supply, for example, concrete numbers for problems
in algebra. The computer can provide
'incorrect' solutions based on known wrong student
strategies (54) [so you can actually penalize
incorrect answers, to help students avoid them, of
course]. However, setting up banks and
running them is still expensive [a lot cheaper now
I bet]. They actually require sufficiently
rich instructional materials [including lots of
confusion and error as well as the right
approach?]. Teachers will accept them more
readily if they are involved from the
beginning. Different subjects present
different difficulties—these devices are very good
for elementary arithmetic, for example. For
other subjects, the dangers of trivialization are
acknowledged [briefly] (59), although it is
claimed that computer generated items are as
reliable as the ones made by human instructors
[probably right --see below] .
Future developments might include all the routine
work being done by computer, while the more subtle
material is provided by humans. For example,
computers were already used at the University of
Michigan for correcting simple errors of spelling,
or calculating readability [on a journalism
course] (60). This saves time and enables a
tutor to focus on higher skills. [There is a
lovely example of how issues have to be
operationalized so that computers can deal with
them, page 60—poor writing is defined in terms of
'sentence length, the number of polysyllabic
words, readability, the spelling of key words,
names and facts, and the number of clichés…
The variety of sentence structures associated with
dull, indirect and verbose writing include the
overuse of articles, adjectives and passive
verbs'. Presumably this is assembled as an
expert system, and as in all cases like this, I
feel rather ambivalent. It is clearly
reductive of the notion of poor or good writing,
but I suspect it is less reductive than the
working criteria used by ordinary academics when
they mark student work—and it is at least
explicit].
There is a question bank for trainee medics in
England. The costs of development are
difficult to estimate, however.
more education
studies
|
|