The assessment problem
Assessment drives education - or to be more precise, summative assessment drives education.
Basic requirements of summative assessment
Summative assessment needs to meet a number of basic requirements (the essential criteria) in order to be viable/cost effective. It must:
- enable comparisons to be made between learners (eg Person X did/is better than Person Y)
- be scalable (ie be able to be applied to huge numbers of learners at a reasonable cost)
- provide concise outputs (ie provide 'results' in a format that makes it easy and quick to make comparisons across learners)
- be credible (people must believe that the comparisons that the assessment enables are meaningful/useful - this does not necessarily equate with assessment validity)
Ideally summative assessment should also meet the following desirable criteria:
- assess things that are relevant
- include a formative assessment element
- are an integral part of learning (ie don't place too great an additional load on the learner)
- avoid demotivating learners (eg I failed eleven-plus, I am academically incapable)
The limitations with current assessment
At present we have a range of forms of assessment that meet the first batch of essential criteria quite well, but fail to meet the second batch of desirable criteria adequately.
Exams, for example, clearly meet the comparison, scalable, concise criteria and are seen as being credible (despite evidence that they do not actually provide the 'objective measure' that folk would like to claim). However, exams generally fail dismally in relation to the desirable criteria. Exams tend to focus on content rather than process. Exams also tend to adopt a behaviourist or constructivist view of learning (which focuses on what an individual can do in isolation) rather than a socio-cultural view of learning (which focuses on what individuals can do in a social context - which includes in collaboration with others).
Essays are quite similar to exams, though they are seen as being less credible (there is a greater recognition of their lack of reliability/validity) and often are less scaleable (particularly in contexts where feedback is expected to be provided).
Other forms of assessment tend to do better against the desirable criteria, but fail to meet the essential ones. For example, portfolios may perform well in terms of assessing things that are relevant but are not scaleable (if you don't believe me then try marking 200 portfolios in one batch - life is too short!).
One of the consequences of all of this are that much learning is not valued, because it is not assessed (in a way that meets the criteria set out above). Other consequences include:
- predefinition of the curriculum - cos that helps with making 'credible comparisons'
- a limited focus on those things that current forms of assessment can handle
- <Add more here?>
The problem that we need to crack is how to develop forms of assessment that meet all of the essential and all of the desirable criteria.
Utilising peer assessment may be the way forward. Wikied represents one ideas about how this might work.
The use of portfolios - which are often cast as ePortfolios - appears to be seen by many people as a solution to the assessment problem. However, as suggested above, portfolios can be seen to fail to meet the essential criterion of scaleability. Is it possible to overcome this problem. Can they become scaleable if we simplify the marking process or make it a continuous assessment portfolio which is verified by external examiners at the end?
The use of adaptive testing - Adaptive testing is testing in which items alter according to performance on the preceding item(s). It uses statistical methods to make sure that the items constantly assess and present in relation to past performance. The entry level can vary, to being a set level that then becomes harder or easier as required; or one pre-selected by learners. Those who carry out the assessment are scored on level of difficulty not how much they get right. These can be used to assess groups or individuals.
The Field of Social Learning Analytics looks like it offers great opportunities to address the assessment of many 'knowledge age skills'.
A way to assess creativity?
Here is a quote from Cowdroy, R & de Graaff, E (2005) Assessing highly-creative ability, Assessment & Evaluation in Higher Education Vol. 30, No. 5, October 2005, pp. 507–518
"A teacher cannot directly assess conceptualization nor schematization (they are cerebral, abstract, invisible), but can assess a student’s understanding of both the concept and its schemata and their place in the theory, philosophy and literature (of art, music or theatre) (Schooler et al., 1996; Cowdroy, 2000; Ramsden, 2003)as articulated by the student. This approach involves a significant double paradigm-shift, from teacher-derived criteria for examination of a work, to student derived criteria for assessment of the student’s understanding of his or her own concept in terms of the philosophical and theoretical frameworks of the relevant field of creativity (Crick & Cowdroy, 1998). This articulation authenticates the student’s creative activity, and assessment using this present-and-defend approach has therefore been called ‘Authenticative Assessment’ (not to be confused with authentic learning)."(p515)
Should we be assessing creativity???
For Further Consideration
Alternative world view
Of course it could be that the way forward is to change the starting premise.
One could frame the problem as being that assessment and accountability are too closely linked (at least in the UK). If we could break that link then the need to 'teach to the test' would be reduced for teachers.
Alternatively one could look for radically different measures of 'success' - for example, Libby Jared suggested that we might use 'the health of the community' as a measure of the success of a school. [url=http://www.learnometer.net/metric_feedback.html The learnometer website] provides an interesting set of metrics that seem to reflect some aspects of 'health of the community', including:
- maths performance
- towards 100% literacy
- staff applications
- staff longevity
- lower truancy
- (as in people smiling at each other or is this an acronym/ technical expression? --Mgaved 12:28, 28 January 2007 (GMT))
- I took it to mean people smiling at each other - but your guess is as good as mine. PeterT 13:01, 28 January 2007 (GMT)
- lower disaffection
- lower crime
- lower recidivism in prisons
- higher national income
- greater overseas role
- inward investment by high value companies
- greater residual value of school buildings (not sure about this one PeterT 16:12, 27 January 2007 (GMT))
- increased parental engagement
- kids who critique more
- kids who argue less(!)
- greater collaboration
- applied ingenuity
- engagement in science
- improving health
- social stability
- broad cultural understanding
- social equity
- equality of opportunity
- exam scores (only if we change what exams measure? PeterT 16:12, 27 January 2007 (GMT))
- % of university undergraduates/graduates
Assessment in the corporate world
(See Mark's post in the forum)
The Kirkpatrick model of evaluation for training highlights a number of levels of impact of training:
- Level 2 - learner's acquisition of the skills and knowledge
- Level 3 - extent to which learner uses those skills and knowledge in the workplace
- Level 4 - benefits to the organisation
This could translate into:
- What was the learning experience like - as seen from the learner's point of view and also from a learning professional's point of view (eg reflection/feedback/evaluation)
- Did the learner acquire the expected skills and knowledge (eg assessment)
- Did the learner get to use those skills and knowledge in their life (eg CV/passport)
- Did society benefit (tough one)
Cambridge Assessment - formerly University of Cambridge Local Examinations Syndicate (UCLES) - website includes a discussion forum on assessment issues (which you have to register for - follow link to Professional Development). http://www.cambridgeassessment.org.uk/
Critical Issue: Using Technology to Improve Student Achievement - a useful review of some of the issues surrounding assessment, and the use of ICT. http://www.ncrel.org/sdrs/areas/issues/methods/technlgy/te800.htm
The learnometer project - is exploring different metrics to demonstrate the effectiveness of (financial) investments in education. http://www.learnometer.net/