Assessment and the black swan problem

Black swans, by Hans CC0

Black swans, by Hans CC0

We spend a lot of time trying to work out whether or not pupils have understood what we've taught them, and what they know. But we should also be flipping this approach, and trying to find out what they don't know.

So what do swans have to do with it? If we take the proposition, 'All swans are white', and then try to prove that by observing as many white swans as possible, we will get nowhere fast. We may feel more and more confident that the proposition is correct, but we will never know with absolute certainty.

What does the evidence say?

We might even go so far as to say, 'We have found no evidence that black swans exist, therefore they don't'. But then we would be falling into the trap of believing that absence of evidence is equivalent to evidence of absence.

It's worth learning this, especially if you observe what pupils do in lessons. Just because they don't display certain kinds of behaviour such as testing their code or creating a video, doesn't mean that they cannot do it. In other words, absence of evidence is not evidence of absence.

Back to the swans. As I said, you can't be certain that all swans are white just because you've seen lots of white swans. However, you'd only need to see one black swan to know that the original proposition is false.

In the same way, having a pupil give you the correct answers and demonstrating the correct behaviour isn't an absolute guarantee that they have mastered the topic. But one mistake or incorrect reasoning would suffice to show that they haven't. 

The conclusion to this is that we ought to be trying our best to push pupils to the limit of their knowledge, and beyond. As soon as they fail or flounder, you know that the limit of their knowledge is the stage or level just before that.

The Peter Principle

An analogous idea is the Peter Principle, which states that people are promoted to their level of incompetence. The way this works is that people are promoted on the basis of their ability to perform their current role. Sooner or later, in the absence of training or a miracle, they will be promoted to a role in which they can't perform very well. It then becomes clear that their highest level of competence was in the role before their most recent promotion.

You can apply this principle in a school setting. For example, you could create a list of increasingly difficult questions, with the last few being ones you haven't covered in class yet.

Another way of doing this is to simply throw a problem at the students, and see how they deal with it. For example, when I taught Economics, I set my students a deceptively simple problem: How can we get rid of unemployment?

The correct answer, of course, is 'If I knew that I wouldn't be sitting here studying Economics', but leaving that aside it's a good question because there are innumerable ways of of dealing with unemployment, none of which can be said unequivocally to work, which means there is no absolutely right or wrong answer.

Another example is a problem I set when I was teaching computing. The students had to create a system that would enable a shopkeeper to automatically apply discounts or add charges according to differing conditions. We'd never covered anything like this before, so there was an incredibly vibrant atmosphere in the class, with students researching into different approaches, testing their ideas in different ways, and experimenting with alternative approaches.

In Jo Boaler's excellent book, Mathematical Mindsets, she reports on research that has shown that when students are set an applied maths problem before being shown how to solve it, and given time to work on it, they performed significantly better than students who had been taught the theory first and then asked to apply it. The reason given was that the students were more curious, and therefore more primed to learn when the teacher finally explained how to solve the problem.

Why is the black swan important?

When a pupil gets something wrong, that can reveal a lot about their understanding, or lack of it. For example, in Economics, you can show students a photo of a high street at night, in which the only light apart from the streetlights is the glow emanating from the 7/11 store: everything else is in darkness. 

Ask the students how many shops there are in the photo. The correct answer is one: the 7/11 store. Why? Because a shop is somewhere that selling and buying take place. If the shop is closed, nobody can sell or buy anything. Therefore, strictly speaking, it's not a shop as far as Economics is concerned. If a student insists that there are x shops in the photo, then you can start to explore what they understand a shop to be. In Economics terms, a shop is a market, and a market is where buyers and sellers can come into contact with each other. So the classified pages in a newspaper constitute a market, as does a university noticeboard advertising rooms to let.

In Computing, a good black swan question I think is: 'Why is Python a programming language whereas HTML is not?" If a student can't answer that question then you would be right to dig deeper to find out of she has really understood the concept of programming. She might produce fantastically brilliant programs, but she might still not 'get' the idea -- in much the same way as someone might produce stunning web pages using a WYSIWYG web editor without understanding what's going on behind the scenes.

The accidental black swan

Occasionally you can discover conclusive proof that a pupil doesn't 'get' something, despite all the 'evidence' demonstrating that they do. This happened to me with regards to a girl in Year 8 (13 years old) who, I thought, was an absolute whiz at spreadsheets. And believe me, I was one of those teachers who walked around the class, chatted to kids, asked them to show me what they had done and why, and I still missed the fact that this young lady honestly hadn't understood the point of spreadsheets at all.

How did I find out? One day I saw her using her calculator. I assumed that she was checking her spreadsheet answers, to make sure she'd devised the correct formulae. But when I looked closely, it turned out that she was calculating the answers, and then typing them in to the spreadsheet. In other words, she hadn't worked out that the whole point of a spreadsheet is to be able to model different scenarios, by changing variables. So she didn't understand the idea of variables either. Her using a calculator was a very good, but accidental, black swan.

Conclusions

It's important to devise questions and tasks that allow pupils to demonstrate their lack of understanding or misunderstanding. Once they get the wrong answer or go down the wrong path, a conversation needs to take place, starting with "Why did you decide to do that?"

By doing so you can (hopefully) get beneath the surface and find out what their reasoning was. From that, you can then work out, or at least strongly infer, what their fundamental misconceptions are.

Where does Bloom come in?

In their paper Developing a Computer-Science Specific Learning Taxonomy, Fuller et al postulate the idea of a two-dimensional matrix of Bloom's taxonomy. You can use the model to represent, amongst other things, the scenario in which a pupil creates decent programs without understanding or remembering how to program. They could do this by using trial and error.

So you can imagine that in a classroom situation, you might see the end result, but miss the trial and error process. You would then come to the conclusion that the pupil is good at programming, but you'd be wrong.

Another possibility is that a pupil is brilliant at stating computing concepts, and explain with great clarity what they are, without being able to channel that knowledge and understanding into writing decent code.

The critic problem

An example I like to use is this: you can't tell whether or not someone is good at computing just because they're good at debugging code. I'm very good at saying what's good or bad about a CGI movie, but I'm not convinced that qualifies me to actually produce one. But equally, a brillaint producer may make a poor critic.

The bottom line

When it comes to assessing ICT or Computing, you need to find out what the pupils don't know. To do so you must:

  • use a variety of methods of assessment, ranging from test papers to practical tasks
  • set open-ended work as far as possible, to give pupils the opportunity to demonstrate the limits of their knowledge, understanding and skills
  • constantly be on the look out for the black swan
  • be aware of the danger of assuming that more and more evidence that a pupil 'gets' it doesn't amount to certainty that he does.

At the time of writing, my next one day training course in assessing ICT and Computing takes place on 23rd March in London. For details please go to the courses page.

To find out what previous participants thought about the course, please go to the Course Testimonials page.