Saturday 30 April 2022

Talking about Statistics. Part Two.

 In Part One we found that "on average" is not the same as saying, "generally". So you can't say, "Generally girls are better at languages and boys are better at maths."

My natural aptitude didn't stretch to labelling the axes correctly.

If you take two populations, it's unlikely they will be exactly the same. So there will be a difference in the average. But it doesn't mean that "generally" one group is different to the other. Generally almost everybody is in the overlap.

In fact, tiny differences, carelessly exaggerated, can have a huge effect on young people and the direction they take. We see it in school with siblings where one gets labelled "the sporty one" or "the clever one" because of some insignificant early success. A small distinction that gets fixed and ends up determining their destiny.

For the Social Mobility Tsar and society in general to peddle supposed differences between boys and girls as being "natural" does have consequences for pupils' image, choices, ambitions and lives. Of course it does. And it's our job to liberate young people from these pressures and limitations.

And it's not just in gender aspirations that the misuse of statistics reaches into our profession and damages pupils' lives.

Here's the same curve, back in the 80s, showing grading in O Level and CSE Languages. As you can see, a very few people got As and Bs at O Level. The "average pupil" was barely catered for. And a large number of pupils were not catered for at all.










Today, GCSE grades are pegged to a curve. The number of grades on offer is determined in advance to ensure standards are maintained year on year. If a year group's KS2 SATs are higher or lower than normal, then the number of each grade can be tweaked. But it's still determined in advance.

Pupils are given predicted grades based on their KS2 English and Maths results. Schools can choose whether to use predicted grades that would put them in the top, say, 25% of schools. And in some ambitious schools these are labeled as "minimum target grades." Whereas if it's a top 25% target grade, it would seem more realistic to tell pupils they have a one in four chance of getting that grade! Of course, the correlation between predicted grades and actual grades for the whole cohort matches up. Because that's how they determine how many grades to give out! But for any one individual pupil, it is not statistically valid to say that their "target" means anything at all.

Pupils are confused by the grades because their teachers can't really tell them what they mean. Especially when pupils learn that their GCSE drama target grade is based on their English and Maths exam 5 years previously. If you believe in "ability" then the KS2 SATs could be taken as a proxy for that. Or if you believe in step by step measurable progress, then you can take the KS2 SATs as a baseline for that. But really there's no justifiable link at the level of the individual pupil. This is where the system starts to bring itself into disrepute.

This came to a head in the 2020 algorithm scandal to determine GCSE and A Level grades. Teachers handed in grades based on their knowledge and assessment of pupils, along with a rank order. Then Ofqual manipulated those grades using an algorithm. Ofqual's statisticians know their stuff. And they have oodles of data on schools and pupils. They could have done a brilliant job.

I imagined something, at the very least, like the following: I hand in my Spanish Centre Assessed Grades. Ofqual look at my school's Spanish pupils for the last few years. Ofqual look at the KS2 baseline of my pupils over the last few years, and see how much progress pupils from different starting points tend to make in Spanish. Ofqual then look at the KS2 profile of the current (2020) cohort, and see if my assessed grades are consistent with the progress you would expect such pupils to make in my school in my subject. This at the very least, if not something more sophisticated and beyond what I could come up with.

They did not. What they did was look at the previous year's grades and my rank order. Then they ignored my assessment of the pupils. And more or less issued them last year's pupils' grades. I could see this with my own pupils. And there was the famous case of the school who had previously entered the whole cohort for a science exam, with pupils scoring grades from 9 to 1. But in 2020 they only entered a small number of pupils for Foundation Tier. Ofqual duly awarded them the previous year's grades, spreading the grades from 9 to 1, even though Foundation candidates couldn't have got higher than a 5.

This was not an accident. I said the statisticians at Ofqual know their stuff. And they do. They modelled the algorithm and knew that it meant one in three grades was wrong. So at A Level one grade per student: wrong. And at GCSE 3 grades per pupil: wrong. On average. So if one pupil got all the correct grades, that meant someone else was getting 6 incorrect grades. But they did it. Why? Because Ofqual produce the results they are instructed to produce. And the priority they were given was to avoid "grade inflation." And by issuing the 2019 grades to the 2020 pupils they made absolutely sure they met this brief. The government's priority was to give out the right number of grades. But not necessarily to the right pupils.

This brings us to the heart of the destructive role of statistics in exams. Their role in "accountability" for teachers and for schools.

Why could Centre Assessed Grades not be trusted? Because we have a high-stakes system where teachers and schools are judged and ranked by results. A system where teachers are pressured and pass that pressure on to pupils. So that schools can be ranked for messianic saviour politicians to criticise and rescue with their heroic initiatives. In fact, Centre Assessed Grades would be expected to be slightly inflated. Not because each individual pupil would be given grades they didn't deserve, but because the pupils who could have a disaster in an exam (and by dropping to a U would have a disproportionate effect on the class's statistics) would be more likely under teacher assessment to get a grade that reflected their level.

Of course it all went wrong. When the wrong pupils were obviously being given the grades, the government had to U-turn. Which meant grades were inflated. Bringing the system into very public disrepute.

The whole argument for ranking schools is morally bankrupt. The argument is that competition between schools drives up standards. And yet, while standards are supposedly being driven upwards, GCSE results have to be held down to avoid grade "inflation." If the narrative of high stakes accountability and competition is that standards are going up, then holding grades down is a form of real terms grade deflation. Devaluing pupils' achievement. Except it's not really about the pupils.

Everyone knows that KS2 SATs are not a qualification for the pupil. They are a school performance measure. But schools, parents, pupils are all made to feel the pressure of their individual performance. It's dishonest and harmful. Some parents think the "ebacc" is a qualification. And ask for their certificate. It's a manipulative school accountability measure. But it's sold to pupils and parents as something they can gain. It's fraudulent. The thing is, GCSEs are not much different. They are really much more for politicians posturing over school "accountability" than they are for the pupils. But the pupils aren't supposed to know this.

I will return to this in the context of 2022, after a slight diversion into grades in MFL.

Ofqual know that grades in MFL at GCSE and A Level are not aligned with other subjects. But it's not their brief to make sure they are. It's Ofqual's brief to make sure standards in each subject stay the same year on year. Within this brief, they have tried to do what they can to move MFL results more into line with other subjects. They looked at the fact that for an elite subject with a high number of A grades, A Level languages had a dearth of A*s. They looked at the effect of native speakers on A Level grades. At GCSE they looked at how levels matched up with European pupils, in a desperate attempt to find something in their legal brief (international comparability of standards) that would allow them to intervene. In the end they were allowed a small tweak to French and German on the grounds that the grading was causing a crisis. Spanish was deemed not to be in crisis, so although it shares the severe grading, it wasn't changed. The evidence and the statistics take second place to politics and the posturing of standards.

Now to 2022 and GCSEs. Pupils are going to take exams. But something is happening. There are pupils who are quietly and without confrontation just losing the will to do well. There are several reasons we can find for this. Firstly, the current Year 11 were half way through Year 9 and now find themselves taking exams. Exams many pupils may well feel they are not prepared for. Secondly, they have been doing "contingency assessments." These are not like mocks. When pupils do mocks, everyone knows that they haven't finished the course, that they will continue to make progress. Contingency assessments are not like this. When you give a pupil a contingency assessment in February, they know that the grade could end up being their GCSE grade. So that means that a pupil who wants to do well, has to revise all their subjects for a crucial GCSE four months early. When they haven't finished the course. All teachers can do is tell pupils not to worry, "It doesn't matter." We are telling our pupils it doesn't matter. We are telling our pupils not to revise. We are setting our pupils exams they can't be ready for. We are breaking our pupils. Or breaking their belief in what we are asking them to do.

The thing is this always happens. With some pupils. They can see the impossibility or can see clearly that they will be amongst the pupils predestined by the system to fail. But this year it is happening to many more pupils, and pupils across the grade range.

At best, they are saying, "It's OK. I'll just get a grade 5. Why should I work for a 9 anyway?" At worst they are giving up completely. Or are broken by aiming for something they can't achieve. Or something that has been revealed as a fraud, a fiddle, a confidence trick. Political shenanigans, statistical manipulation, high-stakes target culture. All passed on to the pupils, for something that had an illusion of value, which has been shattered by the government's desperate attempts to keep us all fooled.



No comments:

Post a Comment