Margaret Wu’s denunciation of the intended “like with like” school performance data in a letter to The Age, looks low key but is a very interesting (and very public) fissure in the professional education and research landscape. Readers will be used to outsiders and polemicists attacking every aspect of the education system, but it is very rare for anyone within the research and policy community to put his/her head up above the rampart and make such a public criticism of her colleagues.
Professor Wu is a top-flight psychometrician (that’s a statistician who specialises in educational measurement) who has worked for many years at the Australian Council for Educational Research (ACER) and the University of Melbourne. ACER is the developer of the NAPLAN tests and also provides the psychometric analysis of the results that will end up as the publicly reported data on school performance.
The Professor Barry McGaw she mentions is a former executive director of ACER, a professorial fellow at University of Melbourne, and chair of the Australian Curriculum Assessment and Reporting Authority (ACARA), which “owns” the NAPLAN tests and is responsible for their publication.
What Professor Wu is saying is that in her opinion the nature and quality of the NAPLAN tests make them unsuitable for the use that Julia Gillard wants them to be used for (that is, to compare school performance) and that Professor McGaw, among others, has agreed to implement. In Wu’s words, this multimillion dollar assessment and reporting process is generating data from any link between student performance and school performance is “at best a conjecture”.
Wu’s criticism is based on her assessment of the statistical “precision” of the NAPLAN tests. She claims that the large margins of error of the tests mean comparison within “like schools” group (that is schools with similar student populations) will be meaningless and only comparisons between high-performing, wealthy schools and low-performing, poor schools will be statistically valid.
Wu does not raise the other major academic criticism of the school reporting data — that the variation in NAPLAN scores is greater within schools (i.e. from Class A to Class B) than between schools (i.e. School X v School Y). In other words, in any given school that overall scores higher than another school, there will be classes that perform worse than the best classes in the “worse” school. No doubt this is related to the often-reported trumping effect of individual teacher quality.
It is fair enough for education bureaucrats (for which the NAPLAN tests were designed) to ignore this local variability, but any parent that chooses School A because it did better than School B on the NAPLAN tests is likely to have her kid end up in a class that is worse than the class they would have ended up in School B. So much for parent choice.
Compare Wu’s brief but devastating intervention with a lengthy article by Dr Ken Boston in this month’s Teacher magazine (an ACER publication — just to keep it all in the family!), which is an edited extract of a speech he gave at the Australian Primary Principals Association in August.
Boston, who was chief executive of the Qualifications and Curriculum Authority in England (and previously director general of Education in NSW and South Australia and an ACER board member), makes the point that NAPLAN was never designed to provide population statistics but rather individual diagnostic information. He goes on to explain some of the statistical and interpretive weaknesses of the approach and detail just how devastating the key stage tests have been in England. Interestingly he is a supporter of the OFSTED approach, which was favorably described by Professor Tony Taylor in a recent Crikey article.
Margaret Wu’s public intervention in the debate is timely and devastating for proponents of such an approach. Make no mistake, Margaret Wu is not Angelo Gavrielatos, Mary Bluett or Kevin Donnelly — she is one of a handful of people in the country who really understand this stuff.
It’s fairly obvious why Gillard wants to be seen to be promoting such retrograde steps as school league tables, but why is a career academic with international reputation such as Barry McGaw, providing intellectual cover for her?
Do I understand correctly? A University statistics professor thinks that the proposed leauge tables are the wrong ones. Apparently this is “devastating”.
So let’s leave academic la-la land for a second:
Firstly, does this professor propose a better approach? If not, let’s go with the “bad” league tables.
Secondly, Professor who? I know a Prof who is apparently the leading Australian expert on sexual politics. Shall we seek her permission before we ask someone on a date? Pu-lease.
In explaining that there is a difference in results between classes in the same school, your article claims this is “no doubt” due to “the often-reported trumping effect of individual teacher quality.”
Surely there is doubt. It could be as simple as a school having graded classes within the one year.
The article is correct that Professor Wu IS one of the few people who understands the complexities of statistics in educational measurement. League tables of schools based on forty multiple choice questions is simply invalid.
Fritz, I’m sorry if this is a dumb question. But I’m really interested in the answer.
Why is it obvious that Julia Gillard wants to be seen to be promoting such retrograde steps as school league tables?
“in any given school that overall scores higher than another school, there will be classes that perform worse than the best classes in the “worse” school. No doubt this is related to the often-reported trumping effect of individual teacher quality.”
There is doubt. It’s widely understood that when dealing with human populations there is typically more variation within a group than between groups. For example, the average wage in NSW is higher than the average wage in Tasmania. Only a fool would extrapolate that everyone in NSW earns more than everyone in Tasmania, or would be surprised that the extrapolation is false.
“It is fair enough for education bureaucrats (for which the NAPLAN tests were designed) to ignore this local variability, but any parent that chooses School A because it did better than School B on the NAPLAN tests is likely to have her kid end up in a class that is worse than the class they would have ended up in School B. So much for parent choice.”
Is there any logical basis for this statement? School A has higher outcomes than School B – based on an imperfect comparison, but the only one we currently have – so if I choose School A then I am likely to have my child end up in a class that is ‘worse’ than their corresponding School B class. This is a non sequitur, as are your closing comments.
C-, must try harder.