April 30th, 2020

A rating system tilted against Hub schools

State’s rubric leaves little room for improving student scores

Alain Jehlen

A rating system tilted against Hub schools

There are 34 Boston Public Schools rated in the bottom 10 percent of the state’s rankings. BANNER PHOTO

State law gives the Commissioner of Elementary and Secondary Education wide discretion in deciding whether to take partial or complete control of a school or school district.

But it certainly helps justify intervention if the commissioner can point out that the school or district has earned bad marks in the state’s supposedly scientific and objective rating system.

Boston has 34 schools (out of about 125) that rank in the bottom 10 percent in the state. BPS as a whole is 14th from the bottom out of 289 districts. Why is it rated so low?

One major reason is that the rating system was designed in a way that almost automatically puts Boston and other urban centers with large numbers of low-income students and recent immigrants at the bottom.

Here’s how it works: The state rates schools and districts mostly according to test scores. But there are two ways they could use the scores. State officials picked the one that makes urban areas look worse.

Two ways to rate schools using test scores: score level vs. growth

When social scientists want to compare how well schools prepare their students for tests — for example, charter schools versus district schools — they use “growth” scores. “Growth” means how fast students are learning the skills that are tested. If a student has a big jump between grade four and grade five, that’s high growth. The school that the student attends probably had something to do with that, although there are other factors.

The other approach is to use the actual score. The score shows the effects of everything that’s happened to a student in that student’s entire life. The student’s current school is only a small part of that.

Say a student comes into a school and gets a 40 on a standardized test. The next year, the student scores a 50, a growth of 10.

At another school, a student comes in scoring 80. A year later, the student scores 81, a growth of one.

Which school was more effective?

Obviously, the school with more growth. But the current state rating system would give the second, slow-growth school a higher grade.

Or imagine a school that has high-scoring, English-fluent students. One year, the school has an influx of bright, fast-learning immigrant students who are still learning English. The school still has high growth scores because the students are learning fast. But the score level plunges because the new students aren’t as fluent in English as last year’s students.

Child poverty and test scores

Other aspects of a child’s life, besides not speaking English, can also lead to low test scores. One is poverty. Students who are poor have extra obstacles to overcome if they are to perform as well as children from homes that are economically comfortable.

Many children from low-income families do manage to break through the barriers and score high. But if you compare the scores of 1,000 low-income children and 1,000 high-income children, the individual variations within each group wash out. The average for high-income children is likely to be higher than the average for low-income children, even though they have the same range of innate potential.

Growth scores are influenced by poverty, too, but not as much as the score levels because what happens in school has more effect on growth.

Where’s Boston?

The new state audit of the Boston schools breaks down MCAS growth scores for English and math for every grade that has growth scores. Most of the growth scores are a bit below the state average. A few are above.

But in the state ranking system, Boston is in the bottom five percent of districts — number 14 out of 289 districts, where a low number means close to the bottom.

Why does Boston rate so low?

The reason is the formula used to calculate the rankings. It has gotten more complicated in recent years but most of it is still test scores: one-fourth growth score and three-fourths score level.

That’s like a formula to judge runners’ speed that’s one-fourth how fast they’re moving and three-fourths where along the road they are at the moment.

If that makes no sense to you, it’s because it doesn’t make sense — at least, not if you’re trying to measure what the runner is doing, or in the case of the state rating system, what schools are doing.

If the state officials stuck to growth scores, they would still only be paying attention to a narrow slice of the skills and knowledge that students need to cope with life. Test scores are poor predictors of a student’s future success.

But at least the state would be looking at something that happens in school.

So why would state officials design a system for finding “underperforming schools” that is such a poor measure of school performance?

When pressed, state officials sometimes say schools with students that score low need extra help even if they’re not bad schools, and a low rating triggers extra state help.

But schools could be given extra help without being labeled “underperforming.” For example, Title I, a federal program that started during the War on Poverty, sends money to schools that have large numbers of low-income children without suggesting the schools are bad schools.

The state’s new Student Opportunity Act promises to give much more money to districts with high numbers of low-income students. It does not assume the schools in those districts are underperforming.

Does the “help” fit the problem?

Also, the way the state “helps” schools with low ratings only makes sense if the school itself is the problem. The state (1) takes over decision-making, (2) fires the principal and most of the teachers, (3) suspends the teachers’ union contract and (4) provides extra money, but only for three years.

Three years is supposed to be enough time for a “turnaround.” But if students score low because they are homeless or face other obstacles due to poverty, those problems are unlikely to go away in three years, which explains why the record of state intervention is spotty at best. Schools sometimes do better while the money flows and extra staff are hired, but then their scores sink back down when it runs out.

Federal Title I aid keeps flowing as long as the school has low-income students. The money from the new state funding law will also keep coming if the legislature and the governor fund it.

Officials also sometimes argue that growth scores change from year to year more than score levels. That’s true, but would you determine a person’s weight with a tape measure rather than a bathroom scale because the readings on your scale bounce around a bit?

Does it help that a measure is stable if it’s measuring the wrong thing?

Alain Jehlen is editor of Boston Schoolyard News, the blog where this article originally appeared.