Flawed Test Construction and Scoring Problems
More Test Games Played at the Expense of Kids
In June, school districts across the state anxiously awaited the latest versions of the newlycrafted Rege nts exams, exams that have been long-regarded as the gold standard of achievement of state tests. As teachers opened the booklets of the various mathematics, biology, earth science, ELA (English Language Arts), and other tests, something seemed different. Different indeed! Many of these tests were much easier than older renditions. Furthermore, the scoring process made it very difficult for students to fail the majority of these exams. By July, Peter Simon of the Buffalo News (July 19 and 21 editions) exposed these new tests and their grading.
Recently, all teachers and administrators in the state received a document entitled From Standards to Assessments by State Education Department Deputy Commissioner James Kadamus. This monograph is an attempt to justify the debacle. It doesn’t work! Deputy Commissioner Kadamus tries to tell career teachers that what they experienced in June, based on a lifetime of experience and teaching, just isn’t so! After reading the deputy commissioner’s explanation, were we expected to accept that the exams were NOT easy and that the scaled scoring was NOT overly generous? When the results of these exams are published in the Report Card this winter, the public will draw its own conclusions.
This latest round of Regents exams is an affront to the high standards that the Commissioner of Education and the Board of Regents purport to promote. The Kadamus document is a desperate attempt to feign rigor. The lack of rigor, however, doesn’t make the playing field any fairer. English Language Learners (ELLs), career and technology students (CTE), and ALL students who do not prosper by these pen and pencil tests, which have no proven validity, are victims.
As if this scenario were not sufficient enough to shake confidence in the bureaucrats in Albany, we have a new set of standardized test results in New York State that make the June Regents experience pale by comparison. Results of the 8th grade ELA and math tests did not arrive in districts until the first week in September, more than three mo nths after students took the assessments. The results are stunning!
Before analyzing the 8th grade ELA assessment, one should examine what recently occurred with the 4th grade ELA given in early February 2001. At the September meeting of the superintendents of Eastern Monroe County, ELA results were discussed. The pattern we saw was blatantly obvious; the 4th grade ELA scores had skyrocketed from the year before! On the ELA exam, which is graded on a 1 through 4 rating scale, with 4 being high, the number of children scoring a 4 increased by statistically implausible numbers. For example, Fairport’s scores of 4 had increased by 35%! Other districts had similar increases.
The 8th grade test results, on the other hand, painted a much different picture. Students in this cohort were reported to be much poorer performers than last year’s 8th graders. Across the state, including high performing Nassau County schools on Long Island, the scores plummeted! It made no difference if you were in a wealthy or a poor district. Apparently, this particular test demonstrated that ALL 8th graders are doing worse than the class before them in spite of focused efforts to teach to the standards and teach to the test. Scores of 4 in the highest performing districts in Monroe county dropped by about 12%! Results such as these are statistically improbable. Further analysis of the results is not permitted at this time as the specific scores were publicly embargoed until the Commissioner’s press conference which was scheduled for late in September. Now, the embargo has been further extended to mid October. Would the state like us to believe that miracles are happening in grade 4? At the same time, are we also to believe that our 8th graders (now in 9th grade) should be placed on academic life support?
The state and CTB McGraw-Hill, the company which produces the tests, both claim that the test and the scores were sound. It was the sampling process that was in error, which led to suc h poor scores. In an October 2 AP story, it was reported that the statewide projection relied on a small sample of randomly selected students from a few schools that were supposed to reflect a crosssection of the entire state. SED officials said this projection was wrong. The story further cited Roseanne DeFabio, the state’s assistant education commissioner for curriculum and instruction and assessment, who said, “As more tests were graded, routine quality control checks made it clear the sampling used to project the overall state performance was in error.”
The fact that the scores were sent to schools in late August strongly contradict this statement. The usual embargo was in effect to sort out any individual school district errors. Had the anomalies been caught during the grading process as stated by Ms. DeFabio, the scores would never have been sent to schools, and SED and McGraw-Hill would have re-scaled the scores to get their desired outcome at that time. Recent history suggests that something more dis turbing may have been going on.
In June, a NYS Assembly subcommittee held hearings in an attempt to understand the alarming dropout rate among English Language Learners (ELLs) since the initiation of New York State’s testing program. Assemblyman Sullivan noted that in New York City and Rochester, a third of the students repeat 9th grade. He pointedly asked Commissioner of Education Richard Mills, “Why do they repeat the 9th grade?” Commissioner Mills replied, “Well, I think one reason is that the middle grades’ curriculum is weak.” The official Assembly record is clear that Assemblymen Sanders, Sullivan, Rivera, and Espaillat did not buy the commissioner’s answer. What lingers so obviously is the commissioner’s allegation that middle-level education is failing. What would support this allegation? Was the delivery of low test scores on the 8th grade ELA assessment three months after the commissioner’s testimony the needed supporting data? As a result of the scores on this test, hundreds of students in the county and thousands across the state, who were doing well in middle school, would have been required to receive Academic Intervention Services (AIS). In some cases, they actually excelled by all other rational measures!
Is it possible that the scores were scaled to sully education in the middle years? Is it possible that scores were affected through either the manipulation or blatant incompetence of the adults mismanaging this testing program? Rega rdless of spin from the State Education Department, a 35% increase on the 4th grade ELA test and a 12% decrease in the grade 8 ELA scores is simply ridiculous. This is the concern that superintendents in Eastern Monroe County, along with many other high-performing districts across the state, expressed to Deputy Commissioner Kadamus in early September.
As noted in the October 2nd AP article, we are told continually that the tests are valid and reliable. Deputy Commissioner Kadamus makes this point again in his Standards to Assessments memo: “We are confident in the validity of the State Assessments.” I must point out, however, that to date no proof of validity has been proffered. The relationship between the state standards and the assessments is a distant one at best. This has been demonstrated by numerous scholars such as Walter M. Haney and Robert M. Hauser in separate validity studies of the Regents exams (2001). In another study from the University of Madison-Wisconsin, the range of alignment of assessments and standards for 10 states ranged from 5% to 46%. Further, in a study scheduled for publication next year by the Rand Corporation, a California research group, it was found that 50% to 80% of the improvement in a school’s average test scores from year to year is related to random factors rather than to real gains in learning. Two variables weigh heavy in this formula: (1) a different group of students takes the test each year and (2) a different test is given each year.
The problem the state faced put them in a “Catch 22” situation. If the scores stood as delivered in August, they were left with no credibility. If the scaling of the test scores were to change, it is apparent that they can manipulate the data to get any results they intend. Seventeen days after districts received the scores (September 27), superintendents received an urgent memorandum from Deputy Commissioner Kadamus. He announced that “new” test score reports would be released in the near future.
Two years ago, due to another set of scoring errors by CTB McGraw-Hill, 9,000 New York City school children, who had actually passed the test, were required to go to summer school. Thousands in California went to summer school and in many cases were retained a grade due to this McGraw-Hill mistake. Who is holding SED and McGraw-Hill accountable? Prior to the current 8th grade debacle, one would have expected that the aforementioned problems would have led to the state’s termination of the contract with McGraw-Hill. Perhaps this move would have provided the much- needed boost in confidence in a program riddled with error.
SED has promised a better system in the future, one with fewer mistakes and a better turn-around time. Currently, it takes more than three months to get scores back to schools. State officials have promised test returns in 3 to 4 weeks with future exams. One could only imagine what errors we will experience with a three-week turn-around of scores when we see such enormous errors with the current 3 to 4 month turnaround. In a story in The New York Times on October 3, Thomas J. Lueck cited Alan Ray, a spokesman for SED, who said that the problem would not alter the state’s relationship with CTB. This comes as quite a surprise. But also surprising is the fact that New York State Board of Regents member Charlotte Frank is vice president of CTB McGraw-Hill.
The last shred of credibility has evaporated.
Career and Technical Education students, English Language Learners, Gifted Student s, 4th and 8th graders have all been the collateral damage of New York’s testing reform. The large numbers of high school students who would have earned the old version of the Regents diploma get short shrift as well, as the new Regents diploma has the va lue of Confederate currency.
Kids must not be used as the tools to keep a boat afloat that should never have set sail. Never again should test games be played at the expense of kids.
The New York Times | Op Ed Letter | October 9, 2001