IM­PROV­ING the qual­ity of ed­u­ca­tion de­liv­ered through our pub­lic schools can not only boost eco­nomic growth but also help to nar­row in­come in­equal­ity in the U.S. And the best way to im­prove ed­u­ca­tion is to iden­tify and pro­mote the most tal­ented teach­ers.

One way of mea­sur­ing a teacher's ef­fec­tive­ness has been to see how much his or her stu­dents' test scores rise. This kind of "value added" mea­sure is straight­for­ward and can eas­ily be used to weed out bad teach­ers and pro­mote bet­ter ones. Crit­ics com­plain, how­ever, that this mea­sure­ment has two po­ten­tial flaws: Some teach­ers' scores may rise not be­cause they have per­formed so well in the class­room but merely be­cause they have bet­ter stu­dents. And some teach­ers may push up their stu­dents' scores by teach­ing to the test, rather than giv­ing stu­dents the un­der­stand­ing of con­cepts that pays off in the long run.

Two im­por­tant pieces of re­search re­but both of th­ese con­cerns, sug­gest­ing there are sig­nif­i­cant ben­e­fits to be gained from more ag­gres­sive use of value-added and other mea­sures to eval­u­ate teach­ers. The first study, spon­sored by the Bill and Melinda Gates Foun­da­tion, looked at stu­dent se­lec­tion. In a re­mark­able feat, the re­searchers ran­domly as­signed stu­dents to about 1,600 dif­fer­ent teach­ers. The random as­sign­ment en­sured that any ob­served im­prove­ment in the stu­dents' test scores was caused by their teach­ers.

The Gates team -- Tom Kane of Har­vard Univer­sity, Daniel McCaf­frey and Trey Miller of the Rand Corp., and Dou­glas Staiger of Dart­mouth Col­lege -- found, as non-ran­dom­ized stud­ies had also found, that value-added mea­sures were pre­dic­tive of stu­dent achieve­ment. As they con­clude, "our find­ings sug­gest that ex­ist­ing mea­sures of teacher ef­fec­tive­ness pro­vide im­por­tant and use­ful in­for­ma­tion on the causal ef­fects that teach­ers have on their stu­dents' out­comes."

The Gates re­searchers also ex­per­i­mented with var­i­ous sup­ple­ments to a purely test-based met­ric, and found that although the val­ueadded mea­sure did the heavy lift­ing, stu­dent sur­veys and ob­ser­va­tional analy­ses of teach­ing qual­ity were use­ful. In­ter­est­ingly, they found that teacher anal­y­sis could be done with­out hav­ing ob­servers make random vis­its to the class- room; al­low­ing a teacher to sub­mit a self-se­lected set of videos from the class­room worked just as well, be­cause even the best classes con­ducted by bad teach­ers were worse than those from bet­ter teach­ers.

The Gates team also par­tially ad­dressed the sec­ond cri­tique -that "good" teach­ers are only teach­ing to the test -- by ex­am­in­ing re­sults from other mea­sures of ed­u­ca­tional qual­ity. For ex­am­ple, the re­searchers ad­min­is­tered ope­nended word prob­lems to test stu­dents' un­der­stand­ing of math. The teach­ers who were pre­dicted to pro­duce achieve­ment gains on state tests pro­duced gains two-thirds as large on the sup­ple­men­tal as­sess­ments. An even more com­pelling re­but­tal of the sec­ond cri­tique, how­ever, is found in a De­cem­ber 2011 pa­per by Raj Chetty and John Fried­man of Har­vard Univer­sity and Jonah Rockoff of Columbia Univer­sity. Th­ese re­searchers as­sem­bled a data­base of 2.5 mil­lion stu­dents in grades 3 through 8 along with 18 mil­lion English and math tests from 1989 through 2009. They then linked that data­base with in­come tax re­turns.

Their pa­per is fas­ci­nat­ing be­cause the re­searchers as­sessed how a high value-added teacher can in­flu­ence stu­dents' later earn­ings and other out­comes. Some­one just teach­ing to the test, with­out im­prov­ing the qual­ity of ed­u­ca­tion, wouldn't be ex­pected to have any last­ing im­pact on stu­dents' earn­ings. Yet Chetty and the oth­ers found big ef­fects later on in stu­dents' lives from hav­ing a higher value-added teacher. By the time a stu­dent reached age 28, for ex­am­ple, the ben­e­fit of one stan­dard-de­vi­a­tion in­crease in teacher qual­ity in a sin- gle grade raised his or her an­nual earn­ings by about 1 per­cent. Their es­ti­mates also sug­gest that re­plac­ing a teacher in the bot­tom 5 per­cent of the value-added distri­bu­tion with an av­er­age teacher would boost ag­gre­gate life­time in­come for the stu­dents in that class­room by $250,000. And that would be true for ev­ery class in ev­ery year of in­struc­tion. Ex­po­sure to a high­er­rated teacher helped stu­dents in other ways, too, the re­searchers found: It in­creased their chances of at­tend­ing col­lege, raised their re­tire­ment sav­ings rates, and re­duced their like­li­hood of be­com­ing teenage par­ents.

The bot­tom line from both th­ese im­por­tant stud­ies is that re­al­time mea­sure­ments of teach­ers' ef­fec­tive­ness, based ei­ther ex­clu­sively or mostly on how much their stu­dents' stan­dard­ized test scores im­prove, pro­vide use­ful in­for­ma­tion that should not be ig­nored. And there are huge re­turns for stu­dents and the econ­omy as a whole from shed­ding the teach­ers who do poorly on th­ese mea­sures, and re­plac­ing them with teach­ers who do bet­ter.

As the Gates report demon­strates, it's pos­si­ble to im­prove teacher ef­fec­tive­ness met­rics. But that shouldn't keep us from us­ing the ones we have now. To help raise fu­ture pro­duc­tiv­ity, we should set a clear goal for all school dis­tricts: to deny ten­ure to teach­ers in the bot­tom 10 per­cent of the distri­bu­tion ac­cord­ing to value-added mea­sure­ments.

