April 28, 2016

Machine Learning and Government

The good

  • Understanding large administrative data becomes better, faster, cheaper, etc.
  • Social processes full of non-linearities, interactions, feedback

The danger

  • Hinders transparency
  • Mathematical models can have unintended consequences

Wisconsin Dropout Early Warning System - DEWS



  • Wisconsin Department of Public Instruction is the state K-12 education agency
  • Historically charged with regulating public schools and reporting on school conditions

Why Public Sector is Different

Many, if not most, of the difficulties we experience in dealing with government agencies arise from the agencies being part of a fragmented and open political system… The central feature of the American constitutional system—the separation of powers—exacerbates many of these problems. The governments of the US were not designed to be efficient or powerful, but to be tolerable and malleable.

~ James Q. Wilson

The Promise of Machine Learning in Education

  • Educators are facing a work environment that is increasingly complex, data rich, and time/labor intensive
  • Performance measures, accountability pressure, and a desire to understand student progression have created a proliferation of data
  • Many schools and school districts are unable to devote resources to data analysis
  • Data is collected, stored, and reported – but often not able to be used to assist educators

DEWS by the Numbers

  • Analyzes over 500,000 historical records of student graduation
  • Selects from over 50 candidate statistical models per grade
  • Provides biannual predictions on over 240,000 current students across four grades to over 1,000 schools
  • Hundreds of users have accessed thousands of individual student reports across nearly every Wisconsin school district

The Process of Building DEWS

  • Identify the problem
  • Assemble a team
  • Demonstrate a prototype
  • Iterate and improve
  • Deploy


It's better to solve the right problem approximately than to solve the wrong problem exactly

~ John Tukey

Strategies for Identification

  • Find policy relevant project with engaged leadership
  • Seek out existing solutions and identify strength and weaknesses
  • Focus on a space where a solution will be well received
  • Align solution to perceived needs

The Dropout Problem


Agree on Values

  • Set clear expectations about tradeoffs and be honest about solution
  • Use shared goals to ground work as more people participate in the project
  • Use values to assess techncial trade-offs and resolve them

What do we want from an EWS?

  • Accurately identifying students who need assistance and those who do not
  • Timely identification to make interventions
  • Transparency in how predictions were made and how students are labeled
  • Reproducibility in the predictions so they vary with changes in underlying behaviors not the models
  • Scaleable to a diverse array of student and school contexts

Accuracy First

Opportunity cost

  • 1,000 schools receiving on average 240 predictions each.
  • Each prediction reviewed by 3-5 staff for ~5 minutes
  • 3 x 240 x \(\frac{1}{12}\) = 60 hours
  • 5 x 240 x \(\frac{1}{12}\) = 100 hours
  • Across 1,000 schools thats 60,000 to 100,000 hours of annual work

Demonstrate Potential