Categories
Big data and evaluation

Measuring results and impact in the age of big data

Pete York and Michael Bamberger 2021

This introduction draws heavily on contributions from Pete York (BCT Partners) and Veronica Olazabal (The Rockefeller Foundation)

The attached paper presents the case for the importance of a convergence between big data and data science and the field of program evaluation.  This convergence can potentially produce significant benefits for the future of equity oriented social and economic development in both developing and industrialized countries.  However,  there are also technical, political, economic, organizational and even philosophical factors that have slowed the achievement of this convergence and its multiple benefits. 

We are living in a world that is increasingly dependent on big data and data science in every aspect of our personal lives and our economic, political and social systems.  While many of these trends began in industrialized nations, they are expanding at an exponential rate in middle and low income countries.  For better or worse, more people now have access to cell phones than to potable water.  

For a number of reasons, discussed in the report, the agencies responsible for evaluating social programs, have been slower to adopt data science approaches than have been their colleagues working in research and planning (see Section 3).  Data science and program evaluation are built on different traditions and use different tools and techniques, so that working together requires both groups to move out of their comfort zones.

Some of the promising areas where data science can make the greatest potential contributions to evaluation include:

  • Reducing the time and cost of data collection so that evaluators can focus on the key evaluation tasks of defining the key evaluation questions, developing a theoretical framework for the evaluation and the analysis and interpretation of the findings.  Many evaluators have to spend so much time and effort on the collection and analysis of data that they have very little time or resources to focus on the critical elements of the evaluation process.  Freeing up time will also allow evaluators to focus on the areas of data quality (spending more time with the communities being studied,  triangulation, ground truthing, mixed methods) – how many evaluations lament not having the time to properly address these questions?.
  • Dramatically expanding the kinds of data that can be collected and analyzed.  This includes access to Artificial Intelligence (AI) (making it possible to identify patterns in huge volumes of multiple kinds of data), a range of predictive analytics tools, also making it possible to develop models and analytical tools making it possible to evaluate complex programs.  Another major advance is the possibility of studying longitudinal trends, in some cases over periods of as long as 20 years possibility.  This makes it possible to both observe historical trends before a program is launched and to track sustainability of program induced changes, maintenance of program infrastructure and continued delivery of services.  All of these are virtually impossible with conventional evaluations that have a defined start and end-date.  
  • Another very powerful set of tools for evaluation managers and policy-makers are the many kinds of algorithms, using artificial intelligence and data mining, that can process huge volumes of data to help improve decision-making and prediction of the best treatments for different groups affected by a program.  The ability to analyze the factors affecting outcomes for individuals or small groups and to provide specific real-time recommendations on the best treatment or combination of treatments to provide for each small group or individual, contrasts with conventional evaluation designs that usually only make recommendations on how to improve the average outcome for the whole population.  However, many of these algorithms are based on complex predictive models which are usually not well understood by most users (both because they are complex and because they are proprietary and not usually made available to clients).  Consequently, there is the danger that some algorithms can have unintended negative outcomes that clients may not even be aware of.  
  • Although these have received less attention, and appear less exciting than the changes described above, one very important development concerns the ability of AI to combine multiple data sources into a single integrated data platform making it possible to explore relationships between the different data sets that was not previously possible.  The program to combat modern slavery (see Section 2 Box 4) provides an example of the great potential of integrated data platform.

Although most of the discussion in the literature has concerned how data science (which is seen as the exciting new frontier) can assist evaluators who are often portrayed as having fallen behind the times with respect to the use of new technology; it is important to recognize that there are some potential weaknesses in data science approaches.  This is particularly true as many data science approaches were originally developed in much simpler and less demanding environments such as marketing analysis and on-line advertising.  In many of these areas, an on-line advertiser is only interested in correlations (if the font size and color of the ad is changed more users of the site will click on the ad); or men who purchase diapers in the supermarket are likely to also purchase beer.  In these cases the client does not need to know why this relationship exists.  Because of the limited demands of clients, many data scientists do not have to develop the kinds of theoretical frameworks and theories of change used by most evaluators.  

So for all of these reasons,  when data scientists and app developers venture into the new world of community development,  designing complex programs for disadvantaged communities, and trying to explain why a program produces certain outcomes for some groups and not for others – there are many lessons that data scientists can learn from their evaluation colleagues.  Some of these lessons include: 

  • greater concern about the quality and validity of data
  • understanding the importance of construct validity (how to interpret indicators extracted from social media, phone call records or satellite images).  How can changes in the number of references to hunger or sickness be used as an indicator of changes in short-term poverty levels? What do satellite counts of the proportion of roofs constructed or straw compared to zinc, tell us about trends in poverty?  
  • addressing issues of social exclusion and sample bias
  • rethinking the role of theory and the need to base an evaluation of a theory of change
  • the importance of ground-truthing (checking on the ground hypotheses generated from the analysis of remote, non-reactive data.

All of these issues are discussed in the attached paper.

https://www.rockefellerfoundation.org/wp-content/uploads/Measuring-results-and-impact-in-the-age-of-big-data-by-York-and-Bamberger-March-2020.pdf

Categories
The broader context

The transformation of evaluation in the 4th industrial revolution [4IR]:

Economists and development practitioners acknowledge that the world is entering the age of the 4th Industrial Revolution [4IR]. Beginning the 1760’s the first revolution was on the transition from hand production to steam and water, and the transformation from agriculture to industry; the second, starting around 1870 was driven by railroads, electricity, mechanization and rising productivity; the third which began in the late 20th century, and which is still ongoing was based on the digital revolution, computers and Technology logical innovation. The fourth industrial revolution, which is already underway is based on networked information systems, automation of manufacturing, large scale machine-to-machine communication, smart machines, artificial intelligence and automated decision-making.

Despite the radical social, economic, political and technological transformations that 4IR is already introducing, there has been relatively little discussion in the evaluation community of how these dramatic changes will affect the nature of evaluation practice, the new questions that evaluations must address, the revolutionary new evaluation tools and techniques, including artificial intelligence, and the potentially greater role that evaluation can have over the next decades.

While evaluation practice currently has a relatively secure and defined role in the development field (almost all development programs are subject to one or more evaluations), in future conventional evaluation offices and consulting services will be competing with fundamental different ways of collecting and using information to monitor and assess the performance of economic, political and social initiatives. For example, many organizations are beginning to use integrated data bases, machine learning and artificial intelligence, combined with modeling and simulation technology (such as digital twins) to assess performance.

In 2020 my colleague Pete York and myself were invited by the Evaluation Matters journal of the African Development Bank to produce a two-part publication considering the implications of 4IR for evaluation practice in Africa. These two publications are available through the following links.

Part 1 discusses the current nature of development evaluation practice, and why a new evaluation paradigm will be required to address the challenges and opportunities of the 4th industrial revolution in Africa.https://idev.afdb.org/sites/default/files/Evaluations/2020-07/Acticle%208-The%20culture%20of%20evaluation%20in%20the%20age%20of%20big%20data%20The%20need%20for%20a%20new%20evaluation%20paradigm%20for%20the%204th%20Industrial%20Revolution.pdf

Part 2 examines the transformation of evaluation in response to the 4th Industrial Revolution. It examines why the transition will be disruptive for evaluation practice, while identifying both the challenges and the opportunities. The example of an impact evaluation of a road construction project in Ghana is used to illustrate how the new evaluation methods are already starting to be applied.

https://idev.afdb.org/sites/default/files/documents/files/EM%20Q2-2020-article1-challenges%20and%20opportunities%204th%20industrial%20revolution%28En%29.pdf

Categories
Complexity-responsive evaluation

The importance of a complexity focus in program evaluation

Most development programs are designed and implemented in complex political, socio-cultural, economic and ecological contexts where outcomes are influenced by many factors over which planners and program managers have very little control. These factors interact in different ways in different project locations. Consequently, a project, which has a clear design and implementation strategy, may produce significantly different outcomes in different locations or at different points in time.

Despite the widespread acknowledgement by evaluators and stakeholders that projects, and project evaluations are “complex”, most evaluations employ designs that implicitly assume the project is “simple” with a clearly defined linear relationship between the project inputs and a limited number of outcomes. For example, most quantitative evaluations adopt a pretest-post comparative group design where outcomes are estimated as the difference in the change for the project and comparison groups over the life of the project. The randomized control trial (RCT) is the most well-known design but a similar logic underlies many other experimental and quasi-experimental designs (propensity score matching, double difference, instrumental variable estimation, regression discontinuity

The policy and methodological implications of the lack of attention to complexity in the design and evaluation of development projects, programs and policies was discussed in a two part blog that I prepared for the International Initiative for Impact Evaluation (3ie) published in June 2021.

One of the reasons why many agencies do not address complexity is because much of the complexity literature is very technical and because complexity is considered to be “too complicated” to understand or measure. Part 1 presents an easy-to-understand framework for mapping complexity along 4 dimensions, together with a checklist for rating the level of complexity of an intervention on each of these dimensions. I also address the important question “why do so many evaluations ignore complexity?”.

https://www.3ieimpact.org/blogs/understanding-real-world-complexities-greater-uptake-evaluation-findings

In Part 2 I discuss practical approaches for evaluating complex development interventions. A 5 step evaluation methodology is presented which combines the use of familiar evaluation tools and techniques with the application for tools such as systems analysis that are designed to address complexity.

https://3ieimpact.org/blogs/building-complexity-development-evaluations

Some of the key take-aways include:

  1. Most evaluations of development programs largely ignore complexity. This is due in part to a perception by policy makers, managers and many evaluators that complexity is too technical and difficult to incorporate into most evaluations, and in part because most conventional evaluation designs do not address complexity.
  2. When complexity issues are not addressed, an evaluation will often over-estimate the impact of a program.
  3. The complexity map included in Part 1 provides an easily understood framework for identifying and discussing the different dimensions of complexity that can affect how a program is implemented and how it achieves its intended outcomes.
  4. The complexity checklist provides a useful first estimate of the level of complexity (rated from very high to very low) on each of the dimensions.
  5. Finally, it is important to recognize that complexity theory emphasizes that all dimensions of a program are holistically interlinked and that the effects of a single program component cannot be assessed isolation from other components and from the broader context within which a program operates. The challenge is to respect the holistic nature of any program while finding ways to sufficiently simplify the program to make it possible to evaluate its effects. The present approach seeks to achieve this through a process of “unpacking” the program into individual components, each of which can be assessed separately and then using systems analysis and other complexity-responsive tools to reassemble the findings of the different components to assess the total impact within the broader context. Different readers may have different opinions on how well these dual considerations are addressed.

Categories
Big data and evaluation

Adapting evaluation to the new information technologies

European Evaluation Society (EES) Webinar on “Opportunities and Challenges for Evaluation to Adapt to the New Information Technologies” February 2021

In February 2021 I was invited together with my colleagues Oscar Garcia (Director of the UNDP Independent Evaluation Office), and Pete York (Chief Data Scientist at BCT Partners) to organize a half-day virtual webinar to review recent experience with the incorporation of new information Technology into development evaluation. The four sessions are presented below:

Session 1: A brief introduction to big data and its applications in development evaluation (Michael Bamberger):
https://www.dropbox.com/s/57rs7nuoraukhm9/Bamberger%20Introduction%20to%20big%20data.pptx?dl=0

Session 2: Applications of data analytics to evaluation (Pete York BCT Partners) https://www.dropbox.com/s/1l6t8clbiua4k3j/EES%20Webinar%20Slides%20York%20011221.pptx?dl=0

Session 3: Opportunities and challenges for evaluation offices to adapt to the new information technologies (Oscar Garcia Director UNDP Independent Evaluation Office): https://www.dropbox.com/s/2dkzi6soccr17y1/2021-01-18%20-%20EES%20Webinar%20on%20IT%20Opportunities%20and%20Challenges%20-%20FINAL%20%28reduced%20size%29.pptx?dl=0

Session 4: The turbulent transition to the 4th Industrial Revolution (Michael Bamberger): https://www.dropbox.com/s/3yeoqpj6ckvdhl9/Bamberger%20Session%203%20%20The%20disruptive%20effects%20of%20the%20transition%20to%20the%204th%20industrial%20revolution%20%282%29.pptx?dl=0