Rikers Island social impact bond (SIB) – Success or failure?

There’s been a lot of discussion over tRikershe past few weeks as to whether Rikers Island was a success or failure and what that means for the SIB ‘market’. You can read the Huffington Post learning and analyses from investors and the Urban Institute as to the benefits and challenges of this SIB. But I think the success and failure discussion fails to recognise the differences in objectives and approaches between SIBs. So I’d like to elaborate on one of these differences, and that’s the attitude towards continuous adaptation of the service delivery model. Some SIBs are established to test whether a well-defined program will work with a particular population. Some SIBs are established to develop a service delivery model – to meet the needs of a particular population as they are discovered.

1.     Testing an evidence-based service-delivery model

This is where a service delivery model is rigorously tested to establish whether it delivers outcomes to this particular population under these particular conditions, funded in this particular way. These models are often referred to as ‘evidence-based programs’ that have been rigorously evaluated. The US is further ahead than other countries in the evaluation of social programs, so while these ‘proven’ programs are still in the minority, there are more of them in the US than elsewhere. These SIBs are part of a movement to support and scale programs that have proven effective. They are also part of a drive to more rigorously evaluate social programs, which has resulted in some evaluators attempting to keep all variables constant throughout service delivery.

An evidence-based service delivery model might:

  • be used to test whether a service delivery model that worked with one population will work with another;
  • be implemented faithfully and adhered to;
  • change very little over time, in fact effort may be made to keep all variables constant e.g. prescribing the service delivery model in the contract;
  • have a measurement focus that answers the question ‘was this service model effective with this population’?

“SIBs are a tool to scale proven social interventions. SIBs could fill a critical void: other than market-based approaches, a structured and replicable model for scaling proven solutions has not existed previously. SIBs can give structure to the critical handoff between philanthropy (the risk capital of social innovation) and government (the scale-up capital of social innovation) to bring evidence-based interventions to more people.” (McKinsey (2012) From potential to action: Bringing social impact bonds to the US, p.7).

2.    Developing a service delivery model

This is where you do whatever it takes to deliver outcomes, so that the service is constantly evolving. It may include an evidence-based prescriptive service model or a combination of several well evidenced components, but is expected to be continuously tested and adjusted. It may be coupled with a flexible budget (e.g. Peterborough and Essex) to pay for variations and additions services that were not initially foreseen. This approach is more prevalent in the UK.

A continuously adjusted service delivery model might:

  • be used to deliver services to populations that have previously not received services, to see whether outcomes could be improved;
  • involve every element of service delivery being continuously analysed and refined in order to achieve better outcomes;
  • continuously evolve – the program keeps adapting to need as needs are uncovered;
  • have a measurement focus that answers the question ‘were outcomes changed for this population’?

Andrew Levitt of Bridges Ventures, the biggest investor in SIBs in the UK, “There is no such thing as a proven intervention. Every intervention can be better and can fail if it’s not implemented properly –it’s so harmful to start with the assumption that it can’t get better.” (Tomkinson (2015) Delivering the Promise of Social Outcomes: The Role of the Performance Analyst p.18)

Different horses for different courses

Rikers New York City

The New York City SIB was designed to test whether the Adolescent Behavioral Learning Experience (ABLE) program would reduce the reoffending of the young offenders exiting Rikers Island. Fidelity to the designated service delivery model was prioritised, in order to obtain robust evidence of whether this particular program was effective. WYNC News reported that “Susan Gottesfeld of the Osborne Association, the group that worked with the teens, said teens needed more services – like mental health care, drug treatment and housing assistance – once they left the jail and were living back in their neighbourhoods.”

In a July 28 New York Times article by Eduardo Porter, Elizabeth Gaynes, Chief Executive of the Osborne Association is quoted as saying “All they were testing is whether M.R.T. by itself would make a difference, not whether something you could do in a jail would make a difference,” Ms. Gaynes said. “Even if we could have raised money to do other stuff, we were not allowed to because we were testing M.R.T. alone.”

This is in stark contrast with the approach taken in the Peterborough SIB. Their performance management approach was a continuous process of identifying these additional needs and procuring services to meet them. The Peterborough SIB involved many adjustments to its service over the course of delivery. For example, mental health support was added, providers changed, a decision was made to meet all prisoners at the gate… as needs were identified, the model was adjusted to respond. (For more detail, see Learning as They Go p.22, Nicholls, A., and Tomkinson, E. (2013). Case Study: The Peterborough Pilot Social Impact Bond. Oxford: Saïd Business School, University of Oxford.)

Neither approach is necessarily right or wrong, but we should avoid painting one SIB a success or failure according to the objectives and approach of another. What I’d like to see is a question for each SIB: ‘What is it you’re trying to learn/test?’ It won’t be the same for every SIB, but making this clear from the start allows for analysis at the end that reflects that learning and moves us forward. As each program finishes, let’s not waste time on ‘Success or failure?’, let’s get stuck into: ‘So what? Now what?’

Huge thanks to Alisa Helbitz and Steve Goldberg for their brilliant and constructive feedback on this blog.

The Peterborough Social Impact Bond (SIB) conspiracy

If you think Social Impact Bonds are the biggest thing to hit public policy EVER, then you were probably horrified at the cancellation of the final cohort of the flagship Peterborough SIB. How is it possible? What does it mean?

Since the news was broken in April this year (2014), I’ve had questions from as far afield as Japan and Israel trying to discover the UK Government’s TRUE agenda. More recently, at the SOCAP Conference in San Francisco in August, it was raised again. Eileen Neely from Living Cities, which has provided $1.5 million in loan financing for the Massachusetts Social Impact Bond was discussing “shut down risk: what happens if one of the parties decide they don’t want to play.”

She said, “In the Peterborough deal in the UK, the government decided that they weren’t going to play any more… so there’s some who say ‘Oh it’s because it wasn’t going well’ and others are saying ‘It was cos it was going too well’ so whichever it is, they decided that they weren’t going to do it, that they weren’t going to go into the next cohort, so what does that mean to the investors?” Eileen made it quite clear “I haven’t talked to any of the participants there, I’m just outside, reading the articles and the blogs …”

I thought it was about time we summarised the evidence for those who continue to ask these questions. Continue reading

Social impact bond (SIB) research questions

It seems that more and more students around the world are keen to do some research on social impact bonds. Great! But we can do better than ‘Do SIBs work?’ (try defining success first…) or ‘What’s the relationship between financial return and effect size?’ We don’t have the data for questions like this yet, but there are so many other wonderful questions we can be asking. There’s also a lot of data on twitter and in the media that could be used for interesting studies of stakeholder perception and reaction.

If you are a student researching SIBs and would like to be connected with other students, please use the contact form and include your university, level of study and research topic. I will connect you via email with other students around the world. If anyone else has more questions they’d like on this list, please pop them in the comment box.

Some questions

  • What are the effects of publicly announcing a SIB? How does timing affect whether a SIB gets agreed and how long it takes to agree?
  • What are the key characteristics of SIBs that have been announced by haven’t happened? How is this different from those that have happened? What has been the result of these projects e.g. funded by other means, directly commissioned etc.
  • Beyond risk and return: what are investors in social impact bonds looking for and attracted to? On what factors does the success of a SIB fund-raising effort rest? E.g. SVA great brochure, Westpac worked with existing customers, Social Finance builds closer relationships and confidence with investors.
  • Procurement – how do SIBs challenge the assumptions, processes of procurement? Comparison of jurisdictions. Implications for changes in law.
  • What do you procure for i.e. organisation, idea, full blown proposal or service – advantages and disadvantages from cases around the world.
  • Making responses public – e.g. Illinois for their request for information – does this impede or encourage innovation?
  • Appropriation risk (US only?) – how is this quantified, perceived, legislated against and what are the effects of that on a SIB? Conversely, how might the increased profile of a SIB affect appropriation risk?
  • Unstable governments – what are the implications for contractual partnerships in those countries? What’s the financial cost too?
  • Relationship between payment metric, measurement confidence and timing of payments – perhaps already in existing literature on performance-based contracting.
  • The role of guarantees – how does it change perception of what you’re doing by different stakeholders?
  • Who initiates and drives a SIB – how does this shape perceptions of risks and benefits by different stakeholders?

Strongs compressed


Alex Nicholls and I published a set of questions in our Case Study: The Peterborough Pilot Social Impact Bond (2013, published by Saïd Business School, University of Oxfordaround the often overlooked issue of legacy: what happens when the SIB is over. Not all of these are research questions, but studies that go some way to addressing these broader questions might be useful. The following is quoted directly from the paper.

One key question of a SIB is when and why should it end? Other key questions to consider are set out below from the perspective of each stakeholder typically involved in a SIB:


  • How to institutionalise innovation into future welfare programmes and in the wider social services market?
  • If early prevention is successful, how to maintain and fund preventative services after SIB ends? Do SIBs need to ‘rollover’ to produce sustainable change?
  • How can SIB outcomes data (likelihood, effect size, cost of delivery, value or savings to tax payer, related externalities/proxy outcomes) drive better commissioning across government?
  • How to achieve key outcomes post SIB?
  • How to continue to grow the social finance market to fund welfare services?
  • How to report on and share SIB learning and data more widely?
  • How to calculate savings from SIB interventions?


  • How to develop a secondary market exit?
  • How to develop a follow-on SIB investment?
  • How to adjust risk and return dynamically to the availability of new information from SIBs in the market?
  • How to tranche investments in a single SIB according to different risk and return profiles and different personal costs of capital?

Service Providers

  • How to ensure continuity of funding of increased capacity?
  • How to institutionalize SIB performance data?
  • How to build capacity to engage in future SIBs?
  • How to manage on-going collaborative relationships?
  • How to disseminate learning?
  • How to leave a community stronger when a service ends?


  • How to build a pipeline of SIB deals?
  • How to build capacity in providers so that they are stronger for having worked on a SIB?
  • How to continuously innovate?
  • Where to apply SIBs and develop other models that build upon SIB learning?
  • When are SIBs no longer necessary, if ever?
  • How to build a business model, given high transaction costs?
  • How better to segment the investor market to the real, rather than perceived, risk and return opportunities of SIBs?
  • How to manage the involvement of commercial, rather than purely social, investors in terms of expectations of high returns and the potential for risk dumping?

Service Users

  • How to ensure that a service gap does not arise for current participants and relevant future populations/cohorts?
  • How to avoid worse outcomes in the long term?
  • Will improved outcomes be sustained for those who participated in a SIB?

Fewer criminals or less crime? Frequency v binary measures in criminal justice

The June 2013 interim results released by the Ministry of Justice gave us a chance to examine the relationship between the number of criminals and the number of crimes they commit. The number of criminals is referred to as a binary measure, since offenders can be in only one of two categories: those who reoffend and those who don’t. The number of crimes is referred to as a frequency measure, as it focuses on how many crimes a reoffender commits.

The payments for the Peterborough SIB are based on the frequency measure. Please note that the interim results are not calculated in precisely the same way as the payments for the SIB will be made. [update: the results from the first cohort of the Peterborough SIB were released in August 2014 showing a reduction in offending of 8.4% compared to the matched national comparison group.]

In the period the Peterborough SIB delivered services to the first cohort (9 Sept 2010-1July 2012), the proportion of crimes committed over the six months following each prisoner’s release reduced by 6.9% and the proportion of criminals by 5.8%. In the same period, there was a national increase in continuing criminals of 5.4%, but an even larger increase of 14.5% in the number of crimes they commit. The current burning issue is not that there are more reoffenders, it is that those who reoffend are reoffending more frequently.

Criminals or crime 1Criminals (binary measure) in this instance are defined as the “Proportion of offenders who commit one or more proven reoffences”. A proven reoffence means “proven by conviction at court or a caution either in those 12 months or in a further 6 months”, rather than simply being arrested or charged.

Crime (frequency measure) in this instance is defined as “Any re-conviction event (sentencing occasion) relating to offences committed in the 12 months following release from prison, and resulting in conviction at court either in those 12 months or in a further 6 months (Note: excludes cautions).”

The two measures are related – you would generally expect more criminals to commit more crimes. But the way reoffending results are measured creates incentives for service providers. If our purpose is to reduce crime and really help those who impose the greatest costs on our society and justice system, we would choose a frequency measure of the number of crimes. If our purpose is to help those who might commit one or two more crimes to abstain from committing any at all, then we would choose a binary measure.Criminals or crime 2Source of data: NSW Bureau of Crime Statistics and Research

The effect of the binary measure in practice: Doncaster Prison

A Payment by Results (PbR) pilot was launched in October 2011 at Doncaster Prison to test the impact of a PbR model on reducing reconvictions. The pilot is being delivered by Serco and Catch22 (‘the Alliance’). The impact of the pilot is being assessed using a binary outcome measure, which is the proportion of prison leavers who are convicted of one or more offences in the 12 months following their release. The Alliance chose to withdraw community support for offenders who are reconvicted within the 12 month period post-release as they feel that this does not represent the best use of their resources. Some delivery staff reported frustration that support is withdrawn, undermining the interventions previously undertaken. (Ministry of Justice, Process Evaluation of the HMP Doncaster Payment by Results Pilot: Phase 2 findings.)

I have heard politicians and policy makers argue that the public are more interested in reducing or ‘fixing’ criminals than helping them offend less, and thus the success of our programmes needs to be based on a binary measure. I don’t think it’s that hard to make a case for reducing crime. People can relate to a reduction in aggravated burglaries. Let’s get intentional with the measures we use.

Start a mistakes log

mistakes“No one is exempt from the rule that learning occurs through recognition of error.” Alexander Lowen, Bioenergetics

There’s too many lessons we’re missing out on because of our tendency to only publish good results. It’s perfectly understandable to want to promote wins, but publishing mistakes and what’s been learned from the them may be even more valuable.

Ben Goldacre is crusading against publication bias in evidence based medicine. He is one of the forces behind http://www.alltrials.net/, an online petition to get all medical trials registered and subsequently all results reported. This is important stuff.

But apart from medicine, those of us involved in designing and delivering social programmes continue to repeat the mistakes of the past, because we simply don’t know enough about what has happened. I’m a strong believer in evidence-based policy, but evidence of policy history and why things failed is rarely captured and shared. Might it be possible for us to value mistakes enough to create incentives for their publication?

Curt Rosengren writes in his blog, the genius of mistakes:

You might even try keeping a mistake genius journal. Not a place for you to berate yourself for how many mistakes you make, but a place for you to actively learn from what has happened. Explore the mistake, explore what insights you’ve gained as a result, and summarize those insights into key points.

One organisation that’s created a ‘mistakes genius journal’ is Givewell in the US, with a section on their website, Our Shortcomings, logging their mistakes and what they’ve done in response. My opinion of the organisation was heightened by this discovery and I thought that this honest recognition and promotion of continuous improvement might have had the opposite effect most would expect from publishing their mistakes. Yes, we’re all worried about tabloid headlines, but wouldn’t it be a little less exciting when it’s not a secret ‘uncovered’, but a quote from the source straight off their public website. Imagine how wonderful it would be if governments and service providers kept similar logs!

As we try to design new services and financial products to address entrenched problems in this emerging social investment market, it would be really valuable to know what didn’t work out for others and most importantly, what they changed in response.

Allia recently showed an exemplary commitment to learning following the closure of their Future for Children Bond, which was the first opportunity for retail investors to invest a proportion of funds in a social impact bond, but failed to raise sufficient capital.

As a first pilot product, the Future for Children Bond has nevertheless been hugely valuable in assessing the retail market for social investment and generating learning about the steps needed to enable it to grow. These lessons will be used to inform the development of future Allia products and will be shared with the sector, together with policy recommendations, in a report by NPC to be published in May.

So here’s to seeing a whole lot more mistakes logs and lessons learned appearing in the public domain – great PR and enhanced social impact – what is there not to like?

How can people with more information be both more confident and more wrong?

I become truly frustrated when faced with someone who insists they are an expert in something they know absolutely nothing about. As a believer in evidence, my first instinct is to provide them with more information, but perhaps this isn’t always a good idea.

The overconfidence effect is a decision-making bias where a person’s confidence about a decision is greater than their accuracy when making it. A great article by Hall, Ariss and Todorov (2007) The illusion of knowledge: When more information reduces accuracy and increases confidence asked participants to predict and bet on basketball games and found that increasing information didn’t cause the expected effects. Their findings have implications for settings such as political campaigns, where the decision to provide voters with an abundance of accurate information countering the false claims may not have the desired effect. Nyhan and Reifler’s paper When Corrections Fail: The Persistence of Political Misperceptions describes this ‘backfire effect’ in which corrections to mock news articles reinforced belief in the original, incorrect version.

So sometimes holding back on the evidence lesson is a good idea with an ignorant audience. And perhaps if I’m so sure I’m right, I might have it so wrong I should be keeping my mouth shut anyway!