Mixed analytical teams in practice
Innovative methods of data manipulation
The role of experimental designs
Mixed analytical teams in practice
Researcher or statistician - Tracey Budd, Department for Transport
In this presentation, Tracey Budd discussed her own experiences of working in mixed analytical teams in the Home Office and Department for Transport. The presentation begun by breaking-down stereotypes often associated with social researchers and statisticians before discussing the benefits of working in mixed analytical teams. Benefits included:
The presentation also considered the implications of moving between professional groups and reviewed the management, departmental and professional issues that arise in developing mixed analytical teams.
R&D tax credits: A cross-disciplinary evaluation experience - Elizabeth Spratt & Alice Dwyer, HM Revenue and Customs
Using the example of a multi-disciplinary evaluation of tax credits for small businesses, this presentation begun with an overview of how analysts from across the professions work together in HMRC’s Knowledge, Analysis and Information teams. This example illustrated the value of working in mixed teams and considered a number of factors including:
How a lone social researcher can make a splash in a sea of other analysts - Mike Tibble, Department for Work and Pensions
Drawing on experiences in the DWP’s fraud and error team, Mike Tibble discussed the added value that social research brought to the development of an evidence-based strategy for tackling fraud and error in the benefits system. The involvement of social research brought a number of benefits including:
General discussion: This focused on whether there were enough opportunities to work across the professions and how easy/difficult it is to move between professional groups and whether it is desirable or necessary? The implications of moving towards a ‘generic analyst’ were considered, in particular those relating to recruitment, support and the added benefits that this might bring.
Innovative methods of data manipulation
Methods to improve the estimation of mortality and life expectancy - Brian Johnson, Office for National Statistics
Brian Johnson from the Office of National Statistics (ONS) presented his paper on ‘Methods to improve the estimation of mortality and life expectancy using the ONS Longitudinal Study’
The ONS Longitudinal Study (LS) contains linked census and vital event data for one per cent of the population of England and Wales. Information from the 1971, 1981, 1991 and 2001 Censuses has been linked across censuses as well as information on events such as births, deaths and cancer registrations.
Since it uses (anonymised) census data, the LS has the advantage of low levels of attrition and the ability to attribute social indicators such as occupation, several years before death rather than relying on information recorded at death registration, which, since usually provided by relatives of the deceased, may ‘promote’ people above their occupational status. This reduces so called numerator/denominator bias
In theory each member of the LS should have a valid exit - either a death or an embarkation, or they should be recorded at the next census. However, at each census, there are LS members unaccounted for. That is they have not been recorded as dying or emigrating but there are no records linking them to the latest census. This been an increasing problem at each successive census. These losses are due to three main reasons:
These introduce bias as it is not realistic to assume that losses are randomly distributed across social groups. Social class V has proportionately more unrecorded losses than Social class I and the treatment of this unaccounted for group can have a significant impact on life expectancy estimates.
There are several available options to deal with this. The first two, to include or exclude these people in total are not viable due to the bias issues. The other options include weighting the data in different ways. One way is to apply 1970’s embarkation rates to the 1990’s, adjusted by ONS International Passenger Survey data for the two decades. This has the advantage of not needing a new data source, but it does rely on assumptions about the relative accuracy of 1970s and 1990s embarkation data. Alternatively, a new approach is to use recent quality assurance analyses based on health authority de-registration as a proxy for unobserved embarkation. This allows a direct estimate of unrecorded emigration by age and gender, and is flexible in that it can be applied to any social classification on the census. However, it is difficult to test the underlying hypothesis and some embarkations are still unrecorded. Investigation of the potential of this approach continues at ONS.
Linking the Labour Force Survey to the Inter - Departmental Business Register - Matt Fido, Office for National Statistics
Matt Fido, also from the ONS then presented his paper on Linking the Labour
Force Survey (LFS) to the Inter Departmental Business Register (IDBR)
The LFS currently underestimates employment levels within industrial sectors
in the UK, particularly the public/private sector split. This is mainly
due to the fact that respondents to the LFS ('employees') have a different
perception of company activities to head office respondents to business
surveys The National Statistics Quality Review and the Allsopp Review recommended
the linking of the LFS to the IDBR to improve employment and sectoral estimates
and investigate how the benefits could be transferred to other surveys.
This would allow estimates such as productivity by industry, public and
private sector employment and skills and small and medium sized enterprise
statistics to be improved.
A feasibility study, using existing LFS questions with telephone respondents, took place in early 2005. The data was then 'cleaned' using Matchcode to improve the address matching to the IDBR which produced a matching rate of 35-40% between LFS and IDBR employment records. The lessons learnt from the feasibility study included the importance of the postcode, problems of non standard employment patterns e.g. travelling salesmen, legal names differing to trading names, and the need to differentiate between who respondents reported to or were paid by.
The pilot study in September/October 2005 attempted to address the lessons learnt from the feasibility study with increased interviewer training, improved questionnaire design and tighter coding for responses. The automatic matching rate improved to 50%, the data quality increased, and there was a willingness to provide the required information. Also, early analysis has seen improvements to the public/private split and industrial sector employment estimates. ONS are currently evaluating the study and looking to introduce the necessary changes to the LFS in 2007, which should provide further opportunities to evaluate the project.
The role of experimental designs
Utilising Random Assignment: Employment Retention Advancement Demonstration Operational Challenges - Karl Olsen, Department for Work and Pensions
This research used a Randomised Control Trail (RCT) on Employment Retention Advancement (ERA) policy to test interventions for Job Centre Plus.
The trial was run in six districts on three customer groups i) New Deal lone parents, ii) New Deal 25+, and iii) Working Tax Credits.
Key issues:
Propensity Score Matching - Elizabeth Whiting and Lucy Cuppleditch, Home Office
Propensity Score Matching (PSM) is a matched comparison design; matching cases in a treatment group with those in a non-treatment group using a number of characteristics. Each case is given a score based on the sum of these characteristics which are then matched so that the only difference is the impact of treatment.
The researchers used PSM to assess the effectiveness of different sentences. A propensity score is created by generating a predicted probability of being in the treatment group using logistic regression. In this case the model predicted whether an offender was given a custodial sentence. Once the propensity score has been calculated there are three methods available for matching:
The Caliper method was chosen. PSM has some limitations, including:
The researchers also look at Prolific Offenders and other Priority offenders (PPOs). The advantage with this group is that not all PPOs are on the list; however the drawback is that there is variation in classification in different areas. The PPO sample was compared with a general offending population and the differences were noted for example 50 per cent of the PPO sample had 35 previous offences or more where as 50 per cent of the general offending population had 5 or more offences. This confirmed there were significant differences between the samples and that the differences could be modelled.
Using data from the Police National Computer (PNC) the researchers applied the Caliper method. The advantage of the PNC is that if no match is found then the sample could be increased (final sample approximately 200,000). Without this possibility there is a danger over over-matching i.e. same case repeatedly matched.
The model was shown to work on all counts excluding ethnicity. As the PPO scheme is on-going the researchers were also able to use their model for prediction, which again was shown to be accurate.
Refugees Integration: Piloting Experimental Design in Evaluating Refugee Integration - Nathanael Bevan, Home Office
This research used Propensity Score Matching (PSM) to evaluate Sunrise. Sunrise is a voluntary pilot being run in four areas of the UK that offers up to 17 hours of caseworker support for participants. The presentation focused on the impact assessment aim, which was to measure at the effect that Sunrise has on the housing and employment outcomes of new refugees.
A Randomised Control Trail was dismissed as an alternative approach because it would have led to:
These are problems that, in different circumstances, could be overcome. Early involvement in the formation of a policy and some control over its delivery are crucial.
As no administrative data sources were available the researcher conducted a literature review, focus groups, and interviews with practitioners to establish main factors that would predict participation and outcome. The sample (all asylum seekers granted refugee status between Jan and Dec 06) were then sent a postal survey and will be followed-up 3 times at 6 month intervals.
There were a number of challenges in using PSM:
The impact of Sunrise will be measure using the following two approaches:
Small Grant Schemes: Scottish Executive Education Department Sponsored Research Programme - Rod Harrison, Scottish Executive
New Ideas Fund - Karen Bathgate and Robert Willis, Welsh Assembly Government
Collaborative Partnership Opportunities - David Ridley, ESRC