British Household Panel Survey (BHPS)

Main uses

The core of BHPS is a panel survey, meaning that the same households are surveyed each year.  Thus, unlike other surveys, it can be used to analyse data either longitudinally (e.g. how does a household's income change over time?) or in terms of persistency (e.g. for households in low income in year X, how long do they remain in low income?).

BHPS's second main use comes from its inclusion of particular subjects which are not in other UK-wide or Great Britain-wide surveys, with examples including consumer durables, participation and risk of mental illness.  However, it is a relatively small survey and should therefore not be used when there are alternative surveys with the required data.



In summary:

The BHPS files in the UK data archive are unusual in two respects:

Finally, note that the individual-level files are for adults only (i.e. there are no records for children).


General issues

As will become apparent from the discussion below, BHPS is a difficult dataset to use and it is easy to make mistakes.  Furthermore, the manual is less helpful than it should be as, although comprehensive in scope, its details often does not seem to completely correspond to the datasets themselves.  For both these reasons, it is important to completely familiarise yourself with the dataset before using it.

Which software to use

As the annual dataset is around 16,000 records for individuals and 8,000 records for households, it can be exported into Excel.

Which files to use

Because the dataset contains all the data for all the years that BHPS has been in existence, each file name has to have a prefix to indicate the year to which the file applies.  These prefixes range from 'a' for 1991/92 through to 'o' for 2005/06.

For household-level analyses, use the household-level files.  For individual-level analyses use the individual-level files, linking these as necessary to selected household-level files for relevant household-level data.

Because the household-level data for a particular year is spread across a number of files, it is likely that any analysis will need to link some of these files.  This can be done using the household identification number.  Similarly, the individual-level files can be linked together using a combination of the household identification number and person number.

Note that household income is not available for all households and the list of households in the household income files is somewhat different than that in the other files.

Which parts of the files to be used

The core of each BHPS dataset is a panel who are surveyed each year.  This is called the 'Essex' sample.  In addition, this core is supplemented each year by additional households in Scotland, Wales and Northern Ireland, the purpose being to provide sample sizes which are sufficient for analysis at the home country level.  Most analyses will simply use the whole sample.

Which weights to use

Unlike most survey datasets, which have a single weight field, BHPS has many weight fields and the issue therefore arises of which to use when.  Furthermore, some of the weights are very different so the choice is important.

The general format of the names for the weight fields is summarised in the table below.


Digit Meaning
First digit The year to which the dataset applies, from 'a' for the first year through to 'o' for the latest year
Second digit

'l' if it is a longitudinal weight; or

'x' if it is not a longitudinal weight

Third digit

'e' if it is an individual-level weight for the whole population; or

'r' if it is an individual-level weight but only for the actual respondents to the survey questions (basically the adults only and not the children); or

'h' if it is a household-level weight for the whole population; or

Fourth digits onwards

'wtuk1' if the weight is to used to analyse the data at the UK-wide level and the whole sample is to be used; or

'wtuk2' if the weight is to be used to analyse the data at the home country level and the whole sample is to be used; or

'wght' if the weight is to be used to analyse the data at the UK-wide level but only using the core 'Essex' sample (see above); or

'sw1': it is not clear that one should ever use such a weight; or

'sw2': it is not clear that one should ever use such a weight

So, for example if what is wanted is a normal UK-wide analysis at the individual level using the latest year's data, then the weight to use is 'oxewtuk1'.  If the same analysis is to be done for Scotland only, then the weight to use is 'oxewtuk2' .


Specific issues

Analysis by region

Because of the sample size is boosted for Scotland, Wales and Northern Ireland, analyses for these countries can be undertaken.  Because of the small overall size of the survey, however, sub-England analysis is not recommended.

Analysis by household income

Such analyses can be undertaken but there are a number of caveats:


Relevant graphs on this website

UK graphs

Indicator Graphs Tables Comments
Persistent low income all indall, hhsamp and nethh

Requires a complicated analysis.

Use indall for the weights, hhsamp for the region, and nethh for household income and the sample origin.  Link these tables using a combination of household identification number and person number.

Use the lewght weights because longitudinal transitions (hence the l), includes children (hence the e) and for the original Essex sample only (hence not one of the UK ones).

For each year and each individual, calculate whether the person was in income poverty or not.  Then link the requisite years together using the person identification number.  Then, for those individuals who were surveyed in each of the years, allocated them to the requisite poverty groups.

Lacking consumer durables all hhresp

Use the xhwtuk1 weights because not longitudinal (hence the x), households (hence the h) and is for the UK (hence the 1).

Allocate each record to a household income quintile, calculating the quintile thresholds required to achieve this.

Non-participation first and second indresp and hhresp

Use the xrwtuk1 weights because not longitudinal (hence the x), adults only (hence the r) and is for the UK (hence the 1).

Allocate each record to a household income quintile using the allocations from hhresp and linking the two tables using household identification number.

Use ONS mid-year population estimates as necessary to translate proportions into absolute numbers.

Scotland, Wales and Northern Ireland graphs

These are effectively a subset of the UK graphs using government region (hhresp) as a filter and, in line with the rules discussed earlier, using the weights with the suffix 2 rather than 1.