Program Library
Programs to get started using the HILDA data
Matching files
These programs show how to match household and responding person files for Wave 1. Program 3 also matches enumerated files.
Adding partner variables
Some users may want to include variables for a respondent’s partner in their analyses. These programs show how to utilise the partner’s cross-wave identifier _hhpxid to add partner variables onto the responding person file.
Creating longitudinal files
You may want to create a wide longitudinal file or a long longitudinal file. In a wide longitudinal file, you would put the variables for each wave next to each other (that is, there is one row of data for each person). In a long longitudinal file, you would remove the wave prefix and stack the information for each wave together (that is, there is a separate row of data for each wave of each person). If you are looking at changes across two specific waves, you may prefer to work with a wide longitudinal file. However, most users will create a long longitudinal file.
Generally, you will start with an unbalanced panel (which includes all responding persons or enumerated persons each wave). Sometimes, you will restrict the unbalanced panel to a balanced panel (which includes respondents in both waves in a pair of waves, or all waves in a set of waves).
The following programs show how to create an unbalanced panel of responding persons.
An alternative way to strip the first letter from a variable name is demonstrated in this macro:
- Program 13 (SAS) (originally provided by Bruce Bradbury)
The wide files are created by matching the responding or enumerated files for each wave together using xwaveid . Example programs to create wide files are provided in:
The longitudinal weights on the enumerated and responding persons files are for the full, balanced panel of respondents and enumerated persons from Wave 1 (i.e., across the first two, three, four etc ... waves).
If you are constructing a balanced panel with different specifications, you should find a suitable weight in the longitudinal weights file. Out of scopes (deaths and moves overseas) are treated as acceptable outcomes, so these people have weights applied as well.
Applying Weights
The HILDA Survey has a complex sample design, non-response and attrition. This should be taken into account when creating descriptive statistics. The following programs show this can be done.
Programs provided by users
- SAS macro and example program for the calculation of standard errors via the Jackknife method. (Provided by Associate Professor Bruce Bradbury in 2009.)
- SAS programs to create a long longitudinal file, repeat code across multiple HILDA Survey waves and strip the first letter from variable names unless it is an
X. (Provided by Associate Professor Bruce Bradbury.) - Stata ado program called hildasetup to create a longitudinal file in long format from the HILDA Survey datasets by extracting variables from the combined files, master file and longitudinal weights file. (Provided by Dr Paco Perales in 2013.)
- PanelWhiz provides a common front-end to a range of panel datasets, including HILDA. Amongst other functions, the program allows users to select vectors of variables for automatic matching and merging in Stata/SE. (Developed by Professor John de New (formerly Haisken-DeNew))
- R package ‘hildareadR’ for constructing longitudinal HILDA files. Similar to the stata ado program (hildasetup) above. (Provided by Sebastian Kalucza and Sara Kalucza in 2020.)
Disclaimer
The HILDA Survey team does not take any responsibility for the correctness of these programs.
Contributions
Users of the HILDA Survey data may contribute code to this library if they believe it may be beneficial to other users. To contribute, please send your code to hilda-inquiries@unimelb.edu.au.
Selected derived variable programs
HILDA tax-benefit model
This Stata program implements the HILDA tax-benefit model, which the HILDA survey team uses to estimate income taxes and family benefits. The program code makes obvious the assumptions, parameters and formulas of this model. User may want to use the program to recalculate taxes and family benefits themselves.
The program works with both the general and the restricted release of HILDA. Note that, for some observations, the program will produce values that are slightly different from the official values when it is used with the general release. This is because the general release provides less precise information on persons’ dates of birth and because some income variables of the general release are top-coded.