Program Library

Programs to get started using the HILDA data

Matching files

These programs show how to match household and responding person files for Wave 1. Program 3 also matches enumerated files.

Adding partner variables

Some users may want to include variables for a respondent’s partner in their analyses. These programs show how to utilise the partner’s cross-wave identifier _hhpxid to add partner variables onto the responding person file.

Creating longitudinal files

User may want to create a balanced longitudinal file in a number of ways.

Wide file of responding persons: A dataset of people who responded in all waves. Variables for each wave are placed next to each other, creating one row of data per person.
Wide file of enumerated persons: A dataset of people living in responding households in all waves. Variables for each wave are placed next to each other.
Long file or responding persons: A dataset of people who responded in all waves. The information for each wave is stacked together, creating a separate row of data for each wave of information for each person.
Long file of enumerated persons: A dataset of people living in responding households in all waves. The information for each wave is stacked together.

Most users will probably want to restrict the files to only include respondents or people from responding households. A few users may also want to add people who have died or moved out of scope (depending on the research question they are answering).

The following programs show how to create balanced long longitudinal files of responding persons.

The wide files are created by matching the responding or enumerated files for each wave together using xwaveid . An alternative way to strip the first letter from a variable name is demonstrated in this macro:

Program 12 (SAS) (originally provided by Bruce Bradbury)

Some users may want to create an unbalanced panel – where you take all respondents or enumerated persons available at each wave (not just those that consistently respond or are consistently in responding households).

Program 13 (Stata) demonstrates how to create either a balanced or unbalanced panel.

The following programs show how to create wide longitudinal files.

The longitudinal weights on the enumerated and responding persons files are for the full, balanced panel of respondents and enumerated persons from Wave 1 (i.e., across the first two, three, four etc ... waves).

If you are constructing a balanced panel with different specifications, you should find a suitable weight in the longitudinal weights file. Out of scopes (deaths and moves overseas) are treated as acceptable outcomes, so these people have weights applied as well.

Applying Weights

The HILDA Survey has a complex sample design, non-response and attrition. This should be taken into account when creating descriptive statistics. The following programs show this can be done.

Programs provided by users

SAS macro and example program for the calculation of standard errors via the Jackknife method. (Provided by Associate Professor Bruce Bradbury in 2009.)
SAS programs to create a long longitudinal file, repeat code across multiple HILDA Survey waves and strip the first letter from variable names unless it is an X . (Provided by Associate Professor Bruce Bradbury.)
Stata ado program called hildasetup to create a longitudinal file in long format from the HILDA Survey datasets by extracting variables from the combined files, master file and longitudinal weights file. (Provided by Dr Paco Perales in 2013.)
PanelWhiz provides a common front-end to a range of panel datasets, including HILDA. Amongst other functions, the program allows users to select vectors of variables for automatic matching and merging in Stata/SE. (Developed by Professor John de New (formerly Haisken-DeNew))
R package ‘hildareadR’ for constructing longitudinal HILDA files. Similar to the stata ado program (hildasetup) above. (Provided by Sebastian Kalucza and Sara Kalucza in 2020.)

Disclaimer

The HILDA Survey team does not take any responsibility for the correctness of these programs.

Contributions

Users of the HILDA Survey data may contribute code to this library if they believe it may be beneficial to other users. To contribute, please send your code to hilda-inquiries@unimelb.edu.au.

Selected derived variable programs

HILDA tax-benefit model

This Stata program implements the HILDA tax-benefit model, which the HILDA survey team uses to estimate income taxes and family benefits. The program code makes obvious the assumptions, parameters and formulas of this model. User may want to use the program to recalculate taxes and family benefits themselves.

The program works with both the general and the restricted release of HILDA. Note that, for some observations, the program will produce values that are slightly different from the official values when it is used with the general release. This is because the general release provides less precise information on persons’ dates of birth and because some income variables of the general release are top-coded.