Skip to main content

Table 1 Terminology of variable categorisation.

From: Harmonisation of variables names prior to conducting statistical analyses with multiple datasets: an automated approach

Terms

Description

Examples

Dataset

Stata data files have the extension '.dta' containing the data to be analysed.

ke_DHS_41.dta

Observation

Each of the subjects for which data in the form of variables have been collected.

Observations are numerated from 1 to the total number of observations

Variable

Each item of information for each subject in a dataset.

Vaccination against the third dose of DTP

Variable name

The name given to the variable, which is used for data management and analyses.

DTP3, hi8, im

Variable label

A free text to explain the information contained in variable.

'DTP3 vaccination status of the child'

Variables of interest

Variables defined by the user, which are to be included in the analyses, and which have to be searched for in the datasets

'DTP3', if the vaccination status of the third DTP dose will be used in the analyses

Candidate variables

Existing variables in the datasets (e.g. surveys) which need to be renamed to the names of the variables of interest to become harmonised

'im8', 'im15'... (these are variables pointing at the vaccination status of the third DTP dose)

Value

The numerical, logical, date, time or string information for a given variable in an observation.

1, 2, 9

Value label

Text label attached to each possible value of variable (in a categorical variable or certain values of non-categorical variables).

1: not vaccinated

2: vaccinated

9: unknown

[Commands] (*)

Terms and expressions used in Stata to undertake data management or analytical actions.

[display], [regress], [lookfor], [codebook]

Do, do file

Files in text format that store commands and that Stata can execute in sequence.

Start.do

'Current'

The term current (applied to variables or datasets) indicates the variables being considered in a programme at run time or the datasets loaded in memory.

No example

  1. (*) Stata commands are written in brackets all throughout this article.