From: Preparation of name and address data for record linkage using hidden Markov models
Symbol | Description | Usage | Based on |
---|---|---|---|
LQ | Locality qualifier words | Addresses | Look-up table |
LN | Locality (town, suburb) names | Addresses | Look-up table |
TR | Territory (state, region) names | Addresses | Lookup table |
CR | Country names | Addresses | Look-up table |
IT | Types of institution | Addresses | Look-up table |
IN | Names of institutions | Addresses | Look-up table |
PA | Type of postal address | Addresses | Look-up table |
PC | Postal (zip) codes | Addresses | Look-up table |
UT | Types of housing unit (eg flat, apartment) | Addresses | Look-up table |
WN | Wayfare names | Addresses | Look-up table |
WT | Wayfare types (eg street, road, avenue) | Addresses | Look-up table |
TI | Title words (eg Dr, Prof, Ms) | Names | Look-up table |
SN | Surnames | Names | Look-up table |
GF | Female given names | Names | Look-up table |
GM | Male given names | Names | Look-up table |
PR | Name prefixes | Names | Look-up table |
SP | Name qualifiers (eg aka, also known as) | Names | Look-up table |
BO | "baby of" and similar strings | Names | Look-up table |
NE | "Nee", "born as" or similar | Names | Look-up table |
II | One letter words (initials) | Names | Coded rule |
ST | Saint names (eg Saint George, San Angelo) | Both | Look-up table |
CO | Comma, semi-colon, colon | Both | Coded rule |
SL | Slash "/" and back-slash "\" | Both | Coded rule |
N4 | Numbers with four digits | Addresses | Coded rule |
NU | Other numbers | Both | Coded rule |
AN | Alphanumeric words | Both | Coded rule |
VB | Brackets, braces, quotes | Both | Coded rule |
RU | Rubbish | Both | Look-up table |
UN | Unknown (none of the above) | Both | Coded rule |