Unified Data Dictionary
This standardized data dictionary documents the common variables shared across all Orca datasets within G-Trac. These variables have been harmonized to enable cross-study analyses while preserving study-specific details.
Note: Each study dataset also includes study-specific fields documented separately:
- GI-DAMPs Columns
- MUSIC Columns
- Mini-MUSIC Columns
Naming Convention
All variables in Orca datasets follow the snake_case naming convention:
- All characters are lowercase
- Words are separated by underscores
- No spaces or special characters (except underscores)
- Examples: study_id, nhs_bloods_date, hbi_total
Variable Organization
The sections below group the shared fields into logical themes to help you quickly find related variables across datasets. For study-specific variables, refer to the individual dataset column lists.
Demographics & Visit Context
| Variable |
Type |
Values |
Comments |
| study_id |
string |
|
Study-specific participant identifier using prefixes such as GID-, MID-, or MINI-. |
| study_group |
string |
cd, uc, ibdu, non_ibd, await_dx, hc |
Disease classification at the visit: Crohn's disease (cd), Ulcerative colitis (uc), IBD unclassified (ibdu), non-IBD control, awaiting diagnosis, or healthy control. |
| study_center |
string |
edinburgh, glasgow, aberdeen, dundee |
Recruiting centre recorded for the visit. |
| sex |
string |
male, female |
Participant sex recorded by the study. |
| age |
int |
|
Participant age in years at the time of the visit. |
| age_at_diagnosis |
int |
|
Age in years when inflammatory bowel disease was first diagnosed. |
| date_of_diagnosis |
date |
YYYY-MM-DD |
Calendar date when inflammatory bowel disease was first diagnosed. |
| has_active_symptoms |
string |
1 / 0 or yes / no |
Indicator of active symptoms at the visit; recorded as 1/0 or yes/no depending on source system. |
| height |
int |
|
Height measured in centimetres. |
| weight |
float |
|
Weight measured in kilograms. |
| bmi |
float |
|
Body mass index calculated as kg/m^2. |
Laboratory Values
| Variable |
Type |
Values |
Comments |
| nhs_bloods_date |
date |
YYYY-MM-DD |
Date the NHS blood panel was collected. |
| haemoglobin |
float |
|
Haemoglobin concentration (g/L) from NHS bloods. |
| haematocrit |
float |
|
Haematocrit reported as a proportion (L/L). |
| white_cell_count |
float |
|
Total white cell count (x10^9/L) from NHS bloods. |
| neutrophils |
float |
|
Absolute neutrophil count (x10^9/L) from NHS bloods. |
| lymphocytes |
float |
|
Absolute lymphocyte count (x10^9/L) from NHS bloods. |
| monocytes |
float |
|
Absolute monocyte count (x10^9/L) from NHS bloods. |
| basophils |
float |
|
Absolute basophil count (x10^9/L) from NHS bloods. |
| eosinophils |
float |
|
Absolute eosinophil count (x10^9/L) from NHS bloods. |
| platelets |
int |
|
Platelet count (x10^9/L) from NHS bloods. |
| sodium |
float |
|
Serum sodium (mmol/L) from NHS bloods. |
| potassium |
float |
|
Serum potassium (mmol/L) from NHS bloods. |
| urea |
float |
|
Serum urea (mmol/L) from NHS bloods. |
| creatinine |
float |
|
Serum creatinine from NHS bloods, measured in umol/L. |
| albumin |
float |
|
Serum albumin from NHS bloods, measured in g/L. |
| crp |
float |
|
C-reactive protein (mg/L) from NHS bloods. |
| calprotectin |
string |
numeric or threshold string (e.g., <20, >1800, no sample) |
Faecal calprotectin result (ug/g) reported as numeric values or qualitative thresholds by the laboratory. |
| calprotectin_date |
date |
YYYY-MM-DD |
Date the faecal calprotectin sample was collected. |
IBD Phenotyping
| Variable |
Type |
Values |
Comments |
| baseline_eims_none |
int |
1 yes, 0 no |
Baseline survey response confirming no extra-intestinal manifestations were reported. |
| baseline_eims_episcleritis |
int |
1 yes, 0 no |
Baseline record indicating a history of episcleritis as an extra-intestinal manifestation. |
| baseline_eims_erythema_nodosum |
int |
1 yes, 0 no |
Baseline record indicating erythema nodosum as an extra-intestinal manifestation. |
| baseline_eims_pyoderma |
int |
1 yes, 0 no |
Baseline record indicating pyoderma gangrenosum as an extra-intestinal manifestation. |
| baseline_eims_sacroileitis |
int |
1 yes, 0 no |
Baseline record indicating sacroiliitis as an extra-intestinal manifestation. |
| baseline_eims_uveitis |
int |
1 yes, 0 no |
Baseline record indicating uveitis as an extra-intestinal manifestation. |
| previous_appendicectomy |
int |
1 yes, 0 no |
History of appendicectomy recorded at baseline. |
| previous_tonsillectomy |
int |
1 yes, 0 no, -1000 unknown |
History of tonsillectomy recorded at baseline; -1000 is used where the source study captured an explicit unknown response. |
IBD Medications & Monitoring
| Variable |
Type |
Values |
Comments |
| sampling_asa |
int |
1 yes, 0 no |
Aminosalicylates (5-ASA) were being taken at the time of sampling. |
| sampling_ada |
int |
1 yes, 0 no |
Adalimumab was being taken at the time of sampling. |
| sampling_ifx |
int |
1 yes, 0 no |
Infliximab was being taken at the time of sampling. |
| sampling_mtx |
int |
1 yes, 0 no |
Methotrexate was being taken at the time of sampling. |
| sampling_uste |
int |
1 yes, 0 no |
Ustekinumab was being taken at the time of sampling. |
| sampling_vedo |
int |
1 yes, 0 no |
Vedolizumab was being taken at the time of sampling. |
| drug_level_date |
date |
YYYY-MM-DD |
Date the therapeutic drug monitoring sample (for biologic drug or antibody levels) was taken. |
| ada_level |
string |
assay result in ug/mL (numeric or threshold string) |
Measured adalimumab concentration from therapeutic drug monitoring; labs often report numeric levels or threshold strings such as >12. |
| ada_antibody |
string |
assay result (e.g., 5, <10, not_tested) |
Anti-adalimumab antibody result captured during therapeutic drug monitoring; qualitative thresholds or numeric titres reported by the lab. |
| ifx_level |
string |
assay result in ug/mL (numeric or threshold string) |
Measured infliximab concentration from therapeutic drug monitoring; labs report numeric values or thresholds such as >12. |
| ifx_antibody |
string |
assay result (e.g., 5, <10, not_detected) |
Anti-infliximab antibody result from therapeutic drug monitoring; values include numeric titres or qualitative thresholds. |
Investigations & Imaging
| Variable |
Type |
Values |
Comments |
| endoscopy_date |
date |
YYYY-MM-DD |
Date of the most recent endoscopy associated with the visit. |
| endoscopy_report |
string |
|
Narrative endoscopy findings recorded in clinical notes. |
| pathology_report |
string |
|
Narrative pathology findings (e.g., biopsy results). |
| mri_small_bowel |
int |
1 performed, 0 not performed |
Indicator that a small bowel MRI was completed for the visit. |
| mri_pelvis |
int |
1 performed, 0 not performed |
Indicator that a pelvic MRI was completed for the visit. |
| mri_small_bowel_report |
string |
|
Narrative findings from the small bowel MRI report. |