DISCLAIMER: AI AUTOGENERATED README. LAST UPDATE 2025-04-29.
OrcaVault Models
This directory contains the dbt models for the OrcaVault data warehouse. The models are organized into several layers following a modified Data Vault architecture.
Directory Structure
- dcl/ - Data Vault Core Layer
- Contains the core Data Vault models (Hubs, Links, Satellites)
- Implements the raw vault pattern for storing historical and current data
- Includes schema definitions and constraints
- mart/ - Business Data Marts
- Contains dimensional models for specific business areas
- Organized by research centers or teams:
- centre/
- curation/
- dawson/
- grimmond/
- tothill/
- Provides business-friendly views of the data
- psa/ - Persistent Staging Area
- Contains models for staging data from source systems
- Preserves raw data with minimal transformations
- Includes models for spreadsheet data (Google LIMS, library tracking)
- ods/ - Operational Data Store
- Contains source definitions for operational data
- tsa/ - Temporary Staging Area
- Contains source definitions for temporary staging data
- legacy/ - Legacy Data Sources
- Contains source definitions for legacy systems
Data Vault Modeling
The Data Vault Core Layer (dcl/) follows the Data Vault 2.0 methodology:
- Hubs (
hub_*.sql
): Store business keys and their source
- Links (
link_*.sql
): Store relationships between business entities
- Satellites (
sat_*.sql
): Store descriptive attributes and their history
- Effectivity Satellites (
effsat_*.sql
): Track the validity of relationships over time
Dimensional Modeling
The Business Data Marts (mart/) implement dimensional models:
- Fact tables containing measures and foreign keys to dimensions
- Dimension tables containing descriptive attributes
- Organized by business domain (e.g., fastq, bam, lims, workflow)
Usage
These models are designed to be built using dbt (data build tool). The models follow a layered architecture where:
- Raw data is loaded into the PSA/TSA layers
- The Data Vault Core Layer (DCL) integrates and historizes the data
- Business Data Marts transform the data into dimensional models for reporting and analysis