Data dictionary
Every field, profiled and connected
45 fields across 8 tables — data quality flagged inline, and each field linked to the metrics it feeds. Auto-profiled from Caldera Logistics's estate.
shipment_events
10,613 rows · 4 fields
event_id
shipment_id
event_type
event_time
shipments
5,416 rows · 8 fields · 4 DQ
shipment_id
order_id
carrier_id
ship_date
promised_delivery_date
The SLA date a shipment was promised by. Ops measures on-time against THIS date at the shipment grain.
delivered_date
When a shipment was actually delivered, per the TMS scan. The numerator side of the Ops on-time calculation — and 2.9% null, so the number it drives is computed on slightly incomplete data.
status
Shipment lifecycle status — but with five casing variants (Delivered / DELIVERED / delivered / In Transit / in transit) across two systems, so any filter on it silently undercounts unless normalized.
weight_lbs
orders
5,000 rows · 7 fields · 1 DQ
order_id
customer_id
order_date
promised_date
service_level
origin_facility_id
The facility an order ships from — but 4 rows point at a facility id (999) that does not exist. A textbook orphaned reference in a schema with no enforced keys.
status
invoices
4,695 rows · 7 fields · 1 DQ
invoice_id
customer_id
order_id
invoice_date
amount
credit_amount
The credit issued against an invoice. This single field is the fork behind three different fights: Finance subtracts it to call a delivery NOT on-time (their 87.7% OTD), it splits gross vs net revenue, and it shifts which customers count as active. Ops never sees it. One undocumented field, three conflicting numbers.
status
exceptions
157 rows · 5 fields · 1 DQ
exception_id
shipment_id
reason_code
opened_at
resolved_at
When an exception was resolved — null 30.6% of the time, so 'open exception' counts depend on whether you trust the null.
customers
60 rows · 6 fields · 1 DQ
customer_id
name
segment
region
status
created_at
carriers
12 rows · 4 fields
carrier_id
name
mode
scac
facilities
8 rows · 4 fields
facility_id
name
region
type