Choosing the algorithm to parse the Entity Relationships (ERs)¶
There are 2 approaches (or 2 modules) we can use here to let dbterd
look at how the ERs can be recognized between the dbt models:
Test Relationship¶
During the dbt development, the engineers are supposed to add the consistency checking using relationships
test function, or similarly specifying the dbt contraints (they're also the tests behind the scenes), given the below example project with Jaffle Shop.
Let's install the repo:
Setup the environment, and install the deps including dbterd
:
python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt
pip install dbterd --upgrade
In the order_items.sql
model, we can see 1 sample test:
models:
- name: order_items
columns:
- name: order_item_id
data_tests:
- not_null
- unique
- name: order_id
data_tests:
- relationships: # dbterd looks for all kind of this test
to: ref('orders')
field: order_id
Running dbterd run -enf table
will expose the DBML code as below:
Table "orders" {
...
}
Table "order_items" {
...
}
...
Ref: "order_items"."order_id" > "orders"."order_id"
...
Awesome, job done here 🎉
NO, not yet (maybe!), sometime this module is not going to work perfectly due to:
- Some relationship tests are added from
mart
tostaging
just for ensuring no missing data when moving from a layer to another.- That's why we have the ignore_in_erd metadata config.
- We have the tests done in separate tools already (e.g. Soda), there is no reason to duplicate the (relationship) tests here.
- No problem! Let's still add it with
where: 1=0
or with the dummy relationship tests (see this blogpost)
- No problem! Let's still add it with
Semantic Entities¶
Since dbt v1.6, dbt has supported the Semantic Layer, when implementing this dbt Semantic Layer with Metric Flow (docs), we have the ability to define entities in our semantic modelling, telling metricflow
how to join tables together.
Based on the above, dbterd
can also look for the Semantic Entities (primary
and foreign
) in order to understand the ERs, subsequently produce the ERD code as the 2nd option.
Let's use the above Jaffle Shop project again, here is the sample implemented semantic_models
between order_item
and orders
:
semantic_models:
- name: order_item
...
model: ref('order_items')
entities:
- name: order_item
type: primary
expr: order_item_id
- name: order_id
type: foreign
expr: order_id
...
semantic_models:
- name: orders
...
model: ref('orders')
entities:
- name: order_id
type: primary
Now running dbterd run -enf table
with the environment variable DBTERD_ALGO=semantic
in advance, or we can use the command without it :
The result DBML code will be the same as the 1st option. Voila! 🎉🎉
New module(s)?¶
If you get the idea of having new type of module(s) to parse ERs, feel free to submit yours here or to check Contribution for pulling a request!