Running dbt cli with 1Password ๐Ÿš€

ยท

3 min read

dbt is a popular data transformation tool that allows you to connect to multiple data sources and runs SQL queries for analytics.

They have a CLI version of dbt called dbt core that can be used to run SQL scripts (models) locally or in CI. It is a wonderful tool that allows for collaboration and standardisation in data teams. One of my gripes with dbt is regarding authentication with the data sources. It advocates for storing passwords in a plain text file in the project root folder or someplace in user's home like ~/dbt.

This is risky considering it can contain credentials to connect to data warehouses for all environments. I used op run with --no-masking (to buffer stdout) to directly fetch passwords from 1Password and add them as env variables. Once I got to know about shell plugins, I felt it would be a good match to connect both.

This 1Password dbt shell plugin supports all major data platforms supported by dbt except BigQuery for some form of password-based authentication. Field names are targeted towards sources that support db auth like Postgres.

Pre-requisites

  1. Install dbt-core and corresponding adapter package like dbt-postgres.

  2. dbt project.

  3. data platform to run dbt on.

  4. Install dbt 1Password shell plugin.

Connecting to Redshift

Create a new Database item in your 1Password of type 'PostgreSQL' and save your db credentials.

Fields in 1Password and corresponding env variable

1Password fieldEnvironment variable
hostDBT_HOST
databaseDBT_DB
portDBT_PORT
userDBT_USER
password (required)DBT_PASSWORD

Go to your profiles.yml file for your dbt project and substitute connection details with env_var macro. This allows dbt to fetch connection details from environment variables set by 1Password.

Sample profiles.yml for Redshift

config:
dev_redshift:
  target: local
  outputs:
    local:
      type: redshift
      host: "{{ env_var('DBT_HOST') }}"
      port: "{{ env_var('DBT_PORT') | as_number }}"
      user: "{{ env_var('DBT_USER') }}"
      pass: "{{ env_var('DBT_PASSWORD') }}"
      dbname: "{{ env_var('DBT_DB') }}"
      schema: "dev"

Run this command to initialize the shell plugin in your dbt project root folder. Select the already created database item and apply the plugin to current and child directories.

op init dbt

All dbt runs from that folder will now work with 1Password injecting connection details. No passwords are stored in plain text. ๐ŸŽ‰

๐Ÿ’ก
If you have targets to different environments or sources, create an empty folder for each target in the project root and run init command.

Connecting to Databricks

The plugin is built such that only a password is required. This allows dbt plugin to work with most password-based auth profiles. One of the side effects is that the field name can be a little different, say Databricks uses 'token' for auth, but it corresponds to the password in this plugin.

Sample profiles.yml for Databricks

config:
dev_dbx:
  target: local
  outputs:
    local:
      type: databricks
      host: "{{ env_var('DBT_HOST') }}"
      token: "{{ env_var('DBT_PASSWORD') }}"
      schema: "{{ env_var('DBT_DB') }}"
      http_path: "http/path"

Pull Request to 1Password repo: shell-plugins/pull/328

Future work

  1. Add the option to inject secrets directly in profiles.yml file along with existing env files

  2. Easy switch between targets via dbt args without using empty folders.

  3. Allow single shell plugin to support different connection schemas.

Acknowledgements

  1. 1Password

  2. Hashnode

#1Password #BuildWith1Password

โš 
I'm not associated with dbt or 1password, just a user making my life better with the plugin.
ย