Is there a documented code example anywhere using ...
# integration-dbt
t
Is there a documented code example anywhere using dbt + the ssh tunnel with port forwarding? This thread talks about it, but I cannot figure out what / how the resources are meant to be configured exactly.
My basic setup is as so, but how do we configure this to use the
get_tunnel
method from the
ssh_resource
? I am clearly missing some fundamental knowledge as to how to the Dagster resources fit together. I know I want it to be called in the assets, but given I am just returning the assets via
load_assets_from_dbt_project
, how do we achieve this?
Copy code
assets = with_resources(
    load_assets_from_dbt_project(
        profiles_dir=DBT_PROJECT_PATH,
        project_dir=DBT_PROFILES,
        use_build_command=False, 
        display_raw_sql=True,
    ),
    {
        "dbt": dbt_cli_resource.configured(
            {
                "project_dir": DBT_PROJECT_PATH,
                "profiles_dir": DBT_PROFILES,
            },
        ),
        "ssh": ssh_resource.configured(
            {
                "remote_host": os.getenv("SSH_HOST"),
                "remote_port": 22,
                "username": os.getenv("SSH_USER"),
                "key_file": "~/.ssh/dagster_rsa",
            }
        ),
    },
)
dbt profile
Copy code
host: localhost
port: 5439
user: xxxxx
pass: xxxxx
dbname: xxxxx
For a final bit of context, what I essentially want to be able to do is run a command equivalent to the below, which authenticates with the SSH server and forwards the local ports.
Copy code
ssh -fN dagster@SSH.SERVER.IP.ADDRESS -i ~/.ssh/dagster_rsa -L 5439:WAREHOUSE.ADDRESS:5439
r
If you set up your configuration as dbt describes: https://docs.getdbt.com/docs/core/connect-data-platform/postgres-setup#profile-configuration, you still need to run the following?
Copy code
ssh -fN dagster@SSH.SERVER.IP.ADDRESS -i ~/.ssh/dagster_rsa -L 5439:WAREHOUSE.ADDRESS:5439
I’m confused as to why
ssh_resource
is required here.
t
Hi @rex , Thanks for the reply. This isn’t a dbt configuration issue, it’s an issue with how can we get dbt assets to materialize on Dagster Cloud with an ssh tunnel with port forwarding. In our existing orchestration tool, this is all fine. But we are trying to PoC Dagater Cloud and finding thing a very difficult puzzle to solve. To give some more background: 1. In an initial thread, Owen said that the SSH resource wouldn’t work (like you are alluding to here), and to do this as a post deployment script. However this does not work, as the post deployment script isn’t run in the Cloud instance nor the workers 2. In another thread, originally about something else, Joe has said that the post deployment script approach won’t work (as we confirmed) and that we need to do this via the SSH resource. But he said that he isn’t sure how we can do this when using the load_assets_from_dbt_project So we are getting a few mixed messages about how to achieve this in Dagster. Perhaps my questions were not clear. So to reaffirm what we are trying to achieve, all I want to be able to do is run dbt materializations on Dagster Cloud serverless. To which, we need to be able to do ssh tunneling with local port forwarding. Can you help advise as to how we can achieve this in Dagster?
P.S. we have Dagater running locally fine, including with port forwarding. We simply run the ssh command above in the env before running tasks. Nice and simple and I know it all works fine. It is a question of how do we setup the tunnel with port forwarding in dagster cloud.
r
Looks like you were able to get this solved with @Tim Castillo. Here’s the solution for posterity. By migrating to
@dbt_assets
, you can just invoke the ssh tunnel before invoking dbt.
Copy code
@dbt_assets(manifest=dbt_manifest_path)
def midnite_assets(context: OpExecutionContext, dbt: DbtCliResource):
    tunnel = ssh.get_tunnel(
        remote_port=5439,
        remote_host="REMOTE",
        local_port=5439,
    )
    tunnel.start()
    yield from dbt.cli(["build"], context=context).stream()
t
Yep we had to migrate to the new dbt_assets to get this up and running. This new pythonich approach to resources is much simpler. It should help with the learning curve significantly
So good timing that they are out now.