https://dagster.io/ logo
#ask-community
Title
# ask-community
w

Will Gunadi

05/04/2022, 7:38 PM
I'm trying to deploy a dagster-dbt on Docker container. I keep getting this error
Invalid --project-dir flag. Not a dbt project. Missing dbt_project.yml file
The container configuration follows this https://docs.dagster.io/deployment/guides/docker#multi-container-docker-deployment And the config I specify inside the repo is this:
Copy code
{"project_dir":"/opt/dagster/app/dbt_dev/bb_dbt",
    "profiles_dir":"/opt/dagster/app/dbt_profiles",
    }
When I docker exec into the
_user_code
container, I can see the above directory just fine and the dbt_project.yml is where it should be. What could be the issue here? Versions: Dagster 0.14.13 dbt: 0.20.2
p

prha

05/04/2022, 7:48 PM
so it’s not a docker mounting issue, since you can see the yml file… cc @owen
(in case there’s some
dagster-dbt
configuration idiosyncracies)
w

Will Gunadi

05/04/2022, 7:50 PM
When I execute the command directly inside the container, it seems to be working (another error, but different one).
d

daniel

05/04/2022, 11:43 PM
Could you show the code that's using that config? That looks right so curious if something else is going on
w

Will Gunadi

05/05/2022, 12:10 AM
Eventually the config is used here:
Copy code
def run_dbt(context, import_ts, table_name):
    dbt_result = context.resources.dbt.run(
        models=f'incr.{table_name}',
        vars=f'{{"import_ts":"{import_ts}"}}')
    return dbt_result
What I am curious is where does this "run" command being executed? Which container and in what directory?
d

daniel

05/05/2022, 12:35 AM
Can you show the part where you use that dict earlier with project_dir in it? How is that fed into the resource so that dbt will use that project dir?
w

Will Gunadi

05/05/2022, 2:26 PM
This is how the dict above is used:
Copy code
bb_dbt = dbt_cli_resource.configured(
{"project_dir":"/opt/dagster/app/dbt_dev/bb_dbt",    "profiles_dir":"/opt/dagster/app/dbt_profiles",
    }
)
And that bb_dbt resource is being used thusly:
Copy code
@job(resource_defs={
    'dbt': bb_dbt,
    })
def a_job():
    an_op()
That op (an_op) eventually calls this function (same as above):
Copy code
def run_dbt(context, import_ts, table_name):
    dbt_result = context.resources.dbt.run(
        models=f'incr.{table_name}',
        vars=f'{{"import_ts":"{import_ts}"}}')
    return dbt_result
Also, this is working and tested outside of a docker container.
@daniel forgot to ping you 🙂
d

daniel

05/05/2022, 2:27 PM
no need to ping, i get slack notifs when a thread I'm on is updated. will take a look
Can you share the full stack trace of the "Invalid --project-dir flag. Not a dbt project. Missing dbt_project.yml file" error?
w

Will Gunadi

05/05/2022, 8:15 PM
Unfortunately not anymore. I switched to use dbt-rpc and it works.
m

Megan Beckett

10/27/2022, 9:10 AM
I am also getting this issue when trying to deploy to Dagster Cloud. The full error message is:
Copy code
dagster_dbt.errors.DagsterDbtCliFatalRuntimeError: Fatal error in the dbt CLI (return code 2): Encountered an error: Runtime Error fatal: Invalid --project-dir flag. Not a dbt project. Missing dbt_project.yml file Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/dbt/main.py", line 129, in main results, succeeded = handle_and_check(args) File "/usr/local/lib/python3.8/site-packages/dbt/main.py", line 191, in handle_and_check task, res = run_from_args(parsed) File "/usr/local/lib/python3.8/site-packages/dbt/main.py", line 218, in run_from_args task = parsed.cls.from_args(args=parsed) File "/usr/local/lib/python3.8/site-packages/dbt/task/base.py", line 184, in from_args move_to_nearest_project_dir(args) File "/usr/local/lib/python3.8/site-packages/dbt/task/base.py", line 171, in move_to_nearest_project_dir nearest_project_dir = get_nearest_project_dir(args) File "/usr/local/lib/python3.8/site-packages/dbt/task/base.py", line 150, in get_nearest_project_dir raise dbt.exceptions.RuntimeException( dbt.exceptions.RuntimeException: Runtime Error fatal: Invalid --project-dir flag. Not a dbt project. Missing dbt_project.yml file
The
dbt_project.yml
file is there in the DBT project and I have configured where to look for the DBT project directory like this: Config file:
Copy code
from dagster._utils import file_relative_path

DBT_PROJECT_DIR = file_relative_path(__file__, "../../striata_sl_dbt")
DBT_PROFILES_DIR = DBT_PROJECT_DIR + "/config"
And assets:
Copy code
from ..config import DBT_PROJECT_DIR, DBT_PROFILES_DIR


@configured(dbt_cli_resource, config_schema={"target": str})
def custom_dbt_cli_resource(config):
    return {
        "target": config["target"],
        "project-dir": DBT_PROJECT_DIR,
        "profiles-dir": DBT_PROFILES_DIR
    }


dbt_assets = with_resources(
    load_assets_from_dbt_project(
        project_dir=DBT_PROJECT_DIR,
        node_info_to_group_fn=lambda _: "dbt_assets"),
    {
        "dbt": custom_dbt_cli_resource
    },
)
It works when I run everything locally, but as soon as I try get it onto Dagster Cloud, I am getting this error. I can't figure out why it can't find the dbt project and where the reference directory is that the command is being run from. Any suggestions?
I have a very similar setup to this hooli example where there is a pipelines sub-folder (dhis2_pipeline) and a dbt project sub-folder (striata_sl_dbt): https://github.com/dagster-io/hooli-data-eng-pipelines The Dockerfile copies both folders across and then on Dagster Cloud, I have the code location set as:
Copy code
location_name: dhis2_pipeline
image: image_on_ecr
code_source:
  package_name: dhis2_pipeline
When I run
dagit
, the dbt project is found and loaded properly. But, on Dagster Cloud, I am getting this issue about not finding the
dbt_project.yml
d

daniel

10/27/2022, 3:15 PM
Hey Megan - if you run a shell in the Docker image that gets created by your Dockerfile, is the dbt project folder in the place where you would expect? something like
Copy code
docker run -it <image>  /bin/bash
my suspicion would be something in the Dockerfile not moving it over to the place that the dagster code is expecting to find it
s

Sean Lopp

10/27/2022, 3:18 PM
Also, fwiw, the regular dbt_cli_resource supports a
target
argument. I doubt that is contributing here I'd be happy to review the Dockerfile if you don't mind sharing it
m

Megan Beckett

10/28/2022, 8:30 AM
Hi @daniel, yes the dbt project is where I expect it to be - it is another folder in the working directory along side the pipelines folder. The code in the pipelines folder then references the dbt project directory with:
Copy code
from dagster._utils import file_relative_path

DBT_PROJECT_DIR = file_relative_path(__file__, "../../striata_sl_dbt")
DBT_PROFILES_DIR = DBT_PROJECT_DIR + "/config"
And loads the dbt models as assets with:
Copy code
@configured(dbt_cli_resource, config_schema={"target": str})
def custom_dbt_cli_resource(config):
    return {
        "target": config["target"],
        "project-dir": DBT_PROJECT_DIR,
        "profiles-dir": DBT_PROFILES_DIR
    }

dbt_assets = with_resources(
    load_assets_from_dbt_project(
        project_dir=DBT_PROJECT_DIR,
        node_info_to_group_fn=lambda _: "dbt_assets"),
    {
        "dbt": custom_dbt_cli_resource
    },
)
This works when running dagit locally and I referenced the hooli project for the organisation of the folders and dbt project. @Sean Lopp, this is the dockerfile - very straightforward...
Copy code
FROM public.ecr.aws/docker/library/python:3.8

WORKDIR /dagster

RUN apt-get update && apt-get install -y git 

COPY ./requirements_docker.txt /dagster/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /dagster/requirements.txt

COPY ./dhis2_pipeline /dagster/dhis2_pipeline

COPY ./striata_sl_dbt /dagster/striata_sl_dbt
d

daniel

10/28/2022, 11:52 AM
And just to sanity check given the text of the error message - the folder also contains a dbt_project.yml file when you shell in to the docker image and check the contents of the folder? One thing to double check - i think you posted a workspace.yaml file earlier in the thread, is there also a dagster_cloud.yaml file (the latter is what has to be configured for cloud) If none of those help we can try to reproduce on our side by replicating that Dockerfile. We could also jump on a call today and look over your project to see if anything jumps out?
m

Megan Beckett

10/28/2022, 12:45 PM
Yes, the dby project folder contains the dbt_project.yml file. I have a workspace.yaml file, but that is only used when running dagit locally. I don't have a dagster_cloud.yaml file though - I didn't know this was needed. I have already deployed this pipeline on Dagster Cloud, which pulls data from an API and inserts into our database. That has all worked up until now without a dagster_cloud.yaml file. It is just now when I am trying to add the dbt project, that it is breaking and can't find it.
I see this example in the hooli project for the dagster_cloud.yaml file: https://github.com/dagster-io/hooli-data-eng-pipelines/blob/master/dagster_cloud.yaml But, this looks like the config that you specify when adding a code location, which I just do from within the Cloud UI and add these details:
Copy code
location_name: dhis2_pipeline
image: image_on_ecr
code_source:
  package_name: dhis2_pipeline
I would be very happy to jump on a call if you are available?
d

daniel

10/28/2022, 2:26 PM
Might be getting a bit late in your timezone but would you possibly be able to do a call in an hour? (8:30AM PST)
One possible clue would be in the logs for the container or task that your agent is spinning up (not the agent container itself, there should be a line like:
Copy code
2022-10-28 09:30:25 -0500 - dagster.builtin - INFO - Executing command: dbt --no-use-color --log-format json ls --project-dir /Users/dgibson/hooli-data-eng-pipelines/hooli_data_eng/../dbt_project --profiles-dir /Users/dgibson/hooli-data-eng-pipelines/hooli_data_eng/../dbt_project/config --select * --output json
That's the command that's failing, the output might give a clue of where exactly it's looking though
m

Megan Beckett

10/31/2022, 10:31 AM
Hi Daniel, thanks for this - this was helpful to look at. I have now moved the dbt project to inside the pipelines folder and this seems to work with the relative_file_paths.
5 Views