https://dagster.io/ logo
Title
d

Daniel Rodman

10/13/2021, 1:37 AM
Hi… many thanks in advance… not really sure what I’m doing wrong (but probably something super obvious)… I’m getting this error in
dagit
when I try to run a pipeline (I can see the pipeline I just can’t run it):
dagster.core.errors.DagsterInvariantViolationError: repo not found at module scope in file /opt/dagster/app/repository.py.

  File "/usr/local/lib/python3.7/site-packages/dagster/grpc/impl.py", line 75, in core_execute_run
    recon_pipeline.get_definition()
  File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/reconstructable.py", line 110, in get_definition
    defn = self.repository.get_definition().get_pipeline(self.pipeline_name)
  File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/reconstructable.py", line 46, in get_definition
    return repository_def_from_pointer(self.pointer)
  File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/reconstructable.py", line 518, in repository_def_from_pointer
    target = def_from_pointer(pointer)
  File "/usr/local/lib/python3.7/site-packages/dagster/core/definitions/reconstructable.py", line 460, in def_from_pointer
    target = pointer.load_target()
  File "/usr/local/lib/python3.7/site-packages/dagster/core/code_pointer.py", line 233, in load_target
    name=self.fn_name, file=self.python_file
For reference… I have 4 separate repositories. They are all configured in my
workspace.yaml
:
load_from:
  # Each entry here corresponds to a service in the docker-compose file that exposes pipelines.
  - grpc_server:
      host: dagster_pipelines
      port: 4000
      location_name: "dagster_pipelines"
  - grpc_server:
      host: dagster_experiment
      port: 4001
      location_name: "dagster_experiment"
  - grpc_server:
      host: dagster_dbt
      port: 4002
      location_name: "dagster_dbt"
  - grpc_server:
      host: dagster_data_app
      port: 4003
      location_name: "dagster_data_app"
I’m trying to organize these repositories as dagster projects. Each project has its own associated dockerized image. This is an example of how the files are organized:
├── projects
│   ├── dbt
│   │   ├── dbt
│   │   │   ├── pipelines
│   │   │   ├── schedules
│   │   │   ├── sensors
│   │   │   └── solids
│   │   │   └── repository.py
I’m able to shell into the container and run pipelines. However, in dagit I’m getting the above error. (Perhaps is related to my docker setup?) Thanks again.
d

daniel

10/13/2021, 2:36 AM
hi daniel - could you share what the entrypoints are for your gRPC server containers?
d

Daniel Rodman

10/13/2021, 3:06 AM
hi Daniel… I’m using this in all of my dockerfiles:
CMD ["dagster", "api", "grpc", "-h", "0.0.0.0", "-p", "4003", "-f", "repository.py"]
d

daniel

10/13/2021, 3:44 AM
Hm, that’s very strange… is sharing the contents of repository.py an option, here or over DM? Strange that it would be able to find it in dagit but not when executing the run. (I’m assuming from the report that you’re using the default run launcher?)
d

Daniel Rodman

10/13/2021, 4:20 AM
Yes… I can share the contents of the repository, it’s just the code that you would get from running
dagster new-project dbt
from dagster import repository

from dbt.pipelines.my_pipeline import my_pipeline
from dbt.schedules.my_hourly_schedule import my_hourly_schedule
from dbt.sensors.my_sensor import my_sensor


@repository
def dbt():
    """
    The repository definition for this dbt Dagster repository.

    For hints on building your Dagster repository, see our documentation overview on Repositories:
    <https://docs.dagster.io/overview/repositories-workspaces/repositories>
    """
    pipelines = [my_pipeline]
    schedules = [my_hourly_schedule]
    sensors = [my_sensor]

    return pipelines + schedules + sensors
Ah… I did modify the run_launcher. I’m a little hazy on how I should configure it for multiple repositories (I had to omit part of the filepath):
run_launcher:
  module: dagster_docker
  class: DockerRunLauncher
  config:
    env_vars:
      - DAGSTER_POSTGRES_USER
      - DAGSTER_POSTGRES_PASSWORD
      - DAGSTER_POSTGRES_DB
    network: dagster_network
    container_kwargs:
      auto_remove: true
      volumes:
        - /<OMITTED_FILEPATH>/repo.py:/opt/dagster/app/repo.py
I feel like this configuration is a little wonky. The volume points to a different repository. However, I have 4 repositories, and it’s only affecting the two repositories I configured as dagster projects. I had initially done this because I didn’t want to have to refresh the container every time I updated the code. However I never removed it. This doesn’t seem right does it?
d

daniel

10/13/2021, 2:21 PM
Hmmmm, I do think that that volumes arg is likely to be contributing to the problem - but this error message looks like you tried to launch a pipeline defined in repo.py, not the
dbt
repository - are you completely certain that you launched my_pipeline() in the
dbt
repository?
since it says
repo not found at module scope in file
- here,
repo
is the name of the repository, and what should be happening there is its trying to find the code for the pipeline that you just launched
d

Daniel Rodman

10/13/2021, 4:15 PM
Thanks so much … I did remove the volume. I think I may have also misconfigured my docker container. Doesn’t look like I was copying over the project directory into the container 😔 …. Anyways this was super helpful, and things seem to be working! Thanks again!
:condagster: 1