Vrushank Kenkre

12/07/2022, 10:53 AM
Hello, I am trying to setup data pipeline on AWS EMR cluster using dagster. I am using the with_pyspark_emr to do this. But when I launch dagit and run the job I am running into module import errors
Traceback (most recent call last):
File "/home/hadoop/.local/lib/python3.7/site-packages/dagster/_core/", line 138, in load_python_module
return importlib.import_module(module_name)
File "/usr/lib64/python3.7/importlib/", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'with_pyspark_emr'
`dagster._core.errors.DagsterImportError: Encountered ImportError:
No module named 'with_pyspark_emr'
while importing module with_pyspark_emr. Local modules were resolved using the working directory
. If another working directory should be used, please explicitly specify the appropriate path using the
for CLI based targets or the
configuration option for workspace targets.` I have setup dagtser on a dev EC2 machine and trying to run the job on EMR. The module with_pyspark_emr in present in
, I am not able to figure out what the issue is. Can someone please help?
:dagster-bot-responded-by-community: 1