HI i AM following along the dagster crash course o...
# ask-community
s
HI i AM following along the dagster crash course on youtube. I am using windows and conda environment. When I do dagit to see the assets flow, It shows an error. this is the error I am getting. How do I fix it? (serverless_project) PS E:\python\airflow\dagster\my-dagster-project> dagit 2023-04-24 205418 +0100 - dagit - INFO - Using temporary directory E:\python\airflow\dagster\my-dagster-project\tmpydk4zjem for storage. This will be removed when dagit exits. 2023-04-24 205418 +0100 - dagit - INFO - To persist information across sessions, set the environment variable DAGSTER_HOME to a directory to use. 2023-04-24 205424 +0100 - dagster - INFO - Started Dagster code server for module my_dagster_project on port 52261 in process 22328 C\Users\Acer\miniconda3\envs\serverless project\lib\site packages\dagster\ core\workspace\context.py591: UserWarning: Error loading repository location my_dagster_projectdagster. core.errors.DagsterInvalidDefinitionError Input asset '["github_stargazers"]' for asset '["github_stargazers_by_week"]' is not produced by any of the provided asset ops and is not one of the provided sources Stack Trace: File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_grpc\server.py", line 266, in init self._loaded_repositories: Optional[LoadedRepositories] = LoadedRepositories( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_grpc\server.py", line 115, in init loadable_targets = get_loadable_targets( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_grpc\utils.py", line 47, in get_loadable_targets else loadable_targets_from_python_module(module_name, working_directory) File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\workspace\autodiscovery.py", line 36, in loadable_targets_from_python_module module = load_python_module( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\code_pointer.py", line 135, in load_python_module return importlib.import_module(module_name) File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\importlib\__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1050, in _gcd_import File "<frozen importlib._bootstrap>", line 1027, in _find_and_load File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 688, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "E:\python\airflow\dagster\my-dagster-project\my_dagster_project\__init__.py", line 7, in <module> defs = Definitions( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\definitions_class.py", line 369, in init self._created_pending_or_normal_repo = _create_repository_using_definitions_args( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\definitions_class.py", line 252, in _create_repository_using_definitions_args def created_repo(): File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\decorators\repository_decorator.py", line 154, in call else CachingRepositoryData.from_list( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\repository_definition\repository_data.py", line 398, in from_list return build_caching_repository_data_from_list( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\repository_definition\repository_data_builder.py", line 205, in build_caching_repository_data_from_list for job_def in get_base_asset_jobs( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\assets_job.py", line 75, in get_base_asset_jobs build_assets_job( File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\assets_job.py", line 180, in build_assets_job resolved_asset_deps = ResolvedAssetDependencies(assets, resolved_source_assets) File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\resolved_asset_deps.py", line 22, in init self._deps_by_assets_def_id = resolve_assets_def_deps(assets_defs, source_assets) File "C:\Users\Acer\miniconda3\envs\serverless_project\lib\site-packages\dagster\_core\definitions\resolved_asset_deps.py", line 105, in resolve_assets_def_deps raise DagsterInvalidDefinitionError( warnings.warn( C\Users\Acer\miniconda3\envs\serverless project\lib\site packages\dagster\ core\execution\compute logs.py48: UserWarning: WARNING: Compute log capture is disabled for the current environment. Set the environment variable
PYTHONLEGACYWINDOWSSTDIO
to enable. warnings.warn(WIN_PY36_COMPUTE_LOG_DISABLED_MSG) 2023-04-24 205424 +0100 - dagit - INFO - Serving dagit on http://127.0.0.1:3000 in process 20768
z
is your
github_stargazers
asset being passed into your
@repo
/
Definitions
?
s
No.
z
I would try adding it to
@repo
or
Definitions
- Dagster only really "sees" assets that are in your
@repo
/
Definitions
, so if an asset in a pipeline is missing from there Dagster won't know what to use for any downstream assets as an input
s
I have this init.py from dagster import Definitions, load_assets_from_modules from . import assets all_assets = load_assets_from_modules([assets]) defs = Definitions( assets=all_assets, ) I am new in here. I am just starting it out. I do not really understand this repo/definitions jargon. how do it add it to what? I was looking for a tutorial to learn this. The one I found is buggy it seems and very old.
z
sure no worries, welcome! check out this doc for better understanding definitions. have you tried the Dagster intro tutorial on their docs page here? I'd expect it to be a bit more up-to-date and may help nail down some of the initial concepts for you
s
I tried everything bro. It seems when I scaffold the project, I do not see any repository folder inside the structure. The tutorial does not even discuss that sometimes I might not even see a repository and what I should do when I do not see one.
z
what does your
assets
module look like?
@repo
is an older function, if you're using
Definitions
in your init file then you don't need to worry about anything repository-related
❤️ 1
s
from dagster import asset from github import Github import pandas as pd from datetime import timedelta import nbformat from nbconvert.preprocessors import ExecutePreprocessor import pickle import jupytext from github import InputFileContent ACCESS_TOKEN = "blablablabla" @asset def github_startgazers(): return list(Github(ACCESS_TOKEN).get_repo("dagster-io/dagster").get_stargazers_with_dates()) @asset def github_stargazers_by_week(github_stargazers): df = pd.DataFrame( [ { "users": stargazer.user.login, "week": stargazer.starred_at.date() + timedelta(days=6 - stargazer.starred_at.weekday()) } for stargazer in github_startgazers ] ) return df.groupby("week").count().sort_values(by="week") @asset def github_stars_notebook(github_stargazers_by_week): markdown = f""" ## github stars
Copy code
python
import pickle
github_stargazers_by_week = pickle.loads({pickle.dumps(github_stargazers_by_week)!r})
## github starts by week, last 52 weeks
Copy code
python
github_stargazers_by_week.tail(52).reset_index().plot.bar(x="week", y="users")
""" nb = jupytext.reads(markdown, "md") ExecutePreprocessor().preprocess(nb) return nb.format.writes(nb) @asset def github_stars_notebook_gist(context, github_stars_notebook): gist=( Github(ACCESS_TOKEN) .get_user() .create_gist( public=False, files={ "github_stars.ipynb":InputFileContent(github_stars_notebook) } ) ) context.log.info(f"Notebook created at {gist.html_url}") return gist.html_url this is what my module look like,,, I was getting this error as well "WARNING: Compute log capture is disabled for the current environment. Set the environment variable
PYTHONLEGACYWINDOWSSTDIO
to enable."
z
very strange, I don't see anything jumping out as obviously wrong. I'll try to reproduce your error in a little bit when I get some freetime
❤️ 1
s
Thanks I would really appreciate that. I tried to follow along this course

https://www.youtube.com/watch?v=ZmUjf3gL1VU&amp;t=28s

I was trying to learn this so that I can help my team adopt dagster. Thanks.
z
in the meantime try out the Dagster tutorial on their website that I linked earlier
❤️ 1
s
Yes i am trying that out now. thanks a lot man. also, if possible would love to know how to fix "WARNING: Compute log capture is disabled for the current environment. Set the environment variable
PYTHONLEGACYWINDOWSSTDIO
to enable." this warning
c
did you set that environment variable?
that’s just a warning tho - it shouldn’t affect the actual execution at all
s
set PYTHONLEGACYWINDOWSSTDIO=enable should I enable it like that? or there is a any json hidden somewhere that I have to playaround with?
c