https://dagster.io/ logo
#ask-community
Title
# ask-community
s

Sean Lindo

08/04/2022, 11:21 PM
I've had some success using ops and jobs to fetch a file from S3, run some transformations, and then save it to a database. I'm attempting to upgrade this example by creating a job that uses the concepts described in Software-Defined Assets, and I’m having trouble getting something to materialize successfully. I'm trying to define a source asset that should load data from an IO Manager that I define. Following that, I'd like to run a number of transformation steps and also some steps that don't produce an asset (such as posting a message to slack). In the attached gist, I get a GraphQL error when materializing “raw_users”. So I have a few questions.. • How do I pass a context to this custom IO Manager so the source asset materializes successfully? • How can I mix in non-asset-producing “steps” into this job? Referenced gist: https://gist.github.com/seanlindo/a096af16dc34df4f4e6f31be3c2c5bae
o

owen

08/04/2022, 11:33 PM
hi @Sean Lindo! Thanks for sharing the gist. Do you mind sharing the GraphQL error that you're getting? One issue I see with the custom IOManager you're using is that it's storing the outputs in-memory. By default, each step (and each asset) runs in a separate process, so memory is not shared between these steps (meaning I would expect a "key not found" type of error). The mental model is that each step will initialize a new instance of whichever IOManager. To get around this issue, I'd recommend basing your custom io manager on the fs_io_manager (this is the one that's used by default).
For your second question (non-asset producing steps), you can create assets out of graphs of ops (so the ops inside the graph do not need to be assets themselves): https://docs.dagster.io/concepts/assets/software-defined-assets#graph-backed-assets
s

Sean Lindo

08/04/2022, 11:54 PM
Hi Owen, here is the attached error message I’m getting. I pulled the idea for the manager from here: https://docs.dagster.io/concepts/io-management/io-managers#testing-an-io-manager
Here is the actual error from the console:
Copy code
An error occurred while resolving field AssetNode.requiredResources
Traceback (most recent call last):
  File "/Users/lindo/.local/share/virtualenvs/dagster-G46_BIBp/lib/python3.8/site-packages/graphql/execution/executor.py", line 452, in resolve_or_error
    return executor.execute(resolve_fn, source, info, **args)
  File "/Users/lindo/.local/share/virtualenvs/dagster-G46_BIBp/lib/python3.8/site-packages/graphql/execution/executors/sync.py", line 16, in execute
    return fn(*args, **kwargs)
  File "/Users/lindo/.local/share/virtualenvs/dagster-G46_BIBp/lib/python3.8/site-packages/dagster_graphql/schema/asset_graph.py", line 501, in resolve_required_resources
    node_def_snap = self.get_node_definition_snap()
  File "/Users/lindo/.local/share/virtualenvs/dagster-G46_BIBp/lib/python3.8/site-packages/dagster_graphql/schema/asset_graph.py", line 214, in get_node_definition_snap
    return check.not_none(self._node_definition_snap)  # type: ignore
  File "/Users/lindo/.local/share/virtualenvs/dagster-G46_BIBp/lib/python3.8/site-packages/dagster/_check/__init__.py", line 968, in not_none
    raise CheckError(f"Expected non-None value: {additional_message}")
dagster._check.CheckError: Expected non-None value: None
o

owen

08/05/2022, 12:00 AM
ah what version of dagster/dagit are you on? I believe this was a bug in an 0.15.6 or 0.15.7, but it should be fixed in 0.15.8 (I believe)
And yeah we probably shouldn't have that IOManager example in the docs, it's pretty misleading. That setup only works if you use the in_process_executor (which is not the case by default)
s

Sean Lindo

08/05/2022, 12:02 AM
It looks like I’m on 0.15.8
o

owen

08/05/2022, 12:03 AM
got it, let me see if I can replicate
s

Sean Lindo

08/05/2022, 12:03 AM
thank you!
o

owen

08/05/2022, 12:10 AM
hm I wasn't able to replicate the issue -- tried running that exact code with dagster==0.15.8 and dagit==0.15.8. When did this error show up? Was it when you clicked on the asset, when you hit "Materialize", or something else?
s

Sean Lindo

08/05/2022, 12:12 AM
this occurs when I check the asset named “raw_users” and then click on Materialize
o

owen

08/05/2022, 12:19 AM
ohh interesting -- I was able to replicate that, thank you!
so that definitely shouldn't cause that error, but it also isn't really possible to materialize raw_users
SourceAssets are assets which you know exist, but don't have an associated asset definition to execute
s

Sean Lindo

08/05/2022, 12:21 AM
what if I simplified this even further and return None in handle_output and a static Pandas dataframe in load_input?
o

owen

08/05/2022, 12:21 AM
So you can read from source assets, but you can't write to them. Seems like a glitch in the UI that it even attempts to materialize raw_users, I'll file an issue
s

Sean Lindo

08/05/2022, 12:22 AM
I may be misunderstanding what materializing means in this context
o

owen

08/05/2022, 12:22 AM
materializing "upstream_asset" (which might be better named as "downstream_asset", as it's downstream of "raw_users") will invoke the io manager to load the contents of "raw_users"
but materializing means "refresh the contents of", so loading the contents of "raw_users" doesn't refresh the contents there (it just reads them)
if that makes sense
s

Sean Lindo

08/05/2022, 12:25 AM
works as expected..I updated the manager to just return a static dataframe and materializing the misnamed downstream asset works with no error
o

owen

08/05/2022, 12:25 AM
got it yeah, that makes sense. The error will only pop up if you click the "raw_users" label, then (with the sidebar open) click "Materialize"
s

Sean Lindo

08/05/2022, 12:26 AM
the real use case for the “raw_users” object will be fetching a CSV from S3
o

owen

08/05/2022, 12:29 AM
I see -- I think what you have is in the right shape 🙂👍. Thanks for the report, I filed an issue here: https://github.com/dagster-io/dagster/issues/9230
s

Sean Lindo

08/05/2022, 12:30 AM
Thank you!
Really appreciate the help, I'm unblocked now :)
o

owen

08/05/2022, 12:32 AM
great!
3 Views