https://dagster.io/ logo
#announcements
Title
# announcements
k

King Chung Huang

07/29/2019, 7:52 PM
Is there an example of how to use the
s3_resource
from
dagster_aws
?
k

King Chung Huang

07/29/2019, 8:46 PM
Thanks! How does system_storage or file_manager come into play in this?
a

alex

07/29/2019, 9:01 PM
so you can have a system_storage with
required_resource_keys
including
s3
and then use the
s3_resource
to provide that resource. You can see in the airline demo that resource is provided under the key s3 https://github.com/dagster-io/dagster/blob/master/examples/dagster_examples/airline_demo/pipelines.py#L53
k

King Chung Huang

07/29/2019, 9:13 PM
Ok. I’ve been looking over that and other pieces of code, but I’m having a really tough time understanding how to actually use it. The ObjectStore APIs also look interesting, but I’m even more lost on how that hooks in. 🙂
a

alex

07/29/2019, 9:13 PM
ya you've certainly bumped in to one of the less polished corners of the system
k

King Chung Huang

07/29/2019, 9:14 PM
I’m not sure if I missed something in the tutorial, but I feel like I didn’t learn anything about dealing with files and objects.
a

alex

07/29/2019, 9:15 PM
Thats good feedback, we could certainly do better. What exactly is your end goal?
k

King Chung Huang

07/29/2019, 9:15 PM
Generally speaking, I want to read an object from S3, do some text processing, and write a new object to S3.
In my current non-Dagster process, I also have a cache of the S3 bucket on a local filesystem.
It feels like Dagster has all the pieces to deal with reading/writing S3 objects and handling a local cache, but I’m struggling with how to read an object and use FileHandles.
a

alex

07/29/2019, 9:19 PM
So, I think for now you can ignore the system_storage and file_manager since those are hooks for controlling how dagsters internally handles the intermediates. I think the airline demo is probably the best example to follow.
k

King Chung Huang

07/29/2019, 9:19 PM
So far, I’ve successfully gotten some existing code to run that ignores resources, FileHandles, etc, and just directly handles files and objects. But, I’m trying (unsuccessfully) to use Dagster’s APIs.
Ok. I might’ve gone down a rabbit hole there. I’ll take another stab based on the airline demo.
I actually ignored the airline demo because the README says it’s out of date.
a

alex

07/29/2019, 9:21 PM
the documentation for the airline demo needs to be rewritten but the code it self is in decent shape
k

King Chung Huang

07/29/2019, 9:21 PM
Ah, ok!
a

alex

07/29/2019, 9:21 PM
the README i think is just referring to itself being out of date - language we should improve (or just do the rewrite)
FileHandle
is just a string - so there is not a lot of value to using that instead of strings
the value to using the s3 resource instead of directly is you could provide a mock one in test mode or vary some other properties in different modes
if you are not doing any of that yet than you can proceed with your current implementation of doing everything directly in the solid
k

King Chung Huang

07/29/2019, 9:26 PM
K.
Thanks for the advice.
4 Views