Nicolas May
07/05/2022, 7:45 PM@asset
to a GCS bucket? Thx!prha
07/05/2022, 7:52 PMs3_pickle_io_manager
and s3_resource
, you would use the gcs_pickle_io_manager
and the `gcs_resource`: https://docs.dagster.io/concepts/io-management/io-managers#using-an-io-manager
You can read more about using GCS for IO management here: https://docs.dagster.io/deployment/guides/gcp#using-gcs-for-io-managementNicolas May
07/05/2022, 8:17 PM@asset
def rides():
response = requests.get("<https://docs.dagster.io/assets/cereal.csv>")
lines = response.text.split("\n")
return [row for row in csv.DictReader(lines)]
assets_with_io_manager = with_resources(
[rides],
resource_defs={
"io_manager": gcs_pickle_io_manager,
"gcs": gcs_resource
},
)
but it only writes locallyprha
07/05/2022, 8:44 PMNicolas May
07/05/2022, 8:45 PMresources:
io_manager:
config:
gcs_bucket: my-cool-bucket
gcs_prefix: good/prefix-for-files-
prha
07/05/2022, 8:46 PMassets_with_io_manager = with_resources(
[rides],
resource_defs={
"io_manager": gcs_pickle_io_manager.configured({"gcs_bucket": "my_custom_bucket_name"}),
"gcs": gcs_resource,
},
)
Nicolas May
07/05/2022, 8:49 PMgcs_pickle_io_manager.configured(...)
approach... that also just results in local write@job
decorator:
@job(
resource_defs={
"gcs": gcs_resource,
"io_manager": gcs_pickle_io_manager,
},
config={
"resources": {
"io_manager": {
"config": {
"gcs_bucket": "my-bucket",
"gcs_prefix": "my-prefix",
}
}
}
}
)
def gcs_job():
gcs_op()
Why is writing an asset any different? And what is the difference?prha
07/05/2022, 9:01 PMrides
asset is getting picked up by the repository correctly? When I ran your snippet, I got an error when the bucket wasn’t configured. There isn’t a fallback behavior in the gcs io manager to fall back on local writes, so I suspect that something isn’t getting picked up correctly. Can we rename rides
just to sanity check?Nicolas May
07/05/2022, 9:06 PMrides
to cereal
... same local writeprha
07/05/2022, 9:18 PM0.15.0
?Nicolas May
07/05/2022, 9:30 PMwith_resources()
approach, only io_manager
initializes:
Starting initialization of resources [io_manager].
... but I get
Starting initialization of resources [gcs, io_manager].
... when defining the resource in the @asset
decorator and it works:
@asset(
resource_defs={
"io_manager": gcs_pickle_io_manager.configured(
{
"gcs_bucket": "my-bucket",
"gcs_prefix": "my-prefix",
}
),
"gcs": gcs_resource,
},
)
def cereal():
response = requests.get("<https://docs.dagster.io/assets/cereal.csv>")
lines = response.text.split("\n")
return [row for row in csv.DictReader(lines)]
prha
07/05/2022, 9:40 PMwith_resources
. cc @chris to take a lookNicolas May
07/05/2022, 9:48 PMOliver
07/06/2022, 1:17 AMNicolas May
07/06/2022, 12:37 PMchris
07/06/2022, 8:54 PMwith_resources
into the repository?Nicolas May
07/06/2022, 8:56 PMwith_resources
in the repo, but it errors... Says what's included in the repo must be a list of assets, jobs, etc.chris
07/06/2022, 8:56 PM@repository
def the_repo():
return [*with_resources(...)]
with_resources
produces a sequence, so you need to unpack it)Nicolas May
07/06/2022, 8:57 PMchris
07/06/2022, 8:57 PMNicolas May
07/06/2022, 8:57 PMchris
07/06/2022, 8:58 PMwith_resources
constructs a new asset definitionNicolas May
07/07/2022, 4:21 PMwith_resources
asset definition in the repo solved my problem... Thanks again!chris
07/07/2022, 5:25 PM