Robert Wade
04/05/2023, 5:14 PM@asset(config_schema="{first_foo: str)" …)
def first_asset(….)
….
@asset(config_schema="{second_foo: str)" …)
def second_asset(first_asset)
….
my_job = define_asset_job("my_job", selection=["*second_asset"] config=<config for first and second asset>, …)
my_sched = build_schedule_from_parititioned_job(my_job, ….)
All of this works great. Now let’s imagine that things in the data engineering world change, specifically my first_asset now has a dependency on an asset from a different code location (perhaps built by a different team).
other_asset = SourceAsset(key=AssetKey("another_teams_asset"))
@asset(config_schema="{first_foo: str)" …)
def first_asset(another_teams_asset)
….
This now requires me to update my job and include config for another_teams_asset. Now let’s imagine that the other team goes through a variety of iterations and another_teams_asset suddenly has a variety of assets that it depends on. Am I expected to monitor all of these iterations/changes and continue to update the config for my job?sandy
04/05/2023, 7:06 PMline 1
line 2
Robert Wade
04/05/2023, 7:07 PMsandy
04/05/2023, 7:07 PMRobert Wade
04/05/2023, 7:09 PMsandy
04/06/2023, 5:23 PMRobert Wade
04/06/2023, 5:46 PMsandy
04/06/2023, 5:51 PMHowever, what I have also uncovered is that if the upstream dependency DOES exist, it still fails.What error are you seeing in this case?
Robert Wade
04/06/2023, 5:54 PMsandy
04/06/2023, 5:54 PMRobert Wade
04/06/2023, 5:54 PMsandy
04/06/2023, 5:55 PMpartitions_def
on the SourceAsset
in the downstream repo? if not, adding one might fix the problem.Robert Wade
04/06/2023, 5:55 PMsandy
04/06/2023, 8:46 PM