Is there a best practice to migrate from loading a...
# integration-airbyte
d
Is there a best practice to migrate from loading assets using
load_assets_from_airbyte_instance
to
load_assets_from_connections
? I tried to match all source/dest/connection settings, but still seeing
ValueError: Airbyte connections are not in sync with provided configuration
AFTER dagster-airbyte applying that reconciler.
b
Hi Dusty, that’s odd - are you able to see a diff running
dagster-airbyte check
?
d
Yea, I can run through the process again and document it, as it’s just our dev environment. I reverted it but can pick this back up after lunch.
d
One thing I did notice, is that when I turned off streams in Airbyte, the asset definitions weren’t reloading in Dagster and when I went to materialize those assets in Dagster it throws the
Copy code
op 'airbyte_sync_63c3b' did not fire outputs 

dagster._core.errors.DagsterStepOutputNotFoundError: Core compute for op "airbyte_sync_63c3b" did not return an output for non-optional output "foo"
a no-op diff for the
load_assets_from_airbyte_instance
method alleviates it.
Starting the process now of loading assets from the
load_assets_from_connections
for a connection that already existed.
Looks ilke the above was caused by the assets not being reloaded unless I did a no-op deploy, but once I aligned the stream_config with the existing Airbyte asses, I get this in the deployment -->
Copy code
TypeError: reduce() of empty sequence with no initial value
Copy code
File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/server.py", line 245, in __init__
    self._container_image,
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/server.py", line 120, in __init__
    repo_def = recon_repo.get_definition()
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 117, in get_definition
    return repository_def_from_pointer(self.pointer, self.repository_load_data)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 787, in repository_def_from_pointer
    repo_def = repository_def_from_target_def(target, repository_load_data)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/reconstruct.py", line 776, in repository_def_from_target_def
    return target.compute_repository_definition()
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/repository_definition.py", line 1549, in compute_repository_definition
    return self._get_repository_definition(repository_load_data)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/repository_definition.py", line 1529, in _get_repository_definition
    default_logger_defs=self._default_logger_defs,
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/repository_definition.py", line 857, in from_list
    default_executor_def=default_executor_def,
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/unresolved_asset_job_definition.py", line 159, in resolve
    asset_selection=self.selection.resolve([*assets, *source_assets]),
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/asset_selection.py", line 157, in resolve
    return self.resolve_inner(asset_graph)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/definitions/asset_selection.py", line 230, in resolve_inner
    for asset_key in selection
b
Hmm, it looks like the asset job is resolving no assets. What’s the asset job that’s triggering this?
d
The definition is
Copy code
airbyte_config_assets = load_assets_from_connections(
    airbyte=airbyte,
    connections=[payments_to_warehouse],
    key_prefix="payments_backend",
)
b
is there a job in your repository that relies on these assets?
I’m wondering if maybe the asset names or prefix are mismatched and it’s causing a job to resolve as empty
d
Good thought - let me check
We do define an asset job that has a group name value that matches this line of the previous
load_assets_from_airbyte_instance
Copy code
#     connection_to_group_fn=lambda group_name: "payments_backend_replication",
It would seem that it’s defining an asset job based on that
connection_to_group_fn
group_name value eh?
b
Yup, I think you’ll want
connection_to_group_fn
on the
load_assets_from_connections
call too? I think what’s happening is that job is not finding the new assets bc they’re missing the group name
d
Good catch, let me add that in and take it do dev
Looks like that did the trick, thank you Ben.
I think my original issue had to do with the asset reloading which was alleviated by the no-op’d the asset definition
b
Yeah, that’s very odd, something another user encountered recently
We are hoping to figure out why that data seems to be cached even after reload
d
Yea, probably my colleague, Guy. We’re just causing all sorts of problems. I’ll leave y’all alone for the rest of the year. Happy holidays!
😶 1
b
thank you for the detailed write-up! this should make it easier for us to try to replicate on our end 🙏 happy holidays!
d
Of course. I’m wondering if it related to our infrastructure/K8s, but I’ll try not to think about it until after Xmas!
FWIW, our instance of Dagster is deployed on K8s via Helm Chart
@ben would it make sense for me to open an issue for the above?
b
That would be great, I will have some time to try to replicate this today+tomorrow
d
sounds good - should be able to write it up tonight
Opened the issue here. Happy to help provide any other info needed to reproduce.