https://dagster.io/ logo
#dagster-feedback
Title
# dagster-feedback
b

Binoy Shah

08/11/2022, 6:07 PM
As per https://docs.dagster.io/_apidocs/assets#dagster.SourceAsset
Copy code
class dagster.SourceAsset(key, metadata=None, io_manager_key=None, io_manager_def=None, description=None, partitions_def=None, _metadata_entries=None, group_name=None, resource_defs=None)
[source]
¶

    A SourceAsset represents an asset that will be loaded by (but not updated by) Dagster.
What does it mean “asset will be loaded by (but not updated by) dagster” for eg. When Dagster builds Assets from Airbyte https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-airbyte/dagster_airbyte/asset_defs.py .. Ideally it “appears” like airbyte updating the the asset and Dagster is loading it. But still SourceAssets were not used for this What would be an example of such
SourceAsset
in more realistic way where its loaded by Dagster but not updated ? Here by term “loading”, does it only mean displaying it in Dagit UI ?
c

claire

08/11/2022, 6:32 PM
Hi Binoy. One example of a source asset would be a file--you might want to read a CSV in, and update assets based on that CSV, but you wouldn't write back to that CSV. The Software Defined Assets with Pandas and Pyspark guide features an example of a source asset that reads from a CSV.
Another example I can name is that you might have multiple Dagster repositories separated by business domains. If you wanted one of your assets to depend on an asset outside of your business domain's repository, you would define that external asset as a source asset because you wouldn't write back to it.
b

Binoy Shah

08/11/2022, 7:24 PM
Thanks Claire, that clarifies it
s

sandy

08/11/2022, 9:02 PM
@Binoy Shah - I agree with you that the doc here is a little unclear - if you have any suggestions on how to improve it, we would be receptive
b

Binoy Shah

08/11/2022, 9:16 PM
Here i think this answer given by @claire was good enough. 1. Such real life scenarios should be part of official documentation. 2. Each API should have example usage, you guys are already following TDD, so the test cases for the APIs and API helper methods should be consistently attached to the docs On a side note, I think Dagster introduces lot of new “Concepts” to the world of Data orchestration. I’ve watched your last 2 presentations @sandy. And although I like to switch my mental paradigm to Asset based workflows. Dagster has become “Too Rich” of a Library+framework and people’s collective skillset has not caught up I have very often, searched dagster’s official Youtube channel to find tutorials and get clarity on APIs and concepts presented in the dagster core library.
using such indicators for important and cautionary aspects of documentations might also help in segregating different levels of narratives
s

sandy

08/11/2022, 9:22 PM
thanks - that's very helpful feedback
👍 1
10 Views