Hi, I 'm having trouble in configuring runs spawne...
# ask-community
m
Hi, I 'm having trouble in configuring runs spawned by the reconciliation sensor and by the automaterialization daemon. How can I specify a run config in those cases? I didn't find anything on the docs or here in the slack channel..
c
m
Thanks, is it better to discuss the metter on github? I add to the discussion also @Christopher Tee that actually posed a similar question with no answer here on the slack channel a month ago: https://dagster.slack.com/archives/C01U954MEER/p1678783501272329?thread_ts=1678783501.272329&cid=C01U954MEER
s
@Marco Jacopo Ferrarotti - are you able to explain why you want to use config in this situation in a little more detail? This isn't currently possible, but understanding your use case would help us figure out the best way to add support for it.
m
Hi @sandy, one of the first assets in the pipeline I'm developing is a bronze ingestion table where I ingest large number of semi-structured files from an SFTP server. I modeled the connection to the SFTP server as a resource and the asset takes the path to the folder containing the files on that SFTP as a config parameter. I didn't want to embed the path into the resource since doing that would prevent the same resource to be used for other assets that requires the connection to the same SFTP server but on different paths. I'd like to be able to automaterialize the asset either with a reconciliation sensor or with a policy. Since both those things trigger a Run I'd expect both to give me the possibility to specify a RunConfig exactly like I would do with a RunRequest in a sensor or in a schedule. I'm open to discussion about moving the config away from the asset into the resource. But I strongly think that if dagster allows to define Software Defined Assets with configs, then it should support those configs also in automaterialization policies.
s
Why not just hardcode the path inside the asset itself? E.g.
Copy code
@asset
def load_from_sftp(sftp_resource: SFTPResource):
    loaded_data = sftp_resource.read_from_folder("/a/b/c")
    ...
Btw I'm not suggesting this is necessarily the best way to do this, just trying to get a deeper understanding of the shape of what you're trying to accomplish.
m
I don't necessarily like to embed those kind of things in the function code. That path might change and for sure it is different between production and dev environment. So a config would be more flexible in my opinion.
s
Got it - makes sense. Do you have thoughts on where would that config would ideally live? In YAML files in your git repository? A Python dictionary that you pass in when constructing your
Definitions
object?
m
@sandy sorry for the delayed answer. To go back to the discussion. As for now that configuration in my code lives in a static dict that I import and use in all the assets that should receive a configuration. Something along these lines:
Copy code
# config.py
assets_config={
  "dev": {
    "AwesomeAsset": {
      "base_path": "/this/is/for/dev"
    }
  },

 "prod": {
    "AwesomeAsset": {
      "base_path": "/this/is/for/prod/"
    }
  },
}[os.getenv("DAGSTER_ENV","dev")]
Copy code
# AwesomeAsset.py
from .config import assets_config

@asset(...)
def AwesomAsset(sftp_resource: SFTPResource):
  loaded_data = sftp_resource.read_from_folder(assets_config["AwesomeAsset"]["base_path"])
Ideally would be cool to explicitly state the asset config with the pythonic API and then have something like the legacy configured api to pass to the Definition a configured version of the asset.
s
got it - that makes sense - here's a github issue where we're tracking this: https://github.com/dagster-io/dagster/issues/12099