https://dagster.io/ logo
#ask-community
Title
# ask-community
h

Harpal

07/20/2022, 11:26 AM
What’s up team Dagster! 🎉 I would like to move from using environment variables to run dbt commands in my
AssetGroup()
and
@multi_asset
methods to using
config_schema
like with <!subteam^S029W40435L|@ops>. The purpose of this is so that I can use and vary these parameters in the dagitUI. Is there another way to pass parameters from the dagitui that can be used in the
AssetGroup()
and
@multi_asset
methods? See the comment section for my env vars in a code snippet 😄
Notice how the ENV_VARS below are used in the multi_asset and AssetGroups. How can I supply variables from the dagit ui that can be used in both the
multi_asset
scope (line 46) AND the
AssetGroup
scope (line 63)?
Copy code
DATASET_TYPE = os.environ.get("DATASET_TYPE", "hold")  # can be "hold" or "arb"
TEST_SPLIT = os.environ.get("TEST_SPLIT", 0.1)
TRAIN_SPLIT = os.environ.get("TRAIN_SPLIT", 0.8)
DIR_ULID = str(ULID())
DBT_PROJECT_DIR = "./dbt"
DBT_PROFILE_DIR = "./dbt/config"
It looks like the main thing stopping us from using a config_schema is that
AssetGroup()
is of a different scope from
multi_asset
. Would it be possible to run the above job with configurable
"vars"
in the dagitUI (and push certain files to GCS) if we stopped using
AssetGroup()
? What would be the cons of dropping this as editing these files in the dagitUI would be super nice feature 🙂
o

owen

07/20/2022, 4:44 PM
hi again @Harpal! In 0.15.0, we moved away from AssetGroup() as the main way to build asset related jobs. I believe this would also work with AssetGroups, but I think the code samples will be a bit cleaner using`define_asset_job`: https://docs.dagster.io/concepts/ops-jobs-graphs/jobs#from-software-defined-assets, so I'm going to use that for the example. The API docs for that function are here: https://docs.dagster.io/_apidocs/assets#dagster.define_asset_job, and one of the parameters that it accepts is
config
(just like traditional op-based jobs). It seems to me that the main change here having all of your dbt assets available to you at once (rather than only loading "arb" or "hold" at one time). Then, you could have two separate asset jobs (one for rematerializing arb, and another for rematerializing hold). From there, you could choose which job you wanted to run in the UI.
Copy code
from dagster import repository, define_asset_job, AssetSelection, with_resources

# ...

dbt_hold_assets = load_assets_from_dbt_manifest(
    manifest_json=manifest_json,
    select=f"tag:hold",
)

csv_hold_assets = csv_assets_for_dbt_assets(dbt_hold_assets)

dbt_arb_assets = load_assets_from_dbt_manifest(
    manifest_json=manifest_json,
    select=f"tag:hold",
)

csv_arb_assets = csv_assets_for_dbt_assets(dbt_arb_assets)

hold_job = define_asset_job("hold", AssetSelection.assets(*(dbt_hold_assets + csv_hold_assets)))
arb_job = define_asset_job("arb", AssetSelection.assets(*(dbt_arb_assets + csv_arb_assets)))

all_assets = with_resources(
    dbt_hold_assets + csv_hold_assets + dbt_arb_assets + csv_arb_assets,
    resource_des=...,  # your resource defs
)


@repository
def my_repo():
    return [all_assets, hold_job, arb_job]
you'll still need to handle configuring the dbt cli resource through the UI, which can be done just by leaving the dbt cli resource config blank and setting it all in the launchpad, or if you want to avoid having to retype the dbt project / profile stuff all the time, you could apply a config mapping to the resource (see: https://dagster.slack.com/archives/C01U954MEER/p1657815567573089?thread_ts=1657669495.255529&amp;cid=C01U954MEER for an example)