https://dagster.io/ logo
Join the conversationJoin Slack
Channels
announcements
dagster-airbyte
dagster-airflow
dagster-bigquery
dagster-cloud
dagster-cube
dagster-dask
dagster-dbt
dagster-de
dagster-ecs
dagster-feedback
dagster-kubernetes
dagster-noteable
dagster-releases
dagster-serverless
dagster-showcase
dagster-snowflake
dagster-support
dagster-wandb
dagstereo
data-platform-design
events
faq-read-me-before-posting
gigs-freelance
github-discussions
introductions
jobs
random
tools
豆瓣酱帮
Powered by Linen
dagster-support
  • s

    Stefan Adelbert

    11/04/2022, 3:58 AM
    Suggestions for Custom Schedule I have a requirement to schedule a job once a fortnight, but the fortnights needs to aligned with a organisation's specific pay period. I am already determining the start and end dates of the most recent pay period using
    tuple(rrule(WEEKLY, interval=2, dtstart=datetime.date(2022,7,4), until=date))[-2:]
    where 04/07/2022 is a known historic pay period start date. I'm considering a sensor which runs once a day and checks if "today" is the Tuesday after the most recent period. If so, yield a RunRequest. Any other ideas?
    :dagster-bot-resolve: 1
    s
    • 2
    • 5
  • s

    saravan kumar

    11/04/2022, 6:24 AM
    In the resource when i use os.getenv("") , the db connection works well..I tried to move the details to run_config as per document..but they just get the string value instead of reading from the env
    @job(resource_defs={"db_session": db_session})
    def job_sample_db_test():
        run_sample_db_test()
    
    
    job_sample_db_test.execute_in_process(
        run_config={
            "resources": {
                "db_session": {
                    "config": {
                        "DB_USER": {"env": "DB_USER"},
                        "DB_PASSWORD": {"env": "DB_PASSWORD"},
                    }
                }
            }
        }
    )
    @resource
    @contextmanager
    def met_db_session(context):
        try:
            user = context.resource_config["DB_USER"]
            password = context.resource_config["DB_PASSWORD"]
            db = os.getenv("DB_DB", "")
            host = context.resource_config["DB_HOST"]
            port = int(os.getenv("DB_PORT", 54332))
            db_connection = create_engine(f'postgresql://{user}:{password}@{host}:{port}/{db}')
            session_maker = sessionmaker(bind=db_connection)
            session = session_maker()
            yield session
        finally:
            db_connection.close()
    Whatever is using os.getenv gets the proper values ,but context.resource_config just get the {"env": "DB_USER"} without actually reading the DB_USER from the environment..what i am missing? The secrets are kuberenets secrets and are available by os.getenv I am just looking for a resource which gets me sql alchemy db connection ,so if there is a better way ,i am down for it...
    :dagster-bot-resolve: 1
    z
    s
    • 3
    • 20
  • l

    Levan

    11/04/2022, 7:57 AM
    Hi! Is there any way I separate
    asset_name
    and
    table_name
    for asset definition with io_manager? I’ve got 2 tables with the same name in different db/schema and defining them in different dagster repos raises error of duplicated asset names.
    :dagster-bot-responded-by-community: 1
    v
    s
    • 3
    • 3
  • d

    Deepa Vasant

    11/04/2022, 8:59 AM
    Hello all, can we be able to configure dagit ui for differnt roles and users and have different workspace for each of them
    :dagster-bot-responded-by-community: 1
    :dagster-bot-resolve: 1
    v
    • 2
    • 2
  • d

    Davi

    11/04/2022, 9:20 AM
    Hello everyone, Is there an equivalent solution in Dagster like Airflow's PythonVirtualenvOperator ? Thanks !
    :dagster-bot-resolve: 1
    j
    • 2
    • 2
  • j

    Joshua Smart-Olufemi

    11/04/2022, 10:26 AM
    Hi there. I am following the initial dagster tutorials and I am at the Assets without Arguments and Return Values section. Now I ran through the whole section on vscode yesterday and it worked fine. I tried doing the same thing again this morning but I get an error from the dagster webpage that says
    FileNotFoundError: [WinError 2] The system cannot find the file specified: 'serial_asset_graph.py'
    File "C:\Users\josh\Desktop\toggle assignment\assignment\venv\lib\site-packages\dagster\_grpc\server.py", line 230, in __init__
    self._loaded_repositories: Optional[LoadedRepositories] = LoadedRepositories(
    File "C:\Users\josh\Desktop\toggle assignment\assignment\venv\lib\site-packages\dagster\_grpc\server.py", line 104, in __init__
    loadable_targets = get_loadable_targets(
    File "C:\Users\josh\Desktop\toggle assignment\assignment\venv\lib\site-packages\dagster\_grpc\utils.py", line 33, in get_loadable_targets
    else loadable_targets_from_python_file(python_file, working_directory)
    File "C:\Users\josh\Desktop\toggle assignment\assignment\venv\lib\site-packages\dagster\_core\workspace\autodiscovery.py", line 27, in loadable_targets_from_python_file
    loaded_module = load_python_file(python_file, working_directory)
    File "C:\Users\josh\Desktop\toggle assignment\assignment\venv\lib\site-packages\dagster\_core\code_pointer.py", line 75, in load_python_file
    os.stat(python_file)
    I'm a bit spun because I have made no changes to my setup. I would love some help in debugging this
    s
    • 2
    • 1
  • m

    Matthew Karas

    11/04/2022, 12:23 PM
    I looked in the dagster issues and did not see this but - it seems like the dagster logger eats exception groups and does not log all the information in the exception group this is in reference to a validation error in cattrs (cattrs got bad data and raised an exception group) https://cattrs.readthedocs.io/en/latest/validation.html - I get the exception - but not the validation error itself - which is in the "sub-exception"
    :dagster-bot-resolve-to-issue: 1
    s
    d
    • 3
    • 2
  • j

    Jordan

    11/04/2022, 12:24 PM
    Hi ! It sometimes happens that a sensor fails for example if there is an error in the execution of the code or if there are too many requests to be triggered in one tick, or various other reasons. So I try to monitor this by sending a notification and especially by switching off the sensor so that I don't receive a notification for each tick.
    @sensor(
        name='my_sensor',
        job=my_job,
        minimum_interval_seconds=30,
    )
    def my_sensor(context: SensorEvaluationContext):
        try :
            since_key = context.cursor or None
    
            new_s3_keys = get_s3_keys(
                bucket=bucket,
                prefix=prefix_path,
                since_key=since_key,
            )
    
            if not new_s3_keys:
                return SkipReason(f"No new s3 files")
    
            last_key = new_s3_keys[-1]
    
            if …:
                context.update_cursor(last_key)
                return [RunRequest(tags=None, run_key=file) for file in new_s3_keys]
    
            context.update_cursor(last_key)   
            return SkipReason("No file corresponds")
    
        except Exception as err:
            send_notification(err)
            stop_this_sensor()
    How do I use
    DagsterInstance.stop_sensor
    at this point in the code and get
    instigator_origin_id
    and
    selector_id
    ? Thanks in advance
    s
    d
    o
    • 4
    • 6
  • b

    Bojan

    11/04/2022, 2:11 PM
    👋 Heya folks i’m working trough snowflake, dagster SDAs and i’ve got a question - it’s highly likely that i’m failing to understand something but in cases when i want to read in a table from snowflake that already exists in snowflake, how do i use that asset later on. The example gives the following:
    @asset(
        key_prefix=["my_schema"]  # will be used as the schema in snowflake
    )
    def my_table() -> pd.DataFrame:  # the name of the asset will be the table name
        ...
    :dagster-bot-responded-by-community: 1
    :dagster-bot-resolve: 1
    j
    • 2
    • 14
  • b

    Bojan

    11/04/2022, 2:11 PM
    but how do i actually return that table ?
    :dagster-bot-not-a-thread: 1
  • m

    Manish Khatri

    11/04/2022, 3:06 PM
    Hello Dagster team :wave_anim: we currently have all out dbt models loaded as assets with the help of the fantastic function
    load_assets_from_dbt_project(…)
    . All the models + Lineage shows up in Dagit which is superb 😛artydagster:. If I wanted to define some of the parent assets (like the one arrowed in the attached picture), how can I do this as a
    SourceAsset
    with a
    TableSchema.from_name_type_dict(…)
    so we can document the columns and have this linked to the
    stg_snowflake_query_history
    DBT asset in the picture? Is this possible?
    b
    s
    • 3
    • 5
  • s

    Sireesha Kuchimanchi

    11/04/2022, 3:42 PM
    Hi everyone!!
    :dagster-bot-not-a-thread: 1
  • s

    Sireesha Kuchimanchi

    11/04/2022, 3:49 PM
    I am working with dagster and snowflake. I have created etl pipeline with dagster and extracted data from a csv file. Now I am trying to populate the extracted data into respective snowflake tables. So with what extension I should save the file and where should I save it? And what is the command to run the snowflake file?
    r
    • 2
    • 3
  • s

    stefan hansan

    11/04/2022, 4:09 PM
    Hi! I can't seem to find much information about the required resource key, 'requests_session'. I am using it in the following way:
    session = context.resources.request_session
    <http://session.post|session.post>(url, data= {'foo': 'bar'})
    Would anyone know where this data object resides in the POST request itself? I am trying to unwrap it on the backend of the URL I am calling, but i just cant figure out where it lives/ Thank you!
    :dagster-bot-resolve: 1
    s
    • 2
    • 2
  • s

    Salvador Ribolzi

    11/04/2022, 4:28 PM
    Hi! We are running Dagster Cloud on a Windows server with the local runner, and calling
    os.getenv('var')
    returns nothing even though the var is set for all users, any idea what might be happening? This also causes
    s3_resource
    to fail with no credentials (ran both
    aws configure
    and added the key/secret as env vars) We are a bit limited permissions wise so accessing the windows user that's running Dagster is not possible for us (though IT is helping with that)
    d
    • 2
    • 18
  • a

    Aaron Hoffer

    11/04/2022, 7:10 PM
    I recently upgrade to
    v1.0.16
    and getting
    sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread.
    errors, however I’m running it in k8s using a postgres backend for jobs/schedules/etc. My jobs are using in memory output storage and I did notice this recent change https://github.com/dagster-io/dagster/pull/10154 and the exceptions are happening in the pods generated for the kubernetes jobs. The jobs seem to complete fine but I’m getting at least one exception per run.
    p
    i
    +2
    • 5
    • 14
  • v

    Vineet Balachandran

    11/04/2022, 7:31 PM
    Hello, is there a way to pass the cron_schedule for a job through an environment variable. Basically I am trying to change the schedule for a job without needing to deploy the code.
    :dagster-bot-resolve: 1
    d
    • 2
    • 2
  • n

    Nicolas May

    11/04/2022, 7:52 PM
    Is it possible to have two different versions of
    dagster.yaml
    ? Like one for local dev/test/debug and one for prod deployment? I see there can be different
    workspace.yaml
    -like files thanks to
    --workspace
    CLI option (https://docs.dagster.io/_apidocs/cli#cmdoption-dagster-daemon-run-w).
    :dagster-bot-responded-by-community: 1
    :dagster-bot-resolve: 1
    z
    s
    • 3
    • 8
  • j

    Jack Yin

    11/04/2022, 8:22 PM
    hey y’all, so i’m trying to build an asset sensor for a partitioned asset. As far as I can tell, I likely will end up using
    run_request_for_partition
    to kick off the downstream partitioned jobs, but i’m not sure how to get the partition of the asset that the sensor is looking at
    :dagster-bot-resolve-to-issue: 1
    s
    d
    • 3
    • 29
  • j

    Jack Yin

    11/04/2022, 8:22 PM
    Do I have to use the
    MultiAssetSensor
    ?
    :dagster-bot-not-a-thread: 1
  • s

    Selene Hines

    11/04/2022, 8:23 PM
    Hey for dagster logging is the dagster_handler_config part of the dagster.yaml setup propogate to all loggers dagster manages or just the context.log() logger? E.g. if I have attached the root logger to the managed python loggers dagster handles, will a handler / format config apply to it as well?
    :dagster-bot-resolve: 1
    s
    • 2
    • 2
  • k

    kyle

    11/04/2022, 10:50 PM
    I don’t understand how I could define a SourceAsset as a file on s3. Then use this as input to a downstream asset. In the dagster example
    assets_pandas_pyspark
    the source asset exists as a local file. In my case I just want to define this as a file on s3.
    sfo_q2_weather_sample = SourceAsset(
        key=AssetKey("sfo_q2_weather_sample"),
        description="Weather samples, taken every five minutes at SFO",
        metadata={"format": "csv"},
    )
    s
    • 2
    • 2
  • m

    Maksym Domariev

    11/05/2022, 4:10 AM
    Hi , I'm trying to run to use dask executor : local execution works fine, when I tried to push it to an existing server, I've got this issue:
    dagster._core.errors.DagsterImportError: Encountered ImportError:
    No module named 'dask_sample_project'
    while importing module dask_sample_project. Local modules were resolved using the working directory
    /Users/<my username>/workspace/flow/nlp/dask_sample_project
    . If another working directory should be used, please explicitly specify the appropriate path using the
    -d
    or
    --working-directory
    for CLI based targets or the
    working_directory
    configuration option for workspace targets.
    the root of the project is correct, any tips?
    s
    • 2
    • 2
  • f

    fahad anwaar

    11/05/2022, 6:20 AM
    Hi All! I hope you guys are fine. What is the best approach to use
    sqlalchemy
    I have checked
    dagster-sqlalchemy
    library But i’m looking for more custom approach to use
    sqlachemy
    Please share any article Thanks
    :dagster-bot-resolve: 1
    a
    s
    • 3
    • 2
  • q

    Qwame

    11/05/2022, 9:23 AM
    Hi, been playing with
    IOManagers
    and I have a question. How do I access
    OutputContext
    metadata values added in
    load_input
    . For e.g.
    def handle_output(self, context: OutputContext, obj: Dict) -> None:
           ...
    
            context.add_output_metadata({"value_to_be_accessed": 'I want to access this value'})
    
        def load_input(self, context: InputContext) -> str:
            <http://context.log.info|context.log.info>(context.upstream_output.get_logged_metadata_entries())
    I have tried
    context.upstream_output.metadata
    ,
    context.upstream_output.get_logged_metadata_entries()
    and have had no success. How do I do this?
    d
    y
    • 3
    • 4
  • b

    Bojan

    11/05/2022, 1:13 PM
    Heya folks, i’m still stumped around this - highly likely i’m doing something wrong. Essentially i’m trying to
    Create an asset to represent the table and pass it downstream. The IO manager for your table asset will then be responsible for how the table is presented to the downstream asset.
    I’m using snowflake’s io manager from https://docs.dagster.io/_apidocs/libraries/dagster-snowflake. What i’m having issues with is essentially “materializing” from an existing table in snowflake. ie. i have a table my_table in a schema my_schema. It’s not clear to me how to create to load in this table as an asset and then use it downstream in other assets
    @asset(
        key_prefix=["my_schema"]  # will be used as the schema in snowflake
    )
    def my_table() -> pd.DataFrame:  
        return ???
        ...
    Is this even possible ?
    :dagster-bot-resolve: 1
    s
    • 2
    • 2
  • s

    Sireesha Kuchimanchi

    11/05/2022, 2:37 PM
    Hey everyone!
    :dagster-bot-not-a-thread: 1
  • s

    Sireesha Kuchimanchi

    11/05/2022, 2:37 PM
    I have used snowflake io manager but I am still getting errors. I will provide the screenshots of the code and the error I am getting below.
    :dagster-bot-not-a-thread: 1
  • s

    Sireesha Kuchimanchi

    11/05/2022, 2:38 PM
    This is the code
    :dagster-bot-not-a-thread: 1
  • s

    Sireesha Kuchimanchi

    11/05/2022, 2:39 PM
    This is the error I am getting.
    :dagster-bot-not-a-thread: 1
    k
    • 2
    • 2
Powered by Linen
Title
s

Sireesha Kuchimanchi

11/05/2022, 2:39 PM
This is the error I am getting.
:dagster-bot-not-a-thread: 1
k

Kyle Gobel

11/05/2022, 4:58 PM
the errors don't really seem related to the code you posted, just guessing but it looks like you need to install dagster-snowflake
pip install dagster-snowflake
(i think) and then maybe you'll get some better error messages
s

Sireesha Kuchimanchi

11/06/2022, 1:44 PM
Sure I'll try that
View count: 1