< Scout> what do people use for their team to manage Dagster dagster #ask-ai

<@U0667DNC02Y> what do people use for their team t...

Steven Schmatz

02/22/2024, 11:23 PM

@Scout what do people use for their team to manage Dagster jobs and runs? For example - say I have a run triggered by an S3 sensor. We still have to upload a file to kick off that run. My thinking is maybe we use Retool or similar to upload the CSV, and then Dagster APIs to monitor the job progress. What do users regularly use for this purpose?

Scout

02/22/2024, 11:23 PM

In practice, teams use a variety of tools and approaches to manage Dagster jobs and runs, depending on their specific needs and existing tooling. The documentation does not provide a comprehensive list of tools that users regularly use for managing Dagster jobs and runs, but it does describe some of the mechanisms within Dagster that can be used to trigger and monitor jobs. For triggering runs, as you mentioned, Dagster provides sensors that can monitor external systems and trigger jobs based on certain events, such as the presence of a new file in an S3 bucket. The example in the documentation shows how to create an S3 sensor that triggers runs when new files are detected:

Copy code

python from dagster_aws.s3.sensor import get_s3_keys  @sensor(job=log_file_job) def my_s3_sensor(context):     since_key = context.cursor or None     new_s3_keys = get_s3_keys("my_s3_rfp_bucket", since_key=since_key)     if not new_s3_keys:         return SkipReason("No new s3 files found for bucket my_s3_rfp_bucket.")     last_key = new_s3_keys[-1]     run_requests = [RunRequest(run_key=s3_key, run_config={}) for s3_key in new_s3_keys]     context.update_cursor(last_key)     return run_requests

For monitoring job progress, users can utilize Dagster's built-in UI, Dagit, which provides visibility into job runs, including their status and logs. Dagit serves as a central place for teams to monitor and interact with their data pipelines. If teams require additional functionality, such as uploading files to trigger runs, they might integrate Dagster with other tools or platforms. For example, they could use a tool like Retool to create a custom interface for uploading files to S3, and then rely on Dagster's sensors to detect those files and trigger the appropriate jobs. For more advanced monitoring or custom interactions, teams might also use Dagster's GraphQL API to programmatically interact with Dagster instances, allowing them to build custom dashboards or integrate with other monitoring tools. Ultimately, the choice of tools and approaches depends on the team's requirements, existing infrastructure, and preferences. Dagster is designed to be flexible and extensible, so it can integrate with a wide range of tools and workflows.

6 Views

Open in Slack

Previous Next