https://dagster.io/ logo
Join the conversationJoin Slack
Channels
announcements
dagster-airbyte
dagster-airflow
dagster-bigquery
dagster-cloud
dagster-cube
dagster-dask
dagster-dbt
dagster-de
dagster-ecs
dagster-feedback
dagster-kubernetes
dagster-noteable
dagster-releases
dagster-serverless
dagster-showcase
dagster-snowflake
dagster-support
dagster-wandb
dagstereo
data-platform-design
events
faq-read-me-before-posting
gigs-freelance
github-discussions
introductions
jobs
random
tools
豆瓣酱帮
Powered by Linen
random
  • m

    Martin Bach

    05/04/2022, 1:26 PM
    At Restack we created a small video about the story of Dagster for you :dagster-spin: I hope you will like it 😊

    https://youtu.be/Rn0miOSmD9E▾

    🙌 1
    🙏 6
    🙏🏽 1
    s
    • 2
    • 1
  • d

    Daniel Eduardo Portugal Revilla

    05/09/2022, 2:08 PM
    Don’t forget to register for the BEAM Collage event, totally Free 😄 🐝 https://www.linkedin.com/posts/daniel-portugal_season-2022-activity-6929430204679434242-IZI8
  • a

    Anoop Sharma

    05/18/2022, 1:39 PM
    Hi #random, I have written a module which can help in building dagster ops, graphs and repositories in an automated way using a config file. I have used it now in a couple of my deployments and found it helpful. Sharing it here in case you feel it can be used in your project as well. It covers only a few basic features as of now, but I believe you can extend it as per your case. https://github.com/asanoop24/dagger/tree/master
    a
    • 2
    • 3
  • d

    David Lakomski

    05/19/2022, 4:34 PM
    from : Airflow, Prefect, and Dagster: An Inside Look quote : "Dagster takes a first-principles approach to data engineering. It is built with the full development lifecycle in mind, from development, to deployment, to monitoring and observability. Prefect, on the other hand, adheres to a philosophy of negative engineering, built on the assumption that the user knows how to code and makes it as simple as possible to take that code and built it into a distributed pipeline, backed by its scheduling and orchestration engine." newbie comment : really agree with it, I find the seperation of concerns principle is strong in Dagster :baby-yoda: (and I've already learned a lot just by losing myself in the documentation and people's code)
    👀 1
  • r

    Rubén Lopez Lozoya

    05/20/2022, 1:39 PM
    Hey, has anyone ever used Dagster in conjunction to TimeScale DB?
    d
    • 2
    • 1
  • g

    George Pearse

    05/24/2022, 4:24 PM
    Just read this article and loved it https://www.ethanrosenthal.com/2022/05/10/database-bundling/ (unfortunately they have a go at the unbundling airflow article which I thought was pretty good). In my mind it's basically arguing that the analytics stack has MLOps covered, provided someone comes up with a little more guidance / config in terms of how to build on top of it (like a DBT for Model monitoring instead of Data Warehouses)
    ❤️ 1
  • t

    Tomas Vykruta (EIQ)

    05/25/2022, 1:46 PM
    Hey folks, we’re looking for a data analytics framework that can be embedded in a web app. Who are the hottest startups out there doing this? We’re evaluating LOOKER and Metabase, looking for more options. Our data lives inside BQ.
    g
    • 2
    • 1
  • s

    sanjay hora

    05/28/2022, 7:17 AM
    👋 Hello, team! here to learn more about scheduling aspects of dagster form the community
    ❤️ 1
  • r

    raaid

    06/01/2022, 12:41 PM
    Not sure if anyone has run into this, but: I have one postgres instance with multiple dbs (one for my application code, one for Dagster). Been trying to prevent it from ballooning on storage space, and didn't realize that the
    event_logs
    table can get so very large (it's half my storage). So if you happen to have something eating storage and can't pinpoint it, couldn't hurt to check that table!
    👍 2
    f
    r
    • 3
    • 3
  • d

    Denis Maciel

    06/01/2022, 1:31 PM
    Hey folks, we are just getting started with Dagster and we are wondering if we should have two separate deployments of Dagster (dev and prod) or a single deployment where you would configure the jobs differently depending if they're running on dev or prod. Do you have any strong opinions about these approaches?
    👍 1
    i
    z
    +2
    • 5
    • 8
  • a

    Adrien Ruault

    06/07/2022, 12:04 PM
    🧚 :dancing: 🕺 :dancing: 💃 :dancing: 🧚‍♀️ Hello guys, My project's current status I am working on the development of an ML product. So far we have used DVC to define our ML pipeline, including preprocessing steps, training, baselines computations, and so on... We are working with GCP and so far we are running our pipeline using vertexAI training jobs. Where I need your help Now we want to move to another DAG framework that allows us to run the different steps on different machines (with or without GPU) and that is more integrated with
    GCP
    . I have always considered
    kubeflow
    as a strong option as it it well integrated with
    GCP
    and specifically
    VertexAI
    . Now I saw a talk at Pycon Italy recently about
    Dagster
    and I felt like it could compete with
    Kubeflow
    but I still can't figure out whether
    Dagster
    is better. One specific question I don't have the answer for is if there is an easy way to run the
    Dagster
    ops on
    Cloud Run
    or
    VertexAI
    . Thanks a lot for your help 🙂 your project looks really cool!!! (@Andrea Giardini I saw you at pycon.italy, and contacted you on linkedin, thought that would make sense to ask my question here 🙂 ) 🧚 :dancing: 🕺 :dancing: 💃 :dancing: 🧚‍♀️
    s
    a
    +2
    • 5
    • 10
  • j

    Jeffrey Kaditz

    06/17/2022, 8:57 PM
    Hey I'm new to dagster and was curious if there was any guides on how to write a simple abstraction layer to easily port existing map/reduce code running in MRJob to dagster. My impression is dagster supports a more general computation paradigm than map/reduce and this should be possible, any guidance is greatly appreciated.
    d
    • 2
    • 2
  • v

    Veer

    06/18/2022, 12:53 PM
    Hi all, I work as a database administrator and my python knowledge is very limited. I know control flow and practicing functions now. In order to work with Dagster, which Python concepts should I learn and practice?
    s
    d
    • 3
    • 2
  • c

    Charles

    06/27/2022, 12:43 PM
    Any Dasgter experts here want to do a couple hour consultancy to help us hit the ground running esp on DevOps side...? DM me
  • g

    George Pearse

    06/27/2022, 3:33 PM
    I was going to start digging into feature stores, but having just had a peak at Feast's I've realised that the API and functionality can be very similar to Software Defined Assets in Dagster (maybe with a more column centric, rather than table / asset centric view). Would I get much in the way of additional benefits from adopting one?
    s
    c
    • 3
    • 7
  • g

    George Pearse

    06/29/2022, 3:47 PM
    I wrote a bunch of pipelines to migrate data from Mongo (App DB) to Postgres (temporary budget data warehouse until we upgrade to something like SnowFlake). Realised too late that what I was really doing was defining Airbyte Sources https://docs.airbyte.com/connector-development/tutorials/building-a-python-source and Destinations which are already well defined and robust. Our main Software Engineer is touchy about App DB query load. I'd imagine that Airbyte will be just as efficient as anything I've written if not more so? My mongo queries retrieve all the data in a collection since a timestamp from just before the previous pipeline execution if that makes sense. And then upsert on a key so a tiny bit of record overlap doesn't matter. Basically my pipelines can be a little buggy, should I just migrate? @Stephen Bailey I do use copy_expert for my backfills which makes them speedy, not sure if Airbyte would do that. But I guess the fact that Airbyte takes everything and loads it into json means that I wouldn't need to run a backfill because there's no transformation logic to get wrong or change. Not posting on Airbyte slack because Airbyte devs will obviously tell me to use Airbyte.
    s
    i
    • 3
    • 6
  • b

    Binoy Shah

    07/06/2022, 1:57 PM
    Does this matrix make sense for comparison or does it sound too biased ?
    d
    i
    • 3
    • 5
  • t

    Tobias Macey

    07/07/2022, 10:01 PM
    I officially launched my new show, The Machine Learning Podcast! The second episode was published earlier this week. For folks who don't know, I also run the Data Engineering Podcast
    🎉 8
    😛artydagster: 1
    z
    s
    • 3
    • 2
  • j

    Jay Jackson

    07/11/2022, 9:19 PM
    Hi, I was curious if anyone here has experience with recommendation engines and what a normal amount of memory consumption is? We've built a recommendation engine that provides recommendations but it's consuming a massive amount of memory most likely due reading relevant data into memory via Pandas (read_csv).. We're going to take another look and try to refactor this but wanted to know if this was normal and if i should consider scaling machine instead.. currently we're maxing out a 64 GB RAM machine 😞
    d
    g
    +3
    • 6
    • 10
  • u

    user

    07/12/2022, 11:20 PM
    We've been having a spirited debate about what we should call the role between Admin and Viewer, currentled called Editor in Dagster. Which do you like best?
    d
    • 1
    • 1
  • d

    Dagster Jarred

    07/12/2022, 11:22 PM
    hey friends, looking for input from the community on this topic above, if you’re interested in helping us name our roles more effectively 🙂
  • g

    George Pearse

    07/15/2022, 7:44 AM
    People keep saying things to this effect, I'm assuming Dagster is in the same camp as Prefect, but I have no idea what this difference in design choice is, and feel like I really should. Edit: I don't have much experience with Airflow. I'm assuming this is an architectural decision (how work is divided on k8s?) and not just ergonomics.
    s
    • 2
    • 2
  • m

    Mark Fickett

    07/20/2022, 5:14 PM
    Does anyone happen to have a Pulumi snippet for spinning up AWS EKS + Dagster agent (+ fluentbit) for Dagster Cloud?
  • p

    Pete Fein

    07/25/2022, 5:04 PM
    I’m teaching a free “Learn All Of Data Engineering In 3 Hours” workshop on Thursday 7/28 at 4:30 ET as part of the Operational Analytics Summer Community Days conference https://www.operationalanalytics.club/summer-community-days
    👀 1
  • a

    Akshay Verma

    07/25/2022, 8:45 PM
    In the docs, for deployment, there is Separate section for "Deploying Dagster to AWS" and "Deploying Dagster to GCP". These section mention in introduction about storing runs and events and IO manager capabilities using the cloud provider. Is there something similar for Azure as well?
  • j

    Jonny Mills

    07/27/2022, 6:49 AM
    [RESOLVED] Our team has been debating whether using f-string formatting is a best practice or not when logging specifically in our repository that uses dagster. There does not seem to be clear consensus in the python community, and the official python docs for python 3.9 (which is what we use) https://docs.python.org/3.9/library/logging.html# doesn’t explicitly say either. This stack overflow post says f-strings has worse optimization, but other people still use it, https://stackoverflow.com/questions/54367975/python-3-7-logging-f-strings-vs. I saw the dagster codebase itself using f-string formatting too. I’d like to switch over to use f-string formatting as well, but want to be able to back up why it won’t cause any issues. Any thoughts?
    n
    • 2
    • 1
  • c

    Charlie Bini

    07/27/2022, 5:02 PM
    ayyyyy thanks for the shoutout @schrockn https://podcasts.google.com/feed/aHR0cHM6Ly93d3cuZGF0YWVuZ2luZWVyaW5ncG9kY2FzdC5jb20vZ[…]l=en&ved=2ahUKEwjoj_2qxJn5AhXUM1kFHV6CD7MQieUEegQIAhAI&ep=6
    :clapping-all: 2
    😛artydagster: 3
    👍 2
    z
    • 2
    • 2
  • f

    Fraser Marlow

    07/29/2022, 8:22 PM
    Ahhhhhh… this was nice https://twitter.com/stkbailey/status/1553086006055866369?s=20&t=zwW_KgHTMcssY7G-8x_82w
    :big-dag-eyes: 3
    :dagster-evolution: 2
  • s

    Saul Burgos

    08/01/2022, 9:55 PM
    Hi, Besides the official documentation website where can I learn more about Dagster? good practices, etc. Fast search on google does not give me too much infor
    z
    y
    • 3
    • 5
  • s

    Sterling Paramore

    08/01/2022, 11:49 PM
    Important message from the future:
    😂 7
    :next-level-daggy: 1
    :rainbow-daggy: 1
    f
    • 2
    • 1
Powered by Linen
Title
s

Sterling Paramore

08/01/2022, 11:49 PM
Important message from the future:
😂 7
:next-level-daggy: 1
:rainbow-daggy: 1
f

Fraser Marlow

08/02/2022, 4:24 PM
The rip in space-time continuum has been repaired. Thanks for the head’s up.
View count: 1