The Dagster tutorials feel rather...incomplete and...
# dagster-feedback
m
The Dagster tutorials feel rather...incomplete and rushed. They tell you how to define assets and jobs, but end there. There's no explanation of what a repository is, what a Dagster project is, or what a resource is. There's also not really a high-level tutorial that takes you through what a Dagster project is at a high level or how these things fit together. There's the 'fully-featured project' guide, but it's not detailed enough and rushes through things too much (in my opinion). Sure, everything is described in extensive detail on the Concepts pages, but that's the problem: there's too much detail on the Concepts pages. Not to mention that the first concept is "Software-Defined Assets" (which the tutorial did cover) but that page then references all sorts of other concepts (repositories, resources, configs) that I never learned about in the tutorial, so how am I supposed to make sense of any of it? What is the intended next step for a new user after they finish the tutorials section? Jump to the concepts page, where you're bombarded with other concepts you have no understanding of? Go to the fully featured project and try and cobble together how the pieces fit together by looking at the code? See my comment in the thread for suggestions on what I think might be a better set up.
👍 2
As low-hanging fruit, just having an obvious answer to the above question would be great. It'd be helpful to have callouts to other resources at the end of the tutorials section. "Now that you know assets and ops and jobs, this is what you need to know next." (The assets tutorial does this, but the ops and jobs one does not. Additionally, the conclusion of the asset section doesn't even talk about repositories or projects, which feels like a missed opportunity.) In general, though, I'd suggest significantly expanding the tutorials section. The content that's there is great, but as I mentioned above, it just feels like a disjointed island. Once I know about assets and ops and jobs, it'd be good to have more tutorials that introduce me to other related topics. I've listed them below, alongside a summary of what the learning outcome of that section would be: • graphs -- now that you know about assets and jobs, here's how to think about interconnecting them (the assets page does cover this somewhat, so maybe it's redundant) • schedules and sensors -- how do you trigger and orchestrate these graphs/assets/ops/jobs that you've defined • repositories -- now that have a few assets and sensors, how can you group them together? • projects -- what's the big picture view of how all these things come together And maybe the 'fully-featured project' (or a simpler version of it) can be what the tutorial builds towards. I'd recommend looking at Prefect's docs for inspiration. They have a separate Concepts section that does a deep dive into each topic, but for many of the foundational concepts, they also have a tutorial page that provides a more succinct introduction to that concept. That may feel redundant, but it's helpful to have a place where I can just get a high-level overview of a topic and maybe a couple of simple examples before learning about every nook and cranny. I finished going through Prefect's tutorials before coming to Dagster (I'm comparing the two products to decide which one we should use) and I never felt confused when reading Prefect's tutorials or unsure about what where I needed to do next. On the other hand, as soon as I finished going through the tutorials section in Dagster's docs, I was immediately confused about where to go from here. Most importantly, I felt like I could actually create Prefect project on my own (because I was introduced to the relevant components with some simple examples of how they connect) after finishing their tutorial, but Dagster's tutorial doesn't even discuss the concept of a project so I did not have the same feeling here. (Getting Started does go through creating a project, but doesn't explain why I need a project or what it is, just how to quickly get a scaffolding.)
👍 2
q
As someone who's tried and followed Airflow/Prefect/Dagster over the last 3 years, I think you're looking at Dagster from the lens of Prefect and that's making you not see dagster for what it is. I agree 100% the docs could use some improvement but I don't think the tutorials section can give you everything about dagster. Unlike Prefeft, dagster has so many moving parts. The docs are better now and it will get better. Prefect on the other hand is not like Dagster. I'd recommend you ask yourself what you want to do and then move from there. You'd have to pick and choose. Do you want to go with assets or you want to go with ops, jobs and graphs. I'd recommend as a beginner, you start with ops, graphs, and jobs. Leave assets out and when you are confortable, you introduce assets. This will make you understand the concepts. There are some who use ops and never use assets and they still get what they want from Dagster. When you want the added benefits of assets, you ask how you can convert your ops, graphs etc to assets. How do you define jobs and add schedules when you switch to jobs. This will make navigating the docs easier. But starting with assets and trying to bring ops, etc can be very confusing for a beginner .. especially when they are trying to view things from the lens of another tool like Prefect. I think Prefect = ops, graphs, jobs, schedules. Although Prefect approaches these differently.
m
Thank you for the reply! I appreciate your thoughts and I've more or less already decided to go with Dagster, largely because what my team and boss needed out of an orchestrator (when was this table updated, why is it out-of-date, and how can I fix that?) aligns quite well with Dagster's emphasis on assets vs. tasks/ops and flows/jobs. With that said, I don't think I'm confusing Prefect with Dagster. I understand their conceptual differences (or at least I think I do), and my criticisms here are not really about Dagster's data model/abstractions, but rather the difficulty of actually learning them as a first-time user who's never used Airflow or Prefect or Dagster before. The tl;dr of my comments above is that the learning path for a new user is not (in my opinion) very-well defined and that the tutorials section should have more content, with summarised and simpler versions of the concepts pages for the most important components, because the concepts pages are too in-depth and off-handedly refer to other concepts that the new user just does not understand. (In this respect -- a simple tutorial page and an in-depth concepts page for the same topic/concept -- Prefect's docs are, in my opinion, much more approachable.) Because Dagster is admittedly a more complex product (more moving parts as you said), it's all the more important that there is a well-defined learning path and the user doesn't just feel so overwhelmed that they move on to another product with easier-to-understand docs.
👍 1
1
s
@Muhammad Jarir Kanji I agree with you on the challenges in the learning path, although I would not say it's due to being incomplete or rushed. My personal take is that learning the project -- especially as the chief implementer -- is like learning 3D underwater chess, although Dagster Cloud has made the actual startup cost significantly lower in the past 9 months. There are now more product-led opinions on how to handle secrets (environment variables), CI/CD (out-of-the-box actions + branch deployments), and the majority of pipelines (assets). There is even a
dagster create project
comamnd, I think. However, the framework is still very open-ended in terms of how you can set up your repositories, deployments, etc, and there are tons of tutorials / pages / etc. I also agree that there needs to be a single entrypoint for new users. Personally, I'd recommend you start with assets, and only use ops/graphs for your more complicated use cases. I think they're simpler to reason about, more intuitive for other users in the org, and have some nice cataloging benefits. I did this example project a while ago that shows both how we set up most of our repos, and also how you might port over something like dbt Jaffle Shop example project into the asset framework. might be useful!
❤️ 1
Also, I agree with you specifically on the idea of a "project". Dagster has "repository", "package", "code location" and "deployment", all of which have some overlap semantically. I know that the team is actively working on simplifying this ontology to improve this experience, though.
❤️ 1
e
This is fantastic feedback - thank you, all. Along with working on simplifying some concepts (specifically 'repository', 'code location,' etc.) we've also been giving a lot of thought to flattening the learning curve for beginners. This includes some changes to improve the asset tutorial, which will make the example more real-world and comprehensive. It should also act as more of a jumping-off point to other concepts, hopefully reducing (or eliminating altogether!) the feeling of not knowing where to go next. (We're also assessing the ops/jobs tutorial, but just tackling assets first.) We hope to release the updated tutorial in the next few weeks. I (and the rest of the team) have also been looking at the Concept pages and thinking about how to improve them. Your feedback about them feeling overwhelming and a bit all over the place is well taken, and something I also thought when I first started learning Dagster. I believe the original thinking for the Concept pages was to put everything in one place to keep info together and make on-page search easy, but over time it may have gotten a bit unwieldy. Seems like it's a lot to get through and it's not always easy to connect the dots.
❤️ 1
p
also wanted to put in a plug for a blog post that attempts to address some of these issues: https://dagster.io/blog/dagster-crash-course-oct-2022 please let me know if you have any feedback on it!
❤️ 4
m
Hi @Stephen Bailey! That's a very cool project! I'll definitely check it out in more detail. And I definitely agree that starting off with assets is actually more amenable for beginners than ops and jobs for all the reasons you outlined above. It's also what the focus of this project seem to be in the future.
@erin Awesome! The more I'm learning about it, the more I'm liking the project so it's good to know that you're also focusing on improving the new user experience alongside adding new features. I'll also add that I actually like the concept of a Concepts page (I appreciated it both here and in Prefect's docs) as a compendium of everything related to a given topic that's also more human-readable than just reading through docstrings and function arguments in the API reference. It's moreso that before I start to read the Concepts page and learn about all the nuts and bolts, there's a clear need for me to get a high-level lay of the land first. @Pete Hunt I was actually just about to add a comment referring to that blog post (which I discovered over the weekend) as an example of what a more beginner-friendly layout and flow for a new user might be: building a fully-functional project that gradually introduces concepts like schedules, resources, and configs as needed.