Reposting this feedback from <this thread >as per ...
# dagster-feedback
c
Reposting this feedback from this thread as per daniel's request -- basically, I'm having a really hard time learning through the docs how to use several python environments in order to isolate Python dependencies among my organization's teams. --- I feel like I'm piecing together random bits from various different parts of the docs regarding workspaces - for example: location_name and executable_path are only mentioned on this page (which is the Workspace Files docs page) once I click the tab you linked for "Loading multiple repositories," which I initially did NOT click because my thought was "well this isn't relevant to me since I don't plan to use repositories since the docs have been telling me to use the more recent Definitions concept." [note daniel just opened a PR changing this language] My point is I'm having a really hard time figuring out how to do something that Dagster claims it is good at: isolating dependencies across projects/teams. So: 1. Do you know of any place that has more verbose docs on workspaces and all the possible fields I can specify? Or any really good examples I can parse through? 2. If I may suggest - it feels like this is a blind spot in the docs. Surely I'm not the first person who was using Airflow, has had incredible headaches with dependencies, saw that Dagster should resolve them - and then struggled to find out how to do so in the docs. I might even argue that this topic (how to set up Dagster to handle many different dependency requirements) deserves its own docs page. For example: I assume I also need to define those venvs in my build somehow? Such a thing + an example feels worthy of docs.
๐Ÿ™ 2
โž• 4
๐Ÿค– 1
c
FWIW, I found the structure of the quickstart_snowflake example project most helpful. I also found the docs disjointed, but managed to get things working by browsing through all of the examples. The docs definitely donโ€™t consistently reflect the recent suggestion to not use repositories.
c
thank you, I will take a look at the snowflake (and other!) examples rather than focusing so much on the docs
s
@Conor Ryan thanks for the feedback, we'll work to improve the docs here and I agree with all your pain points. In the meantime, here is an example that might help: https://github.com/slopp/dagteam
m
We were looking for a way to have teams deploy from separate git repos and ran into the same gaps. I haven't been able to test this yet, but this is what I put together from various examples:
Copy code
# download helm and create local values.yaml
kubectl config set-context dagster --namespace dagster --cluster docker-desktop --user=docker-desktop
kubectl config use-context dagster
helm repo add dagster <https://dagster-io.github.io/helm>

helm show values dagster/dagster > dagster_values.yaml
vim dagster_values.yaml
# modify dagster config to here
# see additions/modifications in separate codeblock below
# ... 

helm show values dagster/dagster > user_deploys_values.yaml
vim user_deploys_values.yaml
# modify user deploys config to here
# see additions/modifications in separate codeblock below
# ... 

# create workspace configmap
vim workspace.yaml
# modify user workspace.yaml here
# see additions/modifications in separate codeblock below
# ...

# apply configmap
kubectl apply -f workspace.yaml

# deploy dagster and dagster-user-deployments
helm --namespace=dagster upgrade --install dagster dagster/dagster --values=dagster-values.yaml
helm upgrade --install user-code dagster/dagster-user-deployments -f /user_deployments_values.yaml
Below are the modifications for the config files mentioned above:
Copy code
# in dagster_values.yaml

dagster-user-deployments:
  enableSubchart: false
.
.
.

dagit:
  workspace:
    enabled: true
    servers: []
    externalConfigmap: "dagster-workspace"
Copy code
# in user_deployments_values.yaml

# add a new deployment for each repo location you want to deploy to
deployments:
  - name: "data-engineering-etls"
    image:
      repository: "<http://github.com/data_engineering/etls|github.com/data_engineering/etls>"
      tag: latest
      pullPolicy: Always
    dagsterApiGrpcArgs:
      - "-f"
      - "/data_engineering/etls/repo.py"
    port: 3030
    resources:
      requests:
        cpu: "250m"
        memory: "500Mi"
      limits:
        cpu: "250m"
        memory: "500Mi"
Copy code
# in workspace.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: dagster-workspace
  namespace: dagster
data:
  workspace.yaml: |
    load_from:
      - grpc_server:
          location_name: "user-code-example"
          host: "k8s-example-user-code-1"
          port: 3030
I know there are probably some issues I haven't worked out, but would this generally work to have a single instance of Dagster with multiple projects/contributing teams as long as the teams deploy their code to the image repo locations defined in user_deployments_values.yaml? A key feature being that you would only have to update the dagster instance when you added a new repo location.
๐Ÿ‘ 1
e
Thanks for this very detailed feedback @Conor Ryan - I definitely agree with this being a blindspot in the docs. This isn't the first time this topic has come up in the past few weeks and is absolutely worthy of attention. I'm adding this to my short list of things to discuss with my team for upcoming docs work.
โค๏ธ 2
c
Very helpful - thanks all for your attention & feedback
g
@Mandi Alexander yes, this will work. This is what we are doing in our org but with plain k8s configs, not using dagster helm charts.
๐Ÿ™ 1
There are lots of blind spots in the docs and it is really a big pain. Sometimes you learn Dagster by it's source code, not docs ๐Ÿ˜„ But I am very thankful to the core devs and the community that are answering our questions extremely fast. I've never seen a community that would offer support that fast and effective. I also encourage everybody to first search for your questions' keywords in Slack, because Slack's history is already a parallel doc. I assume the docs are inconsistent and insufficient because of the high development pace of Dagster, and this is kind of a good sign ๐Ÿ˜„ Long live the Dagster and free software! D
๐Ÿ’ฏ 1
@erin with your permission, the repository/definitions topic is something that caused confusion in our team, I'd like to bring your attention to this discussions, it is not reflected in the official docs from what I've seen. https://github.com/dagster-io/dagster/discussions/10772
e
@Grigorii Kushnir Could you elaborate a bit more on this:
not reflected in the official docs
? Are there examples that weren't updated, or the concept(s) not explained well, etc?
g
@erin For example in our org we had a project that used repositories, and we've started a new one and discovered there are Definitions as well. And our questions were, should we migrate the old project to Definitions? What are the benefits of using Definitions and respectively the downsides of using repositories. What is the motivation behind the transition from repositories to Definitions? It is mentioned in the github discussion, but not in the docs. The docs https://docs.dagster.io/_apidocs/definitions#dagster.Definitions https://docs.dagster.io/concepts/repositories-workspaces/repositories only say that there has been a transition and repositories will continue to work. Will they continue to work and get the same features as definitions or definitions will progress and repositories will remain on the same spot?
The github discussion actually answered our questions.
e
Thanks for the additional detail! Out of curiosity, did you and your team happen to see this documentation? https://docs.dagster.io/concepts/code-locations I agree there are gaps in the content, especially around the future of repositories and incremental migration to
Definitions
g
@erin yes, sure, almost all docs have been read. We're making use of code locations and actually implemented the suggested architecture: https://docs.dagster.io/deployment/overview
Thank you for your great work!
daggy love 1
t
Tip that we should probable be more verbal about: You can search the entire Slack history (more than the current 90+ days) via https://discuss.dagster.io/
๐Ÿ‘ 1
thank you box 1