https://dagster.io/ logo
Title
k

Kevin Martin

09/07/2022, 12:21 PM
Hey all! I set up a serverless deployment, but ran into issues where it seemed like the Github connection was not finalizing, and whenever I was routed back to the deployment setup screen, the button to set up Github was still in place. When I looked a couple days later, it was there, but I couldn’t set up the sample project - I think either because I didn’t give permission beyond one (empty) repository, or because the account I connected with wasn’t an organization owner. Do either of those make sense? I did get it working using a completely different Github account, but is there a way to back that out and reconfigure the deployment with the initial account the right way? (I have another question, but I’ll post that separately.)
p

Pete Hunt

09/07/2022, 12:24 PM
hmm that’s odd
you should be able to reset your state by uninstalling the app from your GH account
k

Kevin Martin

09/07/2022, 12:28 PM
Thanks, Pete. Should that have worked even if I’m not a Github account owner (so long as I have permissions to create new repos)? I don’t know why the lag, but it may be that my issue in setting up the example code into a new repo was just because I had only given permission on one existing repo. I’d thought we’d be able to deploy into that, but I guess I was wrong.
p

Pete Hunt

09/07/2022, 12:30 PM
so you installed the app to an existing empty repo, and then you picked that repo as part of the setup process?
k

Kevin Martin

09/07/2022, 12:31 PM
Right. And it didn’t like that it was an existing repo, and I didn’t have permissions at that point to create a new one, since it was only authorized for the one.
p

Pete Hunt

09/07/2022, 12:36 PM
i wonder if you’re hitting a bug. that flow was designed for people bringing existing dagster projects and i wonder if there is a bug when dealing with empty repos. @ben is on the us west coast and will wake up in a few hours and can take a look
k

Kevin Martin

09/07/2022, 12:38 PM
Yeah. I get that now. I’m pretty new to this, so still working to understand the flows. Not sure that it’s a bug - I just did it wrong. 😀
p

Pete Hunt

09/07/2022, 12:39 PM
here are two ideas i have for making progress: 1. you can grant access to all repos. i think most people just do this. 2. if my suspicion about the empty repo is correct then you can try manually scaffolding it with
dagster project from-example
https://docs.dagster.io/dagster-cloud/deployment/serverless#without-github-gitlab-bitbucket-or-local-development
i think it’s a bug if it’s confusing. there are a lot of states users can be in during the GH auth process. we tried to test every one but some fail more elegantly than others (for reasons outside of our control sometimes)
k

Kevin Martin

09/07/2022, 12:41 PM
OK. I have it working against another Github account right now, but will try one of those things if we do disconnect the installation and start over with the original Github account, which may well happen. Well, if I’m going to break it, I’m glad it’s at least interesting.
p

Pete Hunt

09/07/2022, 1:07 PM
oh btw, github had some intermittent downtime yesterday and today. there’s a strong chance that what you described could be caused by that
k

Kevin Martin

09/07/2022, 2:50 PM
OK. That might be the case.
I did revoke the access and uninstall the app in Github. How do I set it up again now? I’m getting GraphQL errors on the cloud site, but it looks like its still showing the first couple set-up steps (including Github integration) as complete.
b

ben

09/07/2022, 2:56 PM
Hi Kevin, I can go ahead and reset this step on our end and see if that helps. What’s your organization name?
k

Kevin Martin

09/07/2022, 2:56 PM
TheGuarantors
b

ben

09/07/2022, 2:57 PM
Alright, I’ve reset the step, it should now show as incomplete and hopefully those errors won’t show up. You can try reconnecting to the same org. In the meantime I can do some digging on our end
k

Kevin Martin

09/08/2022, 12:43 PM
Thanks, @ben. I’m working with my engineering team on some appropriate account and credential set-up on our side before we try this again. Are you able to give us a clean slate in the cloud - remove our current workspace and any existing Github connections?
b

ben

09/08/2022, 8:01 PM
Yes, you should be reset back to the “Deploy your code” step now
k

Kevin Martin

09/09/2022, 11:49 AM
Thank you, @ben.
I went through the connection process to Github this morning, but afterwards, back on the set-up page, I still see the “Choose account or organization” button. If I click it again, I see the org I set up, with an option to “Cancel Request”. Is there a workflow in Github or internally that this might be waiting on.
Well. Now the next step in deployment is available - to clone the starter project. But when I specify a repo name and click on “Clone and Deploy”, I get an “Internal Server Error” message. Also, trying to use “Configure Installation” on my Github connection gets me a 404 error.
I was able to get to a page in a round-about way at: https://github.com/apps/dagster-cloud/installations/29145624. But that shows only permissions on selected repositories (with none selected). I selected all repositories in the setup, and the link that is being pointed to in the UI (where I get the 404) is different than above: https://github.com/apps/dagster-cloud/installations/28843402.
@Pete Hunt or @ben Should I be opening a new thread for this? Was trying to keep everything together. Upshot is we have an install now with “All repository” permissions, but the ID we see in our URL is 29145624. The one in the “Configure” URL on the Dagster cloud site had 28843402.
The URL from the cloud site doesn’t exist in our GH org. I don’t know how that got crossed up.
This seems to be causing the issue that we can’t create the starter project in GH.
b

ben

09/12/2022, 3:14 PM
It seems very odd that there are two installation IDs. I can go ahead and clear the installation associated with the organization to see if that helps, which should let you reselect a GitHub org.
k

Kevin Martin

09/12/2022, 3:16 PM
We can try that, but I think that’s what you did previously, though. It’s worth a shot.
b

ben

09/12/2022, 3:16 PM
Let me know what you see in the selection flow and if you end up getting a different installation ID
k

Kevin Martin

09/12/2022, 3:18 PM
It looks like IDs are aligned now. However, I am still getting an “Internal Server Error” when I try to create the starter repo.
b

ben

09/12/2022, 3:18 PM
Let me track down that error on our side
k

Kevin Martin

09/12/2022, 3:18 PM
Internal Server Error (Trace ID: 8537424951930906481)
b

ben

09/12/2022, 3:23 PM
It looks like we’re getting a 422, a validation error, back from the GH API. It could be that the repo name matches an existing repo, or is perhaps invalid?
k

Kevin Martin

09/12/2022, 3:25 PM
I just had “dagster_cloud”. Just tried it again without the underscore. Neither exists yet. That doesn’t look like the problem.
On the GH side, I see the app, and I see it marked for “All repositories.” Is there anything else I should see out there?
b

ben

09/12/2022, 3:26 PM
That should be all you need. Let me see if I can replicate the issue on my end
It looks like this might be a permissions issue; something we probably need to patch on our end. Do you happen to know if your user account has permissions to create a repo in the org?
For context; some GH API interactions can only be accomplished when tied to a user, while others occur using the GH app installation without a user attached to the action. I think what may be happening is that the repository creation is being tied to your user account which may not have perms to create a repo in the org. We should be performing the action without a user affiliated instead.
k

Kevin Martin

09/12/2022, 3:48 PM
We just created a separate service account for this, but it should have repo-creation permissions. I can check to be sure.
Yes, I was just able to create a repo using that account.
Could two-factor auth be causing an issue? I do have an access token associated with that account if there’s a way to set that up in the cloud.,
b

ben

09/12/2022, 3:55 PM
I don’t think that should be an issue since we’re using a token specifically for API use. What I’m seeing on our end is
github3.exceptions.ForbiddenError: 403 You need admin access to the organization before adding a repository to it
for the
dagster-tg
account trying to create a repo in the
TheGuarantors
organization
k

Kevin Martin

09/12/2022, 3:56 PM
OK. Can you reset us again? I want to try this from scratch.
b

ben

09/12/2022, 3:58 PM
Yes; you should be reset now
k

Kevin Martin

09/12/2022, 4:13 PM
I relinked and see the same thing. Do you know specific permissions that the account should have in GH?
b

ben

09/12/2022, 4:32 PM
I believe it’s just repository creation in the organization that should be needed to get past that step
but it sounds like the service user does have that permission, at least when interacting through the UI?
k

Kevin Martin

09/12/2022, 4:33 PM
Yes, that’s true.
b

ben

09/12/2022, 4:49 PM
This is tricky. I’m going to try to set up a test org and mirror this to see if I can replicate myself. Under the hood, the UI will set up a clone of the following repository and configure it for your organization. You can bypass that step by doing so manually in the meantime if you’d prefer (https://github.com/dagster-io/dagster-cloud-serverless-quickstart/), just fork the repository & set up the requisite secrets for the GitHub actions.
Very sorry you’re running into this, it’s a strange issue & one we haven’t seen before.
k

Kevin Martin

09/12/2022, 4:50 PM
It’s OK. Just hoping we can understand and fix it..,and keep it from being an issue for others.
Hmmm…I just tried to fork that repo over to our account, and I got an error message:
You cannot fork this repository to the selected destination due to a policy.
Could this be what’s interfering with your code, too?
b

ben

09/12/2022, 5:00 PM
Aha, that sounds promising
k

Kevin Martin

09/12/2022, 5:00 PM
Is the code essentially doing a fork as well? Or creating a repo then putting the files into it?
b

ben

09/12/2022, 5:03 PM
No; it creates a repo and then pushes the files
but the code that’s failing is the repo creation itself; not the subsequent push it looks like
still, I wonder if it’s the same policy that’s restricting automation creating the repo
k

Kevin Martin

09/12/2022, 5:04 PM
That’s possible. I’ll run this by our engineering team.
Hey @ben, are there any needs for more elevated permissions than normal repo-interaction activities (pulls, pushes, PRs), once we get past the initial repo creation? I’m thinking of seeing if I can get extra permissions added for this first step on our side, then tamp them back down. If that doesn’t work, I can try to clone the repo instead.
b

ben

09/13/2022, 2:48 PM
Hey @Kevin Martin, the user will also require perms to write GitHub Actions secrets & to modify GH Actions Workflow files on the repo.
k

Kevin Martin

09/13/2022, 2:51 PM
We just gave owner in the org, and still have the same error.
So it pretty much has all permissions on our side.
b

ben

09/13/2022, 3:04 PM
Hmm, and you are unable to use the template/fork still?
k

Kevin Martin

09/13/2022, 3:04 PM
So. Here’s another step we did. The connection I set up in the site to our Github was done using an account created for this purpose - service-dagster@theguarantors.com. Even with full privileges on our side, that wasn’t working. We added that account to the cloud site as an org admin and logged in with that - and the connection did not show up.
Just the “Connect to Github” button.
Any org admin should see the same things, right? So it’s odd that I see the connection that was set up and that account doesn’t.
b

ben

09/13/2022, 3:09 PM
The view there is tied to the user who has connected to GitHub, since some of the API requests need to be tied to a user, so there are some cases where there will be a discrepency between two admin users
k

Kevin Martin

09/13/2022, 3:10 PM
Is it a problem then that the connection was set up with a different account than the one I’m logged into Dagster with?
If it’s the account in the connection that matters, then everyone should see the same thing…
b

ben

09/13/2022, 3:11 PM
As long as your Dagster user is connected with the proper GitHub account once you’ve clicked “Connect to Github” that shouldn’t be the problem
I’m curious if the manual “use this template” flow will work now that the user has elevated permissions, or if there is still an error there
k

Kevin Martin

09/13/2022, 3:13 PM
My engineering team is asking if it’s possible to set up a call at this point and troubleshoot.
We could potentially get it working manually, but there’s enough strangeness going on here to not be confident about other potential issues going forward. It’d be great to get to the bottom of this.
b

ben

09/13/2022, 3:16 PM
Sure, when works best for folks on your end?
I have availability after 2 PM PST today, 10 AM - 1 PM PST tomorrow, 8-9 AM or after 9:30 AM PST Thursday.
k

Kevin Martin

09/13/2022, 3:33 PM
Thanks. Let me run those times by them.
Is 12:00 PST, 3:00 EST tomorrow OK? You can send me an invite at kevin.martin@theguarantors.com.
b

ben

09/13/2022, 4:04 PM
Yes; I’ve gone ahead and sent an invitation
k

Kevin Martin

09/13/2022, 5:10 PM
Thanks, @ben.
Hey @ben, thanks for helping us to get up and running. Now I move on to struggling through how everything connects together. The first thing I’m having issues with is how to pass secrets. I have a test (fake) secret defined in Github, and I’m trying to extend the cereals example code to get that secret value into an environment variable and use it within one of the sample asset methods just to write it out to a context log. I don’t see where I put a run_config - all the examples I find so far in the docs pass the run config as part of the execute_in_process() method to run inline. But the example code loads the assets into a repository using load_assets_from_package_module() and the actual run is invoked online. I think I understand the part of adding config_schema to the asset decorator, but where does the code for the run_config go that passes in the config value? Or am I thinking about this completely wrong?
Thanks. What about getting that value from an environment variable that was passed in from GH secrets?
Where do I register the secrets in the code? I added a “SUPER_SECRET” key in Github and documented it in both the env: sections in each of the deployment and branch_deployment gihub workflow config files, and added the configuration in the Materialize UI as you pointed out. But I’m getting an error saying that there is no such environment variable set.
ops:
nabisco_cereals:
config:
super_secret:
env: SUPER_SECRET
dagster._core.errors.DagsterInvalidConfigError: Error in config for job
Error 1: Post processing at path root:ops:nabisco_cereals:config:super_secret of original value {'env': 'SUPER_SECRET'} failed:
dagster._config.errors.PostProcessingError: You have attempted to fetch the environment variable "SUPER_SECRET" which is not set. In order for this execution to succeed it must be set in this environment.
Stack Trace:
File "/usr/local/lib/python3.8/site-packages/dagster/_config/post_process.py", line 79, in _post_process
new_value = context.config_type.post_process(config_value)
File "/usr/local/lib/python3.8/site-packages/dagster/_config/source.py", line 42, in post_process
return str(_ensure_env_variable(cfg))
File "/usr/local/lib/python3.8/site-packages/dagster/_config/source.py", line 16, in _ensure_env_variable
raise PostProcessingError(
I think this may be because the changes to the deployment workflows I made were in the branch. I assume the workflow files actually pulled for the GH actions are the ones in main. Trying out some changes now.
b

ben

09/20/2022, 6:59 PM
Can you share the snippet from the workflow file? My guess is somehow it’s not getting passed to the build step
k

Kevin Martin

09/20/2022, 7:04 PM
Here’s the branch_deployments one as I have it right now:
name: Serverless Branch Deployments
on:
  pull_request:
    types: [opened, synchronize, reopened, closed]
concurrency:
  # Cancel in-progress runs on same branch
  group: ${{ github.ref }}
  cancel-in-progress: true

env:
  DAGSTER_CLOUD_URL: "<http://dagster.cloud/theguarantors>"
  #SUPER_SECRET: ${{ secrets.SUPER_SECRET }}

jobs:
  parse_workspace:
    runs-on: ubuntu-latest
    outputs:
      build_info: ${{ steps.parse-workspace.outputs.build_info }}
    steps:
      - uses: actions/checkout@v3
      - name: Parse cloud workspace
        id: parse-workspace
        uses: dagster-io/dagster-cloud-action/actions/utils/parse_workspace@v0.1
        with:
          dagster_cloud_file: dagster_cloud.yaml

  dagster_cloud_build_push:
    runs-on: ubuntu-latest
    needs: parse_workspace
    name: Dagster Serverless Deploy
    strategy:
      fail-fast: false
      matrix:
        location: ${{ fromJSON(needs.parse_workspace.outputs.build_info) }}
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          ref: ${{ github.head_ref }}
      - name: Build and deploy to Dagster Cloud serverless
        uses: dagster-io/dagster-cloud-action/actions/serverless_branch_deploy@v0.1
        with:
          dagster_cloud_api_token: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
          location: ${{ toJson(matrix.location) }}
          # Uncomment to pass through Github Action secrets as a JSON string of key-value pairs
          # env_vars: ${{ toJson(secrets) }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          SUPER_SECRET: ${{ secrets.SUPER_SECRET }}
And here’s the main deploy script:
name: Serverless Prod Deployment
on:
  push:
    branches:
      - "main"
      - "master"
concurrency:
  # Cancel in-progress deploys to main branch
  group: ${{ github.ref }}
  cancel-in-progress: true
env:
  DAGSTER_CLOUD_URL: "<http://dagster.cloud/theguarantors>"
  DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
#  SUPER_SECRET: ${{ secrets.SUPER_SECRET }}

jobs:
  parse_workspace:
    runs-on: ubuntu-latest
    outputs:
      build_info: ${{ steps.parse-workspace.outputs.build_info }}
    steps:
      - uses: actions/checkout@v3
      - name: Parse cloud workspace
        id: parse-workspace
        uses: dagster-io/dagster-cloud-action/actions/utils/parse_workspace@v0.1
        with:
          dagster_cloud_file: dagster_cloud.yaml

  dagster_cloud_build_push:
    runs-on: ubuntu-latest
    needs: parse_workspace
    name: Dagster Serverless Deploy
    strategy:
      fail-fast: false
      matrix:
        location: ${{ fromJSON(needs.parse_workspace.outputs.build_info) }}
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          ref: ${{ github.head_ref }}
      - name: Build and deploy to Dagster Cloud serverless
        uses: dagster-io/dagster-cloud-action/actions/serverless_prod_deploy@v0.1
        with:
          dagster_cloud_api_token: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
          location: ${{ toJson(matrix.location) }}
          # Uncomment to pass through Github Action secrets as a JSON string of key-value pairs
          # env_vars: ${{ toJson(secrets) }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          #  SUPER_SECRET: ${{ secrets.SUPER_SECRET }}
b

ben

09/20/2022, 7:04 PM
ah, yeah, you’ll need to set up the following
# Uncomment to pass through Github Action secrets as a JSON string of key-value pairs
          # env_vars: ${{ toJson(secrets) }}
(you can also manually specify k/v pairs there instead of all secrets)
k

Kevin Martin

09/20/2022, 7:15 PM
Thanks. I’ll try that.
OK. That works now. Thanks!
b

ben

09/20/2022, 8:13 PM
Great! let us know if you run into anything else