https://dagster.io/ logo
Title
m

Mycchaka Kleinbort

05/17/2023, 3:36 PM
Hi All, I'd like to access the Dagster DAG from outside Dagster (to do some analysis) Where can I find it? I imagine there is some file that is created during the "reload code" action that stores what asset depends on what other assets
t

Tim Castillo

05/17/2023, 4:55 PM
It's stored in the database that Dagster uses. If you'd like, the cleanest interface to grab the data you're looking for is the GraphQL API that comes with every Dagster instance. Here's a rudimentary GraphQL query to grab all your assets, its dependencies, and what depends on it:
query GetLineage {
  assetNodes {
    assetKey {
      path
    }
    dependedBy {
      asset {
        assetKey {
          path
        }
      }
    }
    dependencies {
      asset {
        assetKey {
          path
        }
      }
    }
  }
}
m

Mycchaka Kleinbort

05/17/2023, 5:04 PM
Thank you, I'm new to the GraphQL api - do you have a python snippet to pull it?
m

Moody Edghaim

05/17/2023, 5:12 PM
https://docs.dagster.io/concepts/dagit/graphql they have a pretty good resource on their gql api
:people_hugging: 1
which should lead you to this
t

Tim Castillo

05/17/2023, 5:13 PM
Here's a gist I keep on how to query a GraphQL API in Python https://gist.github.com/tacastillo/c4a7a1b55028fcdda1cc07961bd40ba8
🙌 1
m

Mycchaka Kleinbort

05/17/2023, 5:16 PM
@Moody Edghaim - I got as far as
from dagster_graphql import DagsterGraphQLClient
import dagster 
import warnings

warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning)
gql_client = DagsterGraphQLClient('localhost', port_number=3000)

q = '''
query GetLineage {
  assetNodes {
    assetKey {
      path
    }
    dependedBy {
      asset {
        assetKey {
          path
        }
      }
    }
    dependencies {
      asset {
        assetKey {
          path
        }
      }
    }
  }
}
'''
Just trying to figure out how to run the query
t

Tim Castillo

05/17/2023, 5:19 PM
tbh, someone from my own team can correct me, but I think the
DagsterGraphQLClient
is a wrapper that runs specific pre-made queries. If you'd like to query the endpoint on your own outside of Dagster, ( such as in another python script or a notebook), then you can copy+paste the python code in the gist I sent you and install its dependencies.
m

Mycchaka Kleinbort

05/18/2023, 8:58 AM
Thank you, I succeeded