Marko Obert
03/12/2024, 7:33 AMScout
03/12/2024, 7:33 AMEnvVar
class provided by Dagster. This class allows you to reference environment variables in your Dagster code, and when used, the actual values of these variables are not displayed in the UI, such as the Launchpad. Here's an example from the documentation on how to use EnvVar
to configure a resource with credentials that are sourced from environment variables: python from dagster import EnvVar, Definitions, ConfigurableResource class CredentialsResource(ConfigurableResource): username: str password: str defs = Definitions( assets=..., resources={ "credentials": CredentialsResource( username=EnvVar("MY_USERNAME"), password=EnvVar("MY_PASSWORD"), ) }, )
In this example, MY_USERNAME
and MY_PASSWORD
are the names of the environment variables that store the actual username and password. By using EnvVar
, you instruct Dagster to retrieve the values at runtime, and these values will not be displayed in the UI, such as the Launchpad. This approach has several benefits: - Improved observability: The UI will display information about configuration values sourced from environment variables. - Secret values are hidden in the UI: Secret values are hidden in the Launchpad, Resources page, and other places where configuration is displayed. - Simplified testing: Because you can provide string values directly to configuration rather than environment variables, testing may be easier. For more information on using environment variables with Dagster and hiding sensitive information in the UI, you can refer to the [Environment variables guide](https://docs.dagster.io/guides/dagster/using-environment-variables-and-secrets) in the documentation.Marko Obert
03/12/2024, 7:36 AMMarko Obert
03/12/2024, 7:37 AMScout
03/12/2024, 7:37 AMEnvVar
class, you can consider the following alternatives: 1. Using a .env
File for Local Development: Dagster supports loading environment variables from a .env
file for local development. While this file contains environment variables, it is not checked into source control and is used locally. This allows you to keep sensitive information out of your codebase. However, the variables defined in the .env
file would still be environment variables, just not explicitly declared in your system's environment. 2. Secrets Managers: For cloud or production environments, you can use a secrets manager to store sensitive information. Dagster integrates with various secrets management systems, such as AWS Secrets Manager, GCP Secret Manager, or HashiCorp Vault. You can retrieve secrets at runtime within your resources or ops without exposing them in your code or UI. 3. Dynamically Loading Secrets: You can write custom code to dynamically load secrets from a secure location at runtime. This could be a secure file on disk, a database, or a custom secrets service. You would access these secrets programmatically within your resources or ops without using environment variables. 4. Using Configuration Files: You can use configuration files that are not checked into source control to store sensitive information. These files can be read at runtime to load the necessary credentials. Ensure that these files are properly secured and are only accessible by the necessary processes or users. 5. Runtime Configuration: If you have a mechanism to inject configuration at runtime, such as through a deployment pipeline or orchestration system, you can pass sensitive information directly to Dagster at execution time without storing it in the codebase. It's important to note that while these methods can help hide credentials from the launchpad and source control, you should always follow best practices for managing sensitive information, including using secure storage, access controls, and audit logging. For specific implementations and best practices regarding sensitive information and secrets management in Dagster, you should refer to the official Dagster documentation or seek guidance from the Dagster community or support channels.Marko Obert
03/12/2024, 7:43 AMScout
03/12/2024, 7:43 AMbash
pip install msal
2. Create a Custom Resource:
Define a custom resource in Dagster that encapsulates the MSAL authentication logic. This resource will handle acquiring and refreshing tokens as needed.
python
from dagster import resource, Field
import msal
@resource({
"client_id": Field(str, description="The Azure AD client ID"),
"authority": Field(str, description="The authority URL for Azure AD"),
"client_secret": Field(str, is_required=False, description="The client secret for confidential clients"),
"tenant_id": Field(str, description="The Azure AD tenant ID"),
})
def msal_auth_resource(context):
client_id = context.resource_config["client_id"]
authority = context.resource_config["authority"]
client_secret = context.resource_config["client_secret"]
tenant_id = context.resource_config["tenant_id"]
# Create a confidential client application
app = msal.ConfidentialClientApplication(
client_id,
authority=authority.format(tenant_id),
client_credential=client_secret,
)
# The resource can now use 'app' to acquire tokens and make authenticated calls
return app
3. Use the Resource in Your Pipeline:
Utilize the custom MSAL resource in your pipeline to authenticate and make API calls to Microsoft services.
python
from dagster import op
@op(required_resource_keys={"msal_auth"})
def my_op(context):
msal_app = context.resources.msal_auth
# Acquire a token, make API calls, etc.
# Example: token_response = msal_app.acquire_token_for_client(scopes=["<https://graph.microsoft.com/.default>"])
# Use the token to make API calls
4. Configure the Resource:
When you run your pipeline, you'll need to provide the necessary configuration for the MSAL resource, such as the client ID, authority, client secret, and tenant ID.
yaml
resources:
msal_auth:
config:
client_id: "your-client-id"
authority: "<https://login.microsoftonline.com/{0}>"
client_secret: "your-client-secret"
tenant_id: "your-tenant-id"
5. Handle Token Caching and Renewal:
MSAL supports token cacScout
03/12/2024, 7:44 AMConfidentialClientApplication
with a token cache. Please note that the above code is a simplified example to illustrate the concept. In a production environment, you should handle secrets such as the client secret securely, potentially using Dagster's secret management tools or environment variables. For detailed guidance on using MSAL and implementing token caching and renewal, refer to the [MSAL Python documentation](https://github.com/AzureAD/microsoft-authentication-library-for-python). Always ensure that you are following security best practices when handling authentication and managing sensitive information.