Frank Dekervel
02/21/2022, 5:02 PMjonvet
02/21/2022, 5:40 PMdagit
that I should report it here
Operation name: InstanceSensorsQuery
Message: Attempted to deserialize class "InstigatorState" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.
Descent path: <root:dict>
Path: ["repositoriesOrError","nodes",0,"sensors"]
Locations: [{"line":17,"column":9}]
Stack Trace:
File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 452, in resolve_or_error
return executor.execute(resolve_fn, source, info, **args)
File "/usr/local/lib/python3.7/site-packages/graphql/execution/executors/sync.py", line 16, in execute
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/schema/external.py", line 198, in resolve_sensors
[GrapheneSensor(graphene_info, sensor) for sensor in sensors],
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/schema/external.py", line 198, in <listcomp>
[GrapheneSensor(graphene_info, sensor) for sensor in sensors],
File "/usr/local/lib/python3.7/site-packages/dagster_graphql/schema/sensors.py", line 55, in __init__
self._external_sensor.get_external_origin_id()
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1689, in get_job_state
return self._schedule_storage.get_job_state(job_origin_id)
File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 66, in get_job_state
return self._deserialize_rows(rows[:1])[0] if len(rows) else None
File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 37, in _deserialize_rows
return list(map(lambda r: deserialize_json_to_dagster_namedtuple(r[0]), rows))
File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 37, in <lambda>
return list(map(lambda r: deserialize_json_to_dagster_namedtuple(r[0]), rows))
File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 371, in deserialize_json_to_dagster_namedtuple
check.str_param(json_str, "json_str"), whitelist_map=_WHITELIST_MAP
File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 399, in _deserialize_json
return unpack_inner_value(value, whitelist_map=whitelist_map, descent_path=_root(value))
File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 430, in unpack_inner_value
f'Attempted to deserialize class "{klass_name}" which is not in the whitelist. '
geoHeil
02/21/2022, 8:55 PMMarc Keeling
02/21/2022, 11:16 PMKeith Devens
02/21/2022, 11:45 PMroot_run_id
in Dagster. That worked fine locally, but I’m now testing our deployment in ECS and realized that if a job fails its container will disappear, temporary data with it.
So, is there any way to persist temporary data (that can be picked up when a job is resumed) aside from returning it from the op?Billie Thompson
02/22/2022, 8:42 AMMykola Palamarchuk
02/22/2022, 10:06 AMBenoit Perigaud
02/22/2022, 10:11 AMassets = load_assets_from_dbt_manifest(
json.load(open(os.path.join(DBT_PROJECT_DIR, "target", "manifest.json")))
)
build_dbt_models = build_assets_job("build_dbt_models", assets=assets, resource_defs=DEV_RESOURCES)
When I try to load the job though, I get errors whenever my dbt project contains seeds and snapshots. I have then modified the file dagster_dbt/asset_defs.py
to replace if node_info["resource_type"] == "model"
by if node_info["resource_type"] in ["model","snapshot","seed"]
. I can now load my dagster job properly but then I have some errors and unexpected behaviors when trying to materialize my dbt project:
1. I get errors that my seeds are not materialized (Core compute for op "dbt_project" did not return an output for non-optional output "raw_customers"
), most likely due to the fact that only a dbt run
is done, and not a dbt seed
2. If I go to the assets view and want to materialize only one of them, dagster still do a dbt run --select asset1 asset2 assset3 ...
, not filtering down the model I asked to materialize
3. If I recall correctly, in previous versions running a dbt asset-based job was showing me all the different dbt models in “Timed View”, now it is only showing me 1 task, “dbt_project”
Is what I am describing above the expected behaviour or am I missing something?George Pearse
02/22/2022, 11:47 AMQumber Ali
02/22/2022, 1:17 PMAlex Kerney
02/22/2022, 4:48 PM**kwargs
in ops?
I have a few ops that are rendering Jinja2 template and need to pass information to the templates. It would be nice to be able to generalize the ops so that I can use them in other pipelines and just pass different info into the template, but if I try to pass keyword arguments in I get a DagsterInvalidDefinitionError
.
dagster.core.errors.DagsterInvalidDefinitionError: Invalid dependencies: solid "update_templates" does not have input "latest_file". Available inputs: ['template_config', 'kwargs', 'start_after']
I’ve tried with an op like this:
@op(
ins={
"template_config": In(TemplateConfig),
"kwargs": In(Optional[dict]),
"start_after": In(Nothing),
}
)
def update_temolates(
context: OpExecutionContext,
template_config: TemplateConfig,
**kwargs,
):
...
In a graph similar to
@graph
def generate_datasets():
dataset_path = download_dataset()
template_config = template_and_destination()
update_templates(template_config, dataset_path=dataset_path)
George Pearse
02/22/2022, 5:28 PMLee Littlejohn
02/22/2022, 5:59 PMouter_graph.inner_graph.op_1
and outer_graph.inner_graph.op_2
are truncated based on length to keep the display less messy in Dagit to outer_graph.inne…
and outer_graph.inne…
, which becomes very visually unclear with lots of nesting, aliases, or DynamicOutputs.Hebo Yang
02/22/2022, 7:12 PMmrdavidlaing
02/22/2022, 8:06 PMinstance/overview
page)
Is this a know issue; or something to do with our specific set of dependencies?
AttributeError: 'Select' object has no attribute 'subquery'
Traceback (most recent call last):
...snip...
File "/Users/dlaing/workspace/tanzu-dm/.venv/lib/python3.7/site-packages/dagster/core/storage/runs/sql_run_storage.py", line 312, in _runs_query
subquery = base_query.subquery()
graphql.error.located_error.GraphQLLocatedError: 'Select' object has no attribute 'subquery'
geoHeil
02/22/2022, 8:27 PMtimo nicolas
02/22/2022, 8:51 PMDanny Jackowitz
02/22/2022, 9:46 PMjson_console_logger
instead of the colored_console_logger
. I assume this has got to be a one line addition to a configuration file somewhere (dagster.yaml
?) but I’m not finding any references as to how exactly to make such a change. The end goal is to be able to route stdout/stderr from these processes into Datadog and have them parsed sensibly, particularly to support multi-line logs (e.g. backtraces).Alexander Vandenberg-Rodes
02/23/2022, 3:11 AMdagster instance migrate
. Seeing a strange bug:
• the Assets page is completely blank, no matter which projects I select
• but navigating from a job/pipeline run to the asset materialization events shows the assets do existHuib Keemink
02/23/2022, 10:38 AMRobert Balaban
02/23/2022, 2:03 PMresource "helm_release" "dagster" {
...
set {
name = "dagster-user-deployments.deployments[0].envSecrets[0].name"
value = kubernetes_secret.dagster-aws-secret.metadata[0].name
type = "string"
}
}
resource "kubernetes_secret" "dagster-aws-secret" {
metadata {
name = "dagster-aws-secret"
namespace = var.namespace
}
data = {
AWS_ACCESS_KEY_ID = var.AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY = var.AWS_SECRET_ACCESS_KEY
}
}
Cheers!
After some poking around, turns out that the creds need to go into runLauncher
, which makes sense, modifying the terraform file as this:
set {
name = "runLauncher.config.k8sRunLauncher.envSecrets[0].name"
value = kubernetes_secret.dagster-aws-secret.metadata[0].name
type = "string"
}
Seams to do the trick.Hemanth Bellala
02/23/2022, 3:12 PM2022-02-23 10:11:05 -0500 - dagit - INFO - Serving dagit on <http://0.0.0.0:3000> in process 15370
WARNING: You must pass the application as an import string to enable 'reload' or 'workers'.
It starts then stops again. Has anyone faced this issue?Ashish Khaitan
02/23/2022, 6:44 PMAnoop Sharma
02/23/2022, 7:22 PMBasarat Aleem Mohammed
02/24/2022, 4:25 AMGijs
02/24/2022, 7:29 AMHuib Keemink
02/24/2022, 8:11 AM{ 'error_code': 'RESOURCE_DOES_NOT_EXIST', 'message': 'No file or directory exists on path ' '/tmp/dagster_staging/XXX/XXX/stdout.'}
Irven Aelbrecht
02/24/2022, 8:35 AMop
, but you can't define an async graph?
at least I didn't see how I could use DynamicOutput
together with an async op
.
We also use async clients, For instance using aiohttp.ClientSession
. This is generally used by an async contextmanager. I want to define this as a resource (I saw it is done this way for non-async clients in the hackernews example).
How should I do this in dagster?Sundara Moorthy
02/24/2022, 10:02 AMdagster-user-deployments:
enabled: true
deployments:
- name: "k8s-example-user-code-1"
image:
repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
tag: latest
pullPolicy: Always
dagsterApiGrpcArgs:
- "--python-file"
- "/example_project/example_repo/repo.py"
port: 3030
Can i use spark image in place of docker.io/dagster/user-code-exampleIrven Aelbrecht
02/24/2022, 1:20 PMAttributeError: 'SensorEvaluationContext' object has no attribute 'log'
I hacked this in my test with
context = dagster.build_sensor_context()
context.log = logging.Logger('test')
but running the sensor in dagit gave the above error 🙂