Muhammad Jarir Kanji
11/19/2022, 1:33 PMAdam Ma'ruf
11/19/2022, 3:57 PMReceived unexpected config entry "inputs" at path root:my_op. Expected: "{ config?: Any outputs?:...
----
Situations where we run into needing to do this: The upstream job runs ops that define the train, test, and eval datasets for a supervised learning task. The downstream job runs ops that engineer features and train models . The upstream job and downstream job don't change (or run) at the same frequency. The upstream job that defines the learning task and datasplits runs infrequently. The downstream job runs frequently as the ML engineer iterates on the model itself, and multiple runs should re-use the outputs of a single run of the upstream job.Eugenio Nurrito
11/19/2022, 6:14 PMJohn Sears
11/19/2022, 10:37 PMload_input
method, what is the right way to check if the upstream and downstream assets share a partition definition?David Jayatillake
11/20/2022, 12:40 AMFilip Radovic
11/20/2022, 4:02 PMStefan Adelbert
11/21/2022, 2:07 AMdocker compose
to orchestrate dagster components in both development and production environments. It's working really well, but I need to plan for a future where I need to scale up run more workers (user codes repositories) on other machines. I'm aware of running dagster on Kubernetes using Helm (https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm), but I'm wary of the learning curve.
Does anyone have experience getting dagster running on multiple nodes/instances/machines using docker swarm
?Mykola Palamarchuk
11/21/2022, 2:45 AMAverell
11/21/2022, 8:22 AMdefine_asset_job
, and got the exception below. That was because I used two different partitions definitions for the job and for the asset. What I don't understand is, if the same partitions_def is needed for the job and all of the assets it's materializing, then why do we need to provide the partitions_def when defining the job? Why not implicitly read that partitions_def from the assets?
Thanks!
Invariant failed. Description: Assets defined for node 'my_asset' have a partitions_def of Daily, starting 2020-01-01 UTC. End offsetted by 1 partition., but job 'my_job' has non-matching partitions_def of Daily, starting 2020-01-01 UTC..
shailesh
11/21/2022, 9:37 AMOperation name: LaunchAssetLoaderQuery
Message: Expected non-None value: None
Path: ["assetNodes",0,"requiredResources"]
Locations: [{"line":59,"column":3}]
Son Giang
11/21/2022, 10:39 AMbuild_asset_reconciliation_sensor
with dbt_assets
but run into this error:
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server
The above exception was caused by the following exception:
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "{"created":"@1669027129.186794000","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
>
I run dagit and dagster-daemon locally.Giuseppe Savino
11/21/2022, 2:03 PMJun Jo
11/21/2022, 4:08 PMpip install dagster
, I get Defaulting to user installation because normal site-packages is not writeable
.
Then when I write a python file with from dagster import asset
it runs fine.
But when I try to use the dagster
command in Terminal, it shows: zsh: command not found: dagster
Does anyone have the same issue and resolve it in the past?Daniel Katz
11/21/2022, 4:08 PM__ASSET_JOB
which the asset reconciliation sensor and adhoc materializations kick off? What id like to be able to do is have the __ASSET_JOB
executor be the multiprocessing one (which I think happens by default now) when running dagster locally, but use the k8s job executor when running in the cloud.Viacheslav Nefedov
11/21/2022, 5:11 PMWonjae Lee
11/21/2022, 6:01 PMnickvazz
11/21/2022, 8:05 PMasset group
to have a partition?Kyle Gobel
11/21/2022, 9:03 PM<http://myregistry.com/user-deployment:v0.2.0|myregistry.com/user-deployment:v0.2.0>
but dagit/daemon is launching the new jobs in <http://myregistry.com/user-deployment:v0.1.0|myregistry.com/user-deployment:v0.1.0>
. Can I get some pointers on what config may be wrong, or how dagster decides what image to launch for the k8srunlauncher ?André Augusto
11/21/2022, 9:31 PMjob
level? I don’t want to define them at the asset-level, since it is easier (and more DRY) to define them at only the job level.
The following code (almost the same as the Parttioned Asset Jobs page) yields no partitions in dagit (v.1.1.2)
from dagster import AssetSelection, HourlyPartitionsDefinition, asset, define_asset_job
hourly_partitions_def = HourlyPartitionsDefinition(start_date="2022-05-31-00:00")
@asset
def asset1():
...
@asset
def asset2():
...
partitioned_asset_job = define_asset_job(
name="asset_1_and_2_job",
selection=AssetSelection.assets(asset1, asset2),
partitions_def=hourly_partitions_def,
)
Felix Ruess
11/21/2022, 9:33 PMOliver
11/21/2022, 10:31 PMvalidation_loader = repro.load_asset_value(
AssetKey("validate_loader"),
partition_key='dev_medium'
)
but I get
IsADirectoryError: [Errno 21] Is a directory: '/home/ubuntu/.dagster/storage/validate_loader'
looks like it's going through this code path
upath_io_manager.py:146-149
if not context.has_asset_partitions:
# we are dealing with a non-partitioned asset
path = self._get_path(context)
return self._load_single_input(path, context)
I tried with 1.0.16
and 1.1.2
ljx
11/22/2022, 4:05 AMsubscription($runId:ID!){
pipelineRunLogs(runId:$runId){
...on PipelineRunLogsSubscriptionSuccess{
__typename
run{
runId
pipelineName
status
stats{
...on RunStatsSnapshot{
enqueuedTime
launchTime
startTime
endTime
stepsFailed
stepsSucceeded
}
...on PythonError{
message
}
}
}
}
...on PipelineRunLogsSubscriptionFailure{
__typename
message
missingRunId
}
}
}
Any help matters! Thank youĐinh Đức Dương
11/22/2022, 4:58 AMFrédéric Kaczynski
11/22/2022, 10:02 AM@asset
def train():
model = get_tensorflow_model()
tf_model.fit()
return model
This fails because TF can't easily be picklized. Though, there is a method to save a TF model inside a file (tf_model.save("path_to_file")
). Is there a way to tell Dagster to save this file and treat it as the output?
While it's nice that Dagster takes care of the serialization, it'd be nice to have an escape hatch in case an asset is not properly supported by Dagster.
Thanks in advance!geoHeil
11/22/2022, 11:27 AMgeoHeil
11/22/2022, 11:42 AMAverell
11/22/2022, 12:25 PMEduardo Pereira
11/22/2022, 12:35 PMdagster._core.errors.DagsterInvalidConfigError: Error in config for job
Error 1: Missing required config entry "resources" at the root. Sample config for missing entry:
I know I missed something, but can't find anywhere, any advise?shailesh
11/22/2022, 1:03 PMAirton Neto
11/22/2022, 1:10 PMdefine_asset_job
returns UnresolvedAssetJobDefinition
?
I cant call .execute_in_process()
to test it. Should I use resolve
method before calling it?