Beau Hartshorne
05/02/2020, 5:40 PMBeau Hartshorne
05/02/2020, 5:48 PMsashank
05/02/2020, 9:51 PMIlya Tyutin
05/03/2020, 7:51 PMIlya Tyutin
05/03/2020, 8:03 PMIlya Tyutin
05/03/2020, 8:07 PMdagster.*input_hydration_config*
(what a name, btw 😃 ). I wanted to build an object from a dictionary. The documentation states:
config_cls (Any) – The type of the config data expected by the decorated function. Users should provide one of the built-in types, or a composite constructed usingDict is one of the listed built-in types yet it's not a subclass of ConfigType thus it fails. So it's kinda confusing behaviour.orSelector()
.Permissive()
sephi
05/04/2020, 7:39 AMezechiel syx
05/04/2020, 2:23 PMCris
05/04/2020, 2:34 PMmatas
05/05/2020, 2:13 PMaddy
05/05/2020, 3:04 PM<s3://bucket/prefix/step.result>
, I'm also using dagster-pandas
and TypeCheck
to create summary statistics on dataframes (and other types)
I would like to send these to s3 as well so I have all of my intermediate results and event_specific_data
in one place, I suppose I could do this with materializations, but then I'm already automatically materializing them, and I'm already creating the EventMetadataEntry
in the type checks, and would feel hacky to re-add them to a materialization
loving dagster so far, and it's crazy how fast you guys are improving it, started using it like 3 weeks ago and every day it feels a little more awesome to useTobias Macey
05/05/2020, 3:55 PMoutput_defs=[OutputDefinition(name='foo_out']
and then later in the function I yield Output(results, 'boo_out')
it would fail the linting. That would allow for real-time feedback while editing without having to constantly flip between the beginning and end of the function to make sure you've got everything matched up, especially when doing refactoring.Tobias Macey
05/05/2020, 8:36 PMTobias Macey
05/05/2020, 8:37 PMCris
05/06/2020, 5:10 AMBen Sully
05/06/2020, 7:40 AMNoah Trueblood
05/06/2020, 5:49 PMimport dill
from dagster import (
InputDefinition,
OutputDefinition,
execute_solid,
lambda_solid,
SerializationStrategy,
usable_as_dagster_type,
List,
)
dilled = False
class DillSerializationStrategy(SerializationStrategy):
"""Dill it."""
def __init__(self, name='dill'):
super(DillSerializationStrategy, self).__init__(name)
def serialize(self, value, write_file_obj):
global dilled
dilled = True
dill.dump(value, write_file_obj)
def deserialize(self, read_file_obj):
return dill.load(read_file_obj)
@usable_as_dagster_type(
serialization_strategy=DillSerializationStrategy(),
)
class SomeDagsterType:
def __init__(self):
x = 1
def test_serialization():
@lambda_solid(
name='ingest',
input_defs=[
InputDefinition(name='a', dagster_type=List[SomeDagsterType])
],
output_def=OutputDefinition(name='result', dagster_type=List[SomeDagsterType]),
)
def ingest(a):
return a
result = execute_solid(
solid_def=ingest,
input_values={ 'a': [SomeDagsterType()] },
environment_dict={
'storage': {
'filesystem': {},
},
},
)
v = result.output_value()
assert dilled
Tobias Macey
05/06/2020, 6:14 PMfile-manager
context object, I went looking for a reference in the docs to all of the attributes that are attached to the passed in SoliedExecutionContext
by default, but didn't find one. That might be a useful addition to the docs if it isn't just a matter of me overlooking something.sephi
05/07/2020, 5:12 AMdagster.yaml
configuration.
From the instance documentation I understand that it is read from the $DAGSTER_HOME
environment.
If the dagster.yaml
has the logic configuration per project while the $DAGSTER_HOME
env is per user, what is the correct manner to use the dagster.yaml
?
Where should it be saved and how could it be run per project?Ilya Tyutin
05/07/2020, 8:18 AMNo module named 'dagster_aws.emr.emr_pyspark_step_launcher'
I have dagster_aws
package installed, version 0.7.9.
In the screenshot you can see that emr_pyspark_step_launcher
is missing here, but in the repo that file exists blob hyperthinkfastblob hyperthinkfastblob hyperthinkfast https://github.com/dagster-io/dagster/tree/master/python_modules/libraries/dagster-aws/dagster_aws/emruser
05/08/2020, 12:38 AMyuhan
05/08/2020, 12:42 AMCris
05/08/2020, 1:19 AMmatas
05/08/2020, 10:21 AMdagster.check.CheckError: Invariant failed. Description: Must use S3 or GCS storage with non-local Celery broker: <amqp://guest>:guest@cube_rabbitmq:5672// and backend: rpc://
Any ideas?matas
05/08/2020, 1:48 PMbotocore.exceptions.NoCredentialsError: Unable to locate credentials
I tried to dagster-aws init
inside container but it requires me to provide aws region which is obviously irrelevant to me
botocore.exceptions.NoRegionError: You must specify a region.
how can I provide s3 connection creds in this case?schrockn
05/08/2020, 4:10 PMuser
05/08/2020, 11:43 PMnate
05/08/2020, 11:53 PMschrockn
05/09/2020, 12:02 AMmatas
05/09/2020, 9:01 AM