Paul Wyatt
07/31/2020, 5:39 PMdatasplit_y
with an argument id_cols
that takes a list of strings. All of the below have failed:
1
datasplit_y:
inputs:
id_cols:
- user_id
- program_day
2
datasplit_y:
inputs:
id_cols:
value:
- user_id
- program_day
3
datasplit_y:
inputs:
id_cols: [user_id, program_day]
4
datasplit_y:
inputs:
id_cols: ['user_id', 'program_day']
5
datasplit_y:
inputs:
id_cols:
value: ['user_id', 'program_day']
6
datasplit_y:
inputs:
id_cols:
value: ['user_id', 'program_day']
7
datasplit_y:
inputs:
id_cols:
value: "['user_id', 'program_day']"
8
datasplit_y:
inputs:
id_cols:
value: "[user_id, program_day]"
Clearly I'm throwing hail-marys at this point. Any advice?John Mav
07/31/2020, 10:59 PMZach
08/01/2020, 3:57 PMmatas
08/02/2020, 3:24 PMcategories:
home:
- {source: foo, mult: 1}
business:
- {source: bar, mult: 2}
- {source: baz, mult: 3}
...
Here I have a dictionary with different keys (which I want to use as meaningfull values - home, business, etc) and an array of structured values (each one is {source: str, mult: int})
I see 3 ways now, all of them are quite cumbersome:
• just use a Permissive for all the config - I would loose validation at the last leaf level, that's pitty
• change the shape of config to have
categories:
- name: home
sources:
- {source: foo, mult: 1}
- name: business
sources:
- {source: bar, mult: 2}
- {source: baz, mult: 3}
...
looks ugly adding unnecessary 'name' and 'sources'
• create my own DagsterType for the whole config - looks a little like overkill here.
So am I missing something? Is there a better way or what should be more "dagsterish"? DDavid
08/02/2020, 6:07 PMDagster assets
(numeric values).
https://dev.to/sephib/simple-pipeline-monitoring-dashboard-386pSina Samangooei
08/04/2020, 10:49 AM(A -> D, B -> D, C -> D)
into (A -> B, B -> C, C -> D)
but that seems weird.gatesma
08/04/2020, 2:11 PMgatesma
08/04/2020, 2:13 PMCris
08/04/2020, 7:55 PMborgdrone7
08/05/2020, 5:54 PMPaul Wyatt
08/05/2020, 6:40 PMmatas
08/06/2020, 10:04 AMTravis Cline
08/06/2020, 6:02 PMsashank
08/06/2020, 6:02 PMsashank
08/06/2020, 6:02 PMNoah K
08/06/2020, 8:54 PMNoah K
08/06/2020, 9:00 PMuser
08/06/2020, 9:32 PMsashank
08/06/2020, 9:37 PMschrockn
08/06/2020, 9:41 PMschrockn
08/06/2020, 9:41 PMschrockn
08/06/2020, 9:41 PM@pipeline
* Support customized Postgres versions in Helm
* Helm support for liveness checks for Celery works and flower in Celery on Kubernetesschrockn
08/06/2020, 9:43 PMfrom some_aws_library import s3_session
# interface to s3_session:
# @resource(config_schema={'region': str, 'use_unsigned_session': bool})
# def s3_session(_init_context):
# bake all configuration with code:
east_unsigned_s3_session = s3_session.configured(
{'region': 'us-east-1', 'use_unsigned_session': False}
)
# or allow your users only to configure the region only
@configured(s3_session, config_schema={'region': str})
def unsigned_s3_session(config):
return {'region': config['region'], 'use_unsigned_session': False}
Configured also be applied to solid definitions. This allows one to apply configuration mapping without having to introduce a composite_solid
layer:
@solid(config_schema={'age': int, 'name': str})
def introduce(context):
return "{name} is {age} years old".format(**context.solid_config)
introduce_aj = configured(introduce, name="introduce_aj")({'age': 20, 'name': "AJ"})
@pipeline
def return_int_pipeline():
introduce_aj()
schrockn
08/06/2020, 9:43 PM@failure_hook(required_resource_keys={'slack'})
def slack_on_failure(context):
message = 'Solid {} failed'.format(context.solid.name)
context.resources.slack.send_message(message)
@slack_on_failure
@pipeline
def a_pipeline():
b(a()) # will fire if either a or b fail
@pipeline
def selective_notif().
b.with_hooks({slack_on_failure})(a()) # only fires when b fails
schrockn
08/06/2020, 9:44 PMCronJob
objects. This decouples the scheduler from Dagit, enabling you to run jobs on a schedule without requiring cron on the Dagit node and enabling you to spin up multiple Dagit pods.
* An instance level view of all the schedules that have been reconciled with the Scheduler.
* The ability to turn on and off any schedules within your currently loaded workspace from one page.
* View the state of and turn off all schedules, regardless of if there are in defined in the currently loaded workspace.schrockn
08/06/2020, 9:44 PMCris
08/07/2020, 5:17 PMuser
08/07/2020, 11:30 PMjohann
08/07/2020, 11:32 PMMuthu
08/10/2020, 3:49 PM@solid(required_resource_keys={'vars'})
def clean_config_json(context, config_json = None, status = None):
"""Clean the dataset configuration"""
pass