Andrey Alekseev
06/08/2020, 8:24 AMDaniel Olausson
06/08/2020, 2:46 PMdagster pipeline execute
return with non-zero exit code when the pipeline fails?max
06/08/2020, 6:52 PMMuthu
06/09/2020, 5:28 AMError: Invalid value for '--repository-yaml' / '-y': Path 'None' does not exist.
Scheduler defintion
def imdex_schedules():
print("Scheduler function")
return [
ScheduleDefinition(
name='DC_Log_Cleaner',
cron_schedule='2 * * * *',
pipeline_name='my_pipe',
environment_dict=basic_config,
)
]
sephi
06/09/2020, 11:43 AMbash_command_solid
inside a composit_solid
- and need to set a dependancy between the output of the bash_command and another solid in the pipeline
.
We are not able to get an output from the bash_command_solid.
What would be the correct approach for such task?
While researching this issue - I'm trying to understand the bash_script_solid test example -
can you get the return from the bash script?
what does the following line do in the pipeline ?
a()
Tobias Macey
06/09/2020, 12:59 PMexecute
function in the utils module of dagster_bash. You can see what I did here https://github.com/mitodl/ol-data-pipelines/blob/master/ol_data_pipelines/edx/solids.py#L476-L480sephi
06/09/2020, 1:02 PMexecute
take a script? i'll try and check it outtimo
06/09/2020, 2:10 PMAuster Cid
06/09/2020, 2:48 PMalir
06/09/2020, 4:24 PMdhume
06/09/2020, 6:14 PMKen
06/09/2020, 11:45 PMJoost at geronimo.ai
06/10/2020, 9:28 AMborgdrone7
06/10/2020, 12:34 PMborgdrone7
06/10/2020, 12:34 PMborgdrone7
06/10/2020, 12:34 PMborgdrone7
06/10/2020, 12:35 PMJoseph Sayad
06/10/2020, 3:10 PMtracemalloc
and memory_profiler
in my solids to get a better understanding of their memory consumptionKevin
06/11/2020, 5:29 AMuser
06/11/2020, 11:17 PMmax
06/11/2020, 11:19 PMCHANGES.md
for a full list of changes, and the 080_MIGRATION.md
migration guide for details on updating 0.7.x code to work with 0.8.0. We'll be hosting a webinar on Tuesday morning (9am Pacific/noon Eastern/6pm CET) to present some of these changes and hopefully invite some discussion with the community -- DM me if you'd like to be added to the calendar invite. I'll post summaries of the major changes below for comments and discussion in threads here.max
06/11/2020, 11:20 PMworkspace.yaml
, in order to support this new architecture. The workspace yaml encodes what repositories to load and their location, and supersedes the repository.yaml
file and associated machinery.
As a consequence, Dagster internals are now stricter about how pipelines are loaded. If you have written scripts or tests in which a pipeline is defined and then passed across a process boundary (e.g., using the multiprocess_executor
or dagstermill), you may now need to wrap the pipeline in the reconstructable
utility function for it to be reconstructed across the process boundary.
In addition, rather than instantiate the RepositoryDefinition
class directly, users should now prefer the @repository
decorator. As part of this change, the @scheduler
and @repository_partitions
decorators have been removed, and their functionality subsumed under @repository
.max
06/11/2020, 11:21 PMmax
06/11/2020, 11:21 PMmax
06/11/2020, 11:22 PMRunLauncher
configured on the Dagster instance, if one is configured. Additionally, run launchers can now support termination of previously launched runs. If you have written your own run launcher, you may want to update it to support termination. Note also that as of 0.7.9, the semantics of RunLauncher.launch_run
have changed; this method now takes the run_id
of an existing run and should no longer attempt to create the run in the instance.max
06/11/2020, 11:22 PMmax
06/11/2020, 11:22 PMmax
06/11/2020, 11:22 PMStepLauncher
abstraction that uses the resource system to allow individual execution steps to be run in separate processes (and thus on separate execution substrates). This has made extensive improvements to our PySpark support possible, including the option to execute individual PySpark steps on EMR using the EmrPySparkStepLauncher
and on Databricks using the DatabricksPySparkStepLauncher
The emr_pyspark
example demonstrates how to use a step launcher.max
06/11/2020, 11:23 PMrun_config
, and the previous environment_dict
argument to APIs such as execute_pipeline
is now deprecated. We renamed this argument to focus attention on the configuration of the run being launched or executed, rather than on an ambiguous "environment". We've also renamed the config
argument to all use definitions to be config_schema
, which should reduce ambiguity between the configuration schema and the value being passed in some particular case. We've also consolidated and improved documentation of the valid types for a config schema.max
06/11/2020, 11:23 PMsimple_lakehouse
example gives a taste of what it's like to program in Lakehouse. We'd love feedback on whether this model is helpful!