Marco
06/14/2021, 4:23 PMalex
06/14/2021, 4:27 PMload_from
in a workspace
will lead to separate grpc server subprocesses spawned by dagit
and the daemon
if all being run on the same machine. Consolidating loading if separate python environments is not needed is recommended.
* manually managing your grpc servers is another approach to more efficiently host dagit and daemon on the same machine https://docs.dagster.io/concepts/repositories-workspaces/workspaces#running-your-own-grpc-serveras for the specific grpc bit: do I need then one port per repo - how can I then link dagit to several ports?if you are going to manually spawn multiple grpc servers instead of consolidating your loading, you will need to use a
workspace.yaml
to provide the multiple load_from
entries with different portsprha
06/14/2021, 4:52 PM@solid
def prune_old_runs(context, prune_threshold_datetime):
has_more = True
while has_more:
run_records = context.instance.get_run_records(order_by="create_timestamp", ascending=True, limit=CHUNK_SIZE)
has_more = len(run_records) == CHUNK_SIZE
for record in run_records:
if record.create_timestamp > prune_threshold_datetime:
break
context.instance.delete_run(record.pipeline_run.run_id)
Disclaimer: this will wipe the event log records for runs older than a certain size, regardless of its run status. You may want to add extra filters / conditions to preserve certain types of runs (in-progress, most recent runs for partition, etc). This might also affect the asset history of assets materialized in the wiped run.Marco
06/14/2021, 4:54 PMalex
06/14/2021, 4:58 PMget_run_records
is brand new and may be renamed with its formal support & docs, likely by this weeks releaseMarco
06/15/2021, 6:09 PM