The pipeline in dagit shows it succeeded. However,...
# announcements
m
The pipeline in dagit shows it succeeded. However, it looks like it is executing one of the solid forever. I checked the stderr (clicking View Raw Step Output) and all the steps are done, it is a massive processing in this solid. Look below. Am I missing something here...except breaking up the
clean_leads
solids?
a
hmm - if there is nothing too sensitive in the dagit logs, you could send over a debug dump and i could take a look.
there is a download option in the
...
on the right on the page that lists all the runs
m
Copy code
2020-11-13 14:28:53 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - STEP_START - Started execution of step "clean_leads.compute".
2020-11-13 14:28:53 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - OBJECT_STORE_OPERATION - Retrieved intermediate object for input dataframe in memory object store using pickle.
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - OBJECT_STORE_OPERATION - Retrieved intermediate object for input previously_processed in memory object store using pickle.
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - STEP_INPUT - Got input "dataframe" of type "dict". (Type check passed).
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - STEP_INPUT - Got input "previously_processed" of type "[Any]". (Type check passed).
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Total number of leads to clean: 19. Previously processed: [....list of urls....]
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Clean up xxx.xx url
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Remove xxx.xx duplicate url
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Change the name to lower case
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Got unique businesses
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Change the columns to col_name and count
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Merge using name_lower
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Get where count is in [1,2,3]
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Filter out when review count <= 2
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Filter out stores based on bad company name
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Get companies with 0 bad name count
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Mark those companies which are already processed
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Get those companies that are new
2020-11-13 14:28:54 - dagster - INFO - system - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - clean_leads.compute - Assign the source to lead source detail
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - STEP_OUTPUT - Yielded output "result" of type "Any". (Type check passed).
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - OBJECT_STORE_OPERATION - Stored intermediate object for output result in memory object store using pickle.
2020-11-13 14:28:54 - dagster - DEBUG - new_lead_pipeline - 9f75f314-9bc5-4a98-86b5-282eb295e9a9 - 754 - clean_leads.compute - STEP_SUCCESS - Finished execution of step "clean_leads.compute" in 578ms.
a
These are the raw logs I presume? Dagit uses the database as the source of truth, what is your deployment set up ?
m
Correct. It is deployed k8s on gcp
Ha...I think this could be an issue with my cluster.
Copy code
Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/gevent/_socket3.py", line 515, in send return self._sock.send(data, flags) BlockingIOError: [Errno 11] Resource temporarily unavailable
a
Have you refreshed the dagit page?
👍 1
m
I re-run it and every time i refresh, I get the logs populated. Thanks.3