dansasbu
08/27/2021, 2:44 PMalex
08/27/2021, 3:07 PM(0.13.x) doesn’t work with Dagsterwhat exactly are you seeing? we are pinned against a few specific versions that had incompatibility, but newest should work https://github.com/dagster-io/dagster/blame/master/python_modules/libraries/dagster-ge/setup.py#L38
dansasbu
08/27/2021, 3:24 PMAttributeError: 'DataContextConfig' object has no attribute 'validation_operators'
If I add
validation_operator_name='action_list_operator'
to the arguments of the function ge_validation_solid_factory, I receive this one
AttributeError: 'Datasource' object has no attribute 'get_batch'
I see, for some reason, the function _get_data_context_version on data_context.py, is returning None instead of 'v3'. Since it is None it tries to use ._get_batch_v2alex
08/27/2021, 3:35 PMowen
08/27/2021, 4:03 PMdansasbu
08/27/2021, 4:06 PMowen
08/27/2021, 4:20 PM_get_data_context_version
is just basing the decision to use v2 or v3 on how get_batch() was called (which is determined entirely by the dagster-ge code, not the config yaml). I'll work on putting out a change that allows you to choose which config version to run your validation against (which will just take the form of a parameter on the solid factory).dansasbu
09/09/2021, 2:32 PMge_validation_solid_factory
would need to be rewrite?
As part of the new modular expectations API in Great Expectations, Validation Operators are evolving into Checkpoints. At some point in the future Validation Operators will be fully deprecated.
https://docs.greatexpectations.io/docs/reference/checkpoints_and_actionsowen
09/09/2021, 3:44 PMdansasbu
09/10/2021, 4:21 PM@solid(tags={"kind": "ge"})
def run_ge_validation(_context,
ge_context,
df,
datasource_name,
data_connector_name,
data_asset_name,
expectation_suite_name,
):
checkpoint = SimpleCheckpoint(name='ge_checkpoint',
data_context=ge_context,
batch_request={'datasource_name': datasource_name,
'data_connector_name': data_connector_name,
'data_asset_name': data_asset_name,
'batch_identifiers': {
'default_identifier_name': 'default_identifier'}})
results = checkpoint.run(run_name=f'{expectation_suite_name} run',
validations=[
{
"batch_request": {'runtime_parameters': {'batch_data': df}},
"expectation_suite_name": expectation_suite_name,
}
])
yield Output(results.to_json_dict())
Do you know how can I render the results like the way you do for the validation operator? Thanks!owen
09/10/2021, 7:05 PMvalidation_results_page_renderer = ValidationResultsPageRenderer(run_info_at_end=True)
rendered_document_content_list = validation_results_page_renderer.render(
validation_results=results
)
md_str = "".join(DefaultMarkdownPageView().render(rendered_document_content_list))
meta_stats = EventMetadataEntry.md(md_str=md_str, label="Expectation Results")
yield ExpectationResult(
success=bool(results["success"]),
metadata_entries=[meta_stats],
)
yield Output(results.to_json_dict())
should work!dansasbu
09/10/2021, 8:18 PM@solid(tags={"kind": "ge"})
def run_ge_validation(_context,
ge_context,
df,
datasource_name,
data_connector_name,
data_asset_name,
expectation_suite_name,
):
checkpoint = SimpleCheckpoint(name='ge_checkpoint',
data_context=ge_context,
batch_request={'datasource_name': datasource_name,
'data_connector_name': data_connector_name,
'data_asset_name': data_asset_name,
'batch_identifiers': {
'default_identifier_name': 'default_identifier'}})
results = checkpoint.run(run_name=f'{expectation_suite_name} run',
validations=[
{
"batch_request": {'runtime_parameters': {'batch_data': df}},
"expectation_suite_name": expectation_suite_name,
}
])
validation_results_page_renderer = ValidationResultsPageRenderer(run_info_at_end=True)
rendered_document_content_list = validation_results_page_renderer.render(
validation_results=results
)
md_str = "".join(DefaultMarkdownPageView().render(rendered_document_content_list))
meta_stats = EventMetadataEntry.md(md_str=md_str, label="Expectation Results")
yield ExpectationResult(
success=bool(results["success"]),
metadata_entries=[meta_stats],
)
yield Output(results.to_json_dict())
But I receive this error:
AttributeError: 'CheckpointResult' object has no attribute 'meta'
File "C:\Users\s2795861\Documents\envs\caps\lib\site-packages\dagster\core\execution\plan\utils.py", line 42, in solid_execution_error_boundary
yield
File "C:\Users\s2795861\Documents\envs\caps\lib\site-packages\dagster\utils\__init__.py", line 383, in iterate_with_context
next_output = next(iterator)
File "c:\users\s2795861\documents\caps\caps\CAPS\solids\solids_common_validation.py", line 176, in run_ge_validation
rendered_document_content_list = validation_results_page_renderer.render(
File "C:\Users\s2795861\Documents\envs\caps\lib\site-packages\great_expectations\render\renderer\page_renderer.py", line 84, in render
run_id = validation_results.meta["run_id"]
Looks like checkpoints have that meta data in in a different way.owen
09/10/2021, 8:21 PM