Eduardo Santizo
03/24/2021, 9:27 PMyuhan
03/24/2021, 9:52 PMEduardo Santizo
03/24/2021, 9:56 PMmax
03/24/2021, 10:11 PMEduardo Santizo
03/24/2021, 11:24 PMmax
03/25/2021, 12:31 AMpip uninstall great-expectations && pip install great-expectations--0.12.10
should do itEduardo Santizo
03/25/2021, 12:52 AMmax
03/25/2021, 1:11 AMJeff Hulbert
03/25/2021, 12:51 PMEduardo Santizo
03/25/2021, 3:05 PMJeff Hulbert
03/25/2021, 8:29 PMfrom dagster.core.definitions.decorators.composite_solid import composite_solid
from dagster.core.definitions.decorators.lambda_solid import lambda_solid
from dagster.utils import file_relative_path
from dagster import ModeDefinition. lambda_solid, solid, composite_solid
from dagster_ge import ge_validation_solid_factory
from dagster_ge.factory import ge_data_context
GE_PROJECT_DIR = file_relative_path(__file__, "../great_expectations")
GE_DATASOURCE_NAME = "pandasdata"
ge_data_context_conf = ge_data_context.configured({"ge_root_dir": GE_PROJECT_DIR})
local_mode = ModeDefinition(
name="local",
resource_defs={
"ge_data_context": ge_data_context_conf,
},
)
@lambda_solid
def continue_if_validated(df: DataFrame, expectation) -> DataFrame:
if expectation["success"]:
return df
else:
raise ValueError("GE Validation for dataset failed, see previous step")
def load_table_factory(table_name):
check.str_param(table_name, "table_name")
@composite_solid(
name=f"load_table_{table_name}",
config_schema={
"table_name": Field(
str,
is_required=False,
default_value=table_name,
),
},
config_fn=load_config,
)
def load_table_solid():
"""Download table, validate if great expectation suite exists, load it to database"""
ge_suite_name = f"{table_name}.fail"
validate_solid = ge_validation_solid_factory(
name=f"validate_{table_name}",
datasource_name=GE_DATASOURCE_NAME,
suite_name=ge_suite_name,
)
continue_load_solid = continue_if_validated.alias(
f"continue_if_validated_{table_name}"
)
df = download_table_solid()
return load_df_to_db_solid(continue_load_solid(df, validate_solid(df)))
return load_table_solid
Eduardo Santizo
03/25/2021, 9:43 PM# Path to the great expectations folder in the local directory
ge_project_dir = file_relative_path(__file__, "./great_expectations")
# Data source name in the expectation suite
ge_datasource_name = "test_datasource"
# Data context configuration of the root directory
ge_data_context_conf = ge_data_context.configured({"ge_root_dir": ge_project_dir})
# Basic mode definition for the great expectations data context configuration
basic_mode = ModeDefinition(
name="basic",
resource_defs={
"ge_data_context": ge_data_context_conf,
},
)
# Definition of the great expectations validation solid
ge_validate_CAMPR3 = ge_validation_solid_factory(
name="ge_validation_solid",
datasource_name=ge_datasource_name,
suite_name="test_suite",
)
# Pipeline definition
@pipeline(
# The following lines are needed for the great expectations integration
mode_defs=[basic_mode],
)
def CAMPR3_pipeline():
ids = retrieve_CAMP_ids()
df = scrape_CAMPs(ids)
validate_and_save(df, ge_validate_CAMPR3(df))
This is all the code related to the great expectations integration, but can't seem to find the error related to the "checkpoint_store_error"max
03/26/2021, 4:26 PMcheckpoint_store_name
appear in your validation suite?Eduardo Santizo
03/26/2021, 4:45 PMmax
03/26/2021, 6:11 PMDagster Bot
03/26/2021, 6:11 PMdansasbu
08/27/2021, 1:56 AMAttributeError: 'DataContextConfig' object has no attribute 'validation_operators'
I have:
great-expectations 0.13.30
dagster 0.12.4
dagster-ge 0.12.4
Seems that the GE 0.12.X doesn't have the V3 api
great_expectations --v3-api init
Usage: great_expectations [OPTIONS] COMMAND [ARGS]...
Try 'great_expectations --help' for help.
Error: no such option: --v3-api
This is an issue for anyone following the GE tutorial.
@max