https://dagster.io/ logo
Title
s

Stephen Bailey

10/13/2022, 5:27 PM
Would love to have a simple function that would "smart-cast" dictionaries into metadata for assets. We have several flows where we produce a dictionary of metadata about an object, and then log it, like so:
@asset
def my_asset():
   # code to actually create the asset
   object_id = do_something()

   # code to do a /describe call from an api
   details = requests.get(f"some_url/{object_id")
   context.add_output_metadata(details)

   return object_id
However, we frequently hit errors in the ops this way, especially around array values:
dagster._core.errors.DagsterInvalidMetadata: Could not resolve the metadata value for "public_access_id" to a known type. Its type was <class 'NoneType'>. Consider wrapping the value with the appropriate MetadataValue type.
. Would be great to have a
sanitize_metadata
argument that by default drops unparseable values, or something.
:plus1: 3
Given that it's an issue with the metadata and not the actual code being executed, it's annoying when it hard-fails the op.
g

geoHeil

10/13/2022, 8:20 PM
Or even bricks dagit UI as JavaScript is less lenient than Python with regards to json Iike data
s

Stephen Bailey

10/14/2022, 11:58 AM
here's my simplistic workaround right now:
def sanitize_metadata_dict(metadata_dict: dict) -> dict:
    "Sanitizes a raw Python dict so that it can be cast to Dagster metadata"
    sanitized = dict()
    for k, v in metadata_dict.items():
        if isinstance(v, str):
            sanitized[k] = v
        elif isinstance(v, int) or isinstance(v, float):
            sanitized[k] = v
        elif isinstance(v, dict):
            sanitized[k] = sanitize_metadata_dict(v)
        else:
            logger.debug("Noncompliant metadata key found (%s). Removing.", type(v))
    return sanitized
c

chris

10/18/2022, 9:35 PM
Perhaps doesn't solve the general annoyance, but we've been considering adding a NoneableMetadataValue that could help solve the particular issue with None. Added an issue for this: https://github.com/dagster-io/dagster/issues/10080
👍 1
s

Stephen Bailey

10/19/2022, 1:17 PM
awesome, yeah, that would help. The one I most frequently hit is with
list
, would be great to handle that too
👍 1