https://dagster.io/ logo
#ask-community
Title
# ask-community
g

geoHeil

05/26/2022, 12:46 PM
I think I broke dagit for some specific dataset/metadata which was logged as an asset materialization: SyntaxError: JSON.parse: unexpected character at line 1 column 950 of the JSON data. This is the error message the browser tells me - but dagit completely vanishes /goes white (only for this specific asset though).
dagster bot resolve to issue 1
The reason was: I was logging some JSON wich contained empty keys and empty values (accidentally)
s

sean

05/26/2022, 12:53 PM
Thanks for this report. It looks like you’ve got a JSON metadata entry somewhere that is invalid JSON-- we should be catching this server-side. Do you have any idea what that JSON value might be
g

geoHeil

05/26/2022, 12:53 PM
still it would be nice if dagit does not vanish
😉 race condition
s

sean

05/26/2022, 12:53 PM
Ah good ok
Yeah I’ll create an issue for this, we should catch this gracefully.
g

geoHeil

05/26/2022, 12:54 PM
agreed - it would be nice if you could catch this
thanks
pleas also share it here so I can track it.
s

sean

05/26/2022, 12:57 PM
when you say empty keys/values do you mean undefined, null, or empty string?
g

geoHeil

05/26/2022, 12:57 PM
empty (for both)
but it would be cool if you would catch the more generic parsing failure as well.
s

sean

05/26/2022, 12:58 PM
The error you posted is a parse failure, but empty key/value pairs is valid JSON:
JSON.parse(JSON.stringify({ [""]: "" }))
g

geoHeil

05/26/2022, 12:59 PM
no the JSON overall was more complex let me give you a more specific example in a minute
Copy code
import pandas as pd

df = pd.DataFrame({'foo':[1,2,3], 'bar':[4,5,6]})
m_derived = df.dtypes.astype('str').rename('dtype').to_frame()
d_c = 'description'

m_derived.loc['foo', d_c] = 'Foo description'
# notice bar is missing accidentally
m_derived.loc['', d_c] = ''

display(m_derived)
column_details = m_derived.reset_index().rename(columns={'index':'column_name'}).to_dict(orient='records')
column_details

m_res = {

    'source_upstream':{
        'name':'my_dummy_name',
        'column_details': column_details,        
    },
}
m_res


{'source_upstream': {'name': 'my_dummy_name',
  'column_details': [{'column_name': 'foo',
    'dtype': 'int64',
    'description': 'Foo description'},
   {'column_name': 'bar', 'dtype': 'int64', 'description': nan},
   {'column_name': '', 'dtype': nan, 'description': ''}]}}
message has been deleted
s

sean

05/26/2022, 1:02 PM
Thanks-- pretty sure the issue here is the nan, not empty strings
g

geoHeil

05/26/2022, 1:02 PM
I had logged the
m_res
s

sean

05/26/2022, 1:04 PM
IIRC python treats nan as valid JSON even though it technically isn’t
g

geoHeil

05/26/2022, 1:04 PM
correct
s

sean

05/26/2022, 1:06 PM
ok so it looks like we need to figure out how to handle NaNs in JSON values server-side
@Dagster Bot issue Handle NaN in JSON metadata values on server
d

Dagster Bot

05/26/2022, 1:07 PM
4 Views