https://dagster.io/ logo
#dagster-releases
Title
# dagster-releases
c

Colton Padden

02/22/2024, 10:16 PM
🚀 Weekly Release Highlights: 1.6.6 and 0.22.6 🚀🔍 In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with showing asset health, properties, and column schema. • 🐍 Dagster officially supports Python 3.12! • 🐻‍❄️ dagster-polars has been added as an integration. Thanks @Daniel Gafni! • dbt @dbt_assets now supports loading projects with semantic models or model versions • greenlight You can now include FreshnessPolicys on observable source assets. [Experimental] What was your favorite change in this release? React to this message with its emoji!
🔍 5
dbt 4
greenlight 4
🐍 10
🐻‍❄️ 8
d

Daniel Gafni

02/22/2024, 10:46 PM
The new asset page is really good! Big 👍 for displaying asset config there. The searchable fields (and table schema fields) are useful too.
dagster yay 1
The only problem I see with it is the interaction with partitioned assets. As I understand, one of the partitions (materialized the latest) metadata is displayed as the asset metadata. This can cause some confusion.
Also, can you please clarify what "column schema" is? It's empty for me even tho I have table/schema metadata logged.
Also, upstream/downstream depth controls seem to be missing in the Lineage view?
👍 1
d

David Gasquez

02/23/2024, 10:09 AM
What a cool release! I thought
dagster-polars
was already an integration though! Might have confused it with the DuckDB Polars one
d

Daniel Gafni

02/23/2024, 10:18 AM
The wording might be a bit unclear. What happened is dagster-polars got merged into the main Dagster repo. Previously it has been maintained separately.
dagster yay 3
j

josh

02/23/2024, 4:54 PM
The only problem I see with it is the interaction with partitioned assets.
Thanks @Daniel Gafni! How would you like to see partition metadata handled here? Should we try to merge all partitions in some way or would you rather just be able to filter by a single partition?
s

sandy

02/23/2024, 4:56 PM
hey @Daniel Gafni - currently you need to log it using the "columns" metadata key
d

Daniel Gafni

02/23/2024, 5:30 PM
Oh, I’ve been using “table”. May I ask if this convention is final? I understand logging just the schema to “columns”, but a table with some records is no longer just columns. Perhaps Dagster could auto-detect if a table schema or a table has been logged instead of relying just kn the “columns” key?
s

sandy

02/23/2024, 9:32 PM
what exactly is contained in that table value? is it a
TableMetadataValue
?
d

Daniel Gafni

02/23/2024, 10:31 PM
Yes, and it has schema attached
s

sandy

02/23/2024, 10:34 PM
we don't want to assume that any
TableMetadataValue
(or
TableSchema
for that matter) included in the metadata contains the schema for the asset e.g. a
TableMetadataValue
might include a data sample that excludes fields with private information. or it might be the result of the Pandas
describe
invocation, i.e. contains a table of statistics about the asset
d

Daniel Gafni

02/23/2024, 10:53 PM
Alright, so all I have to do is change the metadata key from “table” to “columns”? P.S. excellent idea about logging stats as a table. I’ve been using json previously.
s

sandy

03/04/2024, 5:54 PM
Right - and make sure the value is a
TableSchema
d

Daniel Gafni

03/04/2024, 7:28 PM
Oh, I’ve actually logged it as a table… Can Dagster also support this case? I don’t see why should I log the schema twice - as part of the table and as a separate metadata value
s

sandy

03/05/2024, 12:05 AM
I will raise that with the team, but the current answer is basically that a metadata named "columns" expects a set of columns
👍 1
d

Daniel Gafni

03/14/2024, 8:17 PM
Hey @sandy, did you guys come up with anything? Should Dagster also recognize a logged table and not just a schema?