Baris Cekic
08/24/2022, 8:24 AMassets
with pyspark + deltalake
. I am trying to understand how I can materialize assets as deltalake
tables to s3
.Zach
08/24/2022, 4:09 PMobj.write.format("delta").save(path)
Baris Cekic
08/24/2022, 4:34 PMspark
now
https://dagster.slack.com/archives/C01U954MEER/p1661339761830559Zach
08/24/2022, 4:37 PMBaris Cekic
08/24/2022, 4:39 PM2022-08-24 11:52:39 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 1 - ENGINE_EVENT - Executing steps using multiprocess executor: parent process (pid: 1)
2022-08-24 11:52:39 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 1 - make_people - STEP_WORKER_STARTING - Launching subprocess for "make_people".
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/lib/python3.7/site-packages/pyspark/jars/spark-unsafe_2.12-3.2.2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/08/24 11:53:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/08/24 11:53:14 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2022-08-24 11:53:18 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - STEP_WORKER_STARTED - Executing step "make_people" in subprocess.
2022-08-24 11:53:18 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - RESOURCE_INIT_STARTED - Starting initialization of resources [io_manager].
2022-08-24 11:53:18 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - RESOURCE_INIT_SUCCESS - Finished initialization of resources [io_manager].
2022-08-24 11:53:18 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - LOGS_CAPTURED - Started capturing logs for step: make_people.
2022-08-24 11:53:18 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - STEP_START - Started execution of step "make_people".
2022-08-24 11:53:44 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - STEP_OUTPUT - Yielded output "result" of type "DataFrame". (Type check passed).
[Stage 0:> (0 + 0) / 1]
[Stage 0:> (0 + 1) / 1]
11:54:08 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - HANDLED_OUTPUT - Handled output "result" using IO manager "io_manager"
2022-08-24 11:54:08 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 113 - make_people - STEP_SUCCESS - Finished execution of step "make_people" in 50.23s.
2022-08-24 11:54:09 +0000 - dagster - DEBUG - make_and_filter_data - 93eb470b-5ce0-4d38-b1dd-57e308cb9db0 - 1 - filter_over_50 - STEP_WORKER_STARTING - Launching subprocess for "filter_over_50".
22/08/24 11:54:10 WARN AbstractConnector:
java.io.IOException: Thread signal failed
at java.base/sun.nio.ch.NativeThread.signal(Native Method)
at java.base/sun.nio.ch.ServerSocketChannelImpl.implCloseSelectableChannel(ServerSocketChannelImpl.java:365)
at java.base/java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:242)
at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:112)
at org.sparkproject.jetty.server.ServerConnector.close(ServerConnector.java:371)
at org.sparkproject.jetty.server.AbstractNetworkConnector.shutdown(AbstractNetworkConnector.java:104)
at org.sparkproject.jetty.server.Server.doStop(Server.java:444)
at org.sparkproject.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:94)
Zach
08/24/2022, 4:42 PMBaris Cekic
08/24/2022, 5:00 PMZach
08/24/2022, 5:08 PMBaris Cekic
08/24/2022, 5:10 PMZach
08/24/2022, 6:09 PMBaris Cekic
08/25/2022, 6:18 PMFailed to connect to dagster-run-66fd6f2a-84d1-4edc-816a-07a1e56813a1-7527w:34517
Zach
08/25/2022, 6:31 PMBaris Cekic
08/25/2022, 7:53 PM