https://dagster.io/ logo
Title
f

Félix Tremblay

12/04/2022, 3:57 PM
Hello 👋, we are using Dagster Serverless, and we are having difficulties with the bandwidth limitation (Maximum of 100 GB of bandwidth per day). We need to build a daily ETL process that involves downloading and uploading about 70GB of data each day. The problem is that we will also have to backfill more than 2 years of historical data (about 60TB in total). Is there a chance we could increase the bandwidth limit in order to perform the backfill using Dagster Serverless?
m

Max Wong

12/05/2022, 6:57 AM
this doesn't really answer your question, but I think [task compute] should be separated from [dagster instance], doing a computation from dagster instance can lead to OOM if it's not optimized. plus running tasks outside of dagster means you don't have to deal with scaling the dagster instance to handle more loads
j

jordan

12/05/2022, 2:57 PM
Similarly -we’d recommend using Dagster to orchestrate computation that’s happening within your own cloud. We consider Dagster Cloud Serverless to be a great choice primarily for orchestration and light computation: https://docs.dagster.io/dagster-cloud/deployment/serverless#when-to-choose-serverless But moving 60TB of data is beyond what we plan to support for it currently.
👍 1
f

Félix Tremblay

12/08/2022, 9:25 PM
thanks, @jordan you are right, we should instead use a data orchestrator such as Hevo or Airbyte