https://dagster.io/ logo
#dagster-support
Title
# dagster-support
l

Lindsay S

01/09/2023, 3:38 PM
Good morning! I have been testing out dagster deployed on ec2 (trying to switch to ecs but ec2 for now). I have a very small cluster for testing, but keep running into issues with running out of space and I think it is because of log files being written (I am using postgres for metadata and right now am only materializing assets to from api to s3 and reading in one snowflake table as a dependency with the i/o manager). The problem I am having is that when I add
dagster_aws.s3.compute_log_manager
to my
dagster.yaml
file and restart my docker containers, the dagit immediately crashes and the daemon crashes and tries to restart - this happens both locally and in the cloud. Do I need to do anything additional to make this work?
🤖 1
Copy code
compute_logs:
  module: dagster_aws.s3.compute_log_manager
  class: S3ComputeLogManager
  config:
    bucket: "my-bucket-name"
    local_dir: "/tmp/compute-logs-stg"
    prefix: "dagster-compute-logs"
    skip_empty_files: false
    upload_interval: 30
Here is the error I get (please let me know if this is something other than what I think - log files)
Copy code
OSError: [Errno 28] No space left on device: '/opt/dagster/dagster_home/storage/partitioned-asset-name/2022-12-06'
 File "/usr/local/lib/python3.7/site-packages/dagster/_core/execution/plan/utils.py", line 52, in solid_execution_error_boundary
  yield
 File "/usr/local/lib/python3.7/site-packages/dagster/_utils/__init__.py", line 460, in iterate_with_context
  next_output = next(iterator)
 File "/usr/local/lib/python3.7/site-packages/dagster/_core/execution/plan/execute_step.py", line 626, in _gen_fn
  gen_output = output_manager.handle_output(output_context, output.value)
 File "/usr/local/lib/python3.7/site-packages/dagster/_core/storage/upath_io_manager.py", line 231, in handle_output
  self.dump_to_path(context=context, obj=obj, path=path)
 File "/usr/local/lib/python3.7/site-packages/dagster/_core/storage/fs_io_manager.py", line 148, in dump_to_path
  with path.open("wb") as file:
 File "/usr/local/lib/python3.7/pathlib.py", line 1208, in open
  opener=self._opener)
 File "/usr/local/lib/python3.7/pathlib.py", line 1063, in _opener
  return self._accessor.open(self, flags, mode)
d

daniel

01/09/2023, 7:53 PM
Hi Linday - I think this is coming from your IO manager actually - it looks like the outputs from your assets/ops are what's filling up the filesystem. Is switching to the S3 IO manager an option here? https://docs.dagster.io/deployment/guides/aws#using-s3-for-io-management
l

Lindsay S

01/09/2023, 10:15 PM
HI Daniel, makes sense. I wonder what the difference between the s3 io manager and using boto to write to s3? I actually have been meaning to refactor and use the built in i/o manager, but I am just curious if this is really the problem or if I might just not have enough memory. Does the built in manager do something under the hood to avoid running out of space?
d

daniel

01/09/2023, 10:16 PM
If you have large outputs from your assets/ops, the built-in fs_io_manager (which I see in that stack trace) pickles them up and writes them to disk. It's possible that it's other things that filled up the disk originally, but it's the writing of outputs that's hitting the disk errors now. I'm not aware of anything that the built-in IO manager does to avoid running out of space, but it's a lot harder to run out of space in S3
l

Lindsay S

01/09/2023, 10:20 PM
i see. right now I am chunking the job and writing to s3 in batches, but I guess i'm testing on such a tiny system it just is too small. thanks, this makes sense.,
condagster 1
21 Views