https://dagster.io/ logo
#random
Title
a

Alex Remedios

01/04/2021, 10:45 AM
Happy new year all, just wanted to drop in my xmas project which may be relevant to everyone containerising their pipelines… Whaler is a visual disk-usage analyser for docker images. It lets you see all the junk you may be inadvertently hauling around in your image, which may be costing you time and money.
celebrate 2
👍 2
m

mrdavidlaing

01/04/2021, 11:11 AM
Awesome! Can you post a screenshot of a dagster image?
a

Alex Remedios

01/04/2021, 12:50 PM
🙂 @mrdavidlaing
Copy code
FROM python:3.7-slim
RUN pip install dagster
Copy code
docker build . -t dag
whaler --image dag /
You can find a 14MB pip cache chilling in there amongst other goodies — obviously this is not dagster’s fault though.
🤔 1
🦜 1
thankyou 1
m

mrdavidlaing

01/05/2021, 11:42 PM
Looks like there is a 34MB pip cache in the default dagster image -
whaler --image='dagster/k8s-example:latest'
@Alex Remedios Would you be interested in collaborating on building a “lighter” dagster image? If we found something workable perhaps we could submit a PR upstream
a

Alex Remedios

01/06/2021, 11:17 AM
sure, that sounds like a good challenge. What base image and version are you using?
m

mrdavidlaing

01/06/2021, 11:49 AM
I think the
dagster/k8s-example
image starts with
python:3.7-slim
I’m wondering whether a good place to start is by using the Docker Builder Pattern to run all the “intermediate” steps; ending with one that deletes a bunch of unnecessary files (as revealed by whaler)
a

Alex Remedios

01/06/2021, 11:55 AM
yes that is a must
i’m compiling a list of possible deletions..
do you have some test cases? I actually haven’t used dagster for anything substantial yet. e: it may be easier starting from nothing, using distroless
m

mrdavidlaing

01/06/2021, 12:11 PM
Nothing comprehensive. Idea: Let’s start a GitHub issue and pull in some of the dagster team for advise. Perhaps we can leverage some of the existing build/test infrastructure via a PR?
a

Alex Remedios

01/06/2021, 12:12 PM
sure i’ll cut one @mrdavidlaing https://github.com/dagster-io/dagster/pull/3501
m

mrdavidlaing

01/10/2021, 6:44 PM
thankyou Lets move the conversation over to the PR