Happy new year all, just wanted to drop in my xmas...
# random
a
Happy new year all, just wanted to drop in my xmas project which may be relevant to everyone containerising their pipelines… Whaler is a visual disk-usage analyser for docker images. It lets you see all the junk you may be inadvertently hauling around in your image, which may be costing you time and money.
celebrate 2
👍 2
m
Awesome! Can you post a screenshot of a dagster image?
a
🙂 @mrdavidlaing
Copy code
FROM python:3.7-slim
RUN pip install dagster
Copy code
docker build . -t dag
whaler --image dag /
You can find a 14MB pip cache chilling in there amongst other goodies — obviously this is not dagster’s fault though.
🤔 1
🦜 1
thankyou 1
m
Looks like there is a 34MB pip cache in the default dagster image -
whaler --image='dagster/k8s-example:latest'
@Alex Remedios Would you be interested in collaborating on building a “lighter” dagster image? If we found something workable perhaps we could submit a PR upstream
a
sure, that sounds like a good challenge. What base image and version are you using?
m
I think the
dagster/k8s-example
image starts with
python:3.7-slim
I’m wondering whether a good place to start is by using the Docker Builder Pattern to run all the “intermediate” steps; ending with one that deletes a bunch of unnecessary files (as revealed by whaler)
a
yes that is a must
i’m compiling a list of possible deletions..
do you have some test cases? I actually haven’t used dagster for anything substantial yet. e: it may be easier starting from nothing, using distroless
m
Nothing comprehensive. Idea: Let’s start a GitHub issue and pull in some of the dagster team for advise. Perhaps we can leverage some of the existing build/test infrastructure via a PR?
a
sure i’ll cut one @mrdavidlaing https://github.com/dagster-io/dagster/pull/3501
m
thankyou Lets move the conversation over to the PR