Oren Lederman
01/12/2023, 5:10 AMAndrea Giardini
01/12/2023, 6:08 AMOren Lederman
01/12/2023, 7:21 PMdaniel
01/13/2023, 2:47 AMdef docker_sidecar_k8s_config_tag(timeout: int = 60 * 60 * 4):
"""
k8s config that sets up a sidecar container.
Most of the complexity here is due to the poor support for cleaning up after sidecar
containers in k8s when the main container has finished - we write to a file on process
termination, and have logic in the sidecar container to shutdown the sidecar as soon as
that file exists.
See <https://github.com/kubernetes/kubernetes/issues/25908#issuecomment-252089871>
where this strategy was recommended by one of the k8s maintainers in 2016, no significant progress
seems to have been made on this issue since then.
"""
return {
"container_config": {
"volume_mounts": [
{
# polled in the entrypoint to know when to shut down the container
"name": "sidecar-storage",
"mount_path": "/usr/share/pod",
}
],
},
"pod_spec_config": {
"containers": [
{
"name": "sidecar",
"image": "sidecar_image_ehre",
"volume_mounts": [
{
"name": "sidecar-storage",
# polled in the entrypoint to know when to shut down the container
"mount_path": "/usr/share/pod",
}
],
"command": ["/bin/sh", "-c"],
"args": [
f"""dockerd-entrypoint.sh &
sleep_interval=5
timeout={timeout}
i=0
while [ $i -le $(( timeout/$sleep_interval )) ]; do
if test -f /usr/share/pod/done; then
echo "Dagster pod finished, exiting"
exit 0
fi
echo 'Waiting for the dagster pod to finish...'
sleep $sleep_interval
i=$(( i + 1 ))
done
echo "Timed out waiting for Dagster pod to finish"
exit 1"""
],
}
],
"volumes": [{"name": "sidecar-storage", "empty_dir": {}}],
},
}
then.
@job(
tags={"dagster-k8s/config": docker_sidecar_k8s_config_tag()},
):
...
then
def signal_sidecar_finished() -> None:
"""Register this via atexit.register to ensure that a sidecar knows that the main process has finished
"""
if os.path.exists("/usr/share/pod"):
print("Signaling that the dagster process has finished")
Path("/usr/share/pod/done").touch()
else:
print("No /usr/share/pod folder on dagster process cleanup")
daniel
01/13/2023, 2:47 AMOren Lederman
01/13/2023, 4:10 AMdaniel
01/13/2023, 4:14 AMdaniel
01/13/2023, 4:14 AMOren Lederman
01/13/2023, 4:17 AMdaniel
01/13/2023, 4:40 AMAndrea Giardini
01/13/2023, 7:50 AMYeah, I was hoping to avoid doing thatWhy is that if i can ask? I haven’t had any issue with this solution whenever I used it in the past
Oren Lederman
01/14/2023, 4:07 PMOren Lederman
01/14/2023, 4:08 PMAndrea Giardini
01/14/2023, 5:03 PMYes, it can autoscale, but sidecars scale linearly with the number of pods and are available right when the pods need them.I actually think a deployment scales much better than sidecars in GKE. Every sidecar proxy creates a certain number of management connections and (at least in my experience) you can run out of connections pretty fast with the sidecar pattern since every sidecar will always have some management connections.
Yes, it can autoscale, but sidecars scale linearly with the number of pods and are available right when the pods need them.That’s something that needs to be solved with networkpolicies
Oren Lederman
01/14/2023, 6:18 PMOren Lederman
01/24/2023, 12:31 AMOren Lederman
01/26/2023, 6:27 PMsignal_sidecar_finished
gets called after the first step is completed (based on the timestamps in the log). Can you think of a reason why this could happen?daniel
01/26/2023, 6:27 PMdaniel
01/26/2023, 6:29 PMOren Lederman
01/26/2023, 7:02 PMAndrea Giardini
01/26/2023, 7:07 PMOren Lederman
01/26/2023, 7:34 PMistio
, for example) that we might need in the near future, so it’s an opportunity to test whether this solution would work.