Any advice on using fluentd/bit to upload JSON-for...
# deployment-kubernetes
m
Any advice on using fluentd/bit to upload JSON-formatted application logs from user code in k8s? The standard would be a daemonset, but that pulls from the container's stdout, which is a bit noisy. Our current (on-prem) setup is to have a separate fluentd process that tails a JSON-formatted log file. Is there support for a k8s agent to start multi-container pods for user code (I don't see anything in its Helm values)? Is it a terrible idea to just also run fluentbit in our application image?
d
you can use these tags to have multi-container pods for jobs: https://docs.dagster.io/deployment/guides/kubernetes/customizing-your-deployment#job-or-op-kubernetes-configuration Supporting that for every pod that the agent spins up include the code servers would be the same change that would enable tolerations
ty thankyou 1
m
Coming back to this, I think you can actually configure a fluentd daemonset to use the tail plugin to read from a file path just fine. Then so long as the fluentd daemonset and the Dagster workspace pod have a shared volume to read/write from/to, that should work smoothly. (Not sure if a local or persistent volume makes a difference.) So I think the only thing needed from Dagster is specifying a volume for the pod via the agent's
workspace.volumes
. I'm not sure how that differs from
@job tags pod_spec_config volumes
, but I like dealing with this in the Helm / kubernetes arena rather than the
@job
arena if they're equivalent.
Following up again, how would I provide a
ConfigMap
to the job? I'm trying to follow this example for a fluentd sidecar. I think I can use the
@graph.to_job(tags=...)
to specify the additional
containers
and
volumes
for the pod, and the additional
volumeMount
for the user code container, but I'm not sure where a
ConfigMap
fits in. Here's what I have so far. Specifically, I'm looking for how to add in
fluentd.conf
data in the
configvol
volume based on the
admin/logging/fluentd-sidecar-config.yaml
in the example.
Maybe what I need to do is separately publish the
ConfigMap
into my cluster, and then just reference it by name from my
@job
. It seems a little odd to have my
@job
know about my EKS cluster config, but also makes sense that
etcd
contents would not be defined on the job.
d
I think that's right - Dagster doesn't currently provide any functionality for creating additional k8s resources like configmaps for you
👍🏻 1
m
For posterity, here's what I got working: • No extra tags on the Dagster
@job
. According to documentation these only affect the k8s run launcher / job per pod, and not the job per step mode which is what I'm using. •
volumes
and
volumeMounts
in the agent's Helm chart. These did add the host volume mount I needed, mapping in
/var/log
to my step pods. So then my application code can write out a JSON-formatted log file that
fluent-bit
can see. •
fluent-bit
as a daemonset. By default this maps
/var/log
into the fluent-bit container, so I didn't need to define extra volumes on that side. And the
fluent-bit
Helm chart has options for its config file which it then publishes as a ConfigMap, so I didn't need to separately set up a ConfigMap. All in all, pretty concise and nicely organized; for the most part the application code remains agnostic of the deployment configuration.