Hi < daniel> I have few of question regarding `k8s job op` 1 dagster #deployment-kubernetes

Hi <@U016C4E5CP8>, I have few of question regardi...

02/16/2023, 7:27 PM

Hi @daniel, I have few of question regarding

k8s_job_op

1. What are those unique benefits of

Dagster

we will loose since its just running an arbitrary image (link post https://dagster.slack.com/archives/C02LJ7G0LAZ/p1674240860046369?thread_ts=1674236799.847639&cid=C02LJ7G0LAZ), Can you please explain more on this ? 2. How to pass parameter and receive output from k8s_job_op op definition. 3. This

k8s_job_op

can only be used with K8sRunLauncher not with CeleryK8sRunLauncher? 4. How assets will be written and consumed downstream ? 5. Sharing of config map or environment values from repository image to the k8s_job_op image? 6. Will below ops, graph and job will work ? @here

Copy code

@op
def step1():
   return True

first_op = k8s_job_op.configured(
    {
        "image": "busybox",
        "command": ["/bin/sh", "-c"],
        "args": ["echo HELLO"],
    },
    name="first_op",
)


@graph
def my_graph():
    bool_val = step1()
    first_op() how we can pass bool_val to first_op and get output from this first_op ?

@job
def full_job():
    my_graph()

jordan

02/16/2023, 7:54 PM

You can find answers to most of your questions about

k8s_job_op

here - particular the ones related to configuration: https://docs.dagster.io/_apidocs/libraries/dagster-k8s#dagster_k8s.k8s_job_op I’d say it’s a fairly specialized use case - in general Dagster is written to cleanly separate business logic from environment. The same ops/assets can run locally, in K8s, on ECS, in Docker, etc. Whereas with the

k8s_job_op

, you can only launch the op in K8s. Most people reach for it if they’re coming from Airflow and unfamiliar with Dagster’s core abstractions or if they have some code they need to run that isn’t in Python. For the former, we just hosted an Airflow migration event that I’d encourage you to watch:

https://www.youtube.com/watch?v=VR98jpxREts▾

For the latter,

dagster-shell

is another way you can achieve that without explicitly binding your logic to its runtime environment: https://docs.dagster.io/_apidocs/libraries/dagster-shell

02/16/2023, 8:15 PM

Hi @jordan, Thank you for the reply. Can you please explain more on unique benefits of

Dagster

we will loose since its just running an arbitrary image? (Are they assets and linkage of different ops) based on @daniel comment.

02/16/2023, 8:24 PM

So as per your reply, It is just running a code

Script

Python Script

code in isolated manner and we should not be using this to run a Dagster Code inside different image? Correct me if I am wrong.

jordan

02/16/2023, 11:12 PM

Dagster is built to let you write business logic in pure python - abstracted away from where it’ll run. When it comes time to deploy, you have many options of where that business logic can run. https://docs.dagster.io/deployment The

k8s_job_op

breaks that model by explicitly binding your business logic to K8s. It means you’d be unable to do things like run your code locally without also having a K8s cluster to run them again. You’d also be unable to take full advantage of many of Dagster’s core APIs - for example, you’d have to use quite a bit of indirection to pass a Dagster resource through to whatever container you’re orchestrating. In general, it’s an anti-pattern for using Dagster. You’d use it if you have a container you need to orchestrate as part of your job that isn’t otherwise able to be modeled with Dagster. But if you’re just looking for a way to run Dagster jobs on K8s, it’s extra complexity - Dagster already handles that for you.

Frank Dekervel

02/17/2023, 1:15 PM

i use it to run a spark-on-kubernetes job from within a dagster pipeline. you cant' do that from within a normal dagster op (even if you are using pyspark) because spark needs a special container image, service account, ... so the only thing dagster can do here is fire up the spark master pod, then monitor that it doesn't crash. it can't really see the DAG that is executed inside spark.

Open in Slack

Previous Next