https://dagster.io/ logo
Join the conversationJoin Slack
Channels
announcements
dagster-airbyte
dagster-airflow
dagster-bigquery
dagster-cloud
dagster-cube
dagster-dask
dagster-dbt
dagster-de
dagster-ecs
dagster-feedback
dagster-kubernetes
dagster-noteable
dagster-releases
dagster-serverless
dagster-showcase
dagster-snowflake
dagster-support
dagster-wandb
dagstereo
data-platform-design
events
faq-read-me-before-posting
gigs-freelance
github-discussions
introductions
jobs
random
tools
豆瓣酱帮
Powered by Linen
announcements
  • s

    SamSinayoko

    06/01/2020, 7:16 AM
    Hi, I've been struggling to get the stocks backfill example running:
    (dagster) $ DAGSTER_HOME=$PWD dagster pipeline backfill  -y repository.yaml
    Select a pipeline to backfill: compute_total_stock_volume: compute_total_stock_volume
    
         Pipeline: compute_total_stock_volume
    Partition set: stock_data_partitions_set
       Partitions: 
                    2018-01-01           2018-02-01           2018-03-01           2018-04-01           2018-05-01           2018-06-01           2018-07-01           2018-08-01
                    2018-09-01           2018-10-01           2018-11-01           2018-12-01
    
    Do you want to proceed with the backfill (12 partitions)? [y/N]: y
    Launching runs... 
    Traceback (most recent call last):
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/bin/dagster", line 8, in <module>
        sys.exit(main())
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/dagster/cli/__init__.py", line 38, in main
        cli(obj={})  # pylint:disable=E1123
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/dagster/cli/pipeline.py", line 599, in pipeline_backfill_command
        execute_backfill_command(kwargs, click.echo)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/dagster/cli/pipeline.py", line 693, in execute_backfill_command
        instance.launch_run(run.run_id)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/dagster/core/instance/__init__.py", line 801, in launch_run
        return self._run_launcher.launch_run(self, run)
      File "/Users/sinayoks/apps/miniconda3/envs/dagster/lib/python3.6/site-packages/dagster_graphql/launcher/__init__.py", line 97, in launch_run
        cls=self.__class__.__name__, address=self._address, result=result
    dagster.core.errors.DagsterLaunchFailedError: Failed to launch run with RemoteDagitRunLauncher targeting <http://127.0.0.1:3000>:
    {'__typename': 'PipelineRunNotFoundError'}
    So it looks like the job is creating a run but Dagit isn't able to find it. Any idea what I'm doing wrong here?
    a
    m
    • 3
    • 15
  • r

    Ryan Tam

    06/01/2020, 7:29 PM
    Hello! I am quite new to Dagster and am experimenting with it lately. Perhaps I am missing something, I am trying to find out if Dagster support resource management natively to limit parallelism during executions (i.e. similar to https://luigi.readthedocs.io/en/stable/api/luigi.task.html?highlight=resources#luigi.task.Task.resources), but haven't found anything particularly interesting on the documentation site (other than https://docs.dagster.io/docs/deploying/dask#managing-compute-resources-with-dask which forces me to use dask, which I am avoiding for now) Would any kind soul point me to the right direction/share some dagster recipe for achieving resources management please, thanks a bunch 🙏
    m
    • 2
    • 4
  • m

    Muthu

    06/01/2020, 7:59 PM
    hi… is there any option to run dagit UI with custom base_url instead of simple
    /
    … i want to run the dagit like:
    <http://localhost:3000/engine>
    m
    • 2
    • 2
  • b

    Binh Pham

    06/01/2020, 8:00 PM
    Is there any guidelines for when something should be a solid config vs solid input?
    m
    • 2
    • 4
  • t

    Travis Cline

    06/01/2020, 10:07 PM
    Curious how folks are approaching data quality checks and relatedly (hopefully) generating/simulating test data.
  • t

    Travis Cline

    06/01/2020, 10:12 PM
    it'd be interesting to have hypothesis annotations for inputs and outputs on solids
    d
    • 2
    • 2
  • t

    Travis Cline

    06/01/2020, 10:12 PM
    and perhaps have some initial population of those based on observed data flowing through a pipeline
  • b

    Binh Pham

    06/02/2020, 11:39 PM
    Hi, I want to execute some code when there are any failures. Specifically, I'm trying to log failure information to a table. What would be the best approach? Currently this is how I'm envisioning it.
    s
    a
    • 3
    • 5
  • s

    sephi

    06/03/2020, 6:53 AM
    Hi, When running a composite_solid we need to output 2 objects. if we run with
    @composite_solid(
    output_defs=[OutputDefinition(name='path', dagster_type=String),
    OutputDefinition(name='dataframe', dagster_type=DataFrame),
    ])
    def my_solid():
    ...
    path = solid_return_path()
    dataframe = solid_return_dataframe()
    yield Output (path, 'path')
    yield Output (dataframe, 'dataframe')
    we get the following error:
    dagster.core.errors.DagsterInvalidDefinitionError: @composite_solid my_solid returned problematic value of type <class 'generator'>. Expected return value from invoked solid or dict mapping output name to return values from invoked solids.
    If we change it to :
    return path, dataframe
    the pipeline runs. Any suggestions?
    a
    m
    • 3
    • 2
  • s

    Sam Rausser

    06/04/2020, 2:22 AM
    BlockingIOError: [Errno 11] Resource temporarily unavailable
  • s

    Sam Rausser

    06/04/2020, 2:23 AM
    RuntimeError: can't start new thread
  • s

    Sam Rausser

    06/04/2020, 2:23 AM
    I've been getting a bunch of these errors. does anyone know what might be causing them?
    m
    a
    • 3
    • 35
  • m

    max

    06/04/2020, 2:28 AM
    hm, it looks to me like you might be running out of a system resource like file descriptors... @prha does this ring a bell
  • a

    anderson

    06/04/2020, 1:30 PM
    I haven't found it in the docs yet but if it's there lmk and I'll keep looking instead of asking here, but is it possible to run pipelines inside virtualenv (kind of like airflow virtualenv operator)? Or a solution to support different envs in se deployment using venv or something else
    a
    • 2
    • 6
  • s

    Sam Rausser

    06/04/2020, 2:38 PM
    Untitled
  • s

    Sam Rausser

    06/04/2020, 2:39 PM
    @max still getting resource unavailable
    compute_logs:
      module: dagster.core.storage.local_compute_log_manager
      class: NoOpComputeLogManager
      config:
        base_dir: /tmp/datafarm
    a
    m
    p
    • 4
    • 87
  • u

    user

    06/04/2020, 11:33 PM
    prha just published a new version: 0.7.16.
  • p

    prha

    06/04/2020, 11:40 PM
    0_7_16_Release_Notes
    😛artydagster: 4
  • s

    Sam Rausser

    06/05/2020, 12:23 AM
    running
    0.7.16
  • s

    Sam Rausser

    06/05/2020, 12:23 AM
    Untitled
  • s

    Sam Rausser

    06/05/2020, 12:26 AM
    dagster.yaml
    p
    • 2
    • 2
  • s

    Sam Rausser

    06/05/2020, 12:26 AM
    what did i do wrong?
  • s

    Sam Rausser

    06/05/2020, 1:35 AM
    new errors
  • s

    Sam Rausser

    06/05/2020, 1:35 AM
    Untitled
  • s

    Sam Rausser

    06/05/2020, 1:36 AM
    why is it running cron code?
    p
    • 2
    • 6
  • a

    Andrey Alekseev

    06/05/2020, 5:19 PM
    Hi! Sorry for dumb question 😅 Is it possible to get a config structure (dict or yaml) of existing solid? I guess it is, cuz dagit autocompletes and checks config, but I could not find how it happens.
    e
    a
    • 3
    • 5
  • m

    Muthu

    06/05/2020, 7:45 PM
    hi getting this error after update to 0.7.16
    sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column "snapshot_id" of relation "runs" does not exist
    LINE 1: INSERT INTO runs (run_id, snapshot_id, pipeline_name, status...
    s
    m
    • 3
    • 4
  • v

    VK

    06/06/2020, 6:55 AM
    🙋 Roll call! Who else is here?
  • v

    VK

    06/06/2020, 6:56 AM
    Guys, I could not find it in the docs, but may be you can point me out to the possibility. I am looking for a way to store execution status on S3 without AWS RDS. Is there such possiblity?
    m
    • 2
    • 11
  • w

    wbonelli

    06/06/2020, 7:22 PM
    Is there a way to configure
    execute_pipeline_iterator
    to send logs to the event stream?
    s
    a
    • 3
    • 5
Powered by Linen
Title
w

wbonelli

06/06/2020, 7:22 PM
Is there a way to configure
execute_pipeline_iterator
to send logs to the event stream?
s

schrockn

06/06/2020, 7:25 PM
there is not unfortunately
we do this at the graphql layer with a relatively intricate interaction with gevent and graphql subscriptions
what are you trying to do?
w

wbonelli

06/06/2020, 7:46 PM
ah ok, thanks. I wanted to bubble certain events (pipeline init/start/failure) and logs up to another application as they come in. I think I'll just write a custom logger
a

alex

06/06/2020, 7:59 PM
instance=DagsterInstance.get() Might be what you are looking for, depends on what exactly you mean
View count: 1