For writing dagster tests, is there a good way to ...
# ask-community
d
For writing dagster tests, is there a good way to do this without causing the package to have to be imported (or otherwise take a long time)? What we are seeing is that in order for dagster to "understand" the package, all the repos in the package have to get loaded at package import. This also means that when we want to run a test, the whole package has to get imported, which is slow in our case. Is there something we are missing that can either make package import fast, or not import the whole package for a test that only tests a given job?
o
hi @Daniel Mosesson! How slow is the import, and are you doing anything fancy to construct these repositories (like programmatically generating a ton of jobs?). A shot in the dark solution is that you can have some conditional logic in these repos, based on an env var or something like that to avoid the expensive construction, but I'm interested to know more about your setup before I endorse that solution
d
Its about thirty seconds, and I looked at the flamegraph and almost all the time is spent in importlib, mostly in _find_and_load. I don't even really see where dagster is, or even where my code is really on the graph at all
o
Got it, that definitely seems way out of the ordinary. How large is your package, and are you using any functions like
load_assets_from_package_module
, or
load_assets_from_dbt_project
? And did this import time increase significantly recently, or has it always been about this slow? My instinct is that there should definitely be some way to make package import faster.
d
sorry for the slow responses, have a few other things going on. Package is pretty small, maybe ~40 files, and 1K SLOC or so (not including dependancies, no idea offhand how big those are). I am using
load_assets_from_package_modules
(not sure if the
s
is important there, but timing that seems like its pretty fast (< 1 second). The only thing I have found so far is that
from dagster import Field
takes 9 seconds which is surprising. A small amount of investigating shows that it is almost entirely from
dagster._check
which may or may not help. Package import time has been creeping up over time, and we are currently on dagster 0.15.7.
Just to be clear, this isn't a huge issue for us, we just noticed because one of our machines was having other issues that caused a 6 minute import time and from there we started diving in
o
got it -- I can't see why those particular imports would take so long (the
dagster
package can be slow to import, but that's on the order of like .5-1 second, not 30 🤯). My guess is that there's some weird OS/python installation/machine issue going on, as we haven't seen any other reports of this (and I can replicate), but let me know if I can be useful here. We're also restructuring how our imports work in today's release, but I don't think that's super likely to change anything.
d
I'll take a look. For you though, the
_check
import is fast (even on Windows)?
o
was asking around internally, and was recommended this method of profiling imports (it may give more detailed info than the flamegraph). Also "security software i think has been involved when ive seen things like this in the past"
(I don't have a windows machine handy, but the _check import is ~.25 seconds for me)
b
on my macbook:
on windows: