https://dagster.io/ logo
#ask-ai
Title
# ask-ai
j

James Robinson

02/21/2024, 4:02 PM
What do I need to do to set up a selenium resource in dagster cloud? I currently have a
dagster_cloud_pre_install.sh
script:
Copy code
# Update package list and install wget and gnupg
apt-get update \
    && apt-get install -y wget gnupg

# Download and install chrome
wget -q -O - <https://dl-ssl.google.com/linux/linux_signing_key.pub> | apt-key add - \
    && echo "deb <http://dl.google.com/linux/chrome/deb/> stable main" >> /etc/apt/sources.list.d/google.list \
    && apt-get update \
    && apt-get -y install google-chrome-stable
And the following code to set up my webdriver:
Copy code
driver_service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=driver_service, options=options)
But it is currently failing when trying to install the chrome driver manager getting the latest release version. Am I missing anything or have I got anything wrong?
s

Scout

02/21/2024, 4:02 PM
To set up a Selenium resource in Dagster Cloud, you need to ensure that all the necessary dependencies and configurations are in place. Based on the documentation, here are the general steps you would typically follow to set up a Selenium resource: 1. Install System Dependencies: Your
dagster_cloud_pre_install.sh
script is responsible for installing system-level dependencies. This includes installing Chrome, which is a dependency for Selenium to work with the Chrome browser. Your script seems to be doing this correctly by installing
wget
,
gnupg
, and
google-chrome-stable
. 2. Install Python Dependencies: You need to ensure that all the required Python packages are installed. This includes
selenium
itself and any driver manager you are using, such as
webdriver-manager
for Chrome. 3. Configure the Selenium WebDriver: In your code, you are using
ChromeDriverManager().install()
to install the ChromeDriver. This is typically done in a local development environment. However, in a cloud environment like Dagster Cloud, you might need to handle driver installation differently. 4. Handle ChromeDriver Installation: The error you're encountering with the ChromeDriver installation could be due to various reasons, such as network issues, incorrect permissions, or the cloud environment not supporting dynamic downloads and installations in the same way as a local environment. 5. Adjust the Deployment Scripts: If the dynamic installation of ChromeDriver is not supported in the Dagster Cloud environment, you may need to adjust your deployment scripts to include a statically downloaded version of ChromeDriver that matches the installed version of Chrome. 6. Test the Setup: Before deploying to Dagster Cloud, test your setup locally to ensure that all components are working together correctly. 7. Debugging: If you encounter issues during deployment, you may need to review the logs to identify the exact cause of the failure. This can help you determine if there are additional dependencies or configurations that need to be addressed. The documentation does not provide a specific example or guide for setting up a Selenium resource in Dagster Cloud, so the above steps are general guidelines based on typical Selenium setup procedures. If you continue to face issues, you may need to consult the Dagster Cloud documentation for any cloud-specific considerations or reach out to the Dagster community for support.
j

James Robinson

02/21/2024, 4:04 PM
@Scout It runs successfully locally, but this is my traceback for dagster-cloud:
Copy code
AttributeError: 'NoneType' object has no attribute 'split'
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/dagster/_core/errors.py", line 287, in user_code_error_boundary
    yield
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/dagster/_core/execution/resources_init.py", line 332, in single_resource_event_generator
    resource = next(resource_iter)
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/dagster/_core/execution/resources_init.py", line 451, in _gen_resource
    with resource_or_gen as resource:
  File "/usr/local/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/dagster/_config/pythonic_config/resource.py", line 544, in _initialize_and_run_cm
    with updated_resource.yield_for_execution(context) as value:
  File "/usr/local/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/venvs/806156935750/lib/python3.9/site-packages/working_directory/root/./.../resources/sources/_selenium.py", line 32, in yield_for_execution
    with self._get_selenium_driver(visible=self.visible) as d:
  File "/venvs/806156935750/lib/python3.9/site-packages/working_directory/root/./.../resources/selenium.py", line 42, in _get_selenium_driver
    driver_service = Service(ChromeDriverManager().install())
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/webdriver_manager/chrome.py", line 40, in install
    driver_path = self._get_driver_binary_path(self.driver)
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/webdriver_manager/core/manager.py", line 40, in _get_driver_binary_path
    file = self._download_manager.download_file(driver.get_driver_download_url(os_type))
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 32, in get_driver_download_url
    driver_version_to_download = self.get_driver_version_to_download()
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/webdriver_manager/core/driver.py", line 48, in get_driver_version_to_download
    return self.get_latest_release_version()
  File "/venvs/4db669e58d6c/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 64, in get_latest_release_version
    determined_browser_version = ".".join(determined_browser_version.split(".")[:3])
s

Scout

02/21/2024, 4:04 PM
The traceback you've provided indicates that the error occurs when the
ChromeDriverManager().install()
method is called. This method is trying to determine the version of the browser it needs to download the corresponding driver for, but it seems to be encountering a
NoneType
object where it expects a string that it can call
split()
on. This could be due to several reasons: 1. Browser Version Detection: The
ChromeDriverManager
is likely trying to automatically detect the installed Chrome version to download the matching driver. If Chrome is not installed or not accessible in the expected location within the Dagster Cloud environment, this could result in a
NoneType
when attempting to retrieve the version. 2. Environment Differences: Since you mentioned that it runs successfully locally, the issue might be due to differences between your local environment and the Dagster Cloud environment. The cloud environment might have restrictions or a different setup that prevents
ChromeDriverManager
from working as expected. 3. Dynamic Downloads: Dagster Cloud might not support dynamic downloads and installations of binaries like ChromeDriver. This is a common restriction in managed cloud environments for security and reproducibility reasons. To resolve this issue, consider the following steps: - Pre-install ChromeDriver: Instead of relying on
ChromeDriverManager
to dynamically download the driver, you could pre-install the correct version of ChromeDriver that matches the version of Chrome in your Dagster Cloud environment. This would involve adding the ChromeDriver binary to your deployment package or Docker image if you're using one. - Specify the Path to ChromeDriver: Once you have a pre-installed ChromeDriver, modify your Selenium setup to use the specific path to the driver instead of attempting to download it. - Check Chrome Installation: Ensure that Google Chrome is installed in the Dagster Cloud environment and that your code has the necessary permissions to access it. - Review Cloud Environment Restrictions: Consult the Dagster Cloud documentation or support to understand any restrictions on running browsers and drivers in their environment. There might be specific guidelines you need to follow. - Debugging: Add logging around the area where the error occurs to get more information about the state of the environment and the variables involved. This can help you pinpoint the exact cause of the issue. Since the documentation does not provide specific guidance on setting up Selenium in Dagster Cloud, these suggestions are based on general practices for troubleshooting similar issues in cloud environments. If the problem persists, reaching out to Dagster support or the community might provide more tar
geted assistance for your specific setup.
j

James Robinson

02/21/2024, 4:09 PM
@Scout, will my script work on dagster cloud. i.e. will Dagster Cloud support dynamic downloads and installations of binaries like ChromeDriver.
s

Scout

02/21/2024, 4:09 PM
The documentation does not explicitly state whether Dagster Cloud supports dynamic downloads and installations of binaries like ChromeDriver. Managed cloud environments often have restrictions on executing such operations for security and reproducibility reasons. To ensure compatibility with Dagster Cloud, it is generally recommended to include all necessary dependencies within your deployment package. This means that instead of relying on dynamic downloads, you should pre-install ChromeDriver and any other binaries or system-level dependencies your code requires. If you need to use Selenium with ChromeDriver in Dagster Cloud, you should consider the following: - Pre-installed ChromeDriver: Include a pre-installed ChromeDriver in your deployment package or Docker image. - Static Configuration: Configure your Selenium setup to use the pre-installed ChromeDriver binary by providing the path to the executable. - Dependencies: Ensure all dependencies are specified in your
requirements.txt
or
setup.py
file, as these will be installed automatically by the Dagster Cloud Serverless infrastructure. If you are unsure about the capabilities and restrictions of Dagster Cloud regarding dynamic downloads and installations, it is best to reach out to Dagster support or consult the Dagster Cloud documentation for more detailed information on the runtime environment and deployment process.
20 Views