Skip to content

Selenium Driver

This guide explains how to use Datallog’s built-in helpers for creating Selenium drivers in a controlled, cloud-friendly environment.

The module provides:

  • Automatic creation of Selenium options (Chrome or Firefox)
  • Built-in headless mode when running inside Datallog environments
  • A context-manager (DatallogSeleniumDriver) to open and close drivers cleanly

Overview

The driver flow works as follows:

  1. Prepare driver options
    Using selenium_driver_options(driver_type) to configure Chrome or Firefox

  2. Initialize the WebDriver
    Using selenium_driver(driver_type) which creates the actual driver instance

  3. (Optional) Use the context manager
    The DatallogSeleniumDriver class automatically creates and closes the driver

This helps prevent orphan processes, ensures compatibility with Datallog cloud execution, and simplifies Selenium usage.


Requirements

To use the Selenium driver utilities, you need:

  • Python 3.10+
  • The selenium library installed
  • Chrome or Firefox binaries and corresponding drivers available at:
BrowserDriver PathBinary Path
Chrome/usr/bin/chromedriver(auto-detected)
Firefox/usr/bin/geckodriver/usr/bin/firefox

Inside the Datallog execution environment (datallog run or inside Datallog's cloud environment), these paths are already configured.


Functions Overview

selenium_driver_options(driver_type)

Returns a Selenium Options object based on the given driver type.

python
from datallog import selenium_driver_options

options = selenium_driver_options("chrome")

The function:

  • Configures basic security flags such as --no-sandbox, --disable-gpu (Chrome)
  • Sets a fixed binary location for Firefox

selenium_driver(driver_type)

Creates and returns a fully initialized Chrome or Firefox WebDriver.

python
from datallog import selenium_driver

driver = selenium_driver("firefox")
driver.get("https://example.com")

This function:

  • Loads the appropriate options
  • Uses the correct driver paths
  • Returns a ready-to-use Selenium instance

Using the Context Manager

The easiest and safest way to use Selenium inside Datallog is with the provided context manager:

Example

python
from datallog import DatallogSeleniumDriver

with DatallogSeleniumDriver("firefox") as driver:
    driver.get("https://google.com")
    print(driver.title)

The context manager automatically:

  • Creates the driver on __enter__
  • Calls .quit() on __exit__
  • Prevents resource leaks inside long-running pipelines

Returned Values

  • selenium_driver_options → returns ChromeOptions or FirefoxOptions
  • selenium_driver → returns a Chrome or Firefox WebDriver instance
  • DatallogSeleniumDriver → returns the driver inside the with block

All drivers behave exactly as standard Selenium drivers and support all Selenium APIs.

Error Handling

The driver utilities may raise:

  • ValueError — when using an unsupported driver type (anything other than "chrome" or "firefox")

Most other Selenium errors will originate from the Selenium library itself (e.g., missing binaries, navigation errors, etc.).

Supported Driver Types

python
from datallog import DriverType

driver_type = DriverType('firefox')  # or 'chrome'

Summary

The Datallog Selenium utilities simplify browser automation by:

  • Providing safe defaults for cloud and container execution
  • Handling headless mode automatically
  • Managing driver lifecycles cleanly via context managers

They integrate seamlessly into data pipelines, scrapers, and automated testing workflows.