Why Selenium Python script freeze on EC2

0

I'm trying to run a script, but when I ran the process freeze. Here is the code

def get_source_content(url):
    """..."""

    driver_path = f"{settings.BASE_DIR}/geckodriver"

    options = FirefoxOptions()
    options.add_argument("--headless")
    options.add_argument("--disable-gpu")
    options.add_argument("--no-sandbox")
    options.add_argument("--single-process")
    options.add_argument("--ignore-certificate-errors")

    driver = webdriver.Firefox(
        service=FirefoxService(executable_path=driver_path), options=options
    )

    try:
        driver.get(url)
        WebDriverWait(driver, 3)
        element = driver.find_element(
            "xpath", "//button[@class='sc-beySPh gNAvzR mde-consent-accept-btn']"
        )
        element.click()
        WebDriverWait(driver, 3)
        source = driver.page_source

    except Exception as ex:
        raise ex

    driver.quit()

    return source
Alberto
asked 2 months ago115 views
2 Answers
1
Accepted Answer

The problem was related to the display, so adding this extra options arguments:

options.add_argument("--window-size=800,600")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--enable-automation")

and use a third party library to simulate fake display the problem was fixed

from pyvirtualdisplay import Display
display = Display(visible=0, size=(800, 600))
display.start()
...
display.stop()
Alberto
answered 2 months ago
profile picture
EXPERT
reviewed 2 months ago
0

Hello.

Is it possible that there is a problem with the specs of the EC2 that is running the code?
It may depend on the size of the website you are scraping, but I think it may stop working if the EC2 specs are low.
Have you confirmed that this code completes the process normally?
For example, does the process complete normally when run on a local PC?

Looking at the documentation, I think the usage of wait is as follows.
There may be a problem with the waiting time around here, so please check it.
https://www.selenium.dev/documentation/webdriver/waits/

wait = WebDriverWait(driver, 3)
element = wait.until(EC.element_to_be_clickable((By.XPATH, "//button[@class='sc-beySPh gNAvzR mde-consent-accept-btn']")))
element.click()
profile picture
EXPERT
answered 2 months ago
  • It works perfectly locally and checking metrics and instance resources: RAM is 47.3%, CPU is 64.9%, I/O, etc.... So the maximum resources of the instance are not exceeded.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions