Web automation is a complex and fragile task, with many ways to go wrong. This article describes some common errors and pitfalls, and how you might go about resolving them.
Starting the clientIf initializing a SeleniumSession
fails, it is often useful to look at the logs from the server. If you ran the java -jar ...
command manually, then you should be able to see the logs, but if you used selenium_server()
, then the logs are unavailable by default. However, you can use the stdout
and stderr
arguments to enable log collection, and then use read_output()
and read_error()
to read the logs.
server <- selenium_server(stdout = "|", stderr = "|")
server$read_output()
server$read_error()
This will show any output/errors that the server has written to the console.
TimeoutSometimes, the server can just take a very long time to start up. If you get an error from wait_for_server()
or wait_for_selenium_available()
, it can be worth increasing the max_time
argument to something higher than 60, and seeing if that fixes the issue.
If starting the server results in an error, and wait_for_server()
, server$read_error()
or the logs show an error similar to:
... java.lang.UnsupportedClassVersionError: {file} has been compiled by a more recent version of the Java Runtime ...
This probably means that you need to update your Java version. Selenium’s minimum version is now Java 11 (see https://www.selenium.dev/blog/2023/java-8-support/). You can find later versions of Java here
Port and IP addressOne reason why you may be unable to connect to the server is that the port and IP address you are connecting to is wrong.
If you are using selenium_server()
, server$host
and server$port
give you the host IP address and port, respectively.
You can also get the IP address and port from the server logs. You should see a line like: INFO [Standalone.execute] - Started Selenium Standalone ... (revision ...): http://<IP>:<PORT>
The URL at the end of this message can be used to extract an IP address and a port number, which can then be passed into the host
and port
arguments. For example, if the URL was: http://172.17.0.1/4444
, you would run:
If you are using Chrome, and you see a browser open, but the call to SeleniumSession$new()
times out, you may need to use a different debugging port. For example:
session <- SeleniumSession$new(
browser = "chrome",
capabilities = list(
`goog:chromeOptions` = list(
args = list("remote-debugging-port=9222")
)
)
)
Increasing /dev/shm/ size when using docker
If you are running selenium using docker, you may need to increase the size of /dev/shm/
to avoid running out of memory. This issue usually happens when using Chrome, and usually results in a message like session deleted because of page crash
.
You can use the --shm-size
to the selenium docker images to fix this issue. For example: docker run --shm-size="2g" selenium/standalone-chrome:<version>
At some point, when using selenium, you will encounter the following error:
#> Error in `element$click()`:
#> ! Stale element reference.
#> ✖ The element with the reference <...> is not known in the current browsing context
#> Caused by error in `httr2::req_perform()`:
#> ! HTTP 404 Not Found.
#> Run `rlang::last_trace()` to see where the error occurred.
This error is common when automating a website. Selenium is telling you that an element which you previously identified no longer exists. In all websites, especially complex ones, the DOM will be constantly updating itself, constantly invalidating references to elements. This error is a particularly annoying one, as it can happen at any time and is impossible to predict.
One way to deal with this error is to use elements as soon as they are created, only keeping references to elements if you are sure that they will not be invalidated. For example, if you want to click the same element twice, with a second-long gap in between, you may want to consider fetching the element once for each time, rather than sharing the reference between the actions.
However, this solution is not infallible. If you find yourself encountering this error a lot, it may be a sign that a more high-level package, that can deal with this issue (e.g. selenider), is needed.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4