Showing content from https://github.com/openwpm/OpenWPM/issues/503 below:
Support running OpenWPM crawls on Windows · Issue #503 · openwpm/OpenWPM · GitHub
Skip to content Navigation Menu
Saved searches Use saved searches to filter your results more quickly
Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert Additional navigation options
Support running OpenWPM crawls on Windows #503
Description
Path, I hope, to supporting Windows. There may be some limitations, but first step.
ToDo:
- Once Conda install dependencies #648 is merged, there will be three dependencies that don't work on windows:
- leveldb
- plyvel
- i believe we can switch out plyvel for python-leveldb with almost no fuss
- python-virtualdriver
- this is for running xvfb which won't work on windows anyway, so just need to figure a package management solution / environment.yaml that accomodates both (most likely just making installing python-xvfb a manual step, as install xvfb is manual anyway -- maybe moving to pip will workaround)
- Make some tweaks in deploy_firefox so we're not manually making paths by concatenating strings
- Also suggest making some tweaks in deploy_firefox so that we let geckodriver set a profile path and we then read off it. this will help in goal of restoring stateful crawls and will make it easier to work here.
- Find a replacement for the log interceptor that uses mkfifo which is unix only. This stack overflow thread has something that maybe we can drop in as a replacement. Alternatively, I used a different approach in faust-selenium and created something to constantly "tail" geckodriver.log (https://github.com/birdsarah/faust-selenium/blob/master/crawler/geckodriver_log_reader.py). Alternatively again, we just save the geckodriver.log at the end and don't weave it into our logging. @englehardt - what is the motivation for interleaving the geckodriver logs?
- First step could be to skip geckodriver logs for windows platform - they're not crawl essential as best as I can tell.
Future (open issues):
- Add CircleCI tests and test on Win, OSX, and Linux (at least once per PR - or once a week).
You can’t perform that action at this time.
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.3