RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/sananth12/ImageScraper below:

sananth12/ImageScraper: :scissors: High performance, multi-threaded image scraper

A high performance, easy to use, multithreaded command line tool which downloads images from the given webpage.

Click here to see it in action!

Grab the latest stable build from - Pip: https://pypi.python.org/pypi/ImageScraper

pip install (recommended):

You can also download using pip:

$ pip install ImageScraper

Note that ImageScraper depends on lxml, requests, setproctitle, and future. If you run into problems in the compilation of lxml through pip, install the libxml2-dev and libxslt-dev packages on your system.

$ image-scraper [OPTIONS] URL

You can also use it in your Python scripts. (Deprecated)

import image_scraper
image_scraper.scrape_images(URL)

-h, --help            show this help message and exit
-m MAX_IMAGES, --max-images MAX_IMAGES
                    Limit on number of images
-s SAVE_DIR, --save-dir SAVE_DIR
                    Directory in which images should be saved
-g, --injected        Scrape injected images
--proxy-server PROXY_SERVER
                    Proxy server to use
--min-filesize MIN_FILESIZE
                    Limit on size of image in bytes
--max-filesize MAX_FILESIZE
                    Limit on size of image in bytes
--dump-urls           Print the URLs of the images
--formats [FORMATS [FORMATS ...]]
                    Specify formats in a list without any separator. This
                    argument must be after the URL.
--scrape-reverse      Scrape the images in reverse order
--filename-pattern FILENAME_PATTERN
                    Only scrape images with filenames that match the given
                    regex pattern
--nthreads NTHREADS   The number of threads to use when downloading images.

If you downloaded the tar:

Extract the contents of the tar file.

$ cd ImageScraper/
$ python setup.py install
$ image-scraper --max-images 10 [url to scrape]

Scrape all images

$ image-scraper  ananth.co.in/test.html

Scrape at max 2 images

$ image-scraper -m 2 ananth.co.in/test.html

Scrape only gifs and download to folder ./mygifs

$ image-scraper -s mygifs ananth.co.in/test.html --formats gif

By default, a new folder called "images_" will be created in the working directory, containing all the downloaded images.

Q.)All images were not downloaded?

It could be that the content was injected into the page via JavaScript; this scraper doesn't run JavaScript.

If you want to add features, improve them, or report issues, feel free to send a pull request!!

ImageScraper is to be used education/research purposes only. The authors takes NO responsibility and/or liability for how you choose to use any of the tools/source code/any files provided. By using ImageScraper, you understand that you are AGREEING TO USE AT YOUR OWN RISK.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4