Showing content from https://crawlee.dev/python/api/class/BasicCrawler below:
BasicCrawler | API | Crawlee for Python ยท Fast, reliable Python web crawlers.
BasicCrawler Index Methods
- __init__(*, configuration, event_manager, storage_client, request_manager, session_pool, proxy_configuration, http_client, request_handler, max_request_retries, max_requests_per_crawl, max_session_rotations, max_crawl_depth, use_session_pool, retry_on_blocked, additional_http_error_status_codes, ignore_http_error_status_codes, concurrency_settings, request_handler_timeout, statistics, abort_on_error, keep_alive, configure_logging, statistics_log_format, respect_robots_txt_file, status_message_logging_interval, status_message_callback, _context_pipeline, _additional_context_managers, _logger): None
- async add_requests(requests, *, forefront, batch_size, wait_time_between_batches, wait_for_all_requests_to_be_added, wait_for_all_requests_to_be_added_timeout): None
- Parameters
- requests: Sequence[str | Request]
- optionalkeyword-onlyforefront: bool = False
- optionalkeyword-onlybatch_size: int = 1000
- optionalkeyword-onlywait_time_between_batches: timedelta = timedelta(0)
- optionalkeyword-onlywait_for_all_requests_to_be_added: bool = False
- optionalkeyword-onlywait_for_all_requests_to_be_added_timeout: timedelta | None = None
Returns None
- async export_data(path, dataset_id, dataset_name): None
- Parameters
- path: str | Path
- optionaldataset_id: str | None = None
- optionaldataset_name: str | None = None
Returns None
- async get_data(dataset_id, dataset_name, *, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view): DatasetItemsListPage
- Parameters
- optionaldataset_id: str | None = None
- optionaldataset_name: str | None = None
- keyword-onlyoptionaloffset: int
- keyword-onlyoptionallimit: int | None
- keyword-onlyoptionalclean: bool
- keyword-onlyoptionaldesc: bool
- keyword-onlyoptionalfields: list[str]
- keyword-onlyoptionalomit: list[str]
- keyword-onlyoptionalunwind: list[str]
- keyword-onlyoptionalskip_empty: bool
- keyword-onlyoptionalskip_hidden: bool
- keyword-onlyoptionalflatten: list[str]
- keyword-onlyoptionalview: str
Returns DatasetItemsListPage
- async get_dataset(*, id, name): Dataset
- Parameters
- optionalkeyword-onlyid: str | None = None
- optionalkeyword-onlyname: str | None = None
Returns Dataset
- Parameters
- optionalkeyword-onlyid: str | None = None
- optionalkeyword-onlyname: str | None = None
Returns KeyValueStore
- Parameters
- optionalrequests: Sequence[str | Request] | None = None
- optionalkeyword-onlypurge_request_queue: bool = True
Returns FinalStatistics
- Parameters
- optionalreason: str = 'Stop was called externally.'
Returns None
Properties
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4