A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/trinhminhtriet/github-toolkit below:

trinhminhtriet/github-toolkit: github-toolkit: Scrapes GitHub developers, followers, repositories into MySQL database.

A Python-based web scraper that collects GitHub developer information, their followers, and repository details using Selenium and stores the data in a MySQL database.

github-toolkit/
├── config/
│   └── settings.py           # Configuration and environment variables
├── core/
│   ├── entities.py          # Domain entities
│   └── exceptions.py        # Custom exceptions
├── infrastructure/
│   ├── database/           # Database-related code
│   │   ├── connection.py
│   │   └── models.py
│   └── auth/              # Authentication service
│       └── auth_service.py
├── services/
│   └── scraping/          # Scraping services
│       ├── github_developer_scraper.py
│       └── github_repo_scraper.py
├── utils/
│   └── helpers.py         # Utility functions
├── controllers/
│   └── github_scraper_controller.py  # Main controller
├── main.py                # Entry point
└── README.md
  1. Clone the repository:
git clone git@github.com:trinhminhtriet/github-toolkit.git
cd github-toolkit
  1. Create a virtual environment and activate it:
python3 -m venv .venv
source ~/.venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create a .env file in the root directory with the following variables:
GITHUB_USERNAME=your_username
GITHUB_PASSWORD=your_password
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_HOST=your_db_host
DB_NAME=your_db_name
  1. Create a config directory:

Create a requirements.txt file with:

selenium
sqlalchemy
python-dotenv

Run the scraper:

The scraper will:

  1. 🔑 Authenticate with GitHub
  2. 🌟 Scrape trending developers for specified languages
  3. 👥 Collect their followers (up to 1000 per developer)
  4. 📦 Scrape their repositories
  5. 💾 Store all data in the MySQL database
  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature/your-feature).
  3. Commit changes (git commit -m "Add your feature").
  4. Push to the branch (git push origin feature/your-feature).
  5. Open a pull request.

This project is licensed under the MIT License - see the LICENSE file for details (create one if needed).


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4