Git is a distributed version control system that tracks changes in any set of computer files, usually used for coordinating work among programmers collaboratively developing source code during software development.
This notebook shows how to load text files from Git
repository.
%pip install --upgrade --quiet GitPython
from git import Repo
repo = Repo.clone_from(
"https://github.com/langchain-ai/langchain", to_path="./example_data/test_repo1"
)
branch = repo.head.reference
from langchain_community.document_loaders import GitLoader
loader = GitLoader(repo_path="./example_data/test_repo1/", branch=branch)
page_content='.venv\n.github\n.git\n.mypy_cache\n.pytest_cache\nDockerfile' metadata={'file_path': '.dockerignore', 'file_name': '.dockerignore', 'file_type': ''}
Clone repository from url
from langchain_community.document_loaders import GitLoader
loader = GitLoader(
clone_url="https://github.com/langchain-ai/langchain",
repo_path="./example_data/test_repo2/",
branch="master",
)
Filtering files to load
from langchain_community.document_loaders import GitLoader
loader = GitLoader(
repo_path="./example_data/test_repo1/",
file_filter=lambda file_path: file_path.endswith(".py"),
)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4