Our new LangChain Academy Course Deep Research with LangGraph is now live!
Enroll for free.
SparkDocument loaders PySparkApache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including
Spark SQL
for SQL and DataFrames,pandas API on Spark
for pandas workloads,MLlib
for machine learning,GraphX
for graph processing, andStructured Streaming
for stream processing.
It loads data from a PySpark
DataFrame.
See a usage example.
from langchain_community.document_loaders import PySparkDataFrameLoader
Spark SQL toolkit
Toolkit for interacting with Spark SQL
.
See a usage example.
from langchain_community.agent_toolkits import SparkSQLToolkit, create_spark_sql_agent
from langchain_community.utilities.spark_sql import SparkSQL
Spark SQL individual tools
You can use individual tools from the Spark SQL Toolkit:
InfoSparkSQLTool
: tool for getting metadata about a Spark SQLListSparkSQLTool
: tool for getting tables namesQueryCheckerTool
: tool uses an LLM to check if a query is correctQuerySparkSQLTool
: tool for querying a Spark SQLfrom langchain_community.tools.spark_sql.tool import InfoSparkSQLTool
from langchain_community.tools.spark_sql.tool import ListSparkSQLTool
from langchain_community.tools.spark_sql.tool import QueryCheckerTool
from langchain_community.tools.spark_sql.tool import QuerySparkSQLTool
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4