A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/apache/sedona below:

apache/sedona: A cluster computing framework for processing large-scale geospatial data

Everyone is welcome to join our community events. We have a community office hour every 4 weeks. Please register to the event you want to attend: https://bit.ly/3UBmxFY

Please join our Discord community!

For the mailing list, Please first subscribe and then post emails. To subscribe, please send an email (leave the subject and content blank) to dev-subscribe@sedona.apache.org

Apache Sedonaâ„¢ is a spatial computing engine that enables developers to easily process spatial data at any scale within modern cluster computing systems such as Apache Spark and Apache Flink. Sedona developers can express their spatial data processing tasks in Spatial SQL, Spatial Python or Spatial R. Internally, Sedona provides spatial data loading, indexing, partitioning, and query processing/optimization functionality that enable users to efficiently analyze spatial data at any scale.

Some of the key features of Apache Sedona include:

These are some of the key features of Apache Sedona, but it may offer additional capabilities depending on the specific version and configuration.

Apache Sedona is a widely used framework for working with spatial data, and it has many different use cases and applications. Some of the main use cases for Apache Sedona include:

This example loads NYC taxi trip records and taxi zone information stored as .CSV files on AWS S3 into Sedona spatial dataframes. It then performs spatial SQL query on the taxi trip datasets to filter out all records except those within the Manhattan area of New York. The example also shows a spatial join operation that matches taxi trip records to zones based on whether the taxi trip lies within the geographical extents of the zone. Finally, the last code snippet integrates the output of Sedona with GeoPandas and plots the spatial distribution of both datasets.

Load NYC taxi trips and taxi zones data from CSV Files Stored on AWS S3
taxidf = (
    sedona.read.format("csv")
    .option("header", "true")
    .option("delimiter", ",")
    .load("s3a://your-directory/data/nyc-taxi-data.csv")
)
taxidf = taxidf.selectExpr(
    "ST_Point(CAST(Start_Lon AS Decimal(24,20)), CAST(Start_Lat AS Decimal(24,20))) AS pickup",
    "Trip_Pickup_DateTime",
    "Payment_Type",
    "Fare_Amt",
)
zoneDf = (
    sedona.read.format("csv")
    .option("delimiter", ",")
    .load("s3a://your-directory/data/TIGER2018_ZCTA5.csv")
)
zoneDf = zoneDf.selectExpr("ST_GeomFromWKT(_c0) as zone", "_c1 as zipcode")
Spatial SQL query to only return Taxi trips in Manhattan
taxidf_mhtn = taxidf.where(
    "ST_Contains(ST_PolygonFromEnvelope(-74.01,40.73,-73.93,40.79), pickup)"
)
Spatial Join between Taxi Dataframe and Zone Dataframe to Find taxis in each zone
taxiVsZone = sedona.sql(
    "SELECT zone, zipcode, pickup, Fare_Amt FROM zoneDf, taxiDf WHERE ST_Contains(zone, pickup)"
)
Show a map of the loaded Spatial Dataframes using GeoPandas
zoneGpd = gpd.GeoDataFrame(zoneDf.toPandas(), geometry="zone")
taxiGpd = gpd.GeoDataFrame(taxidf.toPandas(), geometry="pickup")

zone = zoneGpd.plot(color="yellow", edgecolor="black", zorder=1)
zone.set_xlabel("Longitude (degrees)")
zone.set_ylabel("Latitude (degrees)")

zone.set_xlim(-74.1, -73.8)
zone.set_ylim(40.65, 40.9)

taxi = taxiGpd.plot(ax=zone, alpha=0.01, color="red", zorder=3)

We provide a Docker image for Apache Sedona with Python JupyterLab and a single-node cluster. The images are available on DockerHub

Name API Introduction common Java Core geometric operation logics, serialization, index spark Spark RDD/DataFrame Scala/Java/SQL Distributed geospatial data processing on Apache Spark flink Flink DataStream/Table in Scala/Java/SQL Distributed geospatial data processing on Apache Flink snowflake Snowflake SQL Distributed geospatial data processing on Snowflake spark-shaded No source code shaded jar for Sedona Spark flink-shaded No source code shaded jar for Sedona Flink snowflake-tester Java tester program for Sedona Snowflake python Spark RDD/DataFrame Python Distributed geospatial data processing on Apache Spark R Spark RDD/DataFrame in R R wrapper for Sedona Zeppelin Apache Zeppelin Plugin for Apache Zeppelin 0.8.1+

Please visit Apache Sedona website for detailed information


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4