A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.mongodb.com/docs/languages/python/pymongo-arrow-driver/current/schemas/ below:

Schema Examples - PyMongoArrow v1.8

This guide shows examples of how to use PyMongoArrow schemas in common situations.

When performing aggregate or find operations, you can provide a schema for nested data by using the struct object. There can be conflicting names in sub-documents compared to their parent documents.

>>> from pymongo import MongoClient... from pymongoarrow.api import Schema, find_arrow_all... from pyarrow import struct, field, int32... coll = MongoClient().db.coll... coll.insert_many(...     [...         {"start": "string", "prop": {"name": "foo", "start": 0}},...         {"start": "string", "prop": {"name": "bar", "start": 10}},...     ]... )... arrow_table = find_arrow_all(...     coll, {}, schema=Schema({"start": str, "prop": struct([field("start", int32())])})... )... print(arrow_table)pyarrow.Tablestart: stringprop: struct<start: int32>  child 0, start: int32----start: [["string","string"]]prop: [  -- is_valid: all not null  -- child 0 type: int32[0,10]]

You can do the same thing when using Pandas and NumPy:

>>> df = find_pandas_all(...     coll, {}, schema=Schema({"start": str, "prop": struct([field("start", int32())])})... )... print(df)    start           prop0  string   {'start': 0}1  string  {'start': 10}

You can also use projections to flatten the data before passing it to PyMongoArrow. The following example illustrates how to do this by using a very simple nested document structure:

>>> df = find_pandas_all(...     coll,...     {...         "prop.start": {...             "$gte": 0,...             "$lte": 10,...         }...     },...     projection={"propName": "$prop.name", "propStart": "$prop.start"},...     schema=Schema({"_id": ObjectIdType(), "propStart": int, "propName": str}),... )... print(df)                                 _id  propStart propName0  b'c\xec2\x98R(\xc9\x1e@#\xcc\xbb'          0      foo1  b'c\xec2\x98R(\xc9\x1e@#\xcc\xbc'         10      bar

When performing an aggregate operation, you can flatten the fields by using the $project stage, as shown in the following example:

>>> df = aggregate_pandas_all(...     coll,...     pipeline=[...         {"$match": {"prop.start": {"$gte": 0, "$lte": 10}}},...         {...             "$project": {...                 "propStart": "$prop.start",...                 "propName": "$prop.name",...             }...         },...     ],... )

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4