A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://cran.rstudio.com/web/packages/Rcpp/../SparseICA/../Rcpp/../cpp11/../arrow/news/news.html below:

NEWS

arrow 21.0.0 New features Minor improvements and fixes arrow 20.0.0.1 Minor improvements and fixes arrow 20.0.0 Minor improvements and fixes arrow 19.0.1.1 Minor improvements and fixes arrow 19.0.1

This release primarily updates the underlying Arrow C++ version used by the package to version 19.0.1 and includes all changes from the 19.0.0 and 19.0.1 releases. For what’s changed in Arrow C++ 19.0.0, please see the blog post and changelog. For what’s changed in Arrow C++ 19.0.1, please see the blog post and changelog.

arrow 18.1.0 Minor improvements and fixes arrow 17.0.0 New features Minor improvements and fixes arrow 16.1.0 New features Minor improvements and fixes arrow 15.0.1 New features Minor improvements and fixes arrow 14.0.2.1 Minor improvements and fixes arrow 14.0.2 Minor improvements and fixes arrow 14.0.0.2 Minor improvements and fixes Installation arrow 14.0.0.1 Minor improvements and fixes arrow 14.0.0 New features Minor improvements and fixes Installation arrow 13.0.0.1 arrow 13.0.0 Breaking changes New features Minor improvements and fixes Installation Docs arrow 12.0.1.1 arrow 12.0.1 arrow 12.0.0 New features Installation Minor improvements and fixes arrow 11.0.0.3 Minor improvements and fixes arrow 11.0.0.2 Breaking changes New features Docs Reading/writing data dplyr compatibility Function bindings Arrow object creation Installation Minor improvements and fixes arrow 10.0.1

Minor improvements and fixes:

arrow 10.0.0 Arrow dplyr queries

Several new functions can be used in queries:

The package now has documentation that lists all dplyr methods and R function mappings that are supported on Arrow data, along with notes about any differences in functionality between queries evaluated in R versus in Acero, the Arrow query engine. See ?acero.

A few new features and bugfixes were implemented for joins:

Some changes to improve the consistency of the API:

Finally, long-running queries can now be cancelled and will abort their computation immediately.

Arrays and tables

as_arrow_array() can now take blob::blob and ?vctrs::list_of, which convert to binary and list arrays, respectively. Also fixed an issue where as_arrow_array() ignored type argument when passed a StructArray.

The unique() function works on ?Table, ?RecordBatch, ?Dataset, and ?RecordBatchReader.

Reading and writing

write_feather() can take compression = FALSE to choose writing uncompressed files.

Also, a breaking change for IPC files in write_dataset(): passing "ipc" or "feather" to format will now write files with .arrow extension instead of .ipc or .feather.

Installation

As of version 10.0.0, arrow requires C++17 to build. This means that:

arrow 9.0.0 Arrow dplyr queries Reading and writing Arrays and tables Packaging arrow 8.0.0 Enhancements to dplyr and datasets Enhancements to date and time support Extensibility Concatenation Support

Arrow arrays and tables can be easily concatenated:

Other improvements and fixes arrow 7.0.0 Enhancements to dplyr and datasets CSV Other improvements and fixes Installation Under-the-hood changes arrow 6.0.1 arrow 6.0.0

There are now two ways to query Arrow data:

1. Expanded Arrow-native queries: aggregation and joins

dplyr::summarize(), both grouped and ungrouped, is now implemented for Arrow Datasets, Tables, and RecordBatches. Because data is scanned in chunks, you can aggregate over larger-than-memory datasets backed by many files. Supported aggregation functions include n(), n_distinct(), min(), max(), sum(), mean(), var(), sd(), any(), and all(). median() and quantile() with one probability are also supported and currently return approximate results using the t-digest algorithm.

Along with summarize(), you can also call count(), tally(), and distinct(), which effectively wrap summarize().

This enhancement does change the behavior of summarize() and collect() in some cases: see “Breaking changes” below for details.

In addition to summarize(), mutating and filtering equality joins (inner_join(), left_join(), right_join(), full_join(), semi_join(), and anti_join()) with are also supported natively in Arrow.

Grouped aggregation and (especially) joins should be considered somewhat experimental in this release. We expect them to work, but they may not be well optimized for all workloads. To help us focus our efforts on improving them in the next release, please let us know if you encounter unexpected behavior or poor performance.

New non-aggregating compute functions include string functions like str_to_title() and strftime() as well as compute functions for extracting date parts (e.g. year(), month()) from dates. This is not a complete list of additional compute functions; for an exhaustive list of available compute functions see list_compute_functions().

We’ve also worked to fill in support for all data types, such as Decimal, for functions added in previous releases. All type limitations mentioned in previous release notes should be no longer valid, and if you find a function that is not implemented for a certain data type, please report an issue.

2. DuckDB integration

If you have the duckdb package installed, you can hand off an Arrow Dataset or query object to DuckDB for further querying using the to_duckdb() function. This allows you to use duckdb’s dbplyr methods, as well as its SQL interface, to aggregate data. Filtering and column projection done before to_duckdb() is evaluated in Arrow, and duckdb can push down some predicates to Arrow as well. This handoff does not copy the data, instead it uses Arrow’s C-interface (just like passing arrow data between R and Python). This means there is no serialization or data copying costs are incurred.

You can also take a duckdb tbl and call to_arrow() to stream data to Arrow’s query engine. This means that in a single dplyr pipeline, you could start with an Arrow Dataset, evaluate some steps in DuckDB, then evaluate the rest in Arrow.

Breaking changes Installation on Linux Other enhancements and fixes Internals arrow 5.0.0.2

This patch version contains fixes for some sanitizer and compiler warnings.

arrow 5.0.0 More dplyr CSV writing C interface Other enhancements arrow 4.0.1 arrow 4.0.0.1 arrow 4.0.0 dplyr methods

Many more dplyr verbs are supported on Arrow objects:

Over 100 functions can now be called on Arrow objects inside a dplyr verb:

Datasets Other improvements Installation and configuration arrow 3.0.0 Python and Flight Enhancements Bug fixes Packaging and installation arrow 2.0.0 Datasets AWS S3 support Flight RPC

Flight is a general-purpose client-server framework for high performance transport of large datasets over network interfaces. The arrow R package now provides methods for connecting to Flight RPC servers to send and receive data. See vignette("flight", package = "arrow") for an overview.

Computation Packaging and installation Bug fixes and other enhancements arrow 1.0.1 Bug fixes arrow 1.0.0 Arrow format conversion Datasets Other enhancements Bug fixes and deprecations Installation and packaging arrow 0.17.1 arrow 0.17.0 Feather v2

This release includes support for version 2 of the Feather file format. Feather v2 features full support for all Arrow data types, fixes the 2GB per-column limitation for large amounts of string data, and it allows files to be compressed using either lz4 or zstd. write_feather() can write either version 2 or version 1 Feather files, and read_feather() automatically detects which file version it is reading.

Related to this change, several functions around reading and writing data have been reworked. read_ipc_stream() and write_ipc_stream() have been added to facilitate writing data to the Arrow IPC stream format, which is slightly different from the IPC file format (Feather v2 is the IPC file format).

Behavior has been standardized: all read_<format>() return an R data.frame (default) or a Table if the argument as_data_frame = FALSE; all write_<format>() functions return the data object, invisibly. To facilitate some workflows, a special write_to_raw() function is added to wrap write_ipc_stream() and return the raw vector containing the buffer that was written.

To achieve this standardization, read_table(), read_record_batch(), read_arrow(), and write_arrow() have been deprecated.

Python interoperability

The 0.17 Apache Arrow release includes a C data interface that allows exchanging Arrow data in-process at the C level without copying and without libraries having a build or runtime dependency on each other. This enables us to use reticulate to share data between R and Python (pyarrow) efficiently.

See vignette("python", package = "arrow") for details.

Datasets Installation Other bug fixes and enhancements arrow 0.16.0.2 arrow 0.16.0 Multi-file datasets

This release includes a dplyr interface to Arrow Datasets, which let you work efficiently with large, multi-file datasets as a single entity. Explore a directory of data files with open_dataset() and then use dplyr methods to select(), filter(), etc. Work will be done where possible in Arrow memory. When necessary, data is pulled into R for further computation. dplyr methods are conditionally loaded if you have dplyr available; it is not a hard dependency.

See vignette("dataset", package = "arrow") for details.

Linux installation

A source package installation (as from CRAN) will now handle its C++ dependencies automatically. For common Linux distributions and versions, installation will retrieve a prebuilt static C++ library for inclusion in the package; where this binary is not available, the package executes a bundled script that should build the Arrow C++ library with no system dependencies beyond what R requires.

See vignette("install", package = "arrow") for details.

Data exploration Compression Other fixes and improvements arrow 0.15.1 arrow 0.15.0 Breaking changes New features Other upgrades arrow 0.14.1

Initial CRAN release of the arrow package. Key features include:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4