A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/tensorflow/io/issues/315 below:

Standardize columnized dataset? · Issue #315 · tensorflow/io · GitHub

With the upcoming DatasetV2 a lot of the APIs are getting simplified. That also opens up some additional possibilities than just passing the dataset to tf.keras.

One area of interest, is that we already have support for many columnized dataset, e.g, Arrow, Avro, Parquet, Json, HDF5, etc. Those dataset may potentially be standardized with the same API so that we could treat them homogeneously. For example, ArrowDataset already exposes a columns() property method. We could apply the same to Avro, Parquet, Json, HDF5 etc. Thought?

Since those columnized dataset are largely numeric values, I think one area we also could have a common base class for those dataset, and support additional operations. For example, dataset_1 + dataset_2 => dataset_3 (add) where dataset_3 could be passed to tf.keras. The implementation could start with zip + map in python (not even needed in C++). Maybe this could be one use case that will help users?

/cc @terrytangyuan @BryanCutler


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4