Module arrow2::io::parquet::read

source ·
Available on crate feature io_parquet only.
Expand description

APIs to read from Parquet format.

Re-exports

pub use parquet2::fallible_streaming_iterator;
pub use schema::infer_schema;

Modules

API to perform page-level filtering (also known as indexes)
APIs to handle Parquet <-> Arrow schemas.
APIs exposing parquet2’s statistics as arrow’s statistics.

Structs

Metadata for a column chunk.
A descriptor for leaf-level primitive columns. This encapsulates information such as definition and repetition levels and is used to re-assemble nested data.
A CompressedDataPage is compressed, encoded representation of a Parquet data page. It holds actual data and thus cloning it is expensive.
Decompressor that allows re-using the page buffer of PageIterator.
Metadata for a Parquet file.
An iterator of Chunks coming from row groups of a parquet file.
A fallible Iterator of CompressedDataPage. This iterator reads pages back to back until all pages have been consumed. The pages from this iterator always have None crate::page::CompressedDataPage::selected_rows() since filter pushdown is not supported without a pre-computed page index.
A MutStreamingIterator of pre-read column chunks
An Iterator of Chunk that (dynamically) adapts a vector of iterators of Array into an iterator of Chunk.
Metadata for a row group.
An [Iterator<Item=RowGroupDeserializer>] from row groups of a parquet file.

Enums

A Page is an uncompressed, encoded representation of a Parquet page. It may hold actual data and thus cloning it may be expensive.
Errors generated by this crate
Representation of a Parquet type describing primitive and nested fields, including the top-level schema of the parquet file.
The set of all physical types representable in Parquet

Traits

A fallible, streaming iterator.
A special kind of fallible streaming iterator where advance consumes the iterator.
Trait describing a FallibleStreamingIterator of Page

Functions

Reads the column indexes of all ColumnChunkMetaData and deserializes them into Index. Returns an empty vector if indexes are not available
Reads a FileMetaData from the reader, located at the end of the file.
Asynchronously reads the files’ metadata
An iterator adapter that maps multiple iterators of Pages into an iterator of Arrays.
Decompresses the page, using buffer for decompression. If page.buffer.len() == 0, there was no decompression and the buffer was moved. Else, decompression took place.
Returns a ColumnIterator of column chunks corresponding to field.
Returns all ColumnChunkMetaData associated to field_name. For non-nested parquet types, this returns a single column
Returns all ColumnChunkMetaData associated to field_name. For non-nested parquet types, this returns a single column
Creates a new iterator of compressed pages.
Returns a stream of compressed data pages
Reads all columns that are part of the parquet field field_name
Reads all columns that are part of the parquet field field_name
Returns a vector of iterators of Array (ArrayIter) corresponding to the top level parquet fields whose name matches fields’s names.
Returns a vector of iterators of Array corresponding to the top level parquet fields whose name matches fields’s names.
Reads parquets’ metadata syncronously.
Reads parquets’ metadata asynchronously.
Read PageLocations from the ColumnChunkMetaDatas. Returns an empty vector if indexes are not available
Converts a vector of columns associated with the parquet field whose name is Field to an iterator of Array, ArrayIter of chunk size chunk_size.

Type Definitions

Type def for a sharable, boxed dyn Iterator of arrays
Type declaration for a page filter