Module arrow2::io::parquet::write

source ·
Available on crate feature io_parquet only.
Expand description

APIs to write to Parquet format.

Arrow/Parquet Interoperability

As of parquet-format v2.9 there are Arrow DataTypes which do not have a parquet representation. These include but are not limited to:

  • DataType::Timestamp(TimeUnit::Second, _)
  • DataType::Int64
  • DataType::Duration
  • DataType::Date64
  • DataType::Time32(TimeUnit::Second)

The use of these arrow types will result in no logical type being stored within a parquet file.

Re-exports

pub use parquet2::fallible_streaming_iterator;

Structs

Represents a valid brotli compression level.
A CompressedDataPage is compressed, encoded representation of a Parquet data page. It holds actual data and thus cloning it is expensive.
A FallibleStreamingIterator that consumes Page and yields CompressedPage holding a reusable buffer (Vec<u8>) for compression.
A descriptor of a parquet column. It contains the necessary information to deserialize a parquet column.
DynIter is an implementation of a single-threaded, dynamically-typed iterator.
Common type information.
Metadata for a Parquet file.
Sink that writes array chunks as a Parquet file.
An interface to write a parquet to a Write
Represents a valid gzip compression level.
Wrapper struct to store key values
An iterator adapter that converts an iterator over Chunk into an iterator of row groups. Use it to create an iterator consumable by the parquet’s API.
A schema descriptor. This encapsulates the top-level schemas for all the columns, as well as all descriptors for all the primitive columns.
Description for file metadata
Currently supported options to write to parquet
Represents a valid zstd compression level.

Enums

A CompressedPage is a compressed, encoded representation of a Parquet page. It holds actual data and thus cloning it is expensive.
Defines the compression settings for writing a parquet file.
Descriptor of nested information of a field
A Page is an uncompressed, encoded representation of a Parquet page. It may hold actual data and thus cloning it may be expensive.
The set of all physical types representable in Parquet
Representation of a Parquet type describing primitive and nested fields, including the top-level schema of the parquet file.
The parquet version to use

Traits

A fallible, streaming iterator.

Functions

Returns a vector of iterators of Page, one per leaf column in the array
Converts an Array to a CompressedPage based on options, descriptor and encoding.
Converts an Array to a CompressedPage based on options, descriptor and encoding.
Returns an iterator of Page.
Checks whether the data_type can be encoded as encoding. Note that this is whether this implementation supports it, which is a subset of what the parquet spec allows.
Compresses an [EncodedPage] into a CompressedPage using compressed_buffer as the intermediary buffer.
Maps a Chunk and parquet-specific options to an RowGroupIter used to write to parquet
Creates a parquet SchemaDescriptor from a Schema.
Transverses the data_type up to its (parquet) columns and returns a vector of items based on map. This is used to assign an Encoding to every parquet column based on the columns’ type (see example)
Writes a parquet file containing only the header and footer

Type Definitions