bytebuild-0.3.16.3: Build byte arrays
Safe HaskellNone
LanguageHaskell2010

Data.Bytes.Builder

Synopsis

Bounded Primitives

data Builder #

An unmaterialized sequence of bytes that may be pasted into a mutable byte array.

Instances

Instances details
ToBuilder Builder #

Identity

Instance details

Defined in Data.Bytes.Builder.Class

Methods

toBuilder :: Builder -> Builder #

Monoid Builder # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

Semigroup Builder # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

IsString Builder # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

Methods

fromString :: String -> Builder #

fromBounded :: forall (n :: Nat). Nat n -> Builder n -> Builder #

Convert a bounded builder to an unbounded one. If the size is a constant, use Arithmetic.Nat.constant as the first argument to let GHC conjure up this value for you.

Evaluation

run #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks 

Run a builder.

runOnto #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks

Suffix

-> Chunks 

Run a builder. The resulting chunks are consed onto the beginning of an existing sequence of chunks.

runOntoLength #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks

Suffix

-> (Int, Chunks) 

Variant of runOnto that additionally returns the number of bytes consed onto the suffix.

reversedOnto #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks 
-> Chunks 

Variant of runOnto that conses the additional chunks in reverse order.

putMany #

Arguments

:: Foldable f 
=> Int

Size of shared chunk (use 8176 if uncertain)

-> (a -> Builder)

Value builder

-> f a

Collection of values

-> (MutableBytes RealWorld -> IO b)

Consume chunks.

-> IO () 

Run a builder against lots of elements. This fills the same underlying buffer over and over again. Do not let the argument to the callback escape from the callback (i.e. do not write it to an IORef). Also, do not unsafeFreezeByteArray any of the mutable byte arrays in the callback. The intent is that the callback will write the buffer out.

putManyConsLength #

Arguments

:: forall f m (n :: Nat) a b. (Foldable f, MonadIO m) 
=> Nat n

Number of bytes used by the serialization of the length

-> (Int -> Builder n)

Length serialization function

-> Int

Size of shared chunk (use 8176 if uncertain)

-> (a -> Builder)

Value builder

-> f a

Collection of values

-> (MutableBytes RealWorld -> m b)

Consume chunks.

-> m () 

Variant of putMany that prefixes each pushed array of chunks with the number of bytes that the chunks in each batch required. (This excludes the bytes required to encode the length itself.) This is useful for chunked HTTP encoding.

Materialized Byte Sequences

bytes :: Bytes -> Builder #

Create a builder from a sliced byte sequence. The variants copy and insert provide more control over whether or not the byte sequence is copied or aliased. This function is preferred when the user does not know the size of the byte sequence.

chunks :: Chunks -> Builder #

Paste byte chunks into a builder.

copy :: Bytes -> Builder #

Create a builder from a byte sequence. This always results in a call to memcpy. This is beneficial when the byte sequence is known to be small (less than 256 bytes).

copyCons :: Word8 -> Bytes -> Builder #

Variant of copy that additionally pastes an extra byte in front of the bytes.

copy2 :: Bytes -> Bytes -> Builder #

Create a builder from two byte sequences. This always results in two calls to memcpy. This is beneficial when the byte sequences are known to be small (less than 256 bytes).

insert :: Bytes -> Builder #

Create a builder from a byte sequence. This never calls memcpy. Instead, it pushes a chunk that references the argument byte sequence. This wastes the remaining space in the active chunk, so it may adversely affect performance if used carelessly. See flush for a way to mitigate this problem. This functions is most beneficial when the byte sequence is known to be large (more than 8192 bytes).

byteArray :: ByteArray -> Builder #

Create a builder from an unsliced byte sequence. Implemented with bytes.

shortByteString :: ShortByteString -> Builder #

Create a builder from a short bytestring. Implemented with bytes.

textUtf8 :: Text -> Builder #

Create a builder from text. The text will be UTF-8 encoded.

shortTextUtf8 :: ShortText -> Builder #

Create a builder from text. The text will be UTF-8 encoded.

shortTextJsonString :: ShortText -> Builder #

Create a builder from text. The text will be UTF-8 encoded, and JSON special characters will be escaped. Additionally, the result is surrounded by double quotes. For example:

  • foo ==> "foo" (no escape sequences)
  • \_"_/ ==> "\\_\"_/" (escapes backslashes and quotes)
  • hello<ESC>world ==> "hello\u001Bworld" (where <ESC> is code point 0x1B)

cstring :: CString -> Builder #

Create a builder from a NUL-terminated CString. This ignores any textual encoding, copying bytes until NUL is reached.

cstringLen :: CStringLen -> Builder #

Create a builder from a C string with explicit length. The builder must be executed before the C string is freed.

stringUtf8 :: String -> Builder #

Create a builder from a cons-list of Char. These must be UTF-8 encoded.

Byte Sequence Encodings

sevenEightRight :: Bytes -> Builder #

Encode seven bytes into eight so that the encoded form is eight-bit clean. Specifically segment the input bytes inot 7-bit groups (lowest-to-highest index byte, most-to-least significant bit within a byte), pads the last group with trailing zeros, and forms octects by prepending a zero to each group.

The name was chosen because this pads the input bits with zeros on the right, and also because this was likely the originally-indended behavior of the SMILE standard (see sevenEightSmile). Right padding the input bits to a multiple of seven, as in this variant, is consistent with base64 encodings (which encodes 3 bytes in 4) and base85 (which encodes 4 bytes in 5).

sevenEightSmile :: Bytes -> Builder #

Encode seven bytes into eight so that the encoded form is eight-bit clean. Specifically segment the input bytes inot 7-bit groups (lowest-to-highest index byte, most-to-least significant bit within a byte), then pad each group with zeros on the left until each group is an octet.

The name was chosen because this is the implementation that is used (probably unintentionally) in the reference SMILE implementation, and so is expected tp be accepted by existing SMILE consumers.

Encode Integral Types

Human-Readable

word64Dec :: Word64 -> Builder #

Encodes an unsigned 64-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word32Dec :: Word32 -> Builder #

Encodes an unsigned 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word16Dec :: Word16 -> Builder #

Encodes an unsigned 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word8Dec :: Word8 -> Builder #

Encodes an unsigned 8-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

wordDec :: Word -> Builder #

Encodes an unsigned machine-sized integer as decimal. This encoding never starts with a zero unless the argument was zero.

naturalDec :: Natural -> Builder #

Encodes an unsigned arbitrary-precision integer as decimal. This encoding never starts with a zero unless the argument was zero.

int64Dec :: Int64 -> Builder #

Encodes a signed 64-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int32Dec :: Int32 -> Builder #

Encodes a signed 32-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int16Dec :: Int16 -> Builder #

Encodes a signed 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int8Dec :: Int8 -> Builder #

Encodes a signed 8-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

intDec :: Int -> Builder #

Encodes a signed machine-sized integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

integerDec :: Integer -> Builder #

Encode a signed arbitrary-precision integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

Unsigned Words

64-bit

word64PaddedUpperHex :: Word64 -> Builder #

Encode a 64-bit unsigned integer as hexadecimal, zero-padding the encoding to 16 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 00000000000003FE.

32-bit

word32PaddedUpperHex :: Word32 -> Builder #

Encode a 32-bit unsigned integer as hexadecimal, zero-padding the encoding to 8 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 000003FE.

16-bit

word16PaddedUpperHex :: Word16 -> Builder #

Encode a 16-bit unsigned integer as hexadecimal, zero-padding the encoding to 4 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 03FE.

word16PaddedLowerHex :: Word16 -> Builder #

Encode a 16-bit unsigned integer as hexadecimal, zero-padding the encoding to 4 digits. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 03fe.

word16LowerHex :: Word16 -> Builder #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 3fe.

word16UpperHex :: Word16 -> Builder #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 3FE.

8-bit

word8PaddedUpperHex :: Word8 -> Builder #

Encode a 8-bit unsigned integer as hexadecimal, zero-padding the encoding to 2 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 11 as 0B.

word8LowerHex :: Word8 -> Builder #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 3FE.

ascii :: Char -> Builder #

Encode an ASCII char. Precondition: Input must be an ASCII character. This is not checked.

ascii2 :: Char -> Char -> Builder #

Encode two ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii3 :: Char -> Char -> Char -> Builder #

Encode three ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii4 :: Char -> Char -> Char -> Char -> Builder #

Encode four ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii5 :: Char -> Char -> Char -> Char -> Char -> Builder #

Encode five ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii6 :: Char -> Char -> Char -> Char -> Char -> Char -> Builder #

Encode six ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii7 :: Char -> Char -> Char -> Char -> Char -> Char -> Char -> Builder #

Encode seven ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii8 :: Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Builder #

Encode eight ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

char :: Char -> Builder #

Encode a UTF-8 char. This only uses as much space as is required.

Machine-Readable

One

word8 :: Word8 -> Builder #

Requires exactly 1 byte.

Big Endian

word256BE :: Word256 -> Builder #

Requires exactly 32 bytes. Dump the octets of a 256-bit word in a big-endian fashion.

word128BE :: Word128 -> Builder #

Requires exactly 16 bytes. Dump the octets of a 128-bit word in a big-endian fashion.

word64BE :: Word64 -> Builder #

Requires exactly 8 bytes. Dump the octets of a 64-bit word in a big-endian fashion.

word32BE :: Word32 -> Builder #

Requires exactly 4 bytes. Dump the octets of a 32-bit word in a big-endian fashion.

word16BE :: Word16 -> Builder #

Requires exactly 2 bytes. Dump the octets of a 16-bit word in a big-endian fashion.

int64BE :: Int64 -> Builder #

Requires exactly 8 bytes. Dump the octets of a 64-bit signed integer in a big-endian fashion.

int32BE :: Int32 -> Builder #

Requires exactly 4 bytes. Dump the octets of a 32-bit signed integer in a big-endian fashion.

int16BE :: Int16 -> Builder #

Requires exactly 2 bytes. Dump the octets of a 16-bit signed integer in a big-endian fashion.

Little Endian

word256LE :: Word256 -> Builder #

Requires exactly 32 bytes. Dump the octets of a 256-bit word in a little-endian fashion.

word128LE :: Word128 -> Builder #

Requires exactly 16 bytes. Dump the octets of a 128-bit word in a little-endian fashion.

word64LE :: Word64 -> Builder #

Requires exactly 8 bytes. Dump the octets of a 64-bit word in a little-endian fashion.

word32LE :: Word32 -> Builder #

Requires exactly 4 bytes. Dump the octets of a 32-bit word in a little-endian fashion.

word16LE :: Word16 -> Builder #

Requires exactly 2 bytes. Dump the octets of a 16-bit word in a little-endian fashion.

int64LE :: Int64 -> Builder #

Requires exactly 8 bytes. Dump the octets of a 64-bit signed integer in a little-endian fashion.

int32LE :: Int32 -> Builder #

Requires exactly 4 bytes. Dump the octets of a 32-bit signed integer in a little-endian fashion.

int16LE :: Int16 -> Builder #

Requires exactly 2 bytes. Dump the octets of a 16-bit signed integer in a little-endian fashion.

LEB128

intLEB128 :: Int -> Builder #

Encode a signed machine-sized integer with LEB-128. This uses zig-zag encoding.

int32LEB128 :: Int32 -> Builder #

Encode a 32-bit signed integer with LEB-128. This uses zig-zag encoding.

int64LEB128 :: Int64 -> Builder #

Encode a 64-bit signed integer with LEB-128. This uses zig-zag encoding.

wordLEB128 :: Word -> Builder #

Encode a machine-sized word with LEB-128.

word16LEB128 :: Word16 -> Builder #

Encode a 16-bit word with LEB-128.

word32LEB128 :: Word32 -> Builder #

Encode a 32-bit word with LEB-128.

word64LEB128 :: Word64 -> Builder #

Encode a 64-bit word with LEB-128.

VLQ

wordVlq :: Word -> Builder #

Encode a machine-sized word with VLQ.

word32Vlq :: Word32 -> Builder #

Encode a 32-bit word with VLQ.

word64Vlq :: Word64 -> Builder #

Encode a 64-bit word with VLQ.

Many

word8Array :: PrimArray Word8 -> Int -> Int -> Builder #

Create a builder from a slice of an array of Word8. There is the same as bytes but is provided as a convenience for users working with different types.

Big Endian

Little Endian

Prefixing with Length

consLength #

Arguments

:: forall (n :: Nat). Nat n

Number of bytes used by the serialization of the length

-> (Int -> Builder n)

Length serialization function

-> Builder

Builder whose length is measured

-> Builder 

Prefix a builder with the number of bytes that it requires.

consLength32LE :: Builder -> Builder #

Variant of consLength32BE the encodes the length in a little-endian fashion.

consLength32BE :: Builder -> Builder #

Prefix a builder with its size in bytes. This size is presented as a big-endian 32-bit word. The need to prefix a builder with its length shows up a numbers of wire protocols including those of PostgreSQL and Apache Kafka. Note the equivalence:

forall (n :: Int) (x :: Builder).
  let sz = sizeofByteArray (run n (consLength32BE x))
  consLength32BE x === word32BE (fromIntegral sz) <> x

However, using consLength32BE is much more efficient here since it only materializes the ByteArray once.

consLength64BE :: Builder -> Builder #

Prefix a builder with its size in bytes. This size is presented as a big-endian 64-bit word. See consLength32BE.

Encode Floating-Point Types

Human-Readable

doubleDec :: Double -> Builder #

Encode a double-floating-point number, using decimal notation or scientific notation depending on the magnitude. This has undefined behavior when representing +inf, -inf, and NaN. It will not crash, but the generated numbers will be nonsense.

Replication

replicate #

Arguments

:: Int

Number of times to replicate the byte

-> Word8

Byte to replicate

-> Builder 

Replicate a byte the given number of times.

Control

flush :: Int -> Builder #

Push the buffer currently being filled onto the chunk list, allocating a new active buffer of the requested size. This is helpful when a small builder is sandwhiched between two large zero-copy builders:

insert bigA <> flush 1 <> word8 0x42 <> insert bigB

Without flush 1, word8 0x42 would see the zero-byte active buffer that insert returned, decide that it needed more space, and allocate a 4080-byte buffer to which only a single byte would be written.

Rebuild

rebuild :: Builder -> Builder #

This function and the documentation for it are copied from Takano Akio's fast-builder library.

rebuild b is equivalent to b, but it allows GHC to assume that b will be run at most once. This can enable various optimizations that greately improve performance.

There are two types of typical situations where a use of rebuild is often a win:

  • When constructing a builder using a recursive function. e.g. rebuild $ foldr ....
  • When constructing a builder using a conditional expression. e.g. rebuild $ case x of ...