Skip to content

Reading

kevin-montrose edited this page Apr 10, 2021 · 9 revisions

Reading

Introduction

Cesil splits reading into two interfaces IReader<TRow> and IAsyncReader<TRow>, for synchronous and asynchronous operations respectively. The same conceptual operations are supported by each interface.

Reader interfaces are obtained with the CreateReader(...) and CreateAsyncReader(...) methods on IBoundConfiguration<TRow> instances. Configurations are created with the Configuration static class's For<TRow>() and ForDynamic() methods.

Particulars of the format being read, and the manner to initialize and mutate the created TRow, are controlled by the Options (and the ITypeDescriber on it) provided to create the IBoundConfiguration<TRow>. By default, Options.Default or Options.DynamicDefault are used - both of which use the Default Type Describer.

Comments, assuming they are configured in the used Options, are skipped automatically by methods that do not explicitly return them.

Synchronous Reading

These methods all block if data is not available, and the underlying data stream has not completed.

If you know the underlying data stream will always return data and immediately complete (such as reading from a MemoryStream that has already been completely populated), this is the more efficient interface to use.

If your use case is entirely synchronous (ie. there's no other work your code could yield its thread to, you're calling ValueTask<T>.Result, etc.), then use this interface to avoid the inefficiency of sync-over-async.

EnumerateAll()

This method returns an IEnumerable<TRow> which can be used to create an IEnumerator<T>, each call to MoveNext() only consuming as much data as is needed to return the next row. Enumerating multiple times is not supported.

Typically, you will want to use this method with foreach or LINQ rather than consuming it directly.

ReadAll()

This method reads all rows into a List<TRow> and returns it. This method does it's work eagerly, only returning once all rows have been read and the end of data has been reached.

This method is conceptually equivalent to calling ReadAll(new List<TRow>())

ReadAll(TCollection)

This methods reads all rows into a provided implementation of ICollection<TRow> and then returns that collection. This method does it's work eagerly, only returning once all rows have been read and the end of data has been reached.

Modifying the provided collection while this method is running may result in undefined behavior.

TryRead(out TRow)

This method attempts to read a single row, returning true if it does and false if it doesn't. If it does read a row it is stored in the out parameter, otherwise default(TRow) is stored.

An instance of TRow is always obtained from the configured ITypeDescriber on each call. To avoid creating excess rows, row instances can be reused with TryReadWithReuse(ref TRow).

TryReadWithReuse(ref TRow)

This method attempts to read a single row, returning true if it does and false if it doesn't. If a row is read it will be stored into the value passed as a ref parameter if it is non-null, otherwise an instance of TRow will be obtained from the configured ITypeDescriber.

This method returns true if a row is read, and false otherwise. If false is returned, the value of the ref parameter should not be relied upon.

TryReadWithComment()

This method attempts to read a row or a comment, returning a ReadWithCommentResult<TRow> representing the result.

An instance of TRow is always obtained from the configured ITypeDescriber on each call. To avoid creating excess rows, row instances can be reused with TryReadWithCommentReuse(ref TRow).

TryReadWithCommentReuse(ref TRow)

This method attempts to read a row or a comment, returning a ReadWithCommentResult<TRow> representing the result. If a row is read it will be stored into the value passed as a ref parameter if it is non-null, otherwise an instance of TRow will be obtained from the configured ITypeDescriber.

Asynchronous Reading

These methods do not block, if at any point data is not available they will yield control back to the calling thread and return a ValueTask or ValueTask<TRow>. If it is possible to complete without yielding control, these methods will complete synchronously with only the minor overhead imposed by return ValueTask.

Every method takes an optional CancellationToken, which is checked periodically to see if cancellation is requested. Cancellation leaves the reader in an exceptional state, it is not legal to resume using a reader post-cancellation.

This is the preferred interface for most development scenarios, especially those (like web development) where blocking threads can lead to serious issues. However, if your use case is naturally synchronous or you know that all data will be available without blocking then the synchronous interface is more efficient.

EnumerateAllAsync(CancellationToken)

This method returns an IAsyncEnumerable<TRow> which can be used to create a IAsyncEnumerator<T>, each call to MoveNextAsync() only consuming as much data as is needed to return the next row. Enumerating multiple times is not supported.

Typically, you will want to use this method with await foreach rather than consuming it directly.

ReadAllAsync(CancellationToken)

This method reads all rows into a List<TRow>, and returns a ValueTask<List<TRow>> wrapping it. When the task completes, the list is fully populated.

This method is conceptually equivalent to calling ReadAllAsync(new List<TRow>(), CancellationToken)

ReadAllAsync(TCollection, CancellationToken)

This methods reads all rows into a provided implementation of ICollection<TRow>, and returns a ValueTask<TCollection> wrapping it. When the task completes, the list is fully populated.

Modifying the provided collection while the returned task has not yet completed may result in undefined behavior.

TryReadAsync(CancellationToken)

This method attempts to read a single row, returning a ValueTask<ReadResult<TRow>> representing the result.

An instance of TRow is always obtained from the configured ITypeDescriber on each call. To avoid creating excess rows, row instances can be reused with TryReadWithReuseAsync(ref TRow, CancellationToken).

TryReadWithReuseAsync(ref TRow, CancellationToken)

This method attempts to read a single row, returning a ValueTask<ReadResult<TRow>> representing the result. If a row is read it will be stored into the value passed as a ref parameter if it is non-null, otherwise an instance of TRow will be obtained from the configured ITypeDescriber.

Modifying or re-using the instance, if any, the ref parameter was pointing to while the task has not yet completed may result in undefined behavior.

TryReadWithCommentAsync(CancellationToken)

This method attempts to read a row or a comment, returning a ValueTask<ReadWithCommentResult<TRow>> representing the result.

An instance of TRow is always obtained from the configured ITypeDescriber on each call. To avoid creating excess rows, row instances can be reused with TryReadWithCommentReuseAsync(ref TRow, CancellationToken).

TryReadWithCommentReuseAsync(ref TRow, CancellationToken)

This method attempts to read a row or a comment, returning a ValueTask<ReadWithCommentResult<TRow>> representing the result. If a row is read it will be stored into the value passed as a ref parameter if it is non-null, otherwise an instance of TRow will be obtained from the configured ITypeDescriber.

Modifying or re-using the instance, if any, the ref parameter was pointing to while the task has not yet completed may result in undefined behavior.

Disposing

Reading can block, or return tasks that do not complete in the asynchronous case, until the underlying data stream completes - and accordingly readers take conceptual ownership of the underlying data stream. This means that calling Dispose() (on IReader<TRow>) or DisposeAsync() (on IAsyncReader<TRow>) will invoke the equivalent method on the underlying data stream.

If TRow is bound to dynamic, disposing may result in all returned rows also being disposed - this behavior is controlled by the used Options.

Typically, disposing will be handled with either using or await using statements rather than direct calls to the appropriate methods.

Thread Safety

No methods exposed by any reader are thread safe, invoking any of them simultaneous may result in undefined behavior. That said, it is legal to change the thread invoking methods provided there is no overlap in invocation.

As all async methods return a ValueTask or ValueTask<TRow>, it is illegal to await or invoke AsTask() on their returns multiple times.