Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first attempt at resolving nil dereference errors #439

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

shlomi-dr
Copy link

@shlomi-dr shlomi-dr commented Jan 9, 2022

I was battling some nil dereference that was concealed deep in the library. I added a bunch of debug.Stack() calls to retain the original stacktrace, and found that in the following section pT is some times nil:

parquet-go/common/common.go

Lines 667 to 671 in 5f74d27

func FindFuncTable(pT *parquet.Type, cT *parquet.ConvertedType, logT *parquet.LogicalType) FuncTable {
if cT == nil && logT == nil {
if *pT == parquet.Type_BOOLEAN {
return boolFuncTable{}
} else if *pT == parquet.Type_INT32 {

Not sure why that happens, see below comment for proposed fix

Comment on lines +305 to +308
if table.Schema.Type != nil {
pagesMapList[index][name], _ = layout.TableToDataPages(table, int32(pw.PageSize),
pw.CompressionType)
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to wrap the invocation with a nil guard, and that seems to have helped.

Ran the test suite and it looks fine.

Is this correct? Am I missing something?

@shlomi-dr shlomi-dr force-pushed the shlomi/nil-dereference branch 2 times, most recently from a60aec2 to 5f74d27 Compare January 9, 2022 23:44
@hangxie
Copy link
Contributor

hangxie commented Jan 30, 2022

This PR is related to #395 and I'd suggest fix it like #407, which changes function signature to return (value, error) instead value only, this is a typical go way to my understanding. Once the function can return error, one can check pT before dereference it, and return error to notify callers.

The preferred way may introduce large scale refactor though.

zolstein pushed a commit to zolstein/parquet-go that referenced this pull request Jun 23, 2023
* better estimation of the size of decode output buffers

* detect whether some encodings support decoding values in place

* fix page size

* chunk large writes of rows

* support configuring the sorting buffer pool

* debug page buffer ref count

* track stack traces of buffer allocations

* fix buffer reference counting

* fix backward compatiblity with Go 1.17

* set PARQUETGODEBUG=tracbuf=1 in CI

* reduce memory waste of row buffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants