Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add arrow support for timestamp and bytea #724

Merged
merged 3 commits into from
Mar 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions schema/bytea.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package schema

import (
"bytes"
"encoding/base64"
"encoding/hex"
)

Expand Down Expand Up @@ -41,7 +42,7 @@ func (dst *Bytea) Equal(src CQType) bool {

func (dst *Bytea) String() string {
if dst.Status == Present {
return hex.EncodeToString(dst.Bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we use this String method? Is this not a breaking change as we return a different format of the string?
Trying to understand the impact of this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is a breaking change (I did add a way for us to support both conversion in Set) but yes moving forward we will use base64 encoding for bytea to work like arrow. I can add a new type but I don't think we have anything that depends on that.

return base64.StdEncoding.EncodeToString(dst.Bytes)
} else {
return ""
}
Expand Down Expand Up @@ -72,7 +73,11 @@ func (dst *Bytea) Set(src any) error {
b := make([]byte, hex.DecodedLen(len(value)))
_, err := hex.Decode(b, []byte(value))
if err != nil {
return &ValidationError{Type: TypeByteArray, Msg: "hex decode failed", Err: err, Value: value}
b = make([]byte, base64.StdEncoding.DecodedLen(len(value)))
_, err := base64.StdEncoding.Decode(b, []byte(value))
if err != nil {
return &ValidationError{Type: TypeByteArray, Msg: "base64 and hex decode failed", Err: err, Value: value}
}
}
*dst = Bytea{Status: Present, Bytes: b}
} else {
Expand Down
2 changes: 1 addition & 1 deletion schema/resource_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -438,7 +438,7 @@ var calculateCQIDPrimaryKeyTestCases = []struct {
"MacAddressValue": "aa:bb:cc:dd:ee:ff",
"MacAddressArrayValue": []string{"aa:bb:cc:dd:ee:ff", "11:22:33:44:55:66"},
},
ExpectedValue: &UUID{Bytes: [16]uint8{0xa6, 0x76, 0x76, 0xfb, 0x4c, 0x95, 0x51, 0x2d, 0x9e, 0xcd, 0xf4, 0xc4, 0xae, 0xc1, 0x2a, 0xf5}, Status: 0x2},
ExpectedValue: &UUID{Bytes: [16]uint8{0x82, 0xaa, 0xa2, 0xc7, 0x75, 0x10, 0x5c, 0x6d, 0xb8, 0xef, 0x18, 0x49, 0x33, 0x99, 0x4a, 0x9d}, Status: 0x2},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is backwards compatible I would expect existing tests to pass. Is this related to the change in testdata/testdata.go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is backward compat the the binary can read older encoding but the encoding for the binary types moving forward will be base64. I don't think it should impact anything though.

DeterministicCQID: true,
},
}
Expand Down
8 changes: 8 additions & 0 deletions schema/timestamptz.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ import (
// this is the default format used by time.Time.String()
const defaultStringFormat = "2006-01-02 15:04:05.999999999 -0700 MST"

// this is used by arrow string format (time is in UTC)
const arrowStringFormat = "2006-01-02 15:04:05.999999999"

// const microsecFromUnixEpochToY2K = 946684800 * 1000000

const (
Expand Down Expand Up @@ -170,6 +173,11 @@ func (dst *Timestamptz) DecodeText(src []byte) error {
*dst = Timestamptz{Time: tim.UTC(), Status: Present}
return nil
}
tim, err = time.Parse(arrowStringFormat, sbuf)
if err == nil {
*dst = Timestamptz{Time: tim.UTC(), Status: Present}
return nil
}
return &ValidationError{Type: TypeTimestamp, Msg: "cannot parse timestamp", Value: sbuf, Err: err}
}

Expand Down
2 changes: 1 addition & 1 deletion testdata/testdata.go
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ func GenTestData(table *schema.Table) schema.CQTypes {
data[i] = uuidArrayColumn
case schema.TypeInet:
inetColumn := &schema.Inet{}
if err := inetColumn.Set("192.0.2.1/24"); err != nil {
if err := inetColumn.Set("192.0.2.0/24"); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's because the bitmask filters out the final .1, so it amounts to the same, but if Arrow rejects this type of IP we may need to handle it better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 192.0.2.1./24 is basically the same as 192.0.2.0/24 after applying the mask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yevgenypats Right, but was this change necessary for the tests to pass, or was it just some cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe Im confusing something about inet vs cidr . but I did need it for a test to pass because our own arrow inet converted it to 192.0.2.0/24 but I think this happened just from the original IPNet.String conversion so I suspect it should affect anything but maybe I missed something and didn't get to the root cause of this, not sure.

panic(err)
}
data[i] = inetColumn
Expand Down