-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(tod.concatenate): Convert index map to unicode. #201
Conversation
@tristpinsm do we really need to deal with the case where the concatenation axis has a string type? I struggle to see when a time like axis that is fine for concatenation would have a string type. |
That would obviously simplify this PR as you could remove two of the chunks. |
Yeah, I dunno. If you did pass it a dataset that fits that description with In any case I'm fine with removing that part, just let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to go for me. I think in the end it makes sense to keep the general behaviour for concatenating along string-types axes, there isn't anything which restricts it in the current implementation and it could be nice in the future.
caput/tod.py
Outdated
@@ -350,13 +350,21 @@ def dataset_filter(d): | |||
for axis, index_map in first_data.index_map.items(): | |||
if axis in concatenation_axes: | |||
# Initialize the dataset. | |||
dtype = index_map.dtype | |||
if convert_dataset_strings and memh5.has_bytestring(index_map.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory you could remove the memh5.has_bytestring
clause as the conversion should be a no-op on non-bytestring types. Not sure if it's really worth it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say that at least leaving the clause in would save it some work on non- string types, but given that's not true as .has_bytestring
does a very similar recursive crawl through the dtype structure, so you don't actually save anything. So probably remove it to simplify the implementation.
d825b5b
to
06596db
Compare
06596db
to
73d8188
Compare
I think it might rely on the axis being sanely sortable.
…On Wed, Jun 29, 2022 at 7:38 PM tristpinsm ***@***.***> wrote:
@tristpinsm <https://github.com/tristpinsm> do we really need to deal
with the case where the concatenation axis has a string type? I struggle to
see when a time like axis that is fine for concatenation would have a
string type.
Yeah, I dunno. If you did pass it a dataset that fits that description
with convert_dataset_strings=True wouldn't you expect it to do the
conversion? I also can't think of an example where that happens but I don't
see any reason why it's impossible.
In any case I'm fine with removing that part, just let me know.
—
Reply to this email directly, view it on GitHub
<#201 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFJXJQTUAG76DFYXZSNSHDVRSCYDANCNFSM5ZKN77AQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
No description provided.