fix(tod.concatenate): Convert index map to unicode. #201

tristpinsm · 2022-06-21T00:13:19Z

No description provided.

jrs65 · 2022-06-29T08:23:47Z

@tristpinsm do we really need to deal with the case where the concatenation axis has a string type? I struggle to see when a time like axis that is fine for concatenation would have a string type.

jrs65 · 2022-06-29T08:24:10Z

That would obviously simplify this PR as you could remove two of the chunks.

tristpinsm · 2022-06-29T17:37:58Z

@tristpinsm do we really need to deal with the case where the concatenation axis has a string type? I struggle to see when a time like axis that is fine for concatenation would have a string type.

Yeah, I dunno. If you did pass it a dataset that fits that description with convert_dataset_strings=True wouldn't you expect it to do the conversion? I also can't think of an example where that happens but I don't see any reason why it's impossible.

In any case I'm fine with removing that part, just let me know.

jrs65

This looks good to go for me. I think in the end it makes sense to keep the general behaviour for concatenating along string-types axes, there isn't anything which restricts it in the current implementation and it could be nice in the future.

jrs65 · 2022-06-30T09:18:52Z

caput/tod.py

@@ -350,13 +350,21 @@ def dataset_filter(d):
    for axis, index_map in first_data.index_map.items():
        if axis in concatenation_axes:
            # Initialize the dataset.
-            dtype = index_map.dtype
+            if convert_dataset_strings and memh5.has_bytestring(index_map.dtype):


In theory you could remove the memh5.has_bytestring clause as the conversion should be a no-op on non-bytestring types. Not sure if it's really worth it though.

I was going to say that at least leaving the clause in would save it some work on non- string types, but given that's not true as .has_bytestring does a very similar recursive crawl through the dtype structure, so you don't actually save anything. So probably remove it to simplify the implementation.

jrs65 · 2022-10-11T08:39:32Z

I think it might rely on the axis being sanely sortable.

…

On Wed, Jun 29, 2022 at 7:38 PM tristpinsm ***@***.***> wrote: @tristpinsm <https://github.com/tristpinsm> do we really need to deal with the case where the concatenation axis has a string type? I struggle to see when a time like axis that is fine for concatenation would have a string type. Yeah, I dunno. If you did pass it a dataset that fits that description with convert_dataset_strings=True wouldn't you expect it to do the conversion? I also can't think of an example where that happens but I don't see any reason why it's impossible. In any case I'm fine with removing that part, just let me know. — Reply to this email directly, view it on GitHub <#201 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFJXJQTUAG76DFYXZSNSHDVRSCYDANCNFSM5ZKN77AQ> . You are receiving this because your review was requested.Message ID: ***@***.***>

tristpinsm requested a review from jrs65 June 21, 2022 00:13

tristpinsm mentioned this pull request Jun 21, 2022

[WIP] Support distributed read for other acquisition files chime-experiment/ch_util#30

Open

jrs65 approved these changes Jun 30, 2022

View reviewed changes

tristpinsm force-pushed the tpm/unicode_im branch from d825b5b to 06596db Compare June 30, 2022 17:02

fix(tod.concatenate): Convert index map to unicode.

73d8188

tristpinsm force-pushed the tpm/unicode_im branch from 06596db to 73d8188 Compare June 30, 2022 17:59

tristpinsm merged commit 33316b2 into master Jun 30, 2022

tristpinsm deleted the tpm/unicode_im branch June 30, 2022 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tod.concatenate): Convert index map to unicode. #201

fix(tod.concatenate): Convert index map to unicode. #201

tristpinsm commented Jun 21, 2022

jrs65 commented Jun 29, 2022

jrs65 commented Jun 29, 2022

tristpinsm commented Jun 29, 2022

jrs65 left a comment

jrs65 Jun 30, 2022

jrs65 Jun 30, 2022

jrs65 commented Oct 11, 2022 via email

fix(tod.concatenate): Convert index map to unicode. #201

fix(tod.concatenate): Convert index map to unicode. #201

Conversation

tristpinsm commented Jun 21, 2022

jrs65 commented Jun 29, 2022

jrs65 commented Jun 29, 2022

tristpinsm commented Jun 29, 2022

jrs65 left a comment

Choose a reason for hiding this comment

jrs65 Jun 30, 2022

Choose a reason for hiding this comment

jrs65 Jun 30, 2022

Choose a reason for hiding this comment

jrs65 commented Oct 11, 2022 via email