Add missing data types to IngestDocument deep copy #14380

andrross · 2024-06-15T18:35:29Z

PR #11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure.

Resolves #14379

Check List

Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves opensearch-project#14379 Signed-off-by: Andrew Ross <andrross@amazon.com>

github-actions · 2024-06-15T18:43:34Z

❌ Gradle check result for dc0ccdc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2024-06-15T19:36:09Z

❕ Gradle check result for 40b9e07: UNSTABLE

TEST FAILURES:

      1 org.opensearch.gateway.RecoveryFromGatewayIT.testMultipleReplicaShardAssignmentWithDelayedAllocationAndDifferentNodeStartTimeInBatchMode

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

codecov · 2024-06-15T19:38:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.68%. Comparing base (b15cb0c) to head (40b9e07).
Report is 433 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #14380      +/-   ##
============================================
+ Coverage     71.42%   71.68%   +0.26%     
- Complexity    59978    62066    +2088     
============================================
  Files          4985     5118     +133     
  Lines        282275   291829    +9554     
  Branches      40946    42188    +1242     
============================================
+ Hits         201603   209190    +7587     
- Misses        63999    65304    +1305     
- Partials      16673    17335     +662

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vikasvb90 · 2024-06-16T03:45:19Z

PR #11725 added a new deep copy in the ScriptProcessor flow.

No, it did not. If you look at the code again, you will see that the PR only reused an existing deepCopyMap. This means that the issue has been there since the introduction of Byte data type in processor which wasn't reflected in deep copy map and surprisingly there were not tests to validate a byte field ingest document in script processor.

Without my PR, the code will still break in toXContent in WriteableIngestDocument in simulate flow.

andrross · 2024-06-16T15:01:14Z

PR #11725 added a new deep copy in the ScriptProcessor flow.

No, it did not. If you look at the code again

@vikasvb90 My initial read of the code, and all subsequent readings, suggest that a new invocation of deep copy was added in the ScriptProcessor code flow that was not there before, which is why this fix is needed now. I'm happy to update the description if that is incorrect. I would also appreciate a review of this code so I can address any comments. Thanks!

vikasvb90 · 2024-06-16T15:08:48Z

@andrross Yes, the invocation is new in ingest flow. It wasn't there before. I thought you meant that I added the deep copy method itself. The code looks fine to me and I tested the rest tests myself. There are 2 things though which we can probably take in a separate PR as well.

We may need tests for all data types on top of deepCopy method.
Char data type will still not be fixed with this I think right? I can try digging in for it.

andrross · 2024-06-16T16:15:40Z

We may need tests for all data types on top of deepCopy method.

I intended to do this with the new IngestDocumentTests::testCopy method. Let me know if you think we need more coverage of that method.

Char data type will still not be fixed with this I think right?

Correct. As far as I can tell this was broken in all previous versions as well. I opened #14382 to track it separately.

Thanks @vikasvb90!

PR #11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves #14379 Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 112704b) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

PR #11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves #14379 (cherry picked from commit 112704b) Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…t#14380) PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves opensearch-project#14379 Signed-off-by: Andrew Ross <andrross@amazon.com>

…t#14380) (opensearch-project#14413) PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves opensearch-project#14379 (cherry picked from commit 112704b) Signed-off-by: Andrew Ross <andrross@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: kkewwei <kkewwei@163.com>

…t#14380) PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression. However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure. Resolves opensearch-project#14379 Signed-off-by: Andrew Ross <andrross@amazon.com>

andrross requested review from anasalkouz, Bukhtawar, CEHENKLE, dblock, dbwiddis, dreamer-89, gbbafna, kotwanikunal, mch2, msfroh, nknize, owaiskazi19, reta, Rishikesh1159, sachinpkale, saratvemulapalli, shwetathareja, sohami, tlfeng and VachaShah as code owners June 15, 2024 18:35

github-actions bot added bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing labels Jun 15, 2024

andrross force-pushed the ingest-script-data-types branch from dc0ccdc to 40b9e07 Compare June 15, 2024 18:40

andrross added the backport 2.x Backport to 2.x branch label Jun 15, 2024

opensearch-ci-bot mentioned this pull request Jun 16, 2024

[AUTOCUT] Gradle Check Flaky Test Report for RecoveryFromGatewayIT #14304

Open

kotwanikunal approved these changes Jun 17, 2024

View reviewed changes

andrross merged commit 112704b into opensearch-project:main Jun 17, 2024
33 of 34 checks passed

andrross deleted the ingest-script-data-types branch June 17, 2024 20:33

opensearch-trigger-bot bot mentioned this pull request Jun 17, 2024

[Backport 2.x] Add missing data types to IngestDocument deep copy #14413

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing data types to IngestDocument deep copy #14380

Add missing data types to IngestDocument deep copy #14380

andrross commented Jun 15, 2024

github-actions bot commented Jun 15, 2024

github-actions bot commented Jun 15, 2024

codecov bot commented Jun 15, 2024

vikasvb90 commented Jun 16, 2024 •

edited

Loading

andrross commented Jun 16, 2024

vikasvb90 commented Jun 16, 2024

andrross commented Jun 16, 2024

Add missing data types to IngestDocument deep copy #14380

Add missing data types to IngestDocument deep copy #14380

Conversation

andrross commented Jun 15, 2024

Check List

github-actions bot commented Jun 15, 2024

github-actions bot commented Jun 15, 2024

codecov bot commented Jun 15, 2024

Codecov Report

vikasvb90 commented Jun 16, 2024 • edited Loading

andrross commented Jun 16, 2024

vikasvb90 commented Jun 16, 2024

andrross commented Jun 16, 2024

vikasvb90 commented Jun 16, 2024 •

edited

Loading