Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AvgTests error on -0.0 avg #113272

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Sep 20, 2024

Fixes #113225

@ivancea ivancea added >test Issues or PRs that are addressing/adding tests Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.16.0 v9.0.0 labels Sep 20, 2024
@ivancea ivancea requested a review from a team as a code owner September 20, 2024 14:31
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@not-napoleon
Copy link
Member

Maybe I'm reading this wrong, but it looks to me like this will accept -0.0 as the correct output. We really shouldn't be returning negative zeros from any operations. (In fact, probably AbstractAggregatorTestCase#extractResultsFromAggregator should assert that we never return a forbidden double value).

@ivancea
Copy link
Contributor Author

ivancea commented Sep 25, 2024

Maybe I'm reading this wrong, but it looks to me like this will accept -0.0 as the correct output. We really shouldn't be returning negative zeros from any operations.

@not-napoleon TestCaseSupplier provides -0.0 values, so if it can be an input, it should also be an output.

In ES, I see this: https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html#_which_type_should_i_use (Just above that anchor)

The double, float and half_float types consider that -0.0 and +0.0 are different values.

So if ES also allows it, I would allow it too. And for this case, the avg of -0.0 should be -0.0 too (IMO) to be as precise as possible.

I don't have an specific opinion on this topic, I'm just basing my thought process on what we currently have (And compatibility with ES)

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be as precise as possible.

It's not more precise, it's just more confusing. The IEEE spec for floating points says that the two zeros should be considered equal when possible. Java honors this with it's binary comparison operators (== and friends), but not with Double.equals(). You can read the docs on the double class to get an understanding for why this is the case. Practically, this means that if we start allowing negative zeros in the language, we need to be VERY clear on where they compare via == and where they compare via Double.equals() or Number.equals(). It becomes extremely trappy for users if that behavior is not entirely consistent.

We made the decision early on to not allow -0.0, positive or negative infinity, or NaN as possible values in ESQL. We should be consistent with that here. /cc @costin

@not-napoleon
Copy link
Member

so if it can be an input, it should also be an output.

It's an input on the test precisely to test that it isn't an output.

@nik9000
Copy link
Member

nik9000 commented Sep 27, 2024

We made the decision early on to not allow -0.0, positive or negative infinity, or NaN as possible values in ESQL. We should be consistent with that here. /cc @costin

I think we should be consistent with this decision. Our functions should never receive any of those forbidden doubles. We could (should?) add an assertion to the generated code that reads doubles for EVAL and STATS that asserts that we don't receive those numbers. They should always come in as null. And they should always be emitted as null.

ES might support these but they are all null to ESQL. One day we may change this but that's a difficult thing to change at the moment.

@not-napoleon
Copy link
Member

ES might support these but they are all null to ESQL. One day we may change this but that's a difficult thing to change at the moment.

Well, the infinities and NaN are null; -0.0 is 0.

@ivancea
Copy link
Contributor Author

ivancea commented Sep 30, 2024

Ok, so:

  • To ensure it's not an output, I'll add a check to tests like we have to NaN and infinities
  • I'll revert the changes on AvgTests for now

Then, if we have to keep the -0.0 as a potential input in tests, but forbid it as an output, as many functions are currently returning it (Max, MvMax, Min, Values...), I'll fix them by doing a conversion (-0.0 -> 0.0).
I could add it in the agg/evaluator implementers, but it would affect even functions never emitting -0.0. So I'll see first how many functions are failing with this

@ivancea ivancea marked this pull request as draft October 2, 2024 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] AvgTests testFold {TestCase=<double> #2} failing
4 participants