Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

q param in quantiles #278

Closed
soxofaan opened this issue Sep 21, 2021 · 6 comments · Fixed by #294
Closed

q param in quantiles #278

soxofaan opened this issue Sep 21, 2021 · 6 comments · Fixed by #294
Assignees
Milestone

Comments

@soxofaan
Copy link
Member

"name": "q",
"description": "Intervals to calculate quantiles for. Calculates q-quantiles with (nearly) equal-sized intervals.",
"schema": {
"type": "integer",
"minimum": 2
},

I just noticed we have this q parameter in quantiles.
A couple of remarks:

  • this q is a pretty short short name and not very self-explanatory. We probably refer to the statistical concept of "q-quantiles", but numpy,quantile has a q param and that one is equivalent to the probabilities we already have, so that might be confusing
  • " (nearly) equal-sized intervals" : why "nearly"? It's confusing to specify it like that at the moment. If you talk about cutting a "probability distribution" in "intervals", they can be exactly the same size I think. If we would talk about splitting a set of observations in subsets then it makes more sense to state they are nearly same size.
  • "q: intervals to calculate quantiles for": I would add "number of intervals" for clarity
@soxofaan
Copy link
Member Author

On a more general note: I find the probabilistic definition of this process a bit strange (parameter probabilities, talking about "probability distribution"). We don't work here with random/unknown variables, it's all fully defined observation data.
For example: https://numpy.org/doc/stable/reference/generated/numpy.quantile.html does not mention something like "probability"

@m-mohr m-mohr self-assigned this Oct 7, 2021
@m-mohr m-mohr added this to the 1.2.0 milestone Oct 7, 2021
@m-mohr
Copy link
Member

m-mohr commented Oct 25, 2021

Hmm, this process was one of the first ones I defined and nowadays I'd actually split this process into two processes and then, I think, most issues could be solved a bit easier. To avoid breaking changes we can only improve the documentation, but can't really change the names. The naming for statistical processes is more aligned with R, that's why this time it's a bit more confusing for Python users it seems. Thanks for the suggestions, I'll try to come up with a reasonable PR.

@m-mohr
Copy link
Member

m-mohr commented Oct 26, 2021

  • " (nearly) equal-sized intervals" : why "nearly"? It's confusing to specify it like that at the moment. If you talk about cutting a "probability distribution" in "intervals", they can be exactly the same size I think.

I can't recall why this is mentioned here, I've tried to consult my sources when writing the process. I guess one of the implementations I've looked at mentioned it so I left it in brackets to get a bigger coverage of tools that can implement the process, but I can't find it anymore and it is indeed not helpful for implementations to leave this ambiguous. So I'll remove this.

@m-mohr
Copy link
Member

m-mohr commented Oct 26, 2021

  • this q is a pretty short short name and not very self-explanatory. We probably refer to the statistical concept of "q-quantiles", but numpy,quantile has a q param and that one is equivalent to the probabilities we already have, so that might be confusing

I've opened a separate issue as this is breaking by itself, but we could work around it for now. For details see #293 and let me know there whether the non-breaking behavior should be implemented already in 1.x

@m-mohr
Copy link
Member

m-mohr commented Oct 26, 2021

PR is available for review: #294

@m-mohr
Copy link
Member

m-mohr commented Oct 27, 2021

All issues have been solved or are now part of a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants