Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eltype Union{Missing, T} gets treated as categorical? Or something... #4

Closed
kescobo opened this issue Jun 7, 2022 · 2 comments
Closed

Comments

@kescobo
Copy link
Contributor

kescobo commented Jun 7, 2022

In the following example, x and xm are identical, except that x is Vector{Float64} and xm is Vector{Union{Missing, Float64}}. I remember Makie used to have a similar problem, where the union eltype wouldn't get plotted as continuous.

julia> x = rand(100);

julia> xm = Union{Missing, Float64}[x...];

julia> df = DataFrame(x = x, xm = xm);

julia> y = rand(100, 5);

julia> permanova(df, y, BrayCurtis, @formula(1~x))

         | Df | SumOfSqs |  R²   |   F   |   P
-------------------------------------------------
       x |  1 |    0.014 | 0.002 | 0.211 | 0.897
Residual | 98 |    6.664 | 0.998 |       |
   Total | 99 |    6.678 |     1 |       |


julia> permanova(df, y, BrayCurtis, @formula(1~xm))

         | Df |  SumOfSqs  |  R²   |   F    |   P
----------------------------------------------------
      xm | 99 |      6.678 | 1.000 | -0.000 | 0.995
Residual |  0 | -1.561e-17 | 0.000 |        |
   Total | 99 |      6.678 |     1 |        |
@EvoArt EvoArt mentioned this issue Jun 7, 2022
@EvoArt
Copy link
Owner

EvoArt commented Jun 7, 2022

Thanks for that. I've patched it up here #5 but to be honest, I need to sit down and have a proper think about the different data types people will be using etc.

julia> x = rand(100);

julia> xm = Union{Missing, Float64}[x...];

julia> df = DataFrame(x = x, xm = xm);

julia> y = rand(100, 5);

julia> permanova(df, y, BrayCurtis, @formula(1~x))

         | Df | SumOfSqs ||   F   |   P   
-------------------------------------------------
       x |  1 |    0.091 | 0.013 | 1.273 | 0.313
Residual | 98 |    7.028 | 0.987 |       |
   Total | 99 |    7.119 |     1 |       |


julia> permanova(df, y, BrayCurtis, @formula(1~xm))

         | Df | SumOfSqs ||   F   |   P   
-------------------------------------------------
      xm |  1 |    0.091 | 0.013 | 1.273 | 0.282
Residual | 98 |    7.028 | 0.987 |       |
   Total | 99 |    7.119 |     1 |       |


julia> df.xm[2] = missing
missing

julia> permanova(df, y, BrayCurtis, @formula(1~xm))
┌ Warning: 1 data row(s) dropped due to missing values.
└ @ PERMANOVA C:\Users\arn203\.julia\dev\PerMANOVA\src\perm2.jl:90

         | Df | SumOfSqs ||   F   |   P   
-------------------------------------------------
      xm |  1 |    0.092 | 0.013 | 1.275 | 0.300
Residual | 97 |    7.000 | 0.987 |       |
   Total | 98 |    7.092 |     1 |       |

@kescobo
Copy link
Contributor Author

kescobo commented Jun 8, 2022

Nice! Also nice to handle the case where there are actual missings, I've been filtering those manually

@kescobo kescobo closed this as completed Jun 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants