Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for getproperty broadcasting #2655

Merged
merged 10 commits into from
Mar 25, 2021
Merged

add support for getproperty broadcasting #2655

merged 10 commits into from
Mar 25, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Mar 12, 2021

Now we will be able to do:

df.a .= 1

even if :a is not present in df.

The only decision to be made is if we want to allow:

dfv.a .= 1 if dfv is a SubDataFrame. Currently this is allowed, but it is inconsistent with dfv[!, :a] .= 1 so I would disallow it. The problem is that then DataFrames.jl would work differently under Julia 1.7 and prior to 1.7 in this area - thus it is a hard decision to make.

@bkamins bkamins added breaking The proposed change is breaking. decision labels Mar 12, 2021
@bkamins bkamins added this to the 1.0 milestone Mar 12, 2021
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; I agree with disallowing for now; better to error for now and we can allow things later.

@bkamins
Copy link
Member Author

bkamins commented Mar 12, 2021

The problem is that it works under Julia 1.6 although we wanted to disallow it, but we could not because of the limitations of Julia Base.

@bkamins
Copy link
Member Author

bkamins commented Mar 12, 2021

todo: tests and docs need an update after @nalimilan comments

@bkamins
Copy link
Member Author

bkamins commented Mar 12, 2021

@nalimilan - I have summarized the consequences of this PR in the docstrings. It is super unfortunate we cannot make it consistent. What do you think we should do?

@bkamins
Copy link
Member Author

bkamins commented Mar 12, 2021

Unfortunately following the comment by @KristofferC in JuliaLang/julia#39473 (comment) the change will not go into 1.6 LTS.

@bkamins bkamins linked an issue Mar 13, 2021 that may be closed by this pull request
@bkamins
Copy link
Member Author

bkamins commented Mar 13, 2021

I really have a mental problem here. If we introduce this PR now then DataFrames.jl will have a different behavior under different versions of Julia and we will not be able to even give a warning.

Maybe a lesser evil is only to add support for creating columns via df.a .= 1 and leave the rest of the functionality "as is" and only print deprecation warning in Julia 1.7 that things will change in the future (before Julia 1.7 it is even impossible to print a deprecation warning in this case).

@pdeffebach
Copy link
Contributor

This is exciting! I didn't realize this would get added so soon.

I am fine with disallowing for a while. It's a "nice to have" and I'm not sure how many users will be burned by this, at least until we get a new LTS and can be more aggressive with version requirements.

@bkamins
Copy link
Member Author

bkamins commented Mar 14, 2021

The problem is that new LTS will be 1.6 most likely. And in 1.6 this feature will not be available, but only in 1.7.

Just to clarify the issue: In Julia 1.6 and earlier we have no control over df.a .= 1 as it gets transformed before it gets passed to DataFrames.jl. Therefore we have two roads:

  1. have Julia 1.7 and Julia 1.6 have different behaviors, but the behavior in Julia 1.7 would be a correct one (this might be acceptable as as @pdeffebach noted - the cases when behaviors differ are super rare - essentially broadcasting to SubDataFrame and when you would want to change eltype of a column in DataFrame); please vote 🚀 for this option under this post
  2. have Julia 1.7 copy the incorrect behavior of Julia 1.6 with a deprecation warning + only allow df.a .= 1 for new columns in Julia 1.7 (which is an error in Julia 1.6 so it is not a huge problem) and make the change in some future release of Julia that becomes LTS which would support getproperty broadcasting (probably 2.0); please vote 👀 for this option under this post

@nalimilan
Copy link
Member

I'd say option 2 is safer, especially since we're likely to release DataFrames 1.0 before Julia 1.7: people won't have had the occasion to test their code against Julia 1.7. But I think we should feel free to remove the deprecation warning and switch to the new, correct behavior relatively quickly, i.e. even before we drop support for Julia 1.6 (anyway the behavior won't change there). Otherwise I'd rather take option 1.

@bkamins
Copy link
Member Author

bkamins commented Mar 14, 2021

OK. I will go for option 2 then and add comments in the code what has to be changed.

To be clear, what is incorrect in Julia 1.6 are the following things (in general the incorrect things are where getproperty now works differently than ! row selector):

  • sub_data_frame.col .= 1 is allowed and is in-place (technically this should not be allowed and throw an error and we should require sub_data_frame[:, :col] .= 1 for this case); the reason is that getproperty should match ! column selector which replaces columns (and thus is disallowed for SubDataFrame)
  • if col is a vector of e.g. Int then data_frame.col .= 'a' works but converts a to integer and stores it in-place in col; what it should do is to replace the column and create column of Char filled with a.

As noted above it is impossible to fix this behavior in Julia 1.6 because the resolution of the operation happens at parse time before DataFrames.jl mechanics is able to intervene and change this behavior.

src/other/broadcasting.jl Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
docs/src/lib/indexing.md Outdated Show resolved Hide resolved
test/broadcasting.jl Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Member Author

bkamins commented Mar 25, 2021

@nalimilan - OK to merge?

@bkamins bkamins merged commit 34307bc into main Mar 25, 2021
@bkamins bkamins deleted the bk/dotgetproperty branch March 25, 2021 19:37
@bkamins
Copy link
Member Author

bkamins commented Mar 25, 2021

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking The proposed change is breaking. decision
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use dotgetproperty
4 participants