Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine selection subtraction #3504

Open
mreppen opened this issue May 25, 2020 · 16 comments
Open

Combine selection subtraction #3504

mreppen opened this issue May 25, 2020 · 16 comments

Comments

@mreppen
Copy link
Contributor

mreppen commented May 25, 2020

Feature

The possibility to subtract one selection from another in <a-z> aka combine selections from register.

Usecase

Creating more complex selections. Subtraction is a primitive and becomes very powerful together with the other selection tools.

PR?

I have a proof of concept in https://github.com/mreppen/kakoune/tree/subtract . Please excuse the coding/style, as I am not a C.*-programmer. I would gladly work it into a PR though. Main thing missing for a correctly working version is moving some BufferCoords one step backward/forward.

@Delapouite
Copy link
Contributor

Hi @mreppen , thanks for bringing that up.

In this topic, @alexherbo2 was talking about a "difference" operator: #2850 . Is the "subtraction" you propose related or another concept?

Would you mind sharing a short example of manipulation starting from a buffer with lorem ipsum content?

@mreppen
Copy link
Contributor Author

mreppen commented May 25, 2020

Hi @Delapouite, for some reason I had missed that issue despite looking. I had searched for "difference" in the issues, but named it subtract to not confuse with symmetric difference. I would like to point out that along with <a-z>a (which is what I would call a union) both (set) intersection and symmetric difference can be constructed.

Here is an example with lorem ipsum, one line per sentence (start with a paragraph, then %s(?<=\.) c<ret>). I copy only the first line here for brevity. The subtraction/difference is <a-z>s

sed -e "s/^.*/exec -client client0 '\0'/"  << EOF |
gk100C
10L"aZ
4h10L"bZ
"a<a-z>a"uZ
"a<a-z>s"lZ
"uz"b<a-z>s"rZ
"uz"l<a-z>s"r<a-z>s"iZ
"lz"r<a-z>a"dZ
EOF
kak -p $kak_session

a holds [Lorem ipsum] dolor sit amet, ... (A)
b holds Lorem [ipsum dolor] sit amet, ... (B)
u holds [Lorem ipsum dolor] sit amet, ... (A∪B)
l holds [Lorem ]ipsum dolor sit amet, ... (A\B)
r holds Lorem ipsum[ dolor] sit amet, ... (B\A)
i holds Lorem [ipsum] dolor sit amet, ... (A∩B)
d holds [Lorem ]ipsum[ dolor] sit amet, ... ((A\B)∪(B\A))

In terms of sets, <a-z>a=∪ (union), <a-z>s=\ (set difference) and are sufficient to construct everything else.

On my implementation:

  • If you run my code, l r and d will be off by 1. That is easily fixed by moving some BufferCoords by 1, but I did not want to get into that before feedback.
  • It is only pairwise so far (pairs of selection and register content). I think it is easily extended to the "full" version if that is more desirable.
  • As it is pairwise, any subtraction that results in the empty set is removed.

@mreppen
Copy link
Contributor Author

mreppen commented May 29, 2020

I have given this some though, and for the use cases I have in mind, I think not doing pairwise subtraction is better. It also fits the interpretation of a kakoune selections as sets.

@Delapouite
Copy link
Contributor

Did you have a look at how vis handle these distinctions?

From what I remember this editor offers pairwise (z|, z&, z<, z>, z+, z-) and non-pairwise (|, &, \, !) operations : https://github.com/martanne/vis/wiki/Differences-from-Kakoune

@mreppen
Copy link
Contributor Author

mreppen commented May 29, 2020

I had a look at your link and in vis's list of available commands, and my understanding is that it only offers pairwise operators.

Without pairwise I could save one or more selections of one part of my buffer like a mask and remove that from any other selection. Such things would not work with pairwise.

Most pairwise situations would work also without pairwise, I imagine. The type of case where it would not are when one selections overlaps multiple register selections:

[Lorem ipsum] dolor [sit] amet, ...
[Lorem] [ipsum] dolor sit amet, ...

The first minus the second becomes either of

Lorem[ ipsum] dolor [sit] amet, ...
Lorem[ ]ipsum dolor [sit] amet, ...

depending on whether it is or is not pairwise.

Of course, both pairwise and not could be implemented, but I personally would likely make more use of the latter.

@mreppen
Copy link
Contributor Author

mreppen commented May 29, 2020

Another remark on pairwise vs not: As a subtraction can both increase and decrease the number of selections, the behavior would be a bit unintuitive when chaining pairwise combine commands. Non-pairwise combine commands can be chained and I think everyone agrees on what is expected to happen.

@mreppen
Copy link
Contributor Author

mreppen commented May 30, 2020

@Delapouite if you would like to try it out, I did a quick implementation of both pairwise and not.
Pairwise: https://github.com/mreppen/kakoune/tree/subtract_pairwise Not pairwise: https://github.com/mreppen/kakoune/tree/subtract

The ±1 bug is also fixed.

@vbauerster
Copy link
Contributor

vbauerster commented Jun 25, 2020

@mreppen Why do you prefer subtract to symmetric difference?
I've tried your fork (subtract branch) and having following line:

a holds [Lorem ipsum] dolor sit amet, ... (A)
b holds Lorem [ipsum dolor] sit amet, ... (B)

I got Lorem ipsum[ dolor] sit amet, ... or [Lorem ]ipsum dolor sit amet, ... depending on order of selection addition, after <a-z>s.

Having symmetric difference operation I would get: [Lorem ]ipsum[ dolor] sit amet. But I still can get result of subtract operation by applying either <space> or <a-space>. So from that point of view I think plain symmetric difference is more versatile than just subtract operation.

@mreppen
Copy link
Contributor Author

mreppen commented Jun 25, 2020

@vbauerster The neat thing about subtraction is that you can construct the symmetric difference from it ((A\B)∪(B\A), which could be automated with a mapping).
However, if the decision were mine, several common set operators like mathematical intersection and possibly symm. diff. would be built-in. The potentially frustrating thing about subtraction is that its non-commutativity means an extra register (and steps) is needed for (A\B) if B is the current selection and A is in a register. I would probably solve this by having <a-s> for subtraction with reversed order.

Note that symmetric difference + <a-space> is not always the difference. Consider (with Δ = symmetric difference)

  1. L[orem ipsum dolo]r sit amet, ... Δ [Lorem ]ipsum[ dolor] sit amet, ... = [L]orem [ipsum] dolo[r] sit amet, ..., followed by <a-space> is not the subtraction.
  2. The case is similar when you start out with several selections in different regions. Then, even where <a-space> does give subtraction for each, it only fixes one region.

If you want to try out other set operators, you could construct them from mappings of subtraction and "add" (set union) for now.

@Delapouite
Copy link
Contributor

Delapouite commented Jun 26, 2020

I don't know if this link is really on topic and can be useful to this discussion, but I remembered it recently: https://blog.jooq.org/2016/07/05/say-no-to-venn-diagrams-when-explaining-joins/ Especially the graphics in the second part of the article that could be more inline with the mental model of selections lists

@mreppen
Copy link
Contributor Author

mreppen commented Jun 26, 2020

Thanks @Delapouite. I agree with that author: joins are indeed a subsets of the Cartesian product. Maybe there is some insight I am missing, but I do not see Cartesian products in buffer selections (or at least none that are not very forced), as implemented in kakoune.

The operations I am interested in, and I believe also @vbauerster, are precisely the set operations union (kak append), intersection (not in kak, SQL intersect), and set difference (not in kak, SQL except, subtraction in my wording to avoid confusion with symmetric difference).

I see the buffer as a set of characters. Each selection is a ("connected") set. Multiple selections are also a set (but not "connected"). This is the context for interpretation as set operators.

With the risk of getting a bit technical, a pairwise operation has the same interpretation. The difference is just that it is more like a batch/map operation on pairs of "connected" sets/selections. I'll mention again that I really prefer the non-pairwise version. With the exception of some special cases, I believe it is more general.

@vbauerster
Copy link
Contributor

I would be satisfied with either subtract (coined by @mreppen) or symmetric difference, as long as I can accomplish following illustration with key sequence Z<a-S><a-z>? where ? is appropriate operation.
Screenshot-2020-06-27-at-10-43-29.png

@Screwtapello
Copy link
Contributor

I see the buffer as a set of characters. Each selection is a ("connected") set. Multiple selections are also a set (but not "connected").

If you think of the buffer as a set of characters, and selections as an arbitrary subset, then these sorts of set operations are very natural and make perfect sense.

Kakoune doesn't quite work like that, though: there's a very distinct difference between a selection covering the word "hello" and five selections covering the letters "h", "e", "l", "l" and "o". It's certainly possible to "flatten" Kakoune's selections into "a set of selected characters", then perform a set intersection and then reconstruct the original selections by gluing together adjacent selections, and that will work most of the time, but not always.

The current "pairwise" selection operations are, I think, about as flexible as you're going to get while respecting Kakoune's selection model. I'd definitely love to see more "mathematical" selection operations (I've missed them myself) but I don't think it would be appropriate for them to replace the current operations. Perhaps it would be a good experiment to implement these operations with a plugin?

@mreppen
Copy link
Contributor Author

mreppen commented Jun 28, 2020

@Screwtapello I should have been clearer in my interpretation. My idea of what these "set operators" should do, and also my implementation, do indeed distinguish between "hello" and the five separate character selections. I can make more precise descriptions if you want. For now though I will make it less technical:

  • subtraction: "remove any overlap with register ^ from the current selection" (or the reverse, or other register)
  • intersection: "overlap between register ^ and the current selection"

The current "pairwise" selection operations are, I think, about as flexible as you're going to get while respecting Kakoune's selection model. I'd definitely love to see more "mathematical" selection operations (I've missed them myself) but I don't think it would be appropriate for them to replace the current operations.

What I suggest as non-pairwise fits kak's model after making my description more precise (pretty much iteration over all pairs of selections). I think nobody suggests replacing current operators; it is just unfortunate that there is a name overlap for intersection, which is part of why I only did subtraction.

Perhaps it would be a good experiment to implement these operations with a plugin?

Part of the problem is kak does not allow extending the selection mechanism for register combination (the key bindings), so it can't be done. (a trick I used in https://github.com/mreppen/custom-objects.kak works in this context as well, but it's more of a hack because it fails if more than one plugin wants to extend functionality)

@mreppen
Copy link
Contributor Author

mreppen commented Jun 28, 2020

Take the following as a sidenote. There are other more direct ways to describe the same thing.

Maybe @Delapouite was ahead of me, but it appears that the following gets it right. I'll describe intersection, as it is simpler.

  • Create Cartesian product of two selections -> for each pair (a, b), perform intersection -> filter out all empty -> append.

In this sense, it is reminiscent of join with the extra map before the filter (maybe SQL joins can do that?).

This is not a result of extensive deliberation, and I don't want to bring this technicality into the discussion unless somebody finds it useful; I just wanted to connect to @Delapouite's comment.

@Delapouite
Copy link
Contributor

A new vis version was released today with some notable changes in regards of the current topic : https://github.com/martanne/vis/releases/tag/v0.7

…
removed pairwise selection combinators z>, z<, z-, z+, z&, z|
use ~ instead of ! for selection complement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants