Pass the message to `rabbit_backing_queue:discard` callback #9620

noxdafox · 2023-10-01T19:09:55Z

Proposed Changes

This is a re-attempt at PR #7802 which was not rolling-upgrade safe.
I ended up partly following @lhoguin suggestion: the gm receives the full message but forwards to the slaves only the MsgId. The slaves do not call the underlying BQ:discard function.

Rationale behind this logic is all BQ implementations resolve to rabbit_variable_queue:discard which does nothing. Hence, there is no difference from slaves perspective on whether to call or not the discard implementation.

Therefore:

If the leader is a new one, it will still forward MsgId to old nodes which will handle it as such
If the leader is an old one, new nodes will still receive MsgId and simply do not call BQ:discard as un-relevant

Drawback for this implementation is we loose uniformity of implementation. Benefit is a simpler code-base as we do not need to maintain 2 callbacks for discard.

Here I include the previous PR description:

The rabbit_backing_queue:discard callback is passing the message ID to the implementer. This is often not enough to carry on some necessary work as it's seen in the rabbit_priority_queue comment.

In my particular case, this makes it hard to fix the following issue:
noxdafox/rabbitmq-message-deduplication#96

In the above issue a consumer starts consuming with noAck over an empty queue. A publisher publishes a single message which gets forwarded directly to the consumer. In this case, the discard callback is invoked instead of publish_delivered and therefore it's header is not removed from the deduplication cache.

Problem is the discard callback only forwards the message ID and not the whole message not providing enough context for the implementer.

This patch adjust the rabbit_backing_queue behaviour passing the whole message to the discard callback instead of its sole ID.

This implementation has some drawback as, when using mirrored queues, more data will now be exchanged between the master node and the slaves. This will be detrimental to empty HA queue performance.

Yet the following observations should be considered:

This only applies when messages are consumed directly without acknowledgment from an empty queue
HA/mirrored queues are not the ideal choice when high performance are a need
The consumption without acknowledgment is a questionable pattern when combined with HA/mirrored queues
HA/mirrored queues do not provide much value compared to normal ones when empty

Types of Changes

Bug fix (non-breaking change which fixes issue #NNNN)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause an observable behavior change in existing systems)
Documentation improvements (corrections, new content, etc)
Cosmetic change (whitespace, formatting, etc)
Build system and/or CI

Checklist

I have read the CONTRIBUTING.md document
I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
I have added tests that prove my fix is effective or that my feature works
All tests pass locally with my changes
If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Further Comments

Is there a way to run tests concurrently? It takes fairly long on my machine to run the entire rabbit test suite. OFC I can pin down to relevant tests but it would be nice to run the entire suite.

The previous behaviour was passing solely the message ID making queue implementations such as, for example, the priority one hard to fulfil. Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>

Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>

michaelklishin · 2023-10-01T23:30:38Z

@noxdafox thank you but now a few months later and with Khepri merged into main, we are ready to branch off v3.13.x and remove mirroring from classic queues altogether.

So this work may or may not be considered.

michaelklishin · 2023-10-02T10:59:47Z

The team is leaning towards not doing any more CMQ work that is perceived as risky, and we have prior evidence that this sort of change is risky for upgrades :(

Thank you for taking the time to contribute. By Oct 2024 you won't have to worry about the CQ mirroring code path at all, so at least there's that.

noxdafox · 2023-10-02T11:56:53Z

@michaelklishin issue is not with the mirroring queue at all but with the discard behaviour itself.

As it is right now, it prevents proper implementations both for the priority and the deduplication queue.

The meaningful changes within this PR are related to the rabbit_backing_queue behaviour, it's implementation both on the priority and variable queue side and the adoption of the new behaviour in the rabbit_amqqueue_process module.
The HA/mirroring change was a "necessary evil" but not the core of the contribution.

Hence, we can either push this change through and remove the HA side when deleting the related files or I can provide a different PR for fixing only the above without providing the HA changes.

lhoguin · 2023-10-02T13:38:11Z

I agree that we should make this change, but we should do the change after CMQs are removed (in the same release, 4.0). We will remove CMQs from the main branch some time after branching out 3.13, and the PR should be submitted again at that time IMO.

michaelklishin · 2023-10-02T14:04:15Z

@noxdafox all I am saying is that this PR hits an unfortunate timing where CMQs are soon to be removed, which greatly reduces our team's interest in considering any changes to classic queues. This should be over fairly soon, in maybe a month or so.

noxdafox · 2023-10-02T16:49:09Z

Allright, what is the target tag so I can keep an eye on it? 4.0?

lhoguin · 2023-10-03T07:07:31Z

Around 4.0-alpha-1 would be a good time.

michaelklishin · 2023-10-04T14:12:21Z

I'd say -beta.1, as the first alpha will be produced shortly after we branch off v3.13.x and bump the version number in main.

noxdafox added 2 commits October 1, 2023 21:52

rabbit_backing_queue: pass the whole message to discard callback

f14efc6

The previous behaviour was passing solely the message ID making queue implementations such as, for example, the priority one hard to fulfil. Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>

Adopt new rabbit_backing_queue:discard implementation

4945d39

Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>

michaelklishin closed this Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass the message to `rabbit_backing_queue:discard` callback #9620

Pass the message to `rabbit_backing_queue:discard` callback #9620

noxdafox commented Oct 1, 2023

michaelklishin commented Oct 1, 2023

michaelklishin commented Oct 2, 2023

noxdafox commented Oct 2, 2023 •

edited

Loading

lhoguin commented Oct 2, 2023

michaelklishin commented Oct 2, 2023

noxdafox commented Oct 2, 2023

lhoguin commented Oct 3, 2023

michaelklishin commented Oct 4, 2023

Pass the message to rabbit_backing_queue:discard callback #9620

Pass the message to rabbit_backing_queue:discard callback #9620

Conversation

noxdafox commented Oct 1, 2023

Proposed Changes

Types of Changes

Checklist

Further Comments

michaelklishin commented Oct 1, 2023

michaelklishin commented Oct 2, 2023

noxdafox commented Oct 2, 2023 • edited Loading

lhoguin commented Oct 2, 2023

michaelklishin commented Oct 2, 2023

noxdafox commented Oct 2, 2023

lhoguin commented Oct 3, 2023

michaelklishin commented Oct 4, 2023

Pass the message to `rabbit_backing_queue:discard` callback #9620

Pass the message to `rabbit_backing_queue:discard` callback #9620

noxdafox commented Oct 2, 2023 •

edited

Loading