Skip to content

Commit

Permalink
chore(core): Update comment about how drop op behaves with rollups (#…
Browse files Browse the repository at this point in the history
…8808)

## Problem
Dgraph.drop.op doesn't conflict out other transactions, unlike what we
have written in code.

## Solution
We have to update the explanation about how dgraph.drop.op would behave
under rollups.
  • Loading branch information
harshil-goel committed May 12, 2023
1 parent 4c718c0 commit 86d97a4
Showing 1 changed file with 28 additions and 2 deletions.
30 changes: 28 additions & 2 deletions posting/list.go
Original file line number Diff line number Diff line change
Expand Up @@ -831,8 +831,34 @@ func (l *List) Length(readTs, afterUid uint64) int {
// if someone is reading data between two timestamps.
//
// - Drop operation can cause issues if they are rolled up. Since we are storing results at ts + 1,
// in dgraph.drop.op. If some operations were done then, they would be overwriten.
// This won't happen as the transactions should conflict out.
// in dgraph.drop.op. When we do drop op, we delete the relevant data first using a mutation.
// Then we write a record into the dgraph.drop.op. We use this record to figure out if a
// drop was performed. This helps us during backup, when we want to know if we need to
// read a given backup or not. A backup, which has a drop record, would render older backups
// unnecessary.
//
// If we rollup the dgraph.drop.op, and store result on ts + 1, it effectively copies the
// original record into a new location. We want to see if there can be any issues in
// backup/restore due to this. To ensure that there is no issue in writing on ts + 1,
// we do the following analysis.
//
// Analysis is done for drop op, but it would be the same for drop predicate and namespace.
// Assume that there were two backups, at b1 and b2. We move rollup ts around to see if it
// can cause any issues. There can be 3 cases:
//
// 1. b1 < ts < b2. In this case, we would have a drop record in b2. This is the same behaviour
// as we would have writen on ts.
//
// 2. b1 = ts < b2. In this case, we would have a drop record in b1, and in b2. Originally, only
// b1 would have a drop record. With this new approach, b2 would also have a drop record. This
// is okay because last entry in b1 is drop, so it wouldn't have any data to be applied.
//
// 3. b1 < ts < ts + 1 = b2. In this case, we would have both drop drop records in b2. No issues
// in this case.
//
// This proves that writing rollups at ts + 1 would not cause any issues with dgraph.drop.op.
// The only issue would come if a rollup happens at ts + k. If a backup happens in between
// ts and ts + k, it could lead to some data being dropped during restore.
func (l *List) Rollup(alloc *z.Allocator, readTs uint64) ([]*bpb.KV, error) {
l.RLock()
defer l.RUnlock()
Expand Down

0 comments on commit 86d97a4

Please sign in to comment.