celestiaorg · rootulp · Dec 14, 2022 · Nov 18, 2022 · Nov 22, 2022 · Nov 24, 2022
diff --git a/...r-011-optimistic-blob-size-independent-inclusion-proofs-and-pfd-fraud-proofs.md b/...r-011-optimistic-blob-size-independent-inclusion-proofs-and-pfd-fraud-proofs.md
@@ -0,0 +1,173 @@
+# ADR 011: Optimistic Blob Size Independent Inclusion Proofs and PFB Fraud Proofs
+
+## Changelog
+
+- 18.11.2022: Initial Draft
+
+## Context
+
+The blob inclusion verification game is between the verifier and a prover. The verifier has the blob or its commitment and access to the block header of Celestia. The prover wants to prove the inclusion of the blob in the DA Layer.
+
+Validators check if the PFB commitment matches the commitment that is referenced in [ProcessProposal](https://github.com/celestiaorg/celestia-app/blob/3473000a9ff04fccfbba83929711fe11643b782c/app/process_proposal.go#L113). Given that 2/3 of validators can collude and break this consensus rule, we want a fraud-proof to inform light clients of violations of this rule. If we assume that we have this fraud-proof then we can use the check in `ProcessProposal` to our advantage for an optimistic blob size independent inclusion proof.
+
+## Blob size independent inclusion proof - PFB inclusion proof
+
+Instead of proving the Merkle proof over all subtree roots over the blob we create a Merkle proof over the PFB transaction as it includes the commitment already. Therefore this proof will include the shares where the PFB transaction lies in the Celestia square plus the inclusion proof of those shares to the `DataRoot`.
+
+The verifier could do the following steps:
+
+- Verify that the namespaceID is the transaction namespaceID (i.e. `namespace.ID{0, 0, 0, 0, 0, 0, 0, 1}`). We have to do this so we don't accidentally verify a blob share that looks like a PFB transaction
+- Unserialize the PFB transaction
+- Verify the signatures
- Verify the signatures
+- Verify that the PFB transaction signature is valid
- Verify the signatures
+- Verify that the PFB transaction signature is valid
+- Verify the given commitment matches the commitment in the header
+
+Optional if you have the whole blob:
+
+- Verify that the blob length in the PFB is as long as the size of the blob
+
+Optional if you have the rollup header:
+
+- Verify if the signatures match the sequencer
+
+## PFB Fraud Proof
+
+First, we need to ask the question of what the PFB Fraud Proof tries to proof. A PFB Fraud Proof tries to prove that a sound PFB in the square exists but the corresponding blob does not exist where it should be.
+
+A PFB Fraud Proof = PFB Inclusion Proof + Blob Inclusion Proof
+
+After you verify the PFB Inclusion Proof you will get the:
+
+- index where the blob is located
+- how long the bob is
+- the commitment of the blob
+
+With the information that you now confirmed you can confirm the blob inclusion proof of the shares that start at the index for the given length.
+
+This means that you will verify the Merkle proof over the subtree roots over the blob that was specified in the PFB transaction. The final step would be to calculate the commitment over the subtree roots for the final commitment.
+
+If the calculated commitment equals the commitment in the PFB transaction then reject the Fraud-Proof and block the full node, otherwise halt the chain as 2/3 of the validators signed an invalid block.
+
+(TODO: We might need to verify the share after the blob to see if the blob ended if we want to prevent nested blob attacks. Forum post coming soon)
+
+## Quantitative size comparison
+
+### PFB Inclusion Proof
+
+We are comparing the size of a PFB inclusion proof to a worst-case blob inclusion proof.
+
+The size of a normal PFB transaction is about 330 bytes. This PFB transaction can span over 2 shares because the start of a transaction can happen in the middle of a share. The worst case is if the transaction starts in the middle of the share of the last block in a row and continues to the first share of the row after. The picture below shows this scenario and what you need to prove the inclusion of those shares. It is a Merkle Proof of the shares to ROW1 and ROW2 and then a Merkle proof of ROW1 and ROW2 to the DataRoot. The blue nodes are additional nodes that are needed for the Merkle proof.
+
+![PFB Merkle Proof](./assets/pfd-merkle-proof.png)
+
+Let's assume a square size of k. The amount of blue nodes from the shares to ROW1 is O(log(k). The amount of blue nodes from ROW1 to the `DataRoot` is also O(log(k). You will have to include the shares themselves in the proof.
+Share size := 512 bytes
+NMT-Node size := 32 bytes + 2\*8 bytes = 48 bytes
+MT-Node size := 32 bytes
+
+Worst Case Normal PFB proof size in bytes  
+= 2 PFB Shares + blue nodes to row roots + blue nodes to (`DataRoot`)  
+= 2 \* Share size + 2 \* log(k) \* NMT-Node size + log(k) \* MT-Node size  
+= 2 \* 512 + 2 \* log(k) \* 48 + log(k) \* 32  
+= 1024 + 128 \* log(k)  
+
+As the size of a PFB transaction is unbounded you can encompass even more shares. To put a bound on this we assume that most PFB transactions will be able to be captured by 4 Shares.
+
+Huge PFB proof size in bytes  
+= 4 PFB Shares + blue nodes to row roots + blue nodes to (`DataRoot`)  
+= 4 \* Share size + 2 \* log(k) \* NMT-Node size + log(k) \* MT-Node size  
+= 2048 + 128 \* log(k)
+
+### Blob Inclusion Proof
+
+The worst-case blob inclusion proof size will result from the biggest possible blob with the most amount of subtree roots. This blob is constructed as filling up the whole block and having the last row missing one share.
+
+![Blob Inclusion Proof](./assets/blob-merkle-proof.png)
+
+With a blob of size n and a square size of k, this means that we have O(sqrt(n)) subtree row roots and O(log(k)) subtree row roots in the last row. As the whole block is filled up, sqrt(n) tends towards k. We will also require additional k blue parity nodes to prove the row roots.
+
+Worst case blob inclusion proof size  
+= subtree roots (all rows) + subtree roots (last row) + blue nodes (parity shares)  
+= (sqrt(n) + log(k) + k) \* NMT-Node size     | sqrt(n) => k  
+= ( 2 \* k + log(k) ) \* 48
+
+## Optimizations
+
+### Optimization 1
+
+If a PFB would be guaranteed to be in one share then we could decrease the PFB proof size significantly. You would not only get rid of one share resulting in a 512 bytes save but also log(k) fewer blue nodes in the worst case. The maximum size PFB that fits into one share is 501 bytes long:  
+= Share size - nid - special byte - reserved bytes - transaction length  
+= 512 - 8 - 1 - 1 - 1 = 501  
+Therefore a normal-sized PFB of 330 bytes fits easily into a share with additional spare bytes to be used for more complex PFBs.
+A requirement for this is an option for the transaction to be the only one in a share or at the start of a share if it fits in the share. This would extend to bigger PFB shares as well so you can potentially reduce the size of overlapping PFB proofs by 1 share (512 bytes).
+
+One share PFB proof size  
+= Share + blue nodes to row root + blue nodes (`DataRoot`)  
+= Share size + log(k) \* NMT-Node size + log(k) \* MT-Node size  
+= 512 + log(k) \* 48 + log(k) \* 32  
+= 512 + 80 \* log(k)
+
+### Optimization 2
+
+The second optimization that could be possible is to only prove the commitment over the PFB transaction and not the PFB transaction itself. This requires the next block header in the rollup chain to include the commitment of the PFB transaction shares as well.  
+This would require the PFB transaction to be deterministic and therefore predictable so you can calculate the commitment over PFB transaction beforehand. How the PFB transaction is created like gas used and signatures, is something the roll-up has to agree upon before. It only needs to be predictable, if we want to keep the option of asynchronous blocktimes and multiple Rollmint blocks per Celestia block.  
+Another requirement is that the content of the PFB shares is predictable. You could enforce this by only having the PFB transaction in the share.
+
+The commitment over the PFB transaction is created by using the same principle of creating the commitment over a blob. Therefore the size of the inclusion proof is the same as the size of a blob inclusion proof of the size of how many shares the PFB transaction spans.
+
+Commitment inclusion proof size over one share PFB
+= subtree root + blue nodes to share + blue nodes to (`DataRoot`)  
+= NMT-Node size + log(k) \* NMT-Node size + log(k) \* MT-Node size  
+= 48 + log(k) \* 48 + log(k) \* 32  
+= 48 + 80 \* log(k)
+
+For two shares the best case is the same as for one share but the worst case includes twice the blue nodes to the row root. For 3 and 4 shares for a PFB transaction, the worst case proof size does not change significantly.
+
+Worst case commitment inclusion proof size over 2-4 share PFB
+= subtree roots + blue nodes to share + blue nodes to (`DataRoot`)  
+= 2 \* NMT-Node size + 2 \* log(k) \* NMT-Node size + log(k) \* MT-Node size  
+= 96 + log(k) \* 96 + log(k) \* 32  
+= 96 + 128 \* log(k)
+
+![Optimized Pfb Proofs](./assets/optimized-pfb-proofs.png)
+
+<!--- This does not need a fraud proof as it could be a validation rule that even light clients can check. This would require the light clients to know the sequencer set and whose turn it was. (not sure about this)
+--->
+The fraud proof for this would be to prove that the commitment of the PFB transaction does not equal the predicted commitment in the header. Therefore this is equivalent to a PFB transaction inclusion proof. This fraud proof would be optimistic as we would assume that the PFB commitment is correct. But realistically if the commitment over the PFB transaction is wrong then the PFB commitment is most likely wrong as well. Therefore the fraud poof would be a PFB Fraud Proof as described at the top.  
+If we do not have a PFB transaction that can be predicted, we also need to slash double signing of 2 valid PFB transactions in Celestia. This is required so we don't create a valid fraud proof over a valid commitment over the PFB transaction.
+
+The third optimization could be to SNARK the PFB Inclusion Proof to reduce the size even more.?
+
+## Result
+
+![PFB Inclusion Proof](./assets/pfb-proof-size-result.png)
+
+For normal-sized PFBs the Proof size will be worth it from blob size x to y depending on how many shares the PFB occupies.
+If the PFB is guaranteed to be in one share then it is always worth from blob size x.
+For Huge PFBs that span over 4 shares the PFB Proof will be worth it from blob size z.
+Detailed analysis of x,y and z will follow if the general concept makes sense.
+
+## Alternative Approaches
+
+The other way to prove blob inclusion is dependent on the blob size. A blob inclusion proof includes all the subtree roots of the blob and the Merkle proof of those subtree roots to the `DataRoot` of Celestia.
+
+## Detailed Design
+
+TODO
+
+## Status
+
+Proposed
+
+## Consequences
+
+### Positive
+
+By creating PFB inclusion proofs you create PFB fraud proofs at the same time.
+
+### Negative
+
+Optimization 1 might introduce more intra-transaction padding. Optimization 2 will introduce intra-transaction padding so only the PFB transaction occupies the shares.
+
+### Neutral
+
+TODO