Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a test to show a problem with leaf overriding #469

Merged
merged 3 commits into from
Jun 9, 2023

Conversation

xgreenx
Copy link
Collaborator

@xgreenx xgreenx commented Jun 3, 2023

@bvrooman It seems I found a bug=( We use the same storage for leafs and nodes, it means the node can override the leaf that hash leaf_key == node_hash

@bvrooman
Copy link
Contributor

bvrooman commented Jun 3, 2023

You're probably noticing this because of these lines in the SMT update:

        ...
        self.storage.insert(&leaf_node.hash(), &leaf_node.as_ref().into())?;
        self.storage.insert(leaf_node.leaf_key(), &leaf_node.as_ref().into())?;
        ...

In theory, you could override an existing value in storage when leaf_node.hash() equals another leaf_node.leaf_key(). In your test, you created that scenario. I can see this would be a concern if there is a scenario where leaf keys are manually set to equal other leaf hashes.

However, we must consider that we control how we use SMTs and we don't manufacture any scenario where leaf keys are explicitly set to equal other leaf hashes. In order for this to happen in practice, there would have to be a case where a leaf hash - the 32 byte output of SHA256 - happens to equal another leaf key by chance. And since leaf keys are also 32 bytes, generally also the output SHA256, this means a SHA256 collision would have to happen. I've tested with millions of leaves, and even with millions of leaves I haven't seen a collision. The chances are still very low - that's essentially the promise of cryptographic hash functions.

What is more interesting is this: Given the fact that we have an address space of 256 bits, we have (at most) 2^256 leaves. We also have (at most) 2^256 - 1 internal nodes. This means we have 2^257 - 1 total nodes sharing an address space of 256 bits. This represents another area of possible collisions. But again, in practice, this has yet to happen, even with millions of leaves.

However, it's arguable that there is still risk, since the chance is still nonzero. To mitigate these risks, I think we could separate storage for internal and leaf nodes.

@xgreenx
Copy link
Collaborator Author

xgreenx commented Jun 3, 2023

The problem is that we use SparseMerkleTree for the contract's state. It means that the user of the contract may set the key manually and override the data to corrupt the Merkle tree.

image

@bvrooman
Copy link
Contributor

bvrooman commented Jun 3, 2023

Interesting - if a user can set the key, then there is potential for a malicious insert. I don't know enough about how/where the keys are created, but if that is a possibility, we can mitigate that by separating leaves and nodes.

Additionally, do we need to allow users to set their own keys? Allowing direct user input into storage already sounds suspicious. What if we simply hash their input?

@xgreenx xgreenx marked this pull request as ready for review June 9, 2023 22:37
Copy link
Member

@Voxelot Voxelot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find @xgreenx

@xgreenx xgreenx added this pull request to the merge queue Jun 9, 2023
Merged via the queue into master with commit 0b04b2d Jun 9, 2023
@xgreenx xgreenx deleted the bugfix/override-leafs branch June 9, 2023 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants