Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is non-determinsim in cosmos-sdk simulation #559

Closed
adu-crypto opened this issue Jun 20, 2022 · 13 comments
Closed

There is non-determinsim in cosmos-sdk simulation #559

adu-crypto opened this issue Jun 20, 2022 · 13 comments
Assignees

Comments

@adu-crypto
Copy link
Contributor

Describe the bug

There is non-deterministic behavior in cosmos-sdk level simulation caused by Simstate.Rand.
if we commented this line x/cronos/simulaltion/genesis.go

	// if we use rand here it will cause non determinism, but if we comment this line everything is fine.
	_ = simState.Rand.Intn(100)

TestAppStateDeterminism could pass.
But TestAppStateDeterminism failed if we used Simstate.Rand even if the value was not used anywhere.

To Reproduce

Steps to reproduce the behavior:

  1.       git pull https://github.com/adu-crypto/cronos.git
  2.       git checkout evm-simulation-determinism
  3.       make test-sim-nondeterminism
  4. See error:
    sim_test.go:339: 
        	Error Trace:	sim_test.go:339
        	Error:      	Not equal: 
        	            	expected: "\xad\xa7㌵\x01\xc3bR\xfb\xb1;\xad\x13\x04\xb5d@\xdaS?\xa7;\x05`X\x90\xa5\x91\xf1M\xd2"
        	            	actual  : "4\xbd\xa6\x1f\xabzY4\xfe?e\x8b\xc6\xff%\xb9\xed1u\xdc\x19G\xf7ȑVv}\xae\xe5\xcf\xd4"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-��㌵��bR��;����d@�S?�;�`X����M�
        	            	+4����zY4�?e���%��1u��G�ȑVv}����
        	Test:       	TestAppStateDeterminism
        	Messages:   	non-determinism in seed 3916589616287113937: 2/3, attempt: 2/5
--- FAIL: TestAppStateDeterminism (146.91s)
FAIL
FAIL	github.com/crypto-org-chain/cronos/app	149.376s
FAIL

Expected behavior
Simstate.Rand should be deterministic with the same seed.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]: macos Big Sur
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]: go 1.18.1

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@adu-crypto
Copy link
Contributor Author

iaviewer shows the evm and cronos module data is not wrote to the backend db.

(base) adudu@CNMAC0342 cronos % iaviewer data app/tmp/2-1/testing.db "s/k:cronos/" 1  
Got version: 1
Printing all keys with hashed values (to detect diff)
Hash: E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855
Size: 0

@adu-crypto
Copy link
Contributor Author

by changing the backend db to goleveldb and compare store's data and hash of all modules in non-deterministic rounds:

#!/usr/bin

declare -a modules=("acc" 
                    "bank" 
                    "staking" 
                    "mint" 
                    "distribution" 
                    "slashing" 
                    "gov" 
                    "params" 
                    "upgrade" 
                    "evidence" 
                    "capability" 
                    "feegrant" 
                    "authz" 
                    "ibc" 
                    "transfer" 
                    "evm" 
                    "feemarket" 
                    "cronos"
                   )
declare -a dbs=("1-0" "1-1")
        
for module in "${modules[@]}"
do
    for db in "${dbs[@]}"
    do
        iaviewer data app/tmp/"$db"/testing.db "s/k:${module}/" 1 > tmp/"$module"-"$db".data
    done
    delta=$(diff tmp/"$module"-2-0.data tmp/"$module"-2-1.data)
    if [ "$delta" != "" ] 
    then
        echo "${module} is not deterministic"
    fi
done

I got:

feemarket is not deterministic

@adu-crypto
Copy link
Contributor Author

feemarket stores the BlockGasUsed.

(base) adudu@192 cronos % iaviewer data app/tmp/1-0/testing.db "s/k:feemarket/" 1                
Got version: 1
Printing all keys with hashed values (to detect diff)
  01
    AE2D60A6B8D5FBC4C55365A3ED0FF2788EF9F88509C8B9CCD5555EE54130F854
Hash: 55C1DAD91EDEAEE2DFBD2F9A9C3C86938549AE19C4A9A291D18005F2B04D1EBD
Size: 1
(base) adudu@192 cronos % iaviewer data app/tmp/1-1/testing.db "s/k:feemarket/" 1 
Got version: 1
Printing all keys with hashed values (to detect diff)
  01
    941CD147CB2BEE405776E744644AB12FADB8AB7351FDCD2807B4786A098579A3
Hash: 2F1FEB4BC48C0967AA23E81E38B156C0C76F10C19F236F9AD95E1EAD673F3F34
Size: 1

we can see BlockGasUsed value is different

@adu-crypto
Copy link
Contributor Author

Besides, by debugging, I find the genesis states and operations of all modules are deterministic even in the non-deterministic rounds.
Maybe non-deterministic behavior happens in gas calculation?

@adu-crypto
Copy link
Contributor Author

Since the feemarket module store is not deterministic, I debug feemarket blockGasUsed in ethermint repo.

  • change blocksize to 1 in `Makefile` to fasten testing speed.
  • replace cosmos-sdk dependency to custom code base for debugging `runTx`.
By debugging, I find that `blockGasUsed` value is not deterministic even if the random seed, genesis states and module operations are the same. The value changes every time we re-run the simulation. By diff the two times simulation logs, we see:
< after anteHandler, gas meter value: 99423
< before runMsgs, gas meter value: 99423
< finish running tx, gas meter value: 102011
---
> after anteHandler, gas meter value: 98611
> before runMsgs, gas meter value: 98611
> finish running tx, gas meter value: 101199
224c224
< tx execution: gasWanted 10000000, gasUsed 102011
---
> tx execution: gasWanted 10000000, gasUsed 101199
227,229c227,229
< after anteHandler, gas meter value: 111558
< before runMsgs, gas meter value: 111558
< finish running tx, gas meter value: 129051
---
> after anteHandler, gas meter value: 112174
> before runMsgs, gas meter value: 112174
> finish running tx, gas meter value: 129667
231c231
< tx execution: gasWanted 10000000, gasUsed 129051
---
> tx execution: gasWanted 10000000, gasUsed 129667
234,236c234,236
< after anteHandler, gas meter value: 99675
< before runMsgs, gas meter value: 99675
< finish running tx, gas meter value: 211407
---
> after anteHandler, gas meter value: 98877
> before runMsgs, gas meter value: 98877
> finish running tx, gas meter value: 210609
238c238
< tx execution: gasWanted 10000000, gasUsed 211407
---
> tx execution: gasWanted 10000000, gasUsed 210609
241,243c241,243
< after anteHandler, gas meter value: 112060
< before runMsgs, gas meter value: 112060
< finish running tx, gas meter value: 129829
---
> after anteHandler, gas meter value: 112620
> before runMsgs, gas meter value: 112620
> finish running tx, gas meter value: 130389

The non-deterministic behavior happens in app.anteHandler

@adu-crypto
Copy link
Contributor Author

the root cause is found: evmos/ethermint#1151 (comment)

the reason is easy:
https://github.com/mmsqe/cosmos-sdk/blob/dbe1dba5a6dfa118412a6a3ea795c5f1b746eee7/simapp/helpers/test_helpers.go#L26

r := rand.New(rand.NewSource(time.Now().UnixNano()))
GenTx is not using Simstate.Rand, so every time it creates memo with different length.
And ethermint.AnteHandler is consuming gas according to the tx bytes length, so finally gas used is not deterministic.

@yihuang
Copy link
Collaborator

yihuang commented Jun 27, 2022

the root cause is found: evmos/ethermint#1151 (comment)

but it shouldn't cause app hash mismatch?

@adu-crypto
Copy link
Contributor Author

feemarket store BlockGasWanted value in its store.
so when root multi store commits, the feemarket substore is not deterministic

@adu-crypto
Copy link
Contributor Author

Oh, sorry maybe I describe the problem in a confusing way.
The problem is not app hash mismatch in consensus level, but actually different root hash by running simulation two times with the same random seed.

@adu-crypto
Copy link
Contributor Author

open a PR in cosmos-sdk repo to fix this

@adu-crypto
Copy link
Contributor Author

/runsim

@tomtau
Copy link
Contributor

tomtau commented Jul 18, 2022

@tomtau tomtau closed this as completed Jul 18, 2022
@adu-crypto
Copy link
Contributor Author

/runsim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants