Skip to content
Sergey O edited this page Feb 5, 2023 · 20 revisions

change log

Models

  • --model-type= specify which model to use.
    • By default, the new alphafold2_multimer_v3 model is used when the input is a multimer with num-recycles=20 and recycle-early-stop-tolerance=0.5. If the input is a monomer, alphafold2_ptm model is used with num-recycles=3 and recycle-early-stop-tolerance=0.0. The user can override all these options!
    • Bonus: all models can be used for either monomer or multimer prediction.

Recycles

  • --num-recycle= specify number of recycles to run.
  • --recycle-early-stop-tolerance= specify when to stop.
    • When the RMSD (difference in distance matrices, angstrom units) between recycles, falls below the specified threshold, the run will terminate.
  • --save-recycles save models generated at all recycles.
    • --save-all will do the same, but will also save all the intermediate outputs between recycles as a pickle file.

Sampling

  • --random-seed= Specify random seed.
  • --num-seeds= Number of seeds to try.
    • Will iterate from range(random_seed, random_seed+num_seeds)
  • --use-dropout Activate dropouts during inference to sample from the uncertainty of the models.
  • --max-seq Number of sequence clusters to use. --max-extra-seq Number of extra sequences to use.
    • These two options were previously set by --max-msa="max-seq:max-extra-seq", but are now split up to be more user-friendly.
    • Reducing either option will make your model to be less certain about the prediction, and when combined with random seeds may allow sampling alternative conformations.
    • --disable-cluster-profile for multimers we find reducing cluster size (max-seq) results in poor model quality due to more diverse profiles. Disabling profiles appears to fix this issue! We suggest using this flag in combination with --max-seq when introducing uncertainty in multimer sampling.

Other

  • --num-relax= Specify the number of top models to relax. --amber flag by default will trigger ALL models to be relaxed.
  • --recompile-padding= Now accepts an integer, which specifies how much to pad each input by, instead of factor. This is now only used if more than a single input is provided for "batch" computation.
  • bfloat16 is now enabled by default for both monomder and multimer models.
  • --stop-at-score=[0,100] As soon as one of the recycles or models or random seeds reaches the specified score, the job will terminate.
    • The metric used can be specified by the --rank=[auto,plddt,multimer,ptm,iptm] flag. For "auto", "multimer" is used for complexes and "plddt" is used for monomers. "multimer" metric is computed as 80*iptm + 20*ptm. Note, all metrics are now on scale of 0 to 100.

Bugfixes

  • ipTMscores and pTMscores were incorrectly computed if padding was used. The padded region was used in the computation. This only affects local users, as padding was disabled in Colab Notebook. Since padding was at most by factor of 1.1, this likely didn't have a big effect on the scores. The model quality/ranking is unaffected.
  • If you used the monomer model (alphafold_ptm) option for modeling complexes. The first full-length sequence was not defined.
Clone this wiki locally