Skip to content
Sergey O edited this page Feb 5, 2023 · 20 revisions

change log

Models

  • --model-type=[auto,alphafold2_ptm,alphafold2_multimer_v3] specify which model to use.
    • If auto, alphafold2_ptm is selected for monomer inputs, and alphafold2_multimer_v3 is selected for complex (multimer) inputs.
    • Bonus: all models can be used for either monomer or multimer prediction.

Recycles

  • --num-recycle= specify number of recycles to run. --recycle-early-stop-tolerance= specify when to stop.
    • The tolerance is defined as the RMSD (difference in distance matrices, angstrom units) between recycles. If it drops below the specified value, the recycling will terminate.
    • if not specified, num-recycles=20 recycle-early-stop-tolerance=0.5 is used for alphafold2_multimer_v3 and num-recycles=3 recycle-early-stop-tolerance=0.0 is used for alphafold2_ptm.
  • --save-recycles save models generated at all recycles.
    • --save-all will do the same, but will also save all the intermediate outputs between recycles as a pickle file.

Sampling

  • --random-seed= Specify random seed.
  • --num-seeds= Number of seeds to try.
    • Will iterate from range(random_seed, random_seed+num_seeds)
  • --use-dropout Activate dropouts during inference to sample from the uncertainty of the models.
  • --max-seq Number of sequence clusters to use. --max-extra-seq Number of extra sequences to use.
    • These two options were previously set by --max-msa="max-seq:max-extra-seq", but are now split up to be more user-friendly.
    • Reducing either option will make your model to be less certain about the prediction, and when combined with random seeds may allow sampling alternative conformations.
    • --disable-cluster-profile for multimers we find reducing cluster size (max-seq) results in poor model quality due to more diverse profiles. Disabling profiles appears to fix this issue! We suggest using this flag in combination with --max-seq when introducing uncertainty in multimer sampling.

Other

  • --num-relax= Specify the number of top models to relax. --amber flag by default will trigger ALL models to be relaxed.
  • --recompile-padding= Now accepts an integer, which specifies how much to pad each input by, instead of factor. This is now only used if more than a single input is provided for "batch" computation.
  • bfloat16 is now enabled by default for both monomer and multimer models. This may change the results slightly (even with older models) due to slight numeric differences in computation.
  • --stop-at-score=[0,100] As soon as one of the recycles or models or random seeds reaches the specified score, the job will terminate.
    • The metric used can be specified by the --rank=[auto,plddt,multimer,ptm,iptm] flag. For "auto", "multimer" is used for complexes and "plddt" is used for monomers. "multimer" metric is computed as 80*iptm + 20*ptm. Note, all metrics are now on a scale of 0 to 100.
  • --save-all will output a pickled file at each recycle, saving all the results as a dictionary of numpy arrays. This includes the single and pair representations. (if you only want to save the single or pair representations, you can use the old flags --save-single-representations and/or --save-pair-representations)

Bugfixes

  • ipTMscores and pTMscores were incorrectly computed if padding was used. The padded region was used in the computation. This only affects local users, as padding was disabled in Colab Notebook. Since padding was at most by factor of 1.1, this likely didn't have a big effect on the scores. The model quality/ranking is unaffected.
  • If you used the monomer model (alphafold_ptm) option for modeling complexes. The first full-length sequence was not defined.
Clone this wiki locally