haddock3:How to not save the process file

My.cfg configuration file is as follows:

run_dir = "run-test"
mode = "local"
ncores = 64

molecules = [
    "data/1.pdb",
    "data/2.pdb"
]

[topoaa]
autohis = false

[rigidbody] 
tolerance = 5
sampling = 10000
surfrest = true
cmrest = true

[seletop]
select = 400

[flexref]
tolerance = 5

[emref]
tolerance = 5

[clustfcc]
min_population = 4

[seletopclusts]
top_models = 10

[caprieval]
# reference_fname = "data/1dee.pdb"

[prodigyprotein]
chains =  ["A", "B"] 
to_pkd = true

Each step will generate related folders and files. I don’t want to save so many. Our goal is not to focus more on the connection scoring but rather to obtain the score directly without needing so many intermediate files. Or is it possible to use parameters in the.cfg configuration file to not save the files of the current step?

Do you want to only score an existing model of a complex? And not redock?

If you have want to get the haddock score, your can use haddock3-score with as argument the PBD file of your complex.

In the current workflow you are performing an ab-initio docking run with increased sampling.
This does indeed generate a lot of files, which you can delete afterwards. There is no way of keeping all those files in memory.

Note that you don’t need surfrest to be turned on when you use cmrest - it unnecessarily increases the computational time.

I tried the method you provided for haddock3-score and discovered some problems.

There are three questions:

  1. What is the scoring logic of haddock3-score? How does it select the chains? If there are multiple chains in a complex, and I only want to know the docking scores of two of them, what should I do?
    For example, I downloaded the 7fbk.pdb complex from the PDB website. It contains four chains A, B, C, and D, along with some other molecules. Without any processing, I directly used haddock3-score data/7fbk.pdb, and the output was HADDOCK-score (emscoring) = -323.9506. Then I only kept the four protein chains and removed all other substances, and the score was -289.3158. Subsequently, I kept the two chains A and C, and the score became -127.0435.

  2. What are the differences between the scores obtained directly by calculating and re-aligning using the haddock3-score method and the emscoring.tsv scores obtained through other methods?
    For example, the first method: Downloaded the 7fbk.pdb complex from the PDB website, retaining the required A and C chains. Using haddock3-score data/7fbk.pdb, the output was HADDOCK-score (emscoring) = -127.0435.
    The second method: I used the fasta sequences of the A and C chains in the 7fbk.pdb complex to re-generate a single pdb file: 7fbk_a.pdb and 7fbk_c.pdb. Running the alignment with the following.cfg configuration file,

run_dir = "run-test"
mode = "local"
ncores = 64

molecules = [
    "data/7fbk_a.pdb",
    "data/7fbk_c.pdb"
]

[topoaa]
autohis = false

[rigidbody]
tolerance = 5
sampling = 10000
cmrest = true

[seletop] 
select = 400

[flexref]
tolerance = 5

[emref] 
tolerance = 5

[clustfcc]
min_population = 4

[seletopclusts]
top_models = 10

[emscoring]

the highest score in the emscoring.tsv file obtained was

structure original_name md5 score
emscoring_1.pdb cluster_1_model_1.pdb None -105.694
  1. What is the docking scoring threshold used to distinguish whether two molecules are bound?

By default HADDOCK scores all chains.

If you use for example the emscoring module, you can turn on the option to score by interface.
The information will then be written to the header of the PDB files

per_interface_scoring = true

  1. What are the differences between the scores obtained directly by calculating and re-aligning using the haddock3-score method and the emscoring.tsv scores obtained through other methods?
    For example, the first method: Downloaded the 7fbk.pdb complex from the PDB website, retaining the required A and C chains. Using haddock3-score data/7fbk.pdb, the output was HADDOCK-score (emscoring) = -127.0435.
    The second method: I used the fasta sequences of the A and C chains in the 7fbk.pdb complex to re-generate a single pdb file: 7fbk_a.pdb and 7fbk_c.pdb. Running the alignment with the following.cfg configuration file,

In your second way of doing it you are redocking the complex, which is a pure waste of time for what you are doing!
You could combined the two files into one and use the following workflow:

run_dir = "run-test"
mode = "local"
ncores = 64

molecules = [
    "data/7fbk_a+c.pdb"
]

[topoaa]

[emscoring]
per_interface_scoring = true

You can even generate one ensemble of models for scoring purposes. This ensemble file should use the MODEL/ENDMDL way of combining multiple models (can be generated from a set of single models using the pdb_mkensemble command