Coverage in FCC clustering

mpavsic · December 4, 2021, 9:00pm

Dear HADDOCKers,

I did several HADDOCK runs using different parameters and I’d like to see how similar are the models obtained in these different runs. For this I’m clustering all obtained models (basically, cluster*_1.pdb files) using the FCC clustering (GitHub - haddocking/fcc: Fraction of Common Contacts Clustering Algorithm for Protein Structures). The reported coverage is between 34% (for 0.75 cutoff) and 57% (for 0.33 cutoff).

My question is: Why are not all models clustered? Are some models so distant from the others so they can’t form a cluster of even 2 members?

Many thanks in advance.
Miha

mpavsic · December 5, 2021, 10:06am

Oh, I think I’ve found the answer - setting the minimum cluster size (for cluster-fcc.py) to 1 (option: -c 1) results in 100% coverage, with some clusters containing only 1 model (i.e. no neighbours).

amjjbonvin · December 5, 2021, 6:42pm

But then effectively you are ranking based on the individual model scores and no longer on the cluster-average score (over the best 4 model of the cluster if minimum four members are required)

mpavsic · December 5, 2021, 10:19pm

Yes, true, but I tried to cluster only the cluster*_1.pdb files which had significantly different subunit-subunit orientation (the cluster*_1.pdb through cluster*_4.pdb files from the same cluster looked very similar).

Topic		Replies	Views
Protein- peptide docking HADDOCK	14	2996	November 26, 2023
Only 24% of models clustered HADDOCK	4	372	January 18, 2021
CAPRI Metrics - FCC Measurement HADDOCK	2	208	December 11, 2022
Non-clustered structures HADDOCK	2	306	September 27, 2021
FCC with custom selection HADDOCK	4	322	February 10, 2021

Coverage in FCC clustering

Related topics