Andy M. Yip (1,2) and Steve Horvath (3,4)

1 Dept. of Mathematics, UCLA

2 Dept. of Mathematics, National University of Singapore

3 Dept. of Human Genetics, David Geffen School of Medicine, UCLA

4 Dept. of Biostatistics, School of Public Health, UCLA

shorvath@mednet.ucla.edu

http://www.ph.ucla.edu/biostat/people/horvath.htm

Department of Human Genetics and Department of Biostatistics

University of California, Los Angeles, CA 90095

Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency) between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. [1] can be interpreted as a measure of agreement between the m=1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness.

**
Results**

We generalize the topological overlap measure from m=1 step neighborhoods to m>=2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM) by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis.

**
Conclusions**

Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules.

Yip A, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure BMC Bioinformatics 2007, 8:22

Tutorial in
Microsoft Word Format

Dataset

Tutorial in
Microsoft Word Format

Dataset

Data
Annotation File

PowerPoint version PDF version

Weighted Gene Co-Expression Network Page

The old webpage has been moved to here.

2007-01-27

*Please send your suggestions and
comments to: shorvath@mednet.ucla.edu*