Cluster and propensity based approximation of a network
John Ranola, Peter Langfelder, Kenneth Lange, Steve Horvath
Human Genetics and Biostatistics, University of California, Los Angeles
SHorvath (at) mednet (dot) ucla (dot) edu
Peter (dot) Langfelder (at) gmail (dot) com
- Article abstract
- Talk, ppt slides
- Automatic installation from CRAN
- Manual download and installation
- Problems installing or using the package
- Introduction to PropClust
- Old versions of the R package
- Citing the PropClust package
The models in the article generalize current models for both correlation networks and multigraph networks. Correlation networks are widely applied in genomics research. In contrast to general networks, it is straightforward to test the statistical significance of an edge in a correlation network. It is also easy to decompose the underlying correlation matrix and generate informative network statistics such as the module eigenvector. However, correlation networks only capture the connections between numeric variables. An open question is whether one can find suitable decompositions of the similarity measures employed in constructing general networks. Multigraph networks are attractive because they support likelihood based inference. Unfortunately, it is unclear how to adjust current statistical methods to detect the clusters inherent in many data sets.
Here we present an intuitive and parsimonious parametrization of a general similarity measure such as a network adjacency matrix. The cluster and propensity based approximation (CPBA) of a network not only generalizes correlation network methods but also multigraph methods. In particular, it gives rise to a novel and more realistic multigraph model that accounts for clustering and provides likelihood based tests for assessing the significance of an edge after controlling for clustering. We present a novel Majorization-Minimization (MM) algorithm for estimating the parameters of the CPBA. To illustrate the practical utility of the CPBA of a network, we apply it to gene expression data and to a bi-partite network model for diseases and disease genes from the Online Mendelian Inheritance in Man (OMIM).
The CPBA of a network is theoretically appealing since a) it generalizes correlation and multigraph network methods, b) it improves likelihood based significance tests for edge counts, c) it directly models higher-order relationships between clusters, and d) it suggests novel clustering algorithms. The CPBA of a network is implemented in Fortran 95 and bundled in the freely available R package PropClust.
A Set of tutorials that illustrate various aspects of PropClust is available.
Click here to access the tutorial page.
Automatic installation from CRAN
The PropClust package is available from the Comprehensive R Archive Network (CRAN), the standard repository
for R add-on packages. To install the required packages and PropClust, simply type
This will install the PropClust package and all necessary dependencies. The catch is that this only installs the newest version of PropClust if your R version is also the newest (minor) version.
Users using older versions of R will need to follow the manual download and installation instructions below.
But we recommend to use the latest version of R.
Note for Mac users:
CRAN may occasionally fail to compile the PropClust package for
Mac OS X. This leads to the error message “Package PropClust is not available…” when calling
install.packages(). If this occurs, please download the binary version from here and follow the installation instructions (or, if you are able to compile packages locally, download the source and install that).
Note of caution: The newest versions of PropClust is available from CRAN only for the current R version. Please update your R to the newest version
or use the manual download below.
Problems installing or using the package? Please see our list of frequently asked questions. Your problem and the solution may already be posted there.
Manual download and installation
Please follow these steps only if the automatic package installation above does not work.
Short installation instructions, including other required and recommended packages, are available here .
Should you discover bugs (of which there are most likely plenty), please report them to Peter Langfelder (peter.langfelder at gmail.com) and Steve Horvath.
Problems installing or using the package
Please see our list of Frequently Asked Questions (and frequently given answers);
the solution to your problem may already be posted there. In particular, you can find answers about spurious Mac
errors, compatibility problems when upgrading PropClust, and others.
If you find a bug in the newest version on CRAN, please see whether this web site has posted a newer
version where the bug may be fixed. If you still cannot solve the problem, email Peter
Langfelder and Steve Horvath.
Getting started with R and the PropClust package
The package described here is an add-on for the statistical language and environment R (free software).
Our tutorial, described below, contains step by
Old versions of R package PropClust
Older version of the packages presented on this page are available here.
Citing the PropClust package
If you use PropClust in published work, please cite it as follows:
The method, software and evaluations are described in
- Ranola JM, Langfelder P, Lange K, Horvath S Cluster and propensity based approximation of a network. BMC Syst Biol. 2013 Mar 14;7(1):21 PMID: 23497424
(click here to access the article at the BMC web site)
The original code was written by John M Ranola, Peter Langfelder, Kenneth Lange, and Steve Horvath. Peter Langfelder is mainly in charge of maintaining and improving the package.