Tutorials on module preservation

Peter Langfelder1, Luo Rui1, Michael C. Oldham2, and Steve Horvath1,3


1 Dept. of Human Genetics, UC Los Angeles,
2 Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UC San Francisco,
3 Dept. of Biostatistics, UC Los Angeles

Peter (dot) Langfelder (at) gmail (dot) com, SHorvath (at) mednet (dot) ucla (dot) edu

About this page

This page provides a set of tutorials illustrating module preservation studies on several applications. Before going through the tutorials, please make sure you have installed (the newest version of) the WGCNA package.

The tutorials on this page were last updated December 20, 2011. This changelog provides a summary of the updates.


Overview of network terminology

In addition to the tutorials, we provide a short overview table of network terminology.

I. Mini tutorial: preservation of female modules in male samples of F2 mice

Data description and download

In this tutorial we take the gene co-expression network constructed from expression data in female mouse livers (Ghazalpur et al, 2006), and study the preservation of the modules in the corresponding male data. For a detailed description of the data and the biological implications we refer the reader to Ghazalpour et al (2006), Integrating Genetics and Network Analysis to Characterize Genes Related to Mouse Weight (link to paper; link to additional information). We note that the data set contains 3421 measured expression profiles. These were filtered from the original over 20,000 profiles by keeping only the most variant and most connected probes. Please download the following

and unzip them in a folder of your choice, preferably a new folder created specifically for this tutorial. Note the name of the folder; when you start an R session, the first command should be to change the R working directory into this folder.

R Tutorial

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the results.


II. Preservation of Cholesterol Biosynthesis Process module among 8 tissue/gender cobinations in F2 mice

Data description and download

In this tutorial we analyze preservation of a module defined by the GO term ``Cholesterol Biosynthesis Process'' among 8 tissue/gender combinations in an F2 mice cross described in Ghazalpour et al (2006). In this application we show that modules need not correspond to clusters, and in this case network preservation statistics, in particular the connectivity preservation statistics, provide useful information for comparing networks defined by external information across different data sets.

Please download the following

Save and unzip the data in a folder of your choice, preferably a new folder created specifically for this tutorial. Note the name of the folder; when you start an R session, the first command should be to change the R working directory into this folder.

R Tutorial

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the analysis results and selected figures.


III. Preservation of human brain co-expression modules in chimpanzee data and vice-versa

Data description and download

In this tutorial we analyze preservation of human brain co-expression modules in chimpanzee data and vice-versa. The data were originally analyzed in Oldham et al, 2006 who constructed modules in the human expression data and studied their preservation using module eigengene-based connectivity. Here we make the analysis stronger by using a full compendium of preservation statistics. For a detailed description of the data and the biological implications we refer the reader to Oldham et al (2006), Conservation and evolution of gene co-expression networks in human and chimpanzee brain. PNAS (link to paper; link to additional information).

Please download the following

and unzip them in a folder of your choice, preferably a new folder created specifically for this tutorial. Note the name of the folder; when you start an R session, the first command should be to change the R working directory into this folder.

R Tutorial

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the results.


IV. Preservation of KEGG signaling pathways between human and chimpanzee data

In this tutorial we analyze preservation of modules defined by pathway membership in KEGG signaling pathways. We study 8 pathways and show that the pathways exhibit varying degrees of preservation. The human and chimpanzee expression data are the same as those used in Tutorial III above. We recommend creating a separate folder for this analysis and either copying or re-downloading the data to the folder specific to this analysis.

Please download the following

Note the name of the folder in which you saved the data and the custom R function above; when you start an R session, the first command should be to change the R working directory into this folder.

R Tutorial

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the analysis results and selected figures.


V. F2 Mouse liver expression data: preservation of female modules in male samples

This tutorial is a somewhat expanded version of Tutorial I. We study the reference and test networks in a bit more detail, but the main part and the results are the same. To run this tutorial, please download and extract data for tutorial I as described above.

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the results.


VI. Simulation studies

Here we provide the code and results of our simulation studies. We start with a study of weak module preservation, then we provide three studies of preservation of modules of varying sizes in three different situations where some modules are preserved and some are non-preserved in various ways. For completeness, in each study we also evaluate the in-group proportion implemented in the package clusterRepro by Kapp and Tibshirani (2007). Each analysis is presented in a self-contained document.

Simulation study of weak module preservation

The tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the results.

Simulation studies of various module preservation situations with modules of varying sizes

The first tutorial is written as a single PDF document that contains sections of code and explanations what the code does. The code can be copy-pasted into an R session to re-create the results. The rest of the tutotials are provided as plain-text code with some annotation. The code is organized the same way as in the annotated tutorial.


VII. Illustration of preservation study of protein-protein interaction networks

This tutorial illustrates the use of network module preservation statistics for general networks that do not arise as correlation networks. For this tutorial we use two simulated protein-protein interaction (PPI) networks in which we simulate 10 complexes (groups of densely interconnected proteins), 5 of which are preserved in the test data set, and the other 5 are not preserved, analogously to the Simulation 2 scenarion above. The simulated networks can be downloaded here.

Alternatively, the reader can execute the code provided in the tutorial below to re-create the simulated networks. The tutorial itself is written as a PDF file containing annotated code that can be copied into an R session.


VIII. Meta-analysis of gene expression data sets

Jeremy Miller's meta-analysis tutorial illustrates the meta analysis of multiple data sets, including the use of the funtions collapseRows and userListEnrichment as well as interfacing with the VisANT software. Click here to visit his page.