Photo of Guiyuan Lei

Dr Guiyuan Lei
University of Newcastle
Newcastle upon Tyne, United Kingdom

Calibayes CISBAN

Home  Research  Resource  iNotes  Personal

iNotes

System Biology

Computer Tools

Misc

The use of R packages to analyse data

I am currently working on identifing differential expression and network inference for microarray data using R packages in Bioconductor. Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data. We use several R packages: affy, affyPLM, smida, limma, time course and GeneNet. These packages have been integrated into Bioconductor. Please refer to Bioconductor on how to install Bioconductor and these packages. A detailed note is available The use of R packages to analyse data.

Cytoscape

Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. A detailed note is available The use of Cytoscape.

Metropolis Hasting for Decomposable Gaussian Graph

Mike West has one program for Metropolis Hasting search Decomposable Gaussian Graph (MH-d ). I have tried it. Till now, I found two problems: first, there is no initialization for graph matrix, so the value of some entries of current_graph is not 0 or 1. To fix this, initilize the value of current_graph in the function read_starting_point() in copy.C. The second problem is that the value of log-posterior is NAN (not a number), I have not found the reason.) I also modified the Metropolis.C to record the edges of graph instead of the matrix of graph. Because the sparse property of biological netowrk, print out the matrix of graph is wastle of disk space, also not easy to find where is the edge.

This is a serial program, I select a set of genes from whole 54390 genes (for hgu133plus2 affymetrix microarray). For the whole procedure of construct decomposable Gaussian Graph from microarry data, please see Gaussian Graphical Models for microarray data.

HdBCS

HdBCS (Bayesian Covariance Selection in High-Dimensions) is designed by Adrian Dobra (adobra@stat.duke.edu). It is a package to perform covariance selection for datasets with tens or possibly hundreds of thousands of variables. Codes can be downloaded from HdBCs. I wrote notes on how to configure ssh to work properly with mpich, how run mpich on single machine, how to compile mpich, report one bug in STEP 1 of the software, and other issues in STEP 2 and 3...

See more Notes4HdBCS.pdf (Created: 27 Feb,2006). HdBCS can also be compiled by lammpi, see lamboot with specific nodes and Compile C++ program which using fortran package in lammpi.

When I tested HdBCS for 54390 genes (for hgu133plus2 affymetrix microarray), it exited with the error message "signal 8 SIGFPE floating point exception" in step 2. It exited when it deal with different genes at different runs. It never successfully finished. I tried a small list of genes (953 genes), after 12 hours running, I got nothing (empty network), maybe because I have small sample. Anyway, I have to give up this program. The authors released a new version of HdBCS, I hope that I have time to try it in the future. But now I go to decomposable Gaussian Graphs algorithms. (4 June,2006)

Related notes: Configure ssh for running MPICH.

BioConductor

BioConductor is an open source and open development software project for the analysis and comprehension of genomic data. This notes describe how to install BioConductor, how to pre-process affymetrix micro-data and resources on BioConductor ...

See more Notes4BioConductor.pdf (Created: 14 March,2006)

 

Last modified:
8 January, 2008