Grand Prismatic Spring Lab

Thursday, December 04, 2008

Installing Ubuntu on HP Pavilion dv 4 1114nr

I got a new HP Pavilion dv4 1114nr in this holiday season. It has Windows Vista Home Edition pre-installled. Here is a log on how I install ubuntu on this laptop

1. Create Windows Vista recovery disk
Boot into Window s Vista. First, since HP does not provide recovery disk with new laptops any longer, you need to create your own recovery disks in case you need Windows Vista in the future. Start -> Recovery Disk Creation and follow the instructions.

2. Re-participation the hard drive
Windows Vista comes with hand drive resizing and re-participation utilities. That's cool! It saves our trouble to search for a 3rd party software.
Follow the instructions in the following documents:
1. Screenshot Tour: Repartition your hard drive in Windows Vista
2. Can I repartition my hard disk?

3. Download
Don't bother to download ubuntu installation iso and create your own installation CD. If you have internet access (a fair weak condition, isn't it?), you can use Unetbootin (http://en.wikipedia.org/wiki/UNetbootin).

I am not exactly sure. There seems a bug with Unetbootin.
I participated my hard drive into three particitions: C: windows system partition; D: HP recovery partition; F: unformated free partition, which is intended for Linux installation.

But when I select mode as Hard Drive, only C: partition is displayed; I have to select USB Live mode and select F: partition there. I am not sure what this implies, still waiting for the result.

5. sound issues
After the installation, the speaker and the microphone does not work. Particularly, I could not use skype :-(.

Solution to "no sound problem"
Open
sudo vi /etc/modprobe.d/alsa-base
Add the following line to the end of the file

options options snd-hda-intel model=laptop enable_msi=1

Solution to microphone problem:
It is possible due to the mic is muted.
Open Volume Control by double clicking the icon at top-right corner. Select preference and select the device for recording and playback. And cancel the mutation option.

Solution to skype "Audio playback" problem
Excute the following command in a terminal

killall pulseaudio
sudo apt-get remove pulseaudio # this seems not necessary
sudo apt-get install esound
sudo rm /etc/X11/Xsession.d/70pulseaudio

refer to http://www.econowics.com/news-from-the-net/170/skype-problem-with-audio-playback-ubuntu-810-intrepid-ibex/

refer to
https://bugs.launchpad.net/ubuntu/+bug/269586
https://help.ubuntu.com/community/HdaIntelSoundHowto

6. install skype
7. install songbird
8. install Java Runtime Environment
9. install Open Office 3.0
10. install Mac4lin
11. install VLC and other codecs
12 install sopcast and gsopcast (online TV channel)
13 install fcitx Chinese input
First remove default scim framework and install fcitx

sudo apt-get autoremove scim
sudo apt-get install fcitx

next modify Xsession to automatically start fictx for all users. Open

sudo gedit /etc/X11/Xsession.d/95xinput

and chang it to

export XMODIFIERS=@im=fcitx
export XIM=fcitx
export XIM_PROGRAM=fcitx
export GTK_IM_MODULE=fcitx
export QT_IM_MODULE=XIM
fcitx
Open

sudo vim /usr/lib/gtk-2.0/2.10.0/immodule-files.d/libgtk2.0-0.immodules

Change the line about xim to

"xim" "X Input Method" "gtk20" "/usr/share/locale" "en:ko:ja:th:zh"

======
Well, I come back to update this post. I just returned this hp laptop. This was the first time I bought a laptop from HP, unfortunately it was an disappointing experience. I have two issues to complain. The cpu fan is too noise. Even after I disabled the feature "Keep fan running" in BIOS, the fan still makes too much noise. The CD -ROM drive is not quiet either; it feels earthquake when the CD drive is working.

The recovery too is also annoying. I could not recovery my laptop to factory configuration, either via harddrive recovery tool or via recovery CDs. It failed with the "error 1002"; and the HP customer service can not provide any useful help (they outsource custume serive to India, as a result we have to adapt to Indian English).

Anyway, I will blacklist this model from HP: HP Pavilion dv4.

Reference:
1. Screenshot Tour: Repartition your hard drive in Windows Vista
2. Can I repartition my hard disk?
3. Unetbootin http://unetbootin.sourceforge.net/
4. Tutorial: Ubuntu Linux on HP Pavilion
http://aldeby.org/blog/index.php/howto-ubuntu-linux-on-hp-pavilion-dv2000-dv6000-dv9000-series-laptops
5. http://www.dailygyan.com/2008/11/10-things-you-should-do-immediately.html
6. Top 10 Ubuntu downloads http://lifehacker.com/5227309/top-10-ubuntu-downloads
7. http://theindexer.wordpress.com/2009/04/24/to-do-list-after-installing-ubuntu-904-aka-jaunty-jackalope/
8. Install Microsoft YaHei font http://hi.baidu.com/zzy011/blog/item/6651e3ed44a9c62f63d09f37.html

Saturday, November 08, 2008

<R>andom Notes

1. how to estimate the running time of a R function?

R has a function proc.time() http://rweb.stat.umn.edu/R/library/base/html/proc.time.html
sample code

## a way to time an R expression: system.time is preferred
> ptm <- proc.time()
> for (i in 1:50) mad(stats::runif(500))
> proc.time() - ptm
user system elapsed 
0.039 0.001 0.052 
## End(Not run)

2. string manipulation in R

define a string
> s = "some characters"

convert other type into a string
> s = as.character(some_variable_in_other_type)

Convert a string into numbers
> pi = as.numeric("3.14159")

string length
>nchar(s)

string concatenation
> s1 = "string1"
> s2 = "string2"
> paste(s1, s2, sep = "")

given a vector of strings, vs, return a string that is the concatenation of vs's elements
> vs = c("song", "qiang")
> paste(vs, collapse = "")
"song qiang"

string splicing
suppose s is a string, how do we slice a substring of the s given starting position and ending position?
we use the following function. there is no default value for stop. it the value of stop is larger the the total
length of string, it is truncated to the length of the string
> substr(s, first = 1, stop = 12)

string split

> strsplit("song qiang", split=" ")
[1] "song" "qiang"

3. when making figures with legend box, the text expand out of legend box when we use dev.copy2eps() to convert the figure image to a eps file

This problem comes from the different specification of font sizes in difference devices. A ugly way to solve this problem is to specify text.width=strwidth("some string"),
where "some string" refers to the longest legend text plus some extra characters. The optimal number of extra characters should be determined by trial and error.

4. How to handle exceptions in R?
Read about two functions try and tryCatch (R FAQ 7.32). An example with try is shown below:

for(i in 1:16)
{
result <- try(nonlinear_modeling(i));
if(class(result) == "try-error") next;
}

GNU/Linux Notes

GNU/Linux Notes

1. How to speed up my Linux booting?
See Bootchart http://www.bootchart.org/index.html
and remove unnecessary services in the booting process

2. One important thing to remember when creating a SVN repository
In Subversion 1.1, a repository is created with a Berkeley
DB back-end by default. This behavior may change in future
releases. Regardless, the type can be explicitly chosen with
the --fs-type argument:

$ svnadmin create --fs-type fsfs /path/to/repos
$ svnadmin create --fs-type bdb /path/to/other/repos

Do not create a Berkeley DB repository on a network
share—it cannot exist on a remote
filesystem such as NFS, AFS, or Windows SMB. Berkeley DB
requires that the underlying filesystem implement strict POSIX
locking semantics, and more importantly, the ability to map
files directly into process memory. Almost no network
filesystems provide these features. If you attempt to use
Berkeley DB on a network share, the results are
unpredictable—you may see mysterious errors right away,
or it may be months before you discover that your repository
database is subtly corrupted.
If you need multiple computers to access the repository,
you create an FSFS repository on the network share, not a
Berkeley DB repository. Or better yet, set up a real server
process (such as Apache or svnserve), store
the repository on a local filesystem which the server can
access, and make the repository available over a network.
Chapter 6, Server Configuration covers this process in
detail.

3. count file numbers in a directory and its directory

total number of files
find . some_directory|wc -l

list number of files in each directory in detail

#! /usr/bin/python

import os
import sys

def count(p):
if not os.path.isdir(p):
print "%s\t%d" % (p, 1)
return 1

pls = os.listdir(p)
s = 0
for d in pls:
if os.path.isdir(d):
s += count(d)
else:
s += 1

print "%s\t%d " % (p, s)
return s

p = sys.argv[1]
count(p)

4. Ubuntu DNS Server Problem
Problem Description: I run Ubuntu 9.04 on my computer and use Wicd (Wired and Wireless Network Manager) to configure network settings. However, sometimes when I use wireless network, Wicd is able to connect to routers (pingable), but it fails to parse domain names. There is something wrong with DNS server.

Tentative Solution: 1) First disable all settings related to DNS inside Wicd, i.e. do not use either static or global DNS server; 2) edit /etc/resolv.conf, add available DNS servers; 3) restart computer. 4) [Optional] sometimes if we configure wicd to automatically connect and use static DNS server, Wicd freezes while setting static server. In this case, we can edit /etc/wireless-settings.conf to disable automatic connection and static DNS server.

5. How to rename files or directories in order to remove white spaces in the filename?

for i in $(ls -1 *|grep " "); do
mv "$i" $(echo $i|sed 's/ /-/g');
done

6. How to backup files (or directories) with tar and 7-zip?
First we create tar balls with the tar utility and then compress the tar balls with the 7z program. If the content of the file is sensitive, you can encrypt it with the internal encryption option in 7z or with GPG. The code is as following:

for i in *; do
     tar cfv "$i.tar" "$i" && \
     7z a "$i.tar.7z" "$i.tar" && \
     # rm -rf "$i" && \
     # rm -rf "$i.tar"; done
done

7. how do I output the matching regex pattern in a line?
use grep -o PATTERN.

Wednesday, May 07, 2008

Connecting USC VPN Network in Ubuntu

[Update 2013-02-12]
Surprisingly, this old post still receive visitors occasionally. Right now, If you just want to browse the internet and download some papers, you may try the web svn service: sslvpn1.usc.edu.

[Original Post:]
At USC, when you use computers on campus, you can use directly electronic resources, databases, electronic journals because you are in USC private network. Now suppose that you go back to your apartment off campus or you travel away from USC, how can you get access to those electronic resources that USC pays for? That's where VPN come into place. VPN, also called IP tunneling, is a secure method to access computer resources in a private network. VPN stands for "virtual private network". Generally speaking, USC runs a VPN server which listens to your call in and access request. You need to run a VPN client on your own computer, which connects to the server and offer you access to USC resources as you are in USC private network.

However, ITS only provieds official support of VPN clients for Windows (link)and Mac OS (link). Here we give a VPN solution for linux users (take Ubuntu 8.04 for example).

1. Install Network Manager Applet through the Add/Remove in the Ubuntu menu. (Most time, this applet should be installed defautly; if so, just skip to step 2);

2. Install the VPN plug-in network-manager-vpnc. Open Synaptic Package Manager, search for package network-manager-vpnc and install;

3. Left click the network manager applet (usually in the top right corner of your screen) and select VPN Connections->Configure VPN->Add. Type a name in the Connection Name box, USC VPN for example; In Gateway field, type ; In vpn3k.usc.edu; In Group Name field, type USC. Click the Optional tab, select Override user name, type in your USC account (the same as your USC email) in the textbox below. Click Apply. Close the window titled VPN Connections

4. Left click the network manager applet and select VPN Connections then click on USC connection (USC VPN) to connect. In the above password box, type in your password associated with your USC account; in the below Group password, type GoTrojan. OK, we are done!

This tutorial is based on Ubuntu. I think you can also configure VPN client in Debian, Fedora, OpenSuse and other Linux distrobutions.

References:
1.VPN Client on Ubuntu https://help.ubuntu.com/community/VPNClient
2. Configuring the Cisco VPN 3000 Client (Windows 2000/XP/Vista) http://www.usc.edu/its/vpn/vpn3k47win.html#help

Saturday, May 03, 2008

Fixing Resolution Problem of Ubuntu On Paralles Desktop

Problem

After installing Ubuntu 8.04 Hardy Heron in Parallels Desktop on my Macbook Pro, the default resolution is 1024*768. I want to use my Macbook pro's 1440*900 full resolution. I tried to use System->Preference->Screen Resolution, but there are not 1440*900 at all.

Solution

Basic idea: The problem arises because Ubuntu fails to detect the settings of my monitor automatically. Then can I mannually modify xorg.conf to set the right resolution? Let's go!

Open up a terminal. First Backup the original xorg.conf

sudo cp /etc/X11/xorg.conf  /etc/X11/xorg.conf.backup

Next open, open xorg.conf with your favorite editor

sudo vi /etc/X11/xorg.conf

Search the section "Screen" like below.

Section "Screen" Identifier "Default Screen" Device "Generic Video Card" Monitor "Generic Monitor" EndSection

Probabably your file contains more lines similar to the following

SubSection "Display" Depth 24 Modes "1024x768" "800x600" "640x480" EndSubSection

Note the line "Modes "1024x768" "800x600" "640x480"". It says that there are three different kinds resolutions, but our desired resolution 1440x900 is omitted. So we can simply add this resolution option. It is like the following after modification

SubSection "Display" Depth 24 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection

It’ll appear several times throughout the file. Each time you see it, just add your desired resolution (in your case, 1440×900).

If your file doesn't contain a similar Subsection "Display" inside the Section "Screen" (as shown above), you just add the Subsection "Display" yourself. And th final result looks like

Section "Screen" Identifier "Default Screen" Device "Generic Video Card" Monitor "Generic Monitor" DefaultDepth 24 SubSection "Display" Depth 1 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection SubSection "Display" Depth 4 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection SubSection "Display" Depth 8 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection SubSection "Display" Depth 15 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection SubSection "Display" Depth 16 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection SubSection "Display" Depth 24 Modes "1440x900" "1024x768" "800x600" "640x480" EndSubSection EndSection

Finally save the above modifications. Restart your X session by pressing Ctrl-Atl-Breakspace (or reboot your ubuntu), it just works!

If you encounter total messy after this modificaion, don't panic because you still have the backup of the original xorg.conf!

Reference
1. http://gonz.wordpress.com/2007/09/22/fixing-screen-resolution-on-ubuntu-linux-in-parallels-desktop/
2. http://www.simplehelp.net/2007/04/30/how-to-increase-the-screen-resolutions-available-to-ubuntu-while-running-in-parallels-for-os-x/

Thursday, April 17, 2008

Installing R in Suse SLED 10.1

Some one comments:"R is a nightmare in ANY distribution." I agree with him in regards to SUSE Linux Enterprise Edition version 10.1. I am using a x86_64 machine running SUSE SLED 10.1. I wrote in a letter about my bad experience when installing R.

"At first I tried SUSE's install software application, but it can't find R in its repository. Then I downloaded the rpm package of R for SUSE 10.1 from CRAN mirrors, and tried install software application again. But it can not resolve dependencies. At last I ran rpm -i R-base-2.6.1-3.1.x86_64.rpm from terminal. It said it needs libgfortran.

OK, I downloaded and installed libgfortran package, and then tried rpm -i R-base-2.6.1-3.1.x86_64.rpm again. This time it needed base. I downloaded base package and try to install it by running rpm -i base-1.3.6-1mdv2008.0.noarch.rpm (I am not sure this is the right package for my computer), butit needs apache-mod_php, apache-mod_ssl, php-mysql, etc. I have to give up."

Also it seems difficult to me to compile R from source code because the package dependency is so complex.

Finally I figured out a trick: we can run windows version R in a linux with Wine. First go to http://www.winehq.org/. download and install wine. And then grab a Windows installer of R and install it. It works, either basic computation or graphic display. But there is a little problem as shown in the following graph: the cursor overlaps with the text.

ps: Another by-product is that I can play starcraft on that Linux machine with a big display.

Reference:
1. Stuff I've learned about Wine

Saturday, April 05, 2008

Anti RSI Software

RSI stands for Repetitive Strain Injury. It results from repetitive motion of hands, wrist and long time incorrect posture that keeps specific muscles tense all the time. It is commonly seen in people who use computers a lot.

It is helpful to use more human-friendly mouse and keyboard, comfortable chairs and desks and pleasant work space; however we easily forget how long we use a computer when entirely concertrating on the work. Anti-RSI software can reminds us regular breaks and micropause .

Yun Fang recommended Workpace software to me yesterday. It offers a 30-day trial version, but charges a fee after that period. I found two alternative free anti-RSI software: Workrave for Windows and Linux and AntiRSI for Mac OS.

http://www.workrave.org/welcome/

http://tech.inhelsinki.nl/antirsi/

Reference:
Alleviate RSI the Hacker Way

Saturday, March 01, 2008

Synchronization Between Linux, Mac & Windows

Sooner or later after you begin to have more than one computer, you will face the problem of synchronization files between them. Of course the easiest solution is to sell n-1 computers; but it is not reasonable since we must have stronger reason to have n computers at first .

Let me do the research.

My situation is as following:
I run SUSE Linux on the desktop in my lab, which is supposed to running all the time; Also I have a desktop at home running windows xp and finally my macbook laptop. My first priority is to sync between mac and linux for I usc them heavily; and second between mac and windows. Also since I use computer

References

Sync folders between a Mac and PC?

How to mount a Windows shared folder on your Mac

Geek to Live: Mirror files across systems with rsync

http://lifehacker.com/software/mac-os-x/how-to-access-a-macs-files-on-your-pc-247541.php

How to set up a home FTP server

http://ceitl.zanestate.edu/blog/archives/2005/10/synchronizing-files-across-computers-and-platforms/

Geek to Live: Automatically back up your hard drive

http://everythinglinux.org/rsync/

Passwordless SSH Login
http://www.hackinglinuxexposed.com/articles/20021226.html

http://linuxmafia.com/%7Erick/linux-info/filesync.html

Monday, January 14, 2008

Synthetic Biology

Drew Endy Foundations for engineering biology

the four challenges that greatly limit engineering biology today are 1)biological complexity; 2)the tedious and unreliable construction and characterization of synthetic biological systems; 3)the spontaneous variation of biological systems; and 4) evolution.

Lessons from the past
standardization
Registry of Standard Biological Parts: http://parts.mit.edu/registry/index.php/Main_Page

decoupling;
separation of deign and implementation
for example, one group design useful DNA sequences and another group synthesize the piece of DNA chemically. (it seems possible now)

abstraction

design of reproducing machines
reliable computing with unreliable components
error detection & correction mechanism
self-replicating automata

additional reading:
Elowitz, M. B. & Leibler, S. A synthetic oscillatory network of transcriptional
regulators. Nature 403, 335–-338 (2000)

Sprinzak, D. & Elowitz, M. B. Reconstruction of genetic circuits. Nature
doi:10.1038/nature04335

Sunday, November 25, 2007

How to read and write NTFS partition in Mac OS X 10.5 Leopard

Boot Camp in Mac OS X 10.5 Leopard enables user to easily install Windows XP together with Mac OS X on a intel-based apple computer. However Mac OS X 10.5 can natively read, but not write, NTFS partition which is commonly used by Windows XP. In order to both read and write Windows partition in Mac OS, we can format Windows partition as FAT. Here is another approach to bypass this limitation.

1. Download and install MacFUSE for Mac OS X 10.5
2. Download and install NTFS-3g for Mac OS
3. Restart! If you are lucky you can try read and write your Windows' NTFS partition now.

This approach work on my Mac OS X 10.5 + 10.5.1 updates and Windows XP SP2 with NTFS partition on MacBook Pro (Model Identifier MacBookPro3,1).

Before you decide to proceed, google "read write NTFS Mac OS X leopard" to be informed of newest advancement and check for latest versions of MacFUSE and NTFS-3g.

Reference
1. MacFUSE http://code.google.com/p/macfuse/
2. NTFS-3g http://www.ntfs-3g.org/
3. NTFS-3g for Mac OS http://macntfs-3g.blogspot.com/
4. Filesystem in Userspace http://en.wikipedia.org/wiki/Filesystem_in_Userspace
5. Filesystem in Userspace http://fuse.sourceforge.net/wiki/index.php/FileSystems
6. NTFS on your Mac http://www.tuaw.com/2007/11/19/ntfs-on-your-mac-two-ways/

Monday, March 12, 2007

Modeling Biomedical Networks

steady state vs. equilibrium
If the rate of change of all variables (concentrations of matters) are constant we get a steady state. If Additionally all reactions fluxes are zero, we have an equilibrium.

Calculating steady state
There are several numerical methods to calculate steady state, such as improved Newton method, forward integration and backward integration. However none of them are perfect even to find a steady state in complex systems, which may have several steady states.

Metabolic Control Analysis
MCA describes how the systems reacts to changes of parameters. Elasticities describes how the reaction rates depend on the metabolite concentrations. Control coefficients describes how the systems behavior depend on the reaction rates

References:
http://projects.eml.org/downloads/copasi/CopasiTutorial.pdf

Wednesday, March 07, 2007

Install Matlab R2006b

I decide to reinstall MATLAB R2006b mostly because of a new toolbox SymBiology

SimBiology extends MATLAB with tools for modeling, simulating, and analyzing biochemical pathways. You can create your own block diagram model using predefined blocks. You can manually enter in species, parameters, reactions, rules, kinetic laws, and units, or read in Systems Biology Mark-Up Language (SBML) models. SimBiology lets you simulate a model using stochastic or deterministic solvers and analyze your pathway with tools such as parameter estimation and sensitivity analysis.

First get the following MATLAB ISO images at ftp://pxe/software/Matlab2006b (perhaps only available for LAN of USTC)

[Mathworks.Matlab].Mathworks.Matlab.R2006b.UNIX.ISO-TBE-CD1.iso [Mathworks.Matlab].Mathworks.Matlab.R2006b.UNIX.ISO-TBE-CD2.iso [Mathworks.Matlab].Mathworks.Matlab.R2006b.UNIX.ISO-TBE-CD3.iso [Mathworks.Matlab].Mathworks.Matlab.R2006b.UNIX.ISO-TBE.nfo

mount these images and enter the directory where you want to install matlab, create a matlab directory ($MATLAB).

Copy the license file from the first CD. There two license files in CD1/crack license_locked.dat license_server.dat. I copy license_locked.dat to $MATLAB and rename it license.dat. Enter $MATLAB
run CD1/install. The graphic interface is easy to complete.

When I finished the normal install and tried to run matlab. It poped a very lengthy error message java.lang.ExceptionInInitializerError at com.mathworks.mde.filebrowser.FileBrowser.(FileBrowser.java:92) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
and collapsed thereafter. But if I run matlab -nojvm, it worked normally.

Solution: the java compiler that comes together with MATLAB caused the above error. Replace it with my own version of java (jre1.5.0_06)

cd $MATLAB/sys/java
mv java java-backup
ln -s path_of_your_own_java java

And then MATLAB works now. Bingo!

PS: kkk recommended another standalone software, Copasi, to build and simulate biomedical networks. Have a look at it.

COPASI is a software application for simulation and analysis of biochemical networks. COPASI — a COmplex PAthway SImulator. Bioinformatics 22, 3067-74.

Current Features:
Stochastic and deterministic time course simulation
Steady state analysis (including stability)
Metabolic control analysis / sensitivity analysis
Elementary mode analysis
Mass conservation analysis
Calculation of Lyapunov exponents
Parameter scans
Optimization of arbitrary objective functions
Parameter estimation using data from time course and/or steady state experiments
Sliders for interactive parameter changes
Global parameter to change multiple kinetic rates at once
Imports and exports SBML (export only in level 2 version 1, import all levels)
Loads Gepasi files
Export in Berkeley Madonna format and C source code of the ODE system generated from the chemical reactions
Versions for MS Windows, Linux, OS X, and Solaris SPARC
Command line version for batch processing
Visit this page often, new releases will contain many more features!

Still No Sense of Signaling Network Research

As the time of graduation is approaching, I still have no a clear sense of my research subject-insulin signaling network. I would like to admit my laziness and it is mostly because it is a very new and unclear research area. If I have also started with a traditional research, cell culture, gene cloning and purification of proteins, I would mostly finish my research. And now it is too late to switch to an easy topic and it is stupid to do that. Thank that I have read many enlightening papers in this area and learn to use some softwares, why should I give up. It won't be very difficult to graduate no matter what research you have did. It is just a try.

After I realized the above idea, I decided to read systematically publications in this area. Today I am reading the Science STKE Signaling Breakthroughs of the Year. And now another list of paper to be read (The number of papers in this list is increasing expotentially, I don't know when can I have my sense of them)

[1]G. Altan-Bonnet, R. N. Germain, Modeling T cell antigen discrimination based on feedback control of digital ERK responses. PLoS Biol. 3, e356 (2005).[CrossRef][Medline]

[2]J. R. Pomerening, S. Y. Kim, J. E. Ferrell, Jr., Systems-level dissection of the cell-cycle oscillator: Bypassing positive feedback produces damped oscillations. Cell 122, 565–578 (2005).[CrossRef][Medline]

[3]O. Brandman, J. E. Ferrell, Jr., R. Li, T. Meyer, Interlinked fast and slow positive feedback loops drive reliable cell decisions. Science 310, 496–498 (2005).[Abstract/Free Full Text]

Friday, March 02, 2007

Paper Analysis -2007-03-02

Reconstruction of Cellular Signaling Networks and Analysis of Their Properties Nature Reviews Molecular Cell Biology 6, 99-111 (2005); doi:10.1038/nrm1570

A NETWORK RECONSTRUCTION includes a chemically accurate representation of all of the biochemical events that are occurring within a defined signalling network, and incorporates the interconnectivity and functional relationships that are inferred from experimental data.

This article give a enlightening theoretical analysis of signal transduction networks: the order of magnitude of numbers of network components (receptor, kinase, phophatase), the order of magnitude of interconnectivity(~2.5 degree of interconnectivity per component). We can use Combinatorial Complexity to characterize this idea. The catalog of network components without post-translational modification can be inferred from the results the genome annotation. The spectrom of network components after PTM and protein-protein interaction during varies states of the network is expected to be assayed with future proteomic experimental techniques (though I feel passive with expectation). But what use or what consequences of these large potential spectrum of various network components means?

The following paper it refers may be worth reading.

[1]
Papin, J. A. & Palsson, B. O. The JAK–STAT signaling network in the human B-cell: an extreme signaling pathway analysis. Biophys. J. 87, 37–46 (2004).

[2]
Resat, H., Wiley, H. S. & Dixon, D. A. Probability-weighted dynamic Monte Carlo method for reaction kinetics simulations. J. Phys. Chem. B 105, 11026–11034 (2001)

[3]
Bhalla, U. S. & Iyengar, R. Emergent properties of networks of biological signaling pathways. Science 283, 381–387 (1999).
Describes some of the first large-scale analyses of signalling reactions.

[4]
Hoffmann, A., Levchenko, A., Scott, M. L. & Baltimore, D. The IkappaB–NF-kappaB signaling module: temporal control and selective gene activation. Science 298, 1241–1245 (2002).
Shows the powerful integration of mathematical modelling with experimental investigation

[5]
Lee, E., Salic, A., Kruger, R., Heinrich, R. & Kirschner, M. W. The roles of APC and Axin derived from experimental and theoretical analysis of the Wnt pathway. PLoS Biol. 1, 116–132 (2003).

[6]
Prill, R., Iglesias, P.A. and Levchenko, A. Dynamic Properties of Small Regulatory Motifs Contribute to Biological Network Organization. PLoS Biology 3(11): e343 (2005)

[7]
Sivakumaran, S., Hariharaputran, S., Mishra, J. & Bhalla, U. S. The database of quantitative cellular signaling: management and analysis of chemical kinetic models of signaling networks. Bioinformatics 19, 408–415 (2003)

Thursday, March 01, 2007

Omics is Just a Startup

When I was listening the report titled Using Genomics to Explore the Microbial World by Prof. James Tiedje this afternoon, an idea had been daunting in my mind all the time. "Omics is dead" -I forgot where I read this remarks, but it stroke me then and now. Omics is like listing all the components of a computer. However, due to technique limitations and time constraints, we will never be able to get a full list of genes and proteins, though genomics and proteomics optimisticly promised. Even if we could get the full catalogue of human machine, we still can not understand how human body functions and malfunctions, as knowing all the components of a computer does not necessarily imply understanding its working.

Now besides proteomics and genomics, here comes the metabolomics, with similar promising declarations. As the lates Nature essay (Meet the human metabolome)states,

Metabolomics is the study of the raw materials and products of the body's biochemical reactions, molecules that are smaller than most proteins, DNA and other macromolecules. The aim is to be able to take urine, blood or some other body fluid, scan it in a machine and find a profile of tens or hundreds of chemicals that can predict whether an individual is on the road to a disease, say, or likely to experience side-effects from a particular drug.

In fact, researchers in metabolomics are even more optimistic, declaring that

Small changes in the activity of a gene or protein (which may have an unknown impact on the workings of a cell) often create a much larger change in metabolite levels particular concentrations and combinations can reveal something about drugs or disease

However, I am suspecious about their promise. First, considering the great diversity of metabolites in human fluids, we still have not a powerful enough assay to identify the all metabolite in a high-throughout manner and measure their concentrations. Second, the changes in the metabolome is more susceptible to enviromental factors, thus it will be difficult to tell significant changes related to human diseases from temporal fluctuations.

Anyway, let be a little optimistic, omics is just a startup!

Monday, February 05, 2007

Owe Ohler

Owe Ohler's research focus on sequence analysis. His previous ans current research projects include: Regulation of gene expression in Arabidopsis root development; Prediction and validation of skipped mammalian exons; Analysis of transcription start sites in fungal genomes; Motif finding with Bayesian approaches; Identification of core promoter elements in Drosophila; Post-transcriptional regulation with RNA-binding proteins; Regulation of neuronal gene expression in C elegans Pavel Tomancak; Embryonic expression patterns in Drosophila. It is worth mentioning that his research also deal with gene expression analysis, but I am not familiar with his thoughts and methods in this field. So I will focus on part of his research: alternative splicing site identification and promoter prediction.

Ohler U, Shomron N, Burge CB (2005) Recognition of Unknown Conserved Alternatively Spliced Exons. PLoS Comput Biol 1(2): e15 doi:10.1371/journal.pcbi.0010015

Ohler has scientific collaboration with Christopher B. Burge, from MIT, probably a BIG guy in this area. Pay attention to him.

What use is the identification of alternative splicing sites of. The author says that "The identification of such variants has until recently relied solely on the sequencing and comparison of expressed sequence tags (ESTs), but the number of available ESTs is not large enough to cover all variants under all conditions" According a Nature Genetics Review, which I reviewed in last post, the development of microarray platform for finding unknown exons are on the way. Probably, even a microarray experiment can not still covers all variants under all conditions. Thus a preliminary computational prediction gives many possible alternative splicing sites, among which many may be false positive, which can be tested by a microarray experiment. Such prediction may also help the design of the array.

Method: pair hidden Markov model

Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification RNA (2004), 10:1309-1322

Quantification of transcription factor expression from Arabidopsis images Bioinformatics 2006 22(14):e323-e331; doi:10.1093/bioinformatics/btl228

In spite of the great success of microarray technique in gene expression profiling, it fails to detect spatial features of gene expression, thus the confocal microscopy can also provide quantitative information of gene expression with greater spatial and temporal resolution. This paper describes a software protocol of analyzing confocal microscopy images. (How the high-throughput is achieved?)

imagine registration
GFP transcriptional fusion GFP serves as marker of mRNA expression level
GFP translational fusion

Monday, January 29, 2007

Paper Analysis: Microarray technology: beyond transcript profiling and genotype analysis

Microarray technology: beyond transcript profiling and genotype analysis
Nature Reviews Genetics 7, 200-210 (March 2006) | doi:10.1038/nrg1809

I have spent nearly three days reading this review on microarray. It is partly because this paper involves too many new concepts for me to digest, partly because, I have to admit, I have wasted too much time on BBS, films and music ^_^. Even until now I still cannot declare to absorb all materials in this paper, but i think it is better to take some notes here for it may urge me to concentrate on research.

This paper describe the following microarray development

Process	Status^*
^*From most to least developed: mature, in progress, under development, early stages, pilot phase, idea. CGH, comparative genomic hybridization; ChIP-on-chip, on-chip chromatin immunoprecipitation.
Transcriptional profiling	Mature, but still to be improved
Genotyping	Mature, but still to be improved
Splice-variant analysis	In progress
Identification of unknown exons	Early stages
DNA-structure analysis	Pilot phase
ChIP-on-chip	In progress
Protein binding	Under development
Protein–RNA interaction	Idea
Chip-based CGH	In progress
Epigenetic studies	Under development
DNA mapping	Mature
Resequencing	In progress
Large-scale sequencing	Under development
Gene/genome synthesis	Early stages
RNA/RNAi synthesis	Pilot phase
Protein–DNA interaction	Under development
On-chip translation	Under development
Universal microarray	Under development

He thoughts transcriptional profiling is relative in technique but the data analysis and interpretation. Some organization are take effect in this path, such as Microarray Gene Expression Data (MGED) Society, Gene Ontology Consortium and Bioconductor.

Expanding RNA studies the transcried RNA profile is a mixture of pre-mRNA, various form of alternative spliced mature mRNA, non-coding RNA and regualatory RNA. If we think about the effect of alternative splicing, it is possible that we may ignorant other forms and exons in the genome sequence which is not seen in our experiement samples. Then how to know other exons and what condition they are retained in mature mRNA, we can built an array consisting of oligonucleotide representing all known exons from genome annotation analysis. This array can then be used for the above condition.

Another question arising is that how can we find exons that escape the notice of genome annotation analysis. "One option is to synthesize oligonucleotides that correspond to the sequences at the exon–intron boundaries with their 5' ends attached to the chip surface "

Another approach is the entire genome microarray (tiling path), but the fragment is rather long which may miss some active sites of interest.

ChIP-on-chip on-chip chromatin immunoprecipitation. But, how this technique get high throughput if only one kind of protein can be precipitated due to the specificity of antibody binding? Needs more reading to understand this technique.

The author also predicted that " all analyses that are carried out with DNA are feasible at the level of RNA also."

comparative genomic hybridization (CGH), a method that is used to analyse variations in DNA copy number

The following part of this paper describes on demand sythesis based on microfluidic microarray, such as probe production (parallel production of large amount of different of oligomers), gene synthesis, RNAi production and protein in situ synthesis. Finally he introduced universal microarray platform based on L-DNA with great enthusiasm.

Conclusions:
1. To some extent, microarray technique means a new data-driven method e.g placing data production before intellectual concepts. This method is different from traditional hypothesis driven research in biology but is successful in physics.
2. The global view obtained by microarray approaches might lead researchers to appreciate more complexity of biological systems.
3. Experimental multiplexing by analysing different processes on a single system platform will become important. The in vitro systems biology will emerge competing (or complementing) in silico systems biology.

Here is a list of notable research project about microarray analysis http://filtr.blogspot.com/2007/02/research-projects-on-microarray.html

Monday, January 22, 2007

Xianghong Zhou's Papers

If you know the enemy and know yourself, you need not fear the result ofa hundred battles.

--Sun Tze, the Art of War

Comments on Zhou's papers:
1. Gene Aging Nexus: A Web Database and Data Mining Platform for Microarray Data on Aging
keywords:
meta-analysis: by first extracting expression patterns form individual microarray datasets and then identifying recurrent signals, these approaches may enhance signal-noise separation.
differential expression analysis:
co-expression analysis: Zhou proposed a new method to mine regulatory modules in previous papers Mining dense subgraphs across massive biological networks for functional discovery.
no major biological breakthrough.

2. Integrative missing value estimation for microarray data
Question Answered:
Due to the inherent noise and the limitation of experimental systems, a microarray dataset on average has more than 5% missing values, affecting more than 60% of the genes. Such missing values made some subsequent analysis methods inapplicable or greatly decrease their performance. Thus the question of missing value estimation.

Basic Idea:
How to choose neighboring genes when not enough information is available in internal microarray dataset. Intuitively, if a set of genes frequently show expression similarity to the target gene over multiple data sets, they constitute a robust neighborhood which tend to show expression co-variations with the target gene.

other concepts:
LLS Local Least Square
Bayesian principle component analysis
singular value decomposition
support vector machines

Tuesday, December 05, 2006

BioMed Search

BioMed Search has been created by Alex Ksikes, a PhD student in University of Cambridge. The goal of BioMed search is to organize figures, images or schema found in biomedical articles. Over 1 Million images have been indexed and more is on its way. BioMed Seach indexes image captions along with the citations to these images.

A sample search with the keyword insulin returns the following result

Insulin resistance is mediated by a proteolytic fragment of the insulin receptor." title="Click to enlarge">

Figure 2: Insulin-induced lipogenesis, glycogen synthesis, and glucose uptake in basal and insulin-resistant 3T3-L1 adipocytes: effects of E64 on insulin resistance. Panel A, effect of E64 on insulin-induced lipogenesis in cells chronically pretreated with insulin. 3T3-L1 adipocytes were pretreated with 1.7 µM insulin alone for 0 (C), 9 (9I), or 18 h (18I) ...

From Insulin resistance is mediated by a proteolytic fragment of the insulin receptor.
The Journal of biological chemistry.

Knutson VP, Donnelly PV, Balba Y, Lopez-Reyes M · 1995 Oct 20

It is the well formated scientific literatures that makes the such search possible. Pay attention and try to improve it.