Showing posts with label computer. Show all posts
Showing posts with label computer. Show all posts

Thursday, July 14, 2011

Setting Up a Hadoop Cluster

This post lists the steps to set up an Hadoop cluster in Ubuntu 11.04. Most codes can be directly copied and pasted.

* Hadoop
** Install Java
#+begin_src shell
sudo apt-get install sun-java6-jdk
sudo update-java-alternatives -s java-6-sun
#+end_src

** Add Hadoop User and Group
#+begin_src shell
sudo addgroup hadoop
sudo adduser --ingroup hadoop hadoop
#+end_src

** Configuring SSH and Password-less Login
#+begin_src sh
  # In the master node
  su hadoop
  ssh-keygen -t rsa -P ""
 
  for node in $(cat /conf/slaves);
  do
      ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@$node;
  done
#+end_src

** Install Hadoop
*** Install
#+begin_src sh
  ## download and install
  cd /home/hadoop/
  tar xzf hadoop-0.21.0.tar.gz
  mv hadoop-0.21.0 hadoop
#+end_src
*** Update .bashrc
#+begin_src sh
  ## update .bashrc
  # Set Hadoop-related environment variables
  export HADOOP_HOME=/home/hadoop/hadoop
  export HADOOP_COMMON_HOME="/home/hadoop/hadoop"
  export PATH=$PATH:$HADOOP_HOME/bin
  export PATH=$PATH:$HADOOP_COMMON_HOME/bin/
#+end_src
*** Update conf/hadoop-env.sh
#+begin_src sh
  export JAVA_HOME=/usr/lib/jvm/java-6-sun
  export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
#+end_src
*** Update conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<!-- In: conf/core-site.xml -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://128.125.86.89:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>


</configuration>
*** Update conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<!-- In: conf/mapred-site.xml -->
<property>
<name>mapreduce.jobtracker.address</name>
<value>128.125.86.89:54311</value>
</property>

</configuration>
*** Update conf/hdfs-site.xml
#+begin_src html
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<!-- In: conf/hdfs-site.xml -->
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>

</configuration>
#+end_src
*** Update conf/masters (master node only)
#+begin_src sh
128.125.86.89
#+end_src
*** Update conf/slaves (master node only)
#+begin_src sh
128.125.86.89
slave-ip1
slave-ip2
......
#+end_src
*** Copy hadoop installation and configuration files to slave nodes
#+begin_src sh  
# In the master node  
su hadoop    
for node in $(cat /conf/slaves);  
do
      scp ~/.bashrc hadoop@$node:~;       scp -r ~/hadoop hadoop@#node:~;  
done
#+end_src
** Run Hadoop
*** Format HDFS
#+begin_src sh
hdfs namenode -format
#+end_src
*** Start Hadoop
#+begin_src sh
start-dfs.sh && sleep 300 && start-mapred.sh && echo "GOOD"
#+end_src
*** Run Jobs
#+begin_src sh
hadoop jar hadoop pipes
#+end_src
*** Stop Hadoop
#+begin_src sh
stop-mapred.sh && stop-dfs.sh
#+end_src
** References:
1. http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ 
2. http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
3. http://fclose.com/b/cloud-computing/290/hadoop-tutorial/
4. Fix could only be replicated to 0 nodes instead of 1 error

Thursday, December 04, 2008

Installing Ubuntu on HP Pavilion dv 4 1114nr

I got a new HP Pavilion dv4 1114nr in this holiday season. It has Windows Vista Home Edition pre-installled. Here is a log on how I install ubuntu on this laptop

1. Create Windows Vista recovery disk
Boot into Window s Vista. First, since HP does not provide recovery disk with new laptops any longer, you need to create your own recovery disks in case you need Windows Vista in the future. Start -> Recovery Disk Creation and follow the instructions.

2. Re-participation the hard drive
Windows Vista comes with hand drive resizing and re-participation utilities. That's cool! It saves our trouble to search for a 3rd party software.
Follow the instructions in the following documents:
1. Screenshot Tour: Repartition your hard drive in Windows Vista
2. Can I repartition my hard disk?

3. Download
Don't bother to download ubuntu installation iso and create your own installation CD. If you have internet access (a fair weak condition, isn't it?), you can use Unetbootin (http://en.wikipedia.org/wiki/UNetbootin).

I am not exactly sure. There seems a bug with Unetbootin.
I participated my hard drive into three particitions: C: windows system partition; D: HP recovery partition; F: unformated free partition, which is intended for Linux installation.

But when I select mode as Hard Drive, only C: partition is displayed; I have to select USB Live mode and select F: partition there. I am not sure what this implies, still waiting for the result.

5. sound issues
After the installation, the speaker and the microphone does not work. Particularly, I could not use skype :-(.

Solution to "no sound problem"
Open
sudo vi /etc/modprobe.d/alsa-base
Add the following line to the end of the file
options options snd-hda-intel model=laptop enable_msi=1

Solution to microphone problem:
It is possible due to the mic is muted.
Open Volume Control by double clicking the icon at top-right corner. Select preference and select the device for recording and playback. And cancel the mutation option.

Solution to skype "Audio playback" problem
Excute the following command in a terminal

killall pulseaudio
sudo apt-get remove pulseaudio # this seems not necessary
sudo apt-get install esound
sudo rm /etc/X11/Xsession.d/70pulseaudio
refer to http://www.econowics.com/news-from-the-net/170/skype-problem-with-audio-playback-ubuntu-810-intrepid-ibex/

refer to
https://bugs.launchpad.net/ubuntu/+bug/269586
https://help.ubuntu.com/community/HdaIntelSoundHowto


6. install skype
7. install songbird
8. install Java Runtime Environment
9. install Open Office 3.0
10. install Mac4lin
11. install VLC and other codecs
12 install sopcast and gsopcast (online TV channel)
13 install fcitx Chinese input
First remove default scim framework and install fcitx
sudo apt-get autoremove scim
sudo apt-get install fcitx
next modify Xsession to automatically start fictx for all users. Open
sudo gedit /etc/X11/Xsession.d/95xinput
and chang it to
export XMODIFIERS=@im=fcitx
export XIM=fcitx
export XIM_PROGRAM=fcitx
export GTK_IM_MODULE=fcitx
export QT_IM_MODULE=XIM
fcitx
Open
sudo vim /usr/lib/gtk-2.0/2.10.0/immodule-files.d/libgtk2.0-0.immodules
Change the line about xim to
"xim" "X Input Method" "gtk20" "/usr/share/locale" "en:ko:ja:th:zh"
======
Well, I come back to update this post. I just returned this hp laptop. This was the first time I bought a laptop from HP, unfortunately it was an disappointing experience. I have two issues to complain. The cpu fan is too noise. Even after I disabled the feature "Keep fan running" in BIOS, the fan still makes too much noise. The CD -ROM drive is not quiet either; it feels earthquake when the CD drive is working.

The recovery too is also annoying. I could not recovery my laptop to factory configuration, either via harddrive recovery tool or via recovery CDs. It failed with the "error 1002"; and the HP customer service can not provide any useful help (they outsource custume serive to India, as a result we have to adapt to Indian English).

Anyway, I will blacklist this model from HP: HP Pavilion dv4.

Reference:
1. Screenshot Tour: Repartition your hard drive in Windows Vista
2. Can I repartition my hard disk?
3. Unetbootin http://unetbootin.sourceforge.net/
4. Tutorial: Ubuntu Linux on HP Pavilion
http://aldeby.org/blog/index.php/howto-ubuntu-linux-on-hp-pavilion-dv2000-dv6000-dv9000-series-laptops
5. http://www.dailygyan.com/2008/11/10-things-you-should-do-immediately.html
6. Top 10 Ubuntu downloads http://lifehacker.com/5227309/top-10-ubuntu-downloads
7. http://theindexer.wordpress.com/2009/04/24/to-do-list-after-installing-ubuntu-904-aka-jaunty-jackalope/
8. Install Microsoft YaHei font http://hi.baidu.com/zzy011/blog/item/6651e3ed44a9c62f63d09f37.html

Saturday, March 01, 2008

Synchronization Between Linux, Mac & Windows

Sooner or later after you begin to have more than one computer, you will face the problem of synchronization files between them. Of course the easiest solution is to sell n-1 computers; but it is not reasonable since we must have stronger reason to have n computers at first .

Let me do the research.

My situation is as following:
I run SUSE Linux on the desktop in my lab, which is supposed to running all the time; Also I have a desktop at home running windows xp and finally my macbook laptop. My first priority is to sync between mac and linux for I usc them heavily; and second between mac and windows. Also since I use computer

References


Sync folders between a Mac and PC?


2.

How to mount a Windows shared folder on your Mac


Geek to Live: Mirror files across systems with rsync

http://lifehacker.com/software/mac-os-x/how-to-access-a-macs-files-on-your-pc-247541.php

How to set up a home FTP server

http://ceitl.zanestate.edu/blog/archives/2005/10/synchronizing-files-across-computers-and-platforms/

Geek to Live: Automatically back up your hard drive

http://everythinglinux.org/rsync/

Passwordless SSH Login
http://www.hackinglinuxexposed.com/articles/20021226.html


http://linuxmafia.com/%7Erick/linux-info/filesync.html