Hadoop adalah database yang digunakan untuk mengolah big data umumnya pada level tera
atau peta
byte
- Instalasi
- Testing
Instalasi
1. Kebutuhan/Persiapan
- Install OpenJDK 11
-
Buat Hadoop user
$sudo adduser hdadmin
- SSH Server dan Client
$sudo apt-get install openssh-server openssh-client pdsh
hadoadmin
2. Install Hadoop sebagai user -
Download dan extract hadoop
$wget https://archive.apache.org/dist/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz $tar xzf hadoop-3.2.0.tar.gz
-
Setup Hadoop
-
Setup environment variable
$vi .bashrc
Tambahkan baris berikut ini
```shell export PDSH_RCMD_TYPE="ssh" export HADOOP_HOME=/home/hdadmin/hadoop-3.2.0 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin export HADOOP_MAPRED_HOME=${HADOOP_HOME} export HADOOP_COMMON_HOME=${HADOOP_HOME} export HADOOP_HDFS_HOME=${HADOOP_HOME} export JAVA_HOME=/usr/lib/jvm/java-1.11.0-openjdk-amd64 ```
Activekan
.bashrc
dengan cararelogin
atau dengansource .bashrc
-
Seting
enviroment
$vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
Tambahkan baris berikut ini:
```shell export JAVA_HOME=/usr/lib/jvm/java-1.11.0-openjdk-amd64 ```
- Buat dan copy Folder konfigurasi File
$mkdir $HADOOP_HOME/input $cp $HADOOP_HOME/etc/hadoop/ $HADOOP_HOME/input/
- Mengubah core-site.xml
$vi $HADOOP_HOME/input/core-site.xml
Sesuaikan konfigursinya
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs:localhost//:9000</value> </property> <property> <name>hadoop.user.group.static.mapping.overrides</name> <value>dr.who=;hduser=aagusti;</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>aagusti</value> </property> </configuration>
- Mengubah hdfs-site.xml
$vi $HADOOP_HOME/input/hdfs-site.xml
```xml
dfs.replication 1 dfs.blocksize 134217728 dfs.namenode.fs-limits.min-block-size 32768 dfs.namenode.name.dir file:///home/aagusti/hadoop-3.3.1/hdfs/namenode
dfs.datanode.data.dir file:///home/aagusti/hadoop-3.3.1/hdfs/datanode
dfs.permission.enabled false
5. Mengubah mapred-site.xml ```xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> ``` 6. Mengubah yarn-site.xml ```xml <configuration> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value> </property> </configuration> ``` 7. Verify worker file ```shell $vi etc/hadoop/worker ``` File ini biasanya berisi `ip` cluster atau nama `cluster` 8. Copy or download activation1.1.jar
-
-
Format NameNode or Master node
```shell $hdfs namenode -format ```
-
Start/Stop the single node hadoop cluster
- Start HDFS Daemons
Start NameNode daemon and DataNode daemon by executing following command through terminal from /hadoop3.2.0/sbin/
```shell $ start-dfs.sh` ```
- Start ResourceManager daemon and NodeManager daemon
Start ResourceManager daemon and NodeManager daemon by executing following command through terminal from /hadoop3.2.0/sbin/
```shell $ start-yarn.sh` ```
-
Stop NameNode daemon and DataNode daemon.
$ stop-dfs.sh
-
Stop ResourceManager daemon and NodeManager daemon
$ stop-yarn.sh`
-
Verify and access the Hadoop services in Browser
-
Gunakan webbrowser untuk
NameNode
secara default urlnya:http://Your IP Address or localhost:9870/`
Gunakan webbrowser untuk melihat
ResourceManager
secara default urlnya:
http://Your IP Address or localhost:8088/
-