Menu
Woocommerce Menu

Hadoop完全分布式环境搭建

0 Comment


编辑/etc/profile.d/java.sh,在文件中添加如下内容:
export PATH=/usr/java/latest/bin:$PATH

[hadoop@master ~]$ ssh hadoop@slave1
Last login: Wed Jul 26 01:11:22 2017 from master
[hadoop@slave1 ~]$ exit
logout
Connection to slave1 closed.
[hadoop@master ~]$ ssh hadoop@slave2
Last login: Wed Jul 26 13:12:00 2017 from master
[hadoop@slave2 ~]$ exit
logout
Connection to slave2 closed.
[hadoop@master ~]$

[root@master ~]# rpm -ivh jdk-7u9-linux-i586.rpm

二、为hadoop配置所有节点之间的ssh免密登陆

  1. hostname命令修改主机名,并修改/etc/sysconfig/network中的主机
    这里以master节点为例
    [root@localhost ~]# hostname master.flyence.tk
    [root@localhost ~]# vim /etc/sysconfig/network
    [root@localhost ~]# logout

Ubuntu 16.04上构建分布式Hadoop-2.7.3集群 
http://www.linuxidc.com/Linux/2017-07/145503.htm

相关阅读

12.配置slave1和slave2上的环境变量(同步骤3),配置完后使用hadoop
version验证一下

搭建Hadoop环境(在Winodws环境下用虚拟机虚拟两个Ubuntu系统进行搭建)
http://www.linuxidc.com/Linux/2011-12/48894.htm

6.配置core-site.xml
<configuration>
<!– 指定hdfs的nameService –>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
</configuration>

Ubuntu下Hadoop环境的配置
http://www.linuxidc.com/Linux/2012-11/74539.htm

2.使用root用户解压并移动到/usr/local 下
[hadoop@master ~]$ exit
exit
[root@master ~]# cd /home/hadoop/
[root@master hadoop]# ls
hadoop-2.7.3.tar.gz  jdk-8u131-linux-x64.tar.gz
[root@master hadoop]# tar -zxf jdk-8u131-linux-x64.tar.gz 
[root@master hadoop]# ls
hadoop-2.7.3.tar.gz  jdk1.8.0_131  jdk-8u131-linux-x64.tar.gz
[root@master hadoop]# mv jdk1.8.0_131 /usr/local/
[root@master hadoop]# cd /usr/local/
[root@master local]# ls
bin  etc  games  include  jdk1.8.0_131  lib  lib64  libexec  sbin 
share  src
[root@master local]#

Ubuntu
13.04上搭建Hadoop环境
http://www.linuxidc.com/Linux/2013-06/86106.htm

3.使用浏览器管理

图片 1

2.验证
[hadoop@master dfs]$ jps    #master上的进程
7491 Jps
6820 NameNode
7014 SecondaryNameNode
7164 ResourceManager
[hadoop@master dfs]$

集群中的每个节点都要安装Hadoop。
[root@master ~]# rpm -ivh hadoop-1.2.1-1.i386.rpm

7.配置hdfs-site.xml
<configuration>
    <!– 数据节点数 –>
    <property>
      <name>dfs.replication</name>
      <value>1</value>
    </property>
    <!– nameNode数据目录 –>
    #目录不存在需要手动创建,并把所属改为hadoop
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>/usr/local/hadoop/dfs/name</value>   
    </property>
    <!– dataNode数据目录 –>
    #目录不存在需要手动创建,并把所属改为hadoop
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>/usr/local/hadoop/dfs/data</value>
    </property>
</configuration>

三. 安装Hadoop

更多Hadoop相关信息见Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13

3台主机上都要安装,以下步骤要重复三遍

[root@master local]# hadoop version
Hadoop 2.7.3
Subversion -r
baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using
/usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
[root@master local]#

单机版搭建Hadoop环境图文教程详解
http://www.linuxidc.com/Linux/2012-02/53927.htm

[root@slave2 name]# jps    #slave2上的进程
3233 DataNode
3469 Jps
3343 NodeManager
[root@slave2 name]#

切换至hadoop用户,并执行如下命令测试jdk环境配置是否就绪
[hadoop@master ~]$ java -version
java version “1.7.0_09”
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) Client VM (build 23.5-b02, mixed mode, sharing)

Hadoop2.7.2集群搭建详解(高可用) 
http://www.linuxidc.com/Linux/2017-03/142052.htm

Ubuntu 12.10 +Hadoop 1.2.1版本集群配置
http://www.linuxidc.com/Linux/2013-09/90600.htm

[root@master local]# vim /etc/profile
[root@master local]# tail -4 /etc/profile
 
#hadoop
export HADOOP_HOME=/usr/local/hadoop    #注意路径
export PATH=”$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin”
[root@master local]# 
[root@master local]# source /etc/profile    #使配置生效

二. 安装JDK

一、配置环境
1.设置主机名和对应的地址映射
[root@master ~]# cat /etc/hosts
127.0.0.1  localhost localhost.localdomain localhost4
localhost4.localdomain4
::1        localhost localhost.localdomain localhost6
localhost6.localdomain6
192.168.230.130 master
192.168.230.131 slave1
192.168.230.100 slave2
#分别对三台设备配置hostname和hosts

Ubuntu上搭建Hadoop环境(单机模式+伪分布模式)
http://www.linuxidc.com/Linux/2013-01/77681.htm

1.生成密钥
[hadoop@master ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
/home/hadoop/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
1c:16:61:04:4f:76:93:cd:da:9a:08:04:15:58:7d:96 hadoop@master
The key’s randomart image is:
+–[ RSA 2048]—-+
|    .===B.o=    |
|    . .=.oE.o    |
|    .  +o o    |
|      .o .. .    |
|      .S. o    |
|        . o      |
|                |
|                |
|                |
+—————–+
[hadoop@master ~]$

  • 下面为/etc/sysconfig/network中的内容
    NETWORKING=yes
    HOSTNAME=master.flyence.tk

2.发送公钥

# useradd hadoop
# echo “hadoop” | passwd –stdin hadoop

9.配置mapred-site.xml
[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@master hadoop]# vim mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

一. 准备工作
实验环境:Vmware虚拟出的3台主机,系统为CentOS_6.4_i386
用到的软件:Hadoop-1.2.1-1.i386.rpm,jdk-7u9-linux-i586.rpm
主机规划:
IP地址                      主机名                    角色
192.168.2.22          master.flyence.tk        NameNode,JobTracker
192.168.2.42          datanode.flyence.tk      DataNode,TaskTracker
192.168.2.32          snn.flyence.tk          SecondaryNameNode

[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@slave1
The authenticity of host ‘slave1 (192.168.230.131)’ can’t be
established.
ECDSA key fingerprint is
32:1a:8a:37:f8:11:bc:cc:ec:35:e6:37:c2:b8:e1:45.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to
filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed — if you
are prompted now it is to install the new keys
hadoop@slave1’s password: 
 
Number of key(s) added: 1
 
Now try logging into the machine, with:  “ssh ‘hadoop@slave1′”
and check to make sure that only the key(s) you wanted were added.
 
[hadoop@master ~]$

4.
master节点的hadoop用户能够以基于密钥的验证方式登录其他节点,以便启动进程并执行监控等额外的管理工作。
[root@master ~]# su – hadoop
[hadoop@master ~]$ ssh-keygen -t rsa -P ”
[hadoop@master ~]$ ssh-copy-id -i .ssh/id_rsa.pub
hadoop@datanode.flyence.tk
[hadoop@master ~]$ ssh-copy-id -i .ssh/id_rsa.pub
hadoop@snn.flyence.tk

5.使用scp将jdk拷贝到slave1和slave2
12 [root@master ~]# scp -r /usr/local/jdk1.8.0_131/
root@slave1:/usr/local/
[root@master ~]# scp -r /usr/local/jdk1.8.0_131/
root@slave2:/usr/local/

图片 2

五、启动hadoop服务

  1. 在/etc/hosts中,记录3台主机的IP和主机名

Hadoop2.7.3+Spark2.1.0完全分布式集群搭建过程 
http://www.linuxidc.com/Linux/2017-06/144926.htm

切换至hadoop用户,验证Hadoop是否安装完成
[hadoop@master ~]$ hadoop version
Hadoop 1.2.1
Subversion
-r
1503152
Compiled by mattf on Mon Jul 22 15:17:22 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /usr/share/hadoop/hadoop-core-1.2.1.jar

11.使用scp将配置好的hadoop传输到slave1和slave2节点上
[root@master ~]# scp -r /usr/local/hadoop root@slave1:/usr/local/
[root@master ~]# scp -r /usr/local/hadoop root@slave2:/usr/local/

  1. 在3台主机上添加hadoop用户,并设定密码

1.启动所有的服务
[hadoop@master dfs]$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
hadoop@master’s password:    #输入master上的hadoop的密码
master: starting namenode, logging to
/usr/local/hadoop/logs/hadoop-hadoop-namenode-master.out
slave1: starting datanode, logging to
/usr/local/hadoop/logs/hadoop-hadoop-datanode-slave1.out
slave2: starting datanode, logging to
/usr/local/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
hadoop@0.0.0.0’s password:    #输入master上的hadoop的密码
0.0.0.0: starting secondarynamenode, logging to
/usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to
/usr/local/hadoop/logs/yarn-hadoop-resourcemanager-master.out
slave1: starting nodemanager, logging to
/usr/local/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
slave2: starting nodemanager, logging to
/usr/local/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
[hadoop@master dfs]$

3.配置java环境变量(这里使用的是全局变量)
[root@master ~]# vim /etc/profile   
#在文件末尾添加如下java环境变量
[root@master ~]# tail -5 /etc/profile
 
export JAVA_HOME=/usr/local/jdk1.8.0_131    #注意jdk版本
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$JAVA_HOME/bin:$PATH
[root@master ~]#
[root@master ~]# source /etc/profile    #使配置生效

图片 3

Hadoop项目之基于CentOS7的Cloudera
5.10.1(CDH)的安装部署 
http://www.linuxidc.com/Linux/2017-04/143095.htm

标签:

发表评论

电子邮件地址不会被公开。 必填项已用*标注

相关文章

网站地图xml地图