Hadoop22Cluster安装杨雄波.docx
《Hadoop22Cluster安装杨雄波.docx》由会员分享,可在线阅读,更多相关《Hadoop22Cluster安装杨雄波.docx(14页珍藏版)》请在冰点文库上搜索。
![Hadoop22Cluster安装杨雄波.docx](https://file1.bingdoc.com/fileroot1/2023-5/5/ad295f94-4a2a-4115-9771-26897704e8c0/ad295f94-4a2a-4115-9771-26897704e8c01.gif)
Hadoop22Cluster安装杨雄波
Hadoop2.2安装手册
1准备
1.1软件下载
hadoop-2.2.0.tar.gz
jdk1.6.0_35
1.2服务器配置
服务器均采用CentOS5.5
IP
HOSTNAME
USER
PURPOSE
NOTE
99.6.137.76
cloud001
hduser/hd_34567
NN/SNN/RM
64bit
99.6.137.241
cloud002
hduser/hd_34567
DN/NM
64bit
99.6.136.50
cloud003
hduser/hd_34567
DN/NM
32bit
NN:
NameNode
SNN:
SecondaryNameNode
RM:
ResourceManager
DN:
DataNode
服务器改名:
#vi/etc/sysconfig/network
#servicenetworkdrestart
新增用户设权限:
#vi/etc/sudoers
将IP和NAME映射增加到/etc/hosts中
1.2.1设置无密登录
在cloud001/002/003之间设置无密登录
[root@cloud001]#vi/etc/ssh/sshd_config
找到以下内容,并去掉注释符”#“
RSAAuthenticationyes
PubkeyAuthenticationyes
AuthorizedKeysFile.ssh/authorized_keys
[root@cloud001]#cd.ssh
[root@cloud001]#ssh-keygen–trsa--之后一路回车
[root@cloud001]#catid_rsa.pub>>authorized_keys--追加授权Key
然后用root重启ssh服务
以上操作在cloud001/002/003上都操作一遍,然后将cloud001的authorized_keys复制到002和003,使得cloud001能够无密远程登录002和003
[root@cloud001]#scpauthorized_keysroot@cloud002:
~/.ssh/authorized_keys_cloud001
登录cloud002
[root@cloud002]catauthorized_keys_cloud001>>authorized_keys
1.2.2关闭防火墙
以下命令在三台服务器上都执行
[root@cloud001]#/etc/init.d/iptablesstop
[root@cloud001]#chkconfigiptablesoff
2安装
Hadoop集群在每个机器上面的配置基本相同,故先在namenode上面进行部署,然后复制到其他节点即可。
将hadoop-2.2.tar.gz解压到/home/hduser/下面,在cloud001本地文件系统创建以下文件夹
/home/hduser/dfs/name
/home/hduser/dfs/data
/home/hduser/tmp
2.1配置修改
/home/hduser/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
/home/hduser/hadoop-2.2.0/etc/hadoop/yarn-env.sh
修改和设置JAVA_HOME,如:
/home/hduser/hadoop-2.2.0/etc/hadoop/slaves
新增cloud002,cloud003
/home/hduser/hadoop-2.2.0/etc/hadoop/core-site.xml
fs.defaultFS
hdfs:
//cloud001:
9000
io.file.buffer.size
131072
hadoop.tmp.dir
file:
/home/hduser/tmp
forothertemporaryrectories.
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
/home/hduser/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
dfs.namenode.secondary.http-address
cloud001:
9001
dfs.namenode.name.dir
file:
/home/hduser/dfs/name
dfs.datanode.data.dir
file:
/home/hduser/dfs/data
dfs.replication
3
dfs.webhdfs.enabled
true
/home/hduser/hadoop-2.2.0/etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
cloud001:
10020
mapreduce.jobhistory.webapp.address
cloud001:
19888
/home/hduser/hadoop-2.2.0/etc/hadoop/yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.address
cloud001:
8032
yarn.resourcemanager.scheduler.address
cloud001:
8030
yarn.resourcemanager.resource-tracker.address
cloud001:
8031
yarn.resourcemanager.admin.address
cloud001:
8033
yarn.resourcemanager.webapp.address
cloud001:
8088
2.2复制节点
将/home/hduser/hadoop-2.2.0复制到cloud002、cloud003
2.3启动验证
#cd/home/hduser/hadoop-2.2.0
#./bin/hdfsnamenode–format--格式化namenode
#./sbin/start-dfs.sh--启动hdfs
[root@cloud001logs]#jps
15764ResourceManager
30150NameNode
30359SecondaryNameNode
16803Jps
001验证进程:
namenodesecondarynamenode
002和003验证进程:
datanode
注:
多次格式化namenode会造成cloud001上的clusterID和cloud002,003上的clusterID不一致,导致datanode无法启动。
将/home/hduser/dfs/data/VERSION文件中的clusterID修改成和/home/hduser/dfs/name/VERSION文件中的一致,即可解决问题
#./sbin/start-yarn.sh--启动yarn
001上新增resourcemanager进程,002和003上新增nodemanager进程
查看集群状态#./bin/hdfsdfsadmin–report
查看HDFS:
http:
//99.6.137.76:
50070
查看YARN:
http:
//99.6.137.76:
8088
3示例
#./bin/hdfsdfs–ls/
#./bin/hdfsdfs-mkdir/user
#./bin/hdfsdfs-mkdir-p/user/yxb/wordcount/in
创建三个txt文件,在里面写一些英文字符,并上传
#ls-l../test/
total24
-rw-r--r--1rootroot45May2912:
12yxb-01.txt
-rw-r--r--1rootroot47May2912:
12yxb-02.txt
-rw-r--r--1rootroot32May2912:
13yxb-03.txt
#./bin/hdfsdfs-put../test/yxb*.txt/user/yxb/wordcount/in
#./bin/hdfsdfs-ls/user/yxb/wordcount/in
14/05/2912:
15:
26WARNutil.NativeCodeLoader:
Unabletoloadnative-hadooplibraryforyourplatform...usingbuiltin-javaclasseswhereapplicable
Found3items
-rw-r--r--3rootsupergroup452014-05-2912:
15/user/yxb/wordcount/in/yxb-01.txt
-rw-r--r--3rootsupergroup472014-05-2912:
15/user/yxb/wordcount/in/yxb-02.txt
-rw-r--r--3rootsupergroup322014-05-2912:
15/user/yxb/wordcount/in/yxb-03.txt
执行计算(如果out目录存在要删除掉的,会报错)
#./bin/hadoopjarshare/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jarwordcount/user/yxb/wordcount/in/user/yxb/wordcount/out
查看执行结果
4Q&A
Q:
如何修改Hadoop日志级别
A:
exportHADOOP_ROOT_LOGGER=DEBUG,console
exportHADOOP_ROOT_LOGGER=WARN,console
Q:
HostName的问题:
A:
hadoop会反向解析hostname,即使是用了IP,也会使用hostname来启动YANR。
权宜之计,设置各机器的/etc/hosts使hostname能用。
Q:
OffendingkeyforIPin/root/.ssh/known_hosts:
1
A:
打开/root/.ssh/known_hosts,删除第一行