技术分享
hyper-v 搭建本地yum镜像源以及hadoop完全分布式 name节点分离集群
2021-06-08
设计网络拓扑
网段均为 192.168.0*
yum 104
hadoop1 105
hadoop2 106
hadoop3 107
hadoop4 108
hadoop5 109
#查看ip ip a
yum的ip
搭建本地 yum 源
在 yum 上安装nginx
# 安装依赖 yum install epel-release -y # 安装nginx yum install nginx -y # 查看nginx 位置 whereis nginx whereis nginx nginx: /usr/sbin/nginx /usr/lib64/nginx /etc/nginx /usr/share/nginx 默认使用路径/etc/nginx/nginx.conf
关闭防火墙
# 查看防火墙状态 firewall-cmd --state 》》 running #禁止开机启动 systemctl disable firewalld.service #关闭防火墙 systemctl stop firewalld.service
访问成功
修改nginx配置
vi /etc/nginx/nginx.conf
user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; # Load dynamic modules. See /usr/share/doc/nginx/README.dynamic. include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; # Load modular configuration files from the /etc/nginx/conf.d directory. # See http://nginx.org/en/docs/ngx_core_module.html#include # for more information. include /etc/nginx/conf.d/*.conf; server { listen 80 ; server_name _ ; root /usr/share/nginx/html; # Load configuration files for the default server block. include /etc/nginx/default.d/*.conf; location / { } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } } server { listen 8080; server_name _ ; location / { autoindex on; root /home/iso/; # (这里请换成你的实际目录路径) } } }
固定IP
之前我们使用的 dhcp 会自动改变
修改网卡配置固定ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0 IPADDR=192.168.0.104 #IP地址 GATEWAY=192.168.0.1 #网关 DNS1=192.168.1.1,192.168.0.1 #域名解析器 service network restart
上传并解压缩iso镜像
上传镜像到 /home/CentOS-7-x86_64-Everything-1908.iso
mkdir /home/iso mount -o loop /home/CentOS-7-x86_64-Everything-1908.iso /home/iso/报错 mount: /dev/loop0 写保护,将以只读方式挂载 修改权限 chmod 777 /home/CentOS-7-x86_64-Everything-1908.iso #重启nginx nginx -s reload
在使用的主机中 配置yum repo 访问路径
我们开始配置yum的配置文件,在
/etc/yum.repos.d新建一个名为Nginx-yum.repo
的配置文件,内容如下:
注意修改 baseurl 为你虚拟机的url
[Nginx-yum] name=Nginx-yum #mirrorlist enabled=1 gpgcheck=1 gpgkey=http://192.168.1.3:8080/RPM-GPG-KEY-CentOS-7
刷新yum
yum clean all yum makecache ##注意不要千万不要 yum update 否则虚拟ip 会消失
将虚拟机导出
毕竟配置挺麻烦的
将配置导出到
C:\Users\Public\Documents\Hyper-V\localyum 中的yum 文件夹就是虚拟机 属性 ip 192.168.0.104 yum 端口 8080 nginx 配置位置 网卡配置位置 /etc/sysconfig/network-scripts/ifcfg-enp0s10f0 vi /etc/nginx/nginx.conf
配置一台hadoop
ssh免密
systemctl enable sshd systemctl start sshd
修改主机名
vi /etc/hostname vi /etc/hosts service network restart
下载安装包
解压
unzip jdk1.8.0_181.zip tar -zxvf hadoop-3.2.1.tar.gz
配置java
export JAVA_HOME=/cloudcomput/jdk1.8.0_181 export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
配置hadoop
pssh 学习
pssh 是一款 nb哄哄的 工具
安装pssh
cd pssh-2.3.1/#注意这里一定要采取 python2 安装,否则会报错 python setup.py install
使用pssh
使用说明,必须保证 所有主机之间已经可以免密登录 先在当前文件夹编辑主机名 vi hosts.txt 内容如下 hadoop1 hadoop2 hadoop3 hadoop4 hadoop5 pssh -h hosts.txt -i 【要每个主机执行的命令】
使用实例,向所有节点配置/etc/profile hadoop路径
#先在主节点修改 vi /etc/profile 在结尾添加 export HADOOP_HOME=/cloudcomput/hadoop-3.2.1 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin #pscp进行分发 pscp -h hosts.txt /etc/profile /etc/profile #pssh 生效所有文件 pssh -h hosts.txt -i "source /etc/profile " 执行成功 [1] 07:24:13 [SUCCESS] hadoop2 [2] 07:24:13 [SUCCESS] hadoop1 [3] 07:24:13 [SUCCESS] hadoop4 [4] 07:24:13 [SUCCESS] hadoop3 [5] 07:24:13 [SUCCESS] hadoop5
hadoop启动报错
ERROR: Attempting to operate on yarn nodemanager as root
在/hadoop/sbin路径下:
将start-dfs.sh,stop-dfs.sh两个文件顶部添加以下参数
#!/usr/bin/env bash HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
还有,start-yarn.sh,stop-yarn.sh顶部也需添加以下:
#!/usr/bin/env bash YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
修改后重启 ./start-dfs.sh,成功!
分发目录添加参数 -r
pscp -h hosts.txt -r /cloudcomput/hadoop-3.2.1/sbin /cloudcomput/hadoop-3.2.1/sbin
hadoop 修改配置文件后重新格式化
stop-all.sh #删除文件 pssh -h hosts.txt -i "cd /cloudcomput/hadoop-3.2.1/tmp && rm -rf * "pssh -h hosts.txt -i "cd /cloudcomput/hadoop-3.2.1/logs && rm -rf * " mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs/name chmod -R 777 /cloudcomput/hadoop-3.2.1/tmp hdfs namenode -format start-all.sh
启动hadoop 后除了,namenode 都正常工作,查日志发现以下错误
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /cloudcomput/hadoop-3.2.1/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
解决
hadoop.tmp.dir /tmp/hadoop-${user.name}
mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs/name
找了半天错 原来是 - 是中文的!!!!!
搭建完全分布式 成功
第二名称节点
8088
启动历史记录
mapred --daemon start historyserver
查看数据节点
数据节点1 hadoop3
数据节点2 hadoop4
数据节点3 hadoop5
查看五个节点上的服务
pssh -h hosts.txt -i '/cloudcomput/java/bin/jps '
《大数据技术与应用》
实验报告
学号:20153442
班级:信1701-3
姓名:李孟凯
石家庄铁道大学信息学院
(章鱼互联网学院平台)
2020 年春季学期
实验一 搭建 namenode与secondnamenode分离的完全分布
式hadoop 以及 伪分布式hadoop 搭建:
一、任务目标
1.模拟搭建完全分布式的hadoop
2.伪分布式hadoop 搭建
二、系统环境
Win10
三、任务内容
在虚拟机中搭建完全分布式的hadoop集群
网段均为 192.168.0.*
提供集群部署时的局域网镜像源 yum 104
namenode hadoop1 105
secondnamenode hadoop2 106
datanoda1 hadoop3 107
datanode2 hadoop4 108
datanoda3 hadoop5 109
四、任务步骤
0.伪分布式搭建
之前已经配置过伪分布式hadoop 容器
直接docker部署即可
docker run -tdi -p 8088:8088 -p 9000:9000 -p 9864:9864 -p 9866:9866 -p 9867:9867 -p 9870:9870 -p 19888:19888 -p 50100:50100 -p 50105:50105 --hostname localhost --privileged -e “container=docker” --name hadoopweifb registry.cn-hangzhou.aliyuncs.com/mkmk/hadoop:weifb3 init | docker exec hadoopweifb /bin/bash -c ’ /starthadoop.sh ’
部署讲解
8088 数据请求端口
9000 name端口
9864 .。。
都是hadoop 必要的端口
直接使用即可
Hadoop版本 3.2.1
Java版本1.8
镜像仓库是我的阿里云仓库
registry.cn-hangzhou.aliyuncs.com/mkmk/hadoop:weifb3
1.选择集群操作系统。
我对centos7 使用比较多,所以在这里选择centos7作为集群操作系统,为了最终搭建的单个节点资源开销较少,我们采取只有900mb的 centos7 最小化安装
国内华为云
https://repo.huaweicloud.com/centos/7/isos/x86_64/CentOS-7-x86_64-Minimal-1908.iso
2.Hyper-v管理器
我们并不需要下载vamvare ,win10自带虚拟机工具 hyper-y
查找hyper-y 即可使用
3.搭建本地yum
采用mini install 操作系统上什么都没有,需要我们去手动安装,但是为了加快整个集群的安装速度,我们采取直接在本地搭建yum镜像源
下载 centos7 完全版安装包 ,华为云,大小10g ,半小时下载完
https://repo.huaweicloud.com/centos/7/isos/x86_64/CentOS-7-x86_64-Everything-1908.iso
首先配置 yum 虚拟机,并选择最小化安装包
内存:4g
硬盘50g
选择CentOS-7-x86_64-Minimal-1908.iso 进行安装
安装nginx
安装依赖
yum install epel-release -y
安装nginx
yum install nginx -y
查看nginx 位置
whereis nginx
whereis nginx
nginx: /usr/sbin/nginx /usr/lib64/nginx /etc/nginx /usr/share/nginx
默认使用路径
/etc/nginx/nginx.conf
关闭 虚拟机 防火墙
查看防火墙状态
firewall-cmd --state
》》 running
#禁止开机启动
systemctl disable firewalld.service
#关闭防火墙
systemctl stop firewalld.service
配置完全安装包路径到 nginx
vi /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - remoteuser[time_local] “KaTeX parse error: Double superscript at position 34: … '̲status bodybytessent"http_referer” ’
‘“httpuseragent""http_x_forwarded_for”’;
access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; # Load modular configuration files from the /etc/nginx/conf.d directory. # See http://nginx.org/en/docs/ngx_core_module.html#include # for more information. include /etc/nginx/conf.d/*.conf; server { listen 80 ; server_name _ ; root /usr/share/nginx/html; # Load configuration files for the default server block. include /etc/nginx/default.d/*.conf; location / { } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } } server { listen 8080; server_name _ ; location / { autoindex on; root /home/iso/; # (这里请换成你的实际目录路径) } }
}
手动配置ip
默认使用的 dhcp ip会自动改变
修改网卡配置固定ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
IPADDR=192.168.0.104 #IP地址
GATEWAY=192.168.0.1 #网关
DNS1=192.168.1.1,192.168.0.1 #域名解析器
service network restart
上传挂载everything安装包
mkdir /home/iso
mount -o loop /home/CentOS-7-x86_64-Everything-1908.iso /home/iso/
报错 mount: /dev/loop0 写保护,将以只读方式挂载
修改权限
chmod 777 /home/CentOS-7-x86_64-Everything-1908.iso
重启nginx
nginx -s reload
上传java1.8 hadoop3.2.1 到nginx代理
上传文件后,修改nginx配置
添加
server {
listen 9090;
server_name _ ;
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods ‘GET,POST’;
add_header Access-Control-Allow-Headers ‘DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization’;
location / {
root /home/javainstall/;
autoindex on;
} }
重启nginx
nginx -s reload
3.搭建hadoop集群
搭建集群标准机
由于集群之间的软件环境都很类似,我们先搭建namenode,然后镜像拷贝虚拟机即可,最后修改配置文件即可
配置本地yum仓库
删除所有自带的的 /etc/yum.repos.d
/etc/yum.repos.d新建一个名为Nginx-yum.repo
[Nginx-yum]
name=Nginx-yum
#mirrorlist=http://mirrorlist.centos.org/?release=KaTeX parse error: Expected 'EOF', got '&' at position 11: releasever&̲arch=basearch&repo=os&infra=$infra
baseurl=http://192.168.1.3:8080
enabled=1
gpgcheck=1
gpgkey=http://192.168.1.3:8080/RPM-GPG-KEY-CentOS-7
刷新yum
yum clean all
yum makecache
配置hosts解析
vi /etc/hosts
127.0.0.1 localhost
192.168.0.105 hadoop1
192.168.0.106 hadoop2
192.168.0.107 hadoop3
192.168.0.108 hadoop4
192.168.0.109 hadoop5
下载并配置java以及hadoop
创建cloudcomput 目录
mkdir /cloudcomput
进入目录
Cd /cloudcomput
从本地镜像下载安装包
wget http://192.168.0.104:9090/jdk1.8.0_181.zip
wget http://192.168.0.104:9090/hadoop-3.2.1.tar.gz
解压安装包,配置环境变量
vi /etc/profile
末尾添加
export JAVA_HOME=/cloudcomput/java
export PATH=PATH:JAVA_HOME/bin
export CLASSPATH=.:JAVAHOME/lib/dt.jar:JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/cloudcomput/hadoop-3.2.1
export PATH=PATH:HADOOP_HOME/bin:$HADOOP_HOME/sbin
到hadoop 官网看配置文件
修改/cloudcomput/hadoop3.2.1/etc/hadoop 中的各种配置文件
修改/cloudcomput/hadoop3.2.1/etc/hadoop/hadoop-env.sh
55行左右添加
export JAVA_HOME=/cloudcomput/java
修改/cloudcomput/hadoop3.2.1/etc/hadoop/core-site.xml
末尾添加
fs.defaultFS
hdfs://192.168.0.105:9000
hadoop.tmp.dir
/cloudcomput/hadoop-3.2.1/tmp/hadoop-${user.name}
修改/cloudcomput/hadoop3.2.1/etc/hadoop/hdfs-site.xml
dfs.namenode.http-address
192.168.0.105:9870
dfs.namenode.secondary.http-address
192.168.0.106:9868
dfs.replication
3
dfs.namenode.datanode.registration.ip-hostname-check
false
修改/cloudcomput/hadoop3.2.1/etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.application.classpath
/cloudcomput/hadoop-3.2.1/etc/,
/cloudcomput/hadoop-3.2.1/etc/hadoop/,
/cloudcomput/hadoop-3.2.1/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/lib-examples/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/lib/*,
修改/cloudcomput/hadoop3.2.1/etc/hadoop/yarn-site.xml
yarn.resourcemanager.hostname
hadoop1
yarn.nodemanager.aux-services
mapreduce_shuffle
mapreduce.application.classpath
/cloudcomput/hadoop-3.2.1/etc/,
/cloudcomput/hadoop-3.2.1/etc/hadoop/,
/cloudcomput/hadoop-3.2.1/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/lib-examples/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/lib/*,
修改hadoop 启动文件权限
vi /cloudcomput/hadoop-3.2.1/sbin/start-dfs.sh
vi /cloudcomput/hadoop-3.2.1/sbin/stop-dfs.sh
vi /cloudcomput/hadoop-3.2.1/sbin/start-yarn.sh
vi /cloudcomput/hadoop-3.2.1/sbin/stop-yarn.sh
都在开头第二行添加
HDFS_DATANODE_USER=root
HADOOP_SECURE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
标准机配置完成后备份标准机
搭建配置每个节点
修改主机名
vi /etc/hostname
vi /etc/hosts
service network restart
修改ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
修改当前节点的ip
IPADDR=192.168.0. 【当前节点ip】 #IP地址
GATEWAY=192.168.0.1 #网关
DNS1=192.168.1.1,192.168.0.1 #域名解析器
service network restart
设置ssh 自启动
systemctl enable sshd
systemctl start sshd
配置集群之间ssh 免密登录
mkdir ~/.ssh
ssh-genkey -t rsa
从第一台主机开始,把每台主机的公共密匙 都追加到 authorized_keys
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
把 ~/.ssh/authorized_keys 发送到下一台主机
scp ~/.ssh/authorized_keys root@下一台主机:~/.ssh/authorized_keys
ssh root@下一台主机ip
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@下一台主机:~/.ssh/authorized_keys
依次循环直到所有主机文件都添加到了 ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@所有主机都发一次:~/.ssh/authorized_keys
检测免密登录
首先登录到 hadoop1
依次
ssh hadoop2
ssh hadoop3
ssh hadoop4
ssh hadoop5
至此hadoop完全分布式配置工作结束
使用完全分布式
启动hadoop
登录到 hadoop1
start-all.sh 即可
学习并安装批处理工具 pssh
在hadoop1 namenode 中安装 pssh
wget https://pypi.tuna.tsinghua.edu.cn/packages/60/9a/8035af3a7d3d1617ae2c7c174efa4f154e5bf9c24b36b623413b38be8e4a/pssh-2.3.1.tar.gz#sha256=539f8d8363b722712310f3296f189d1ae8c690898eca93627fc89a9cb311f6b4
#注意这里一定要采取 python2 安装,否则会报错
python setup.py install
查看所有节点的启动进程
pssh -h hosts.txt -i ‘/cloudcomput/java/bin/jps’
查看各节点的web页面
名称节点
第二名称节点
数据节点1
数据节点2
数据节点3
五、实验总结
- 标签:
-
其他