Openshift 简明教程
OpenShift - Clusters
OpenShift 使用两种安装方法来设置 OpenShift 集群。
OpenShift uses two installation methods of setting up OpenShift cluster.
-
Quick installation method
-
Advanced configuration method
Setting Up Cluster
Quick Installation Method
此方法用于运行快速未获取的集群安装配置。为了使用此方法,我们首先需要安装安装程序。这可以通过运行以下命令来完成。
This method is used for running a quick unattained cluster setup configuration. In order to use this method, we need to first install the installer. This can be done by running the following command.
Interactive method
Interactive method
$ atomic-openshift-installer install
当人们希望运行交互式设置时,这是非常有用的。
This is useful when one wishes to run an interactive setup.
Unattended installation method
Unattended installation method
当人们希望设置无人参与的安装方式时,使用此方法,其中用户可以定义一个配置文件 yaml 文件,并将其放在 ~/.config/openshift/ 下,名称为 installer.cfg.yml。然后,可以运行以下命令来安装 –u tag 。
This method is used when one wishes to set up an unattended way of installation method, wherein the user can define a configuration yaml file and place it under ~/.config/openshift/ with the name of installer.cfg.yml. Then, the following command can be run to install the –u tag.
$ atomic-openshift-installer –u install
它默认情况下使用 ~/.config/openshift/ 下的配置文件。另一方面,Ansible 则用作安装的备份。
By default it uses the config file located under ~/.config/openshift/. Ansible on the other hand is used as a backup of installation.
version: v2
variant: openshift-enterprise
variant_version: 3.1
ansible_log_path: /tmp/ansible.log
deployment:
ansible_ssh_user: root
hosts:
- ip: 172.10.10.1
hostname: vklnld908.int.example.com
public_ip: 24.222.0.1
public_hostname: master.example.com
roles:
- master
- node
containerized: true
connect_to: 24.222.0.1
- ip: 172.10.10.2
hostname: vklnld1446.int.example.com
public_ip: 24.222.0.2
public_hostname: node1.example.com
roles:
- node
connect_to: 10.0.0.2
- ip: 172.10.10.3
hostname: vklnld1447.int.example.com
public_ip: 10..22.2.3
public_hostname: node2.example.com
roles:
- node
connect_to: 10.0.0.3
roles:
master:
<variable_name1>: "<value1>"
<variable_name2>: "<value2>"
node:
<variable_name1>: "<value1>"
在这里,我们有特定角色的变量,如果人们希望设置一些特定变量,则可以定义该变量。
Here, we have role-specific variable, which can be defined if one wishes to set up some specific variable.
完成后,我们可以使用以下命令验证安装。
Once done, we can verify the installation using the following command.
$ oc get nodes
NAME STATUS AGE
master.example.com Ready 10d
node1.example.com Ready 10d
node2.example.com Ready 10d
Advanced Installation
高级安装完全基于 Ansible 配置,其中存在关于主设备和节点配置的完整主机配置和变量定义。其中包含有关配置的所有详细信息。
Advanced installation is completely based on Ansible configuration wherein the complete host configuration and variables definition regarding master and node configuration is present. This contains all the details regarding the configuration.
一旦我们完成设置并准备好剧本,我们只需运行以下命令即可设置集群。
Once we have the setup and the playbook is ready, we can simply run the following command to setup the cluster.
$ ansible-playbook -i inventry/hosts ~/openshift-ansible/playbooks/byo/config.yml
Adding Hosts to a Cluster
我们可以使用以下命令向集群添加主机
We can add a host to the cluster using −
-
Quick installer tool
-
Advanced configuration method
Quick installation tool 在交互和非交互模式下均可运行。使用以下命令。
Quick installation tool works in both interactive and non-interactive mode. Use the following command.
$ atomic-openshift-installer -u -c </path/to/file> scaleup
可以用于添加 master 和 node 的应用程序配置文件缩放格式看起来如下。
Format of scaling the application configuration file looks can be used for adding both master as well as node.
Advanced Configuration Method
在此方法中,我们更新 Ansible 的 host 文件,然后在此文件中添加一个新 node 或服务器详细信息。配置文件如下所示。
In this method, we update the host file of Ansible and then add a new node or server details in this file. Configuration file looks like the following.
[OSEv3:children]
masters
nodes
new_nodes
new_master
在相同的 Ansible hosts 文件中,添加有关新 node 的变量详细信息,如下所示。
In the same Ansible hosts file, add variable details regarding the new node as shown below.
[new_nodes]
vklnld1448.int.example.com openshift_node_labels = "{'region': 'primary', 'zone': 'east'}"
最后,使用更新后的 host 文件运行新配置并调用配置文件,使用以下命令完成设置。
Finally, using the updated host file, run the new configuration and invoke the configuration file to get the setup done using the following command.
$ ansible-playbook -i /inventory/hosts /usr/share/ansible/openshift-ansible/playbooks/test/openshift-node/scaleup.yml
Managing Cluster Logs
OpenShift 集群日志不过是集群的 master 和 node 机器中生成日志。它可以管理任何类型的日志,从服务器日志、master 日志、容器日志、pod 日志等开始。有多种技术和应用程序用于容器日志管理。
OpenShift cluster log is nothing but the logs which are getting generated from the master and the node machines of cluster. These can manage any kind of log, starting from server log, master log, container log, pod log, etc. There are multiple technologies and applications present for container log management.
以下列出了可用于日志管理的一些工具。
Few of the tools are as listed, which can be implemented for log management.
-
Fluentd
-
ELK
-
Kabna
-
Nagios
-
Splunk
ELK stack − 在尝试从所有 node 中收集日志并以系统格式呈现日志时,此堆栈非常有用。ELK 堆栈主要分为三大类。
ELK stack − This stack is useful while trying to collect the logs from all the nodes and present them in a systematic format. ELK stack is mainly divided in three major categories.
ElasticSearch − 主要负责从所有容器中收集信息,并将其放入中心位置。
ElasticSearch − Mainly resposible for collecting information from all the containers and putting it into a central location.
Fluentd − 用于将收集到的日志馈送到 elasticsearch 容器引擎。
Fluentd − Used for feeding collected logs to elasticsearch container engine.
Kibana − 一个图形界面,用于将收集的数据作为图形界面中的有用信息呈现。
Kibana − A graphical interface used for presenting the collected data as a useful information in a graphical interface.
需要注意的一个关键点是,当此系统部署在集群上时,它将开始从所有 node 中收集日志。
One key point to note is, when this system is deployed on the cluster it starts collecting logs from all the nodes.
Log Diagnostics
OpenShift 有一个内置的 oc adm dignostics 命令(带 OC),可用于分析多个错误情况。管理员可以从 master 中使用此工具作为集群管理员。此实用程序对于故障排除和诊断已知问题非常有用。该实用程序在 master 客户端和 node 上运行。
OpenShift has an inbuilt oc adm dignostics command with OC that can be used for analyzing multiple error situations. This tool can be used from the master as a cluster administrator. This utility is very helpful is troubleshooting and dignosing known problems. This runs on the master client and nodes.
如果在没有任何争论或标志的情况下运行,它将查找客户端、服务器和 node 机器配置文件,并将其用于诊断。可以通过传递以下参数单独运行诊断 −
If run without any agruments or flags, it will look for configuration files of the client, server, and node machnies, and use them for diagnostics. One can run the diagnostics individually by passing the following arguments −
-
AggregatedLogging
-
AnalyzeLogs
-
ClusterRegistry
-
ClusterRoleBindings
-
ClusterRoles
-
ClusterRouter
-
ConfigContexts
-
DiagnosticPod
-
MasterConfigCheck
-
MasterNode
-
MetricsApiProxy
-
NetworkCheck
-
NodeConfigCheck
-
NodeDefinitions
-
ServiceExternalIPs
-
UnitStatus
可以简单地使用以下命令运行它们。
One can simply run them with the following command.
$ oc adm diagnostics <DiagnosticName>
Upgrading a Cluster
集群升级涉及升级集群内的多项内容,并通过新组件和升级更新集群。这包括 −
Upgradation of the cluster involves upgrading multiple things within the cluster and getiing the cluster updated with new components and upgrdes. This involves −
-
Upgradation of master components
-
Upgradation of node components
-
Upgradation of policies
-
Upgradation of routes
-
Upgradation of image stream
为了执行所有这些升级,我们首先需要准备好快速安装程序或 utils。为此,我们需要更新以下实用程序 −
In order to perform all these upgrades, we need to first get quick installers or utils in place. For that we need to update the following utilities −
-
atomic-openshift-utils
-
atomic-openshift-excluder
-
atomic-openshift-docker-excluder
-
etcd package
在开始升级之前,我们需要备份 master 机器上的 etcd,可以使用以下命令完成此操作。
Before starting the upgrade, we need to backup etcd on the master machine, which can be done using the following commands.
$ ETCD_DATA_DIR = /var/lib/origin/openshift.local.etcd
$ etcdctl backup \
--data-dir $ETCD_DATA_DIR \
--backup-dir $ETCD_DATA_DIR.bak.<date>
Upgradation of Master Components
在 OpenShift master 中,我们通过更新 etcd 文件,然后转到 Docker,从而启动升级。最后,我们运行自动执行程序,使集群进入所需位置。但是,在开始升级之前,我们需要首先在每个 master 上激活原子 OpenShift 包。可以使用以下命令完成此操作。
In OpenShift master, we start the upgrade by updating the etcd file and then moving on to Docker. Finally, we run the automated executer to get the cluster into the required position. However, before starting the upgrade we need to first activate the atomic openshift packages on each of the masters. This can be done using the following commands.
Step 1 - 移除 atomic-openshift 软件包
Step 1 − Remove atomic-openshift packages
$ atomic-openshift-excluder unexclude
Step 2 - 在所有主服务器上升级 etcd。
Step 2 − Upgrade etcd on all the masters.
$ yum update etcd
Step 3 - 重启 etcd 服务并检查其是否已成功启动。
Step 3 − Restart the service of etcd and check if it has started successfully.
$ systemctl restart etcd
$ journalctl -r -u etcd
Step 4 - 升级 Docker 软件包。
Step 4 − Upgrade the Docker package.
$ yum update docker
Step 5 - 重启 Docker 服务并检查其是否正常运行。
Step 5 − Restart the Docker service and check if it is correctly up.
$ systemctl restart docker
$ journalctl -r -u docker
Step 6 - 完成后,使用以下命令重新启动系统。
Step 6 − Once done, reboot the system with the following commands.
$ systemctl reboot
$ journalctl -r -u docker
Step 7 - 最后,运行 atomic-executer 将软件包恢复到 yum 排除列表中。
Step 7 − Finally, run the atomic-executer to get the packages back to the list of yum excludes.
$ atomic-openshift-excluder exclude
升级策略没有这种强制性,它仅在建议时才需要升级,可以通过以下命令进行检查。
There is no such compulsion for upgrading the policy, it only needs to be upgraded if recommended, which can be checked with the following command.
$ oadm policy reconcile-cluster-roles
在大多数情况下,我们不需要更新策略定义。
In most of the cases, we don’t need to update the policy definition.
Upgradation of Node Components
主服务器更新完成后,我们可以开始升级节点。需要记住的一点是,升级周期应该短,以避免集群中出现任何问题。
Once the master update is complete, we can start upgrading the nodes. One thing to keep in mind is, the period of upgrade should be short in order to avoid any kind of issue in the cluster.
Step 1 - 从您希望执行升级的所有节点中删除所有原子 OpenShift 软件包。
Step 1 − Remove all atomic OpenShift packages from all the nodes where you wish to perform the upgrade.
$ atomic-openshift-excluder unexclude
Step 2 - 接下来,在升级前禁用节点计划。
Step 2 − Next, disable node scheduling before upgrade.
$ oadm manage-node <node name> --schedulable = false
Step 3 - 将所有节点从当前主机复制到其他主机。
Step 3 − Replicate all the node from the current host to the other host.
$ oadm drain <node name> --force --delete-local-data --ignore-daemonsets
Step 4 - 在主机上升级 Docker 设置。
Step 4 − Upgrade Docker setup on host.
$ yum update docker
Step 5 - 重启 Docker 服务,然后启动 Docker 服务节点。
Step 5 − Restart Docker service and then start the Docker service node.
$systemctl restart docker
$ systemctl restart atomic-openshift-node
Step 6 - 检查两者是否都已正确启动。
Step 6 − Check if both of them started correctly.
$ journalctl -r -u atomic-openshift-node
Step 7 - 升级完成后,重新引导节点计算机。
Step 7 − After upgrade is complete, reboot the node machine.
$ systemctl reboot
$ journalctl -r -u docker
Step 8 - 重新启用节点上的计划。
Step 8 − Re-enable scheduling on nodes.
$ oadm manage-node <node> --schedulable.
Step 9 - 运行 atomic-openshift executer 将 OpenShift 软件包重新置于节点上。
Step 9 − Run the atomic-openshift executer to get the OpenShift package back on node.
$ atomic-openshift-excluder exclude
Step 10 − 最后,检查所有节点是否可用。
Step 10 − Finally, check if all the nodes are available.
$ oc get nodes
NAME STATUS AGE
master.example.com Ready 12d
node1.example.com Ready 12d
node2.example.com Ready 12d