Remove a host from a cluster

Removing a host from a cluster is a tricky operation that may be required in case some node becomes unavailable. Normally, when performing the Remove, Move, or Decommiss actions on unavailable hosts you get unavailability errors. To ignore such errors the maintenance mode was introduced.

To remove a host from a cluster using maintenance mode, follow these steps.

Step 1. Prepare the host

  1. Turn on the maintenance mode on the host you want to remove. You can do that on the Hosts page or on the Hosts tab on the chosen cluster page by clicking the maintenance default icon.

    Enable the maintenance mode
    Maintenance mode
  2. Check that the replication factor value in HDFS is not smaller than a number of live DataNodes. Otherwise, add a new DataNode or reduce the dfs.replication parameter value. It has to be done because insufficient number of replicas leads to moving the NameNode to safe mode that makes DataNodes and dependant components (HBase RegionServer, Tez, etc.) go down.

Step 2. Remove components from the host

To remove all components from a host, repeat the following steps for each component:

  1. Check if a component is present on the host using the Mapping tab.

    The mapping tab
    Mapping tab
  2. If it is present on the unavailable host, go to the Services tab, find the corresponding service, and run the action that removes that component.

    The service action menu
    Service action menu
  3. In the pop-up window, click btn close light btn close dark next to the unavailable host. After that, click Run and wait for the job to complete.

    The process of a component removal
    Component removal
Special cases

 

  • Some components cannot be removed from a host until they were added to another host (e.g. HBase Master Server). In that case, run the Add action for them in the corresponding service action menu and select an additional host for that component prior to removing them from the unavailable host.

The process of a component addition
Component addition
  • Some components can have only one instance in the cluster (e.g. MapReduce History Server), so they need to be moved to another host using the Move action of the corresponding service. Keep in mind that while a host is in maintenance mode, it’s only possible to remove components from it, so for this operation you need to turn the maintenance mode off and turn it back on once the moving is done.

  • If a component is an HDFS DataNode or a YARN NodeManager, it’s required to decommission it before removing. You can do that by running the Maintenance/Decommiss DataNodes and Decommiss/Recommiss NodeManagers actions for HDFS and YARN, respectively.

  • Some services need to be restarted after certain components were moved. Thus, you need to restart Hive and Spark after moving/removing YARN components; HBase after moving HDFS NameNodes; Spark after moving Hive Metastore; all HA services (Hive, YARN, Flink) after moving/removing/adding a Zookeeper Server.

Step 3. Remove the host

Once all the components have been removed from the unavailable host, you can remove it from the cluster by clicking the unlink default icon.

The process of a host removal
Host removal

Next, confirm the action by clicking Unlink in the pop-up window.

An action confirmation window
Action confirmation
Found a mistake? Seleсt text and press Ctrl+Enter to report it