Planning Your Core Upgrade

Describes how to develop a successful plan for your upgrade process.

The key to a successful upgrade process is to plan the process ahead of time. This page helps you develop an upgrade process that fits the needs of your cluster and users.

Choosing a Cluster Upgrade Method

Upgrade Workflows (Release 6.x to 7.0.0) describes these methods in more detail. The method you choose affects the flow of events while upgrading packages on nodes and the duration of the maintenance window.

Offline Upgrade

The offline upgrade process is simpler than a rolling upgrade, and usually completes faster. In an offline upgrade, data-fabric software processes and the jobs that depend on them are stopped on all nodes so that packages can be updated. Offline upgrade is the default upgrade method when other methods cannot be used.

Offline Upgrade Paths without the Installer

You can perform an offline upgrade from the following core versions:

  • Release 6.x
  • Release 5.x
  • Release 4.1
  • Release 4.0.x
  • Release 3.x
NOTE After upgrading core to release 6.0 or later, you must upgrade ecosystem components to an EEP that is compatible with your core 6.0 or later release. To determine the compatible EEPs, see EEP Support and Lifecycle Status. This must be done before you enable core features.
During the maintenance window, the administrator:
  • Stops all jobs on the cluster.
  • Stops all cluster services.
  • Upgrades packages on all nodes (which can be done in parallel).
  • Brings the cluster back online at once.

Rolling Upgrade

In a manual rolling upgrade, you upgrade the data-fabric software one node at a time so that the cluster as a whole remains operational throughout the process. The fileserver service on each node goes offline while packages are upgraded, but its absence is short enough that the cluster does not raise the data-under-replication alarm.

The following restrictions apply to rolling upgrades:

  • In release 6.0 and later, only manual rolling upgrades are supported. Scripted rolling upgrades are not supported.
  • Rolling upgrades only upgrade core packages, not ecosystem components. A rolling upgrade of ecosystem components is not supported.
  • If you choose to do a rolling upgrade on a cluster with core and ecosystem components, the ecosystem components will continue to work during the rolling upgrade as long as the ecosystem components are not updated. If you choose to upgrade core and ecosystem components together, the ecosystem components might not function properly during the upgrade process.
  • The administrator should block off a maintenance window, during which only critical jobs are allowed to run and users expect longer-than-average run times.
Rolling Upgrade Paths

You can perform a manual rolling upgrade from only the following core versions:

  • Release 5.2.x with EEP 3.0.1 or later
  • Release 6.x with EEP 4.0.0 or later
NOTE After upgrading core, you must upgrade ecosystem components to EEP 4.0.0 or later, and this must be done before you enable release 6.x or later features. To determine the EEP required by your release, see EEP Support and Lifecycle Status.

Updating the JDK

Check the JDK Support Matrix to verify that your JDK version is supported by the core version to which you are upgrading. Releases 6.0 and 6.1 require JDK 8. Release 6.2.0 and later require JDK 11. For more information, see the JDK Support Matrix.

Planning for Security

Security is not enabled by default for upgrades. During an upgrade, the security attributes of your cluster are preserved unless you decide to change them. Note that if you have configured security on a release 5.2.x cluster, you cannot use the Installer or Stanzas to upgrade. You must upgrade manually. For information about custom security, see Customizing Security in HPE Ezmeral Data Fabric.

Before upgrading core software, make sure that you have reviewed the list of known vulnerabilities in Security Vulnerabilities. If a vulnerability applies to your release, contact your HPE support representative for a fix, and apply the fix immediately, if applicable.

Scheduling the Upgrade

Consider the following factors when scheduling the upgrade:

  • When will preparation steps be performed? How much of the process can be performed before the maintenance window?
  • What calendar time would minimize disruption in terms of workload, access to data, and other stakeholder needs?
  • How many nodes need to be upgraded? How long will the upgrade process take for each node, and for the cluster as a whole?
  • When should the cluster stop accepting new non-critical jobs?
  • When (or will) existing jobs be terminated?
  • How long will it take to clear the pipeline of current workload?
  • Will other Hadoop ecosystem components (such as Hive) get upgraded during the same maintenance window?
  • When and how will stakeholders be notified?

Planning Upgrades to Data Fabric Clients

Determine if you need to upgrade data-fabric client nodes. You upgrade data-fabric client nodes after you upgrade the cluster nodes but before enabling new features.

Data Fabric Client Nodes

On each data-fabric client node, upgrade to the client version that is compatible with the operations that you want to perform on the cluster. The following table shows which supported client operations are available based on the client version and the cluster version.

POSIX Client Nodes

On POSIX client nodes, the only supported client operation is file system access. As of release 5.1, FUSE-based POSIX clients are available in addition to loopback NFS clients.

POSIX loopback NFS clients can be upgraded, or a fresh install can be performed.

See Upgrading the Data Fabric POSIX loopbacknfs Client for more information.

NOTE Basic and Platinum POSIX client packages are recommended for fresh installation and for all new clusters.

The following table shows which loopback NFS client versions are supported by which data-fabric clusters. For example, the release 6.0 cluster supports 4.0.2, 4.1, 5.0, 5.1, and 5.2 loopback NFS clients.

Table 1. Loopback NFS POSIX Client Upgrades
7.x Client 6.x Client
7.x Cluster Yes Yes
6.x Cluster Yes Yes

Determining Cross-Cluster Feature Support

HPE Ezmeral Data Fabric supports features that operate on more than one cluster. Before you upgrade, consider the impact of the following cross-cluster features:
Volume Mirroring
Volume mirroring works from a lower version to a higher version irrespective of the features that you enable on the higher version. For example, you can mirror volumes from a release 6.1 cluster to a release 6.2 cluster irrespective of whether or not you have enabled the new features present in the release 6.2 version.

However, volume mirroring from a higher release version to a lower release version works only when you enable identical sets of features on both clusters. For example, you can mirror volumes from a release 6.2 cluster to a release 6.1 cluster only if you do not enable new features that are present on the release 6.2 cluster.

Table Replication
Table replication works between clusters of different versions as long as both versions support HPE Ezmeral Data Fabric Database table replication. For example, you can replicate HPE Ezmeral Data Fabric Database binary tables from a release 6.2 cluster to a release 6.0 cluster.
NOTE As of release 5.2, HPE Ezmeral Data Fabric Database JSON table replication is also supported. You cannot replicate HPE Ezmeral Data Fabric Database JSON tables to a cluster that runs a version prior to release 5.2.
Policy-Based Security
An upgraded data-fabric platform has all the policy-based security features set to the default values:
  • Upgraded volumes are not tagged with any security policies, and have the enforcementMode setting at its default (PolicyAceAndDataAce). Determination of access rights is based on the existing access determination algorithm:

    Grant access if Permitted(mode bits) OR Permitted(ACE)
  • Files and directories are not tagged with any security policies.
  • After enabling the policy-based security feature, use the maprcli, extended attribute commands, and other Java, C, and Hadoop APIs to tag volumes, files, and directories.

Planning for the Ecosystem Pack

To plan for the Ecosystem Pack (EEP), see Planning Ecosystem Pack Upgrades.

What's Next

Go to Preparing to Upgrade Core.