ADH releases

3.2.4.2

3.2.4.2.b2

 
     Date: 27.04.2024

  • Improvements

  • Bug fixes

Removed the need to install Axiom JDK when using Astra Linux

Added the ability to set a custom value to the JAVA_HOME variable

Configuration for Shiro Simple Authentication displayed passwords in plain text

Package conflict when updating ADH and installing the SSM service

3.2.4.2.b1

 
     Date: 27.03.2024

  • New features

  • Bug fixes

  • Misc/Internal

Added the SSM service

Added the Kyuubi service

Added the JDBC checks

The Manage Ranger plugin action is now customizable and more understandable

ADQM Spark connector is now included in the ADH bundle

Added the Spark Connect component for the Spark3 service

Corrected permissions on directories and files in the bundle to support the ability to install the product on a file system with umask of 027

Added ability to manage environment variables for Hive

Added ability to manage environment variables for HBase

Bumped Spark to 3.4.2_arenadata1

Bumped Hive to 3.1.3_arenadata6

Bumped ADB Spark connector to 1.0.5-spark-3.4.x

Incorrectly generated hdfs-site.xml during the YARN’s Manage Ranger plugin action

Spark3 didn’t catch up SSL settings after the Remove/Install actions

ADB PySpark connector errors

LDAP didn’t work on Airflow2

Faulty removal of Spark2 in ADH 3.2.4

HDFS start balancer failed on Astra Linux

Wrong paths of logs for Impala in ADH v3.1.2_arenadata1_b1-2

HiveServer2 couldn’t start after upgrading ADH to 3.2.4 due to the SAN section in certificates

The SSM service is currently in the technology preview state and is not intended for use in a production environment. It is under development and is provided to the clients for a review

The path to the Python interpreter for the Spark 2/3 services has been changed from /opt/python3.10/bin/python3 to /opt/pyspark3-python/bin/python3. Take this into account when setting the PYSPARK_PYTHON service parameter

To install Impala on Red Hat, you need to manually install the cyrus-sasl-sql package. Will be fixed in the next version

ADCM minimum version is 2.0 now

3.2.4.1

3.2.4.1.b3

 
     Date: 27.04.2024

  • Improvements

Removed the need to install Axiom JDK when using Astra Linux

3.2.4.1.b2

 
     Date: 16.01.2024

  • Bug fixes

Faulty Spark 2/3 configuration when kerberizing

3.2.4.1.b1

 
     Date: 26.12.2023

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Upgraded Flink to 1.17.1_arenadata1 with the FLINK-32976 patch

Upgraded Tez to 0.10.1_arenadata1

Upgraded ADB Spark Connector to 1.0.5-spark-3.3.x with performance boosts, security improvements, and bug fixes

Added jdbc-tools 1.0 package

Upgraded Spark2 to 2.3.2_arenadata2 with the SPARK-31644 patch

Upgraded Hadoop to 3.2.4_arenadata1 with the following patches:

  • HDFS-14768: Busy DN replica should be consider in live replica check

  • HDFS-15186: Erasure Coding: Decommission may generate the parity block’s content with all 0 in some case

  • HADOOP-17340: TestLdapGroupsMapping failing -string mismatch in exception validation

  • HADOOP-15783: TestSFTPFileSystem.testGetModifyTime fails

  • YARN-9554: TimelineEntity DAO has java.util.Set interface which JAXB can’t handle

Added zstd support in HDFS

Added PMDK support in HDFS

Upgraded Spark3 to 3.3.2_arenadata1 with the SPARK-39910 patch (internal development)

Upgraded Spark3 Livy to 0.7.2_arenadata5

Upgraded Sqoop to 1.4.7_arenadata2

Upgraded Phoenix Query Server to 6.0.0_arenadata2

Upgraded Phoenix to 5.1.3_arenadata2

Upgraded HBase operator tools to 1.2.0_arenadata2

Upgraded Hive to 3.1.3_arenadata5 with the following patches:

  • HIVE-20344: PrivilegeSynchronizer for SBA might hit AccessControlException

  • HIVE-21844: HMS schema Upgrade Script is failing with NPE

Upgraded HBase to 2.4.17_arenadata1

Upgraded Airflow to 2.6.3

Excluded the vulnerability of the log4j library up to the version 2.15

Added log settings for Solr in ADCM

Added log settings for ZooKeeper in ADCM

Excluded Airflow1 from the bundle

Added the Thrift Server component for Spark3

Added the template for changing Ulimits to ADCM component configuration

Added support for AstraLinux 1.7 SE "Орел"

Reconfigured Hive connection settings to the metadata store database to improve flexibility

Updated the Postgres JDBC driver supplied with the distribution

Now there is no complete recreation of existing keytabs when expanding/installing services

Error adding roles to shiro.ini when setting up Zeppelin

Spark3 Server failed to start after upgrade

Because of the Airflow 1 deprecation, it’s mandatory to remove it from the cluster with the upgrade. Airflow 2 can be installed instead

Minimum ADPS version — 1.1.0

3.1.2.1

3.1.2.1.b2

 
     Date: 30.11.2023

  • Bug fixes

Reconfiguration error due to HDFS being in SafeMode for a while

Error updating repository URL during offline upgrade

Upstream changes related to Phoenix Query Server are taken into account while updating HBase

Errors upgrading Community Edition cluster

3.1.2.1.b1

 
     Date: 20.10.2023

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Upgraded Flink to 1.16.2

Incorporated the following upstream updates:

  • HIVE-5312: Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode

  • HIVE-20187: Incorrect query results in hive when hive.convert.join.bucket.mapjoin.tez is set to true

  • HIVE-21940: Metastore: Postgres text <-> clob mismatch for PARTITION_PARAMS/PARAM_VALUE

  • HIVE-22914: Make Hive Connection ZK Interactions Easier to Troubleshoot

  • HIVE-19825: HiveServer2 leader selection shall use different zookeeper znode

Bumped Solr to 8.11.2 with fixed vulnerability of the log4j library

Bumped Sqoop to 1.4.7 with fixed vulnerability of the log4j library

Bumped HBase to 2.2.7

Bumped Phoenix to 5.13

Implemented Impala support

ADB Spark 3 Connector is now included in the ADH bundle

Implemented hbck2 (an HBase component)

Introduced the Maintanence mode which provides an ability to remove any node from a cluster

Intoroduced High Availability auto-management for ADH services

Added the SQL Gateway component for Flink

Added PySpark 3 for customer installation

Added Knox SSO authorization for Zeppelin

Tweaked Kerberos management (enable, disable, configure)

Added logging settings for Spark in ADCM

Added the Precheck packages cluster parameter that enables/disables package checks. By default the checks are disabled

Added logging settings for Sqoop in ADCM

Airflow 2 services failed after kerberization and restart

Metadata and statistics errors for Hive

Hive configuration failed without an installed ZooKeeper on CE

The problem with config groups for Hive

Spark 3 check failed if Spark 2 was not installed

Inability to remove Spark 3 without mapped components

The problem with HDFS Balancer with enabled Kerberos

Minimum ADPS version — 1.0.5

Minimum ADCM version — 2023.10.10.08

2.1.10

2.1.10.b1

 
     Date: 21.06.2023

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Added an ability to select a TLS version for ADH services

Added support for custom Zeppelin interpreters

Added the new component Spark History Server for Spark3

The following upstream updates have been incorporated:

  • HIVE-20192: HS2 with embedded metastore is leaking JDOPersistenceManager objects

  • HIVE-20209: Metastore connection fails for first attempt in repl dump

  • HIVE-20511: REPL DUMP is leaking metastore connections

  • HIVE-20522: HiveFilterSetOpTransposeRule may throw assertion error due to nullability of fields

  • HIVE-20627: Concurrent async queries intermittently fail with LockException and cause memory leak

  • HIVE-20682: Async query execution can potentially fail if shared sessionHive is closed by master thread

  • HIVE-21018: Grouping/distinct on more than 64 columns should be possible

  • HIVE-26743: Backport HIVE-24694 to 3.1.x

  • HIVE-21206: Bootstrap replication is slow as it opens a lot of metastore connections

  • HIVE-22393: HiveStreamingConnection: Exception in beginTransaction causes AbstractRecordWriter to throw NPE, covering up real exception

  • HIVE-22841: ThriftHttpServlet#getClientNameFromCookie should handle CookieSigner IllegalArgumentException on invalid cookie signature

  • HIVE-24552: Possible HMS connections leak or accumulation in loadDynamicPartitions

  • HIVE-25830: Hive::loadPartitionInternal occur connection leak

  • HIVE-25268: date_format udf returns wrong results for dates prior to 1900 if the local timezone is other than UTC

  • HIVE-25075: Hive::loadPartitionInternal establishes HMS connection for every partition for external tables

ResourceManager high availability mode activates automatically

Spark can work with a custom Hive Metastore

Added SSL support for Hive Metastore

Added the Remove action for the MariaDB service with the created status

Updated Spark3 version to 3.3.2

ZooKeeper: added links to Admin Server endpoints on the ZooKeeper page in ADCM

Enhanced the Move action behavior

Fixed: can’t remove faulty installed Airflow 1 after Airflow 2

Zeppelin: fixed incorrect Hive JDBC string with enabled Hive HA

Hive: fixed timezone for date_format()

Fixed NiFi hive3streaming Unable to close exception

Fixed GROUPING/DISTINCT limitations for Hive tables with 64+ columns

Hive: fixed the AssertionError for requests with union

ZooKeeper: changed the default port for Admin Server

2.1.8

2.1.8.b3

 
     Date: 02.03.2023

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Airflow2: added the high availability mode

Airflow2: added LDAP authentication/authorization support

Airflow2: added support for external broker configuration

Hive version updated to 3.1.3

The following upstream updates have been incorporated:

  • HIVE-20007: Hive carries out timestamp computations in UTC

  • HIVE-20221: Increase column width for partition_params

  • HIVE-20437: Handle schema evolution from float, double and decimal

  • HIVE-20833: package.jdo needs to be updated to conform with HIVE-20221 changes

  • HIVE-20839: Fix the "Cannot find field" error during dynamically partitioned hash join

  • HIVE-21050: Use Parquet LogicalTypes

  • HIVE-21215: Read Parquet INT64 timestamp

  • HIVE-21837: MapJoin throws an exception when selected column has completely null values

  • HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal

  • HIVE-22476: Hive datediff function provided inconsistent results when hive.fetch.task.conversion is set to none

  • HIVE-22540: Vectorization: Decimal64 columns don’t work with VectorizedBatchUtil.makeLikeColumnVector(ColumnVector)

  • HIVE-22589: Add storage support for ProlepticCalendar in ORC, Parquet, and Avro

  • HIVE-22648: Upgrade Parquet to 1.11.0

  • HIVE-23345: INT64 Parquet timestamps cannot be read into bigint Hive type

  • HIVE-24074: Incorrect handling of timestamp in Parquet/Avro when written in certain time zones in versions before Hive 3.x

  • HIVE-25054: Upgrade jodd-core due to CVE-2018-21234

  • HIVE-25093: date_format() UDF is returning output in UTC time zone only

  • HIVE-25104: Backward incompatible timestamp serialization in Parquet for certain timezones

  • HIVE-25299: Casting timestamp to numeric data types is incorrect for non-UTC timezones

  • HIVE-25458: unix_timestamp() with string input returns wrong result

  • HIVE-25559: to_unix_timestamp() udf result is incorrect

  • HIVE-25577: unix_timestamp() is ignoring the time zone value

  • HIVE-26233: Problems reading back PARQUET timestamps above 10000 years

  • HIVE-26270: Wrong timestamps when reading Hive 3.1.x Parquet files with the vectorized reader

  • HIVE-26320: Incorrect results for IN UDF on Parquet column of CHAR/VARCHAR type

  • HIVE-26612: INT64 Parquet timestamps cannot be read into BIGINT Hive type

  • HIVE-26658: INT64 Parquet timestamps cannot be mapped to most Hive numeric types

  • HIVE-26955: Select query fails when decimal column data type is changed to string/char/varchar in Parquet

Changed the hive.server2.transport.mode default value to binary

The cluster_version_before_upgrade variable hidden from config.yaml.j2

HDFS: removed RW permissions from the dfs.permissions.superusergroup property

Spark3: added ADH python package to the installation routine

Implemented a custom configuration for container-executor.cfg

Fixed CORS errors in ResourceManager UI2

Fixed: no retries for the Disable balancer task during the HBase shrink action

Fixed: missing timestamps in the ansible log output in ADCM

Fixed: the configuration for checker-thriftserver is not overwritten during reconfiguration

Fixed: the livy_spark3_move action finished execution with unexpected result

Fixed: status checker does not run for Spark3 after reinstalling

Fixed the checker-thriftserver conflict with other services

Fixed wrong setting of dfs.datanode.https.address when running the Disable Kerberos action

Fixed NodeName stop sequence for the JournalNode expand action

Fixed: HDFS doesn’t stop with enabled Kerberos

Fixed issues with host cert key permissions

Fixed .jceks file permissions

Fixed: Sqoop Hive import fails with enabled Ranger plugin

Fixed the NameNode stop action behavior

Airflow2: fixed support for external database connections

Airflow2: fixed Redis configuration

Airflow2: fixed template usage for airflow.cfg (cfg_properties_template)

Airflow2: fixed permissions for Redis config directory

Airflow2: fixed the Logging level list

Fixed: a failed check if TEZ submit creates an entry in Timelineserver

Fixed the ZooKeeper zkCli.sh error with enabled Kerberos

Fixed: Spark checks failed after YARN expand

Fixed errors with disabling Kerberos

ADH MySQL Service: changed the display name and version

Fixed the description for name.dirs in config.yml

Fixed the description for spark.ssl.trustStore

Updated the hadoop.http.authentication.signature.secret.file behavior

Updated offline package versions

2.1.7

2.1.7.b1

 
     Date: 20.12.2022

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Added the livy-spark3 component to the Spark3 service

Added Hive delegation token

Added the Apply configs from ADCM checkbox for all services

Flink build 1.15.1 is available

Added the ability to connect to Flink JobManager in the high availability mode

Added package checks optimizations for the installation

Added a cleanup of MIT credentials after disabling Kerberos in an ADH cluster for the following services:

  • Hive

  • YARN

  • HDFS

Passwords hidden for actions Config check, Enable SSL for the following services:

  • YARN

  • HBase

  • HDFS

Refactored Livy impersonation options

Added the ability to configure Livy impersonation

Added additional xasecure.add-hadoop-authorization parameter

The Disable HA HiveServer2 action uses the deleteall command instead of rmr

Performed actions optimizations

Added the ability to delete a service in the created state

Fixed the High Availability mode activation for YARN ResourceManager

Fixed: failed to enable SSL with Airflow2 on AltLinux

Fixed: Flink TaskManager doesn’t start in the High Availability mode

Fixed: no metrics when Monitoring service was installed in a wrong order

Fixed: missing Spark3 actions Add Spark3 Livy, Remove Spark3 Livy

Fixed: Flink upgrade from 2.1.6 to 2.1.7 fails if Job Manager is not collocated with the HDFS Client

Fixed: missing parameter container-executor.class in yarn-site.xml

Fixed: Flink installation failed

Fixed the Manage ranger plugin action in HBase

Fixed SSL settings for the following services:

  • HDFS

  • YARN

  • Hive

Fixed: the Enable SSL action failed due to absence of ranger-hbase-plugin

Fixed the upgrade from 2.1.6.b4-1 to 2.1.7.b1-pre_rc

Fixed: incorrect minimum version for an upgrade

Fixed: the Enable HA/Disable HA actions lead to a configuration divergence of hive-site.xml on servers

Fixed the Disable HA HiveServer2 action error

Fixed HBase SASL error while connecting to a kerberized ZooKeeper (no hbase-jaas.conf)

Fixed the error when expanding Hive Server 2 on a host with Hive Metastore

Fixed the Add Hive Metastore action

Fixed: no description for the Reliability Control → timeout field

Fixed: the Scheduler type does not pass to the yarn_scheduler_jmx filter

Fixed: passwords shown in the Ansible log during the Enable SSL action

Fixed: wrong Spnego keytab value when expanding Hive Server 2

Fixed components list on the Hosts - Components page

Fixed: need to install the jdbc-mysql driver on nodes with the hive-client service when internal MariaDB is used

The Enable Ranger plugin action for Hive failed if no YARN service and politics are defined in Ranger

Fixed: enabling SSL repeatedly failed

Fixed: the Spark Thriftserver check fails due to FairScheduler being used

Fixed: YARN reconfiguration fails if FairScheduler is enabled

Fixed errors with the order of hosts decommissioning

Fixed the bug that prevented running Flink on Yarn

Added install_flag: true to the Upgrade action

Fixed naming for service configs (HDFS, HBase, Hive)

Fixed naming for ADH Service actions

Fixed naming and typos for ADH install actions

Changed the sequence of actions for Kerberos, SSL, and cluster installation

2.1.6

2.1.6.b4

 
     Date: 03.11.2022

  • Bug fixes

Fixed: wrong Spnego keytab value when expanding Hive Server 2

2.1.6.b3

 
     Date: 28.10.2022

  • Bug fixes

Fixed: expand operation with http principal is failed

Fixed: the Disable/Enable SSL and Kerberos actions available in any state

Fixed: Spnego error when disabling Kerberos for Hive

2.1.6.b2

 
     Date: 17.10.2022

  • New features

  • Bug fixes

Added support for customization of krb5.conf via ADCM

Fixed: HDFS gets corrupt because of missing Spark files

Fixed an issue with custom Hive configuration

2.1.6.b1

 
     Date: 16.09.2022

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Added support for AltLinux 8.4

Added support for customization of ldap.conf via ADCM

Added support for FreeIPA kerberization

Improved errors handling using cURL

Hive logging refactored

Fixed: cannot change parameters in ranger-yarn-policymgr-ssl.xml (YARN)

Fixed: hive.metastore.sasl.enabled cannot be changed (Hive)

Fixed a configuration name after Flink restarts

Fixed: Zeppelin user does not exist when enabling Ranger for YARN

Fixed: cannot change nameservice when faulty installed (HDFS)

Fixed: a cluster upgrade always switches state to Installed

Fixed: a Spark 3 upgrade does not disable old repositories on Alt Linux

Fixed: the Reconfigure Kerberos action has no allow_to_terminate attribute

Fixed: YARN Resource Manager does not start on Alt 8.4

Fixed ClassNotFoundException while executing spark-sql with enabled Ranger plugin

Fixed inconsistency between security settings in Spark and Spark3 configs

Changed check code due to Scala version upgrade

2.1.4

2.1.4.b11

 
     Date: 08.08.2022

  • New features

  • Improvements

  • Bug fixes

Added the ability to specify external nameservices

Added the ability to connect to HiveServer2 in the fault-tolerant mode

Cluster components states refactored

Refactored the order of Stop, Start, and Restart actions for the HDFS service

Enhanced monitoring metrics collection by YARN queues

Removed read-only attribute from the hadoop.security.auth_to_local property

Fixed the cluster kerberization status error after a bundle upgrade

Fixed: Ansible variable is not resolved during HDFS installation

The hive.zookeeper.quorum property gets reset on the Hiveserver2 Disable action

Fixed permissions for the dfs.datanode.data.dir directory

2.1.4.b10

 
     Date: 16.06.2022

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The ability to install ADH components from custom Docker registry is added

The check box Rewrite current service SSL parameters is added for the Enable SSL action

New parameters are added for the Zeppelin Hive interpreter — in order to enable SSL and Kerberos

Retries for generating Kerberos principals are implemented

Custom authentication (LDAP/AD) is enabled for Hive2Server

The Move action is added for Spark Livy Server

The Move action is added for Spark History Server

The Move action is added for Flink Job Manager

The Move action is added for Sqoop Metastore

The Move action is added for YARN Timeline Server

The Move action is added for YARN MapReduce History Server

The Ranger plugin for Solr authorization is added

The ability to remove services from the cluster is added

The ability to customize configuration files via ADCM is added

The support of Kerberos REALM is added

The Solr connection for audits is changed from Solr server to ZK node

Changing the property type from string to option when upgrading a bundle does not reset the property value anymore

SSL default configuration parameters are changed from invisible to read-only

Fixed: duplicate dictionary keys in the config.yaml file that did not pass the new YAML validation in ADCM

Fixed: the error with Zeppelin installation without Hive

Fixed: users could change some read-only Kerberos-related parameters in services

Fixed failing jobs when enabling the GPU on YARN property

Fixed the error with applying the Remove service action to Spark3

Fixed the error with incorrect value in interpereter.json in Zeppelin when SSL is off

Fixed: after enabling SSL policies did not work in Ranger

Fixed: the Container DN format when applying Enable Kerberos

Fixed the error with incorrect saving the HBASE_MASTER_OPTS value in HBase

Fixed: YARN failed to copy container-executor.cfg

Fixed: job status and result did not match when deleting optional (unnecessary) settings in the ZooKeeper service configuration

Fixed: the action Check applied to Flink failed if hosts in ADCM had uppercase letters

Fixed: services did not collect policy from Ranger in SSL

Fixed: applying the action Remove internal database actually removed the service itself

A fixed Ranger Solr plugins repository is added for ADH 2.1.4

The order of bundle upgrades is changed from particular to general

Dependencies between components and services at the ADCM level are implemented

Ranger Plugins are bumped to 1.0.3

The ability to download ADH offline packs from the Arenadata source directory to the customer proxy repository is added

2.1.4.b9

 
     Date: 31.03.2022

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The Kerberos authentication is enabled for Web UI

SSL for Ranger plugins is enabled

SSL for Flink is enabled

SSL for Sqoop is enabled

The rollback operation is enabled in the case of the failed kerberization process

SSL for Zeppelin is enabled

SSL for Airflow is enabled

SSL for Spark is enabled

SSL for Solr is enabled

SSL for Hive is enabled

SSL for HBase is enabled

SSL for YARN is enabled

The ability to configure SSL in the Hadoop clusters is added

SSL for HDFS is enabled

The Custom hive-site.xml block is placed after the hive-site.xml block in the configuration settings

The links to NameNodes and HttpFS are moved to the top of the HDFS web links list

The order of cluster stop actions is reversed

The Reconfig and restart action is replaced by the Restart action that runs three operations: stops the service, applies configuration parameters, and starts the service

The ability to execute the resourcemanager_enable_ha action without changing hc_map is disabled for Resource Manager

The ability to execute the resourcemanager_expand action without enabling High Availability (HA) is disabled for Resource Manager, since it does not work without enabling HA

Fixed: the parameters from the httpfs-site.xml configuration file did not apply to the HttpFS service

Fixed: SQL queries launched from Spark3 or Spark2 did not work correctly with the Ranger Hive plugin being enabled

Fixed: the keystore/truststore parameters specified in the service settings did not override the default cluster settings during the per-service installation

Fixed: the Phoenix Query Server could not work in the thin mode in the kerberized environment

Fixed: the error with running spark-thrift-server-checker after enabling and disabling Kerberos

Fixed: the error with saving the Flink configuration parameters after installation in the kerberized environment

The ability to work with kerberized ADH clusters is fixed for the Windows operation system

Fixed the Cluster Monitoring imported, but not installed error that occurred during the cluster installation

Fixed the error with checking privileges to the public schema of the ranger database at the preconfigure-check stage

The cluster installation errors at the HBase check stage are fixed

Fixed: application logs were unavailable in the legacy Resource Manager UI of the kerberized clusters

Fixed: Kerberization on AD failed if two instances of Ranger Admin were installed

Configuring SSL settings is added before enabling SSL in autotests

Web links are rewritten to support the http/https schema change

2.1.4.b6

 
     Date: 21.12.2021

  • Improvements

  • Bug fixes

Refactoring of the database for Hive Metastore checks is done

The error with the kdc_type parameter being set to null after the cluster upgrade is fixed

Fixed the error with MapReduce jobs launched in the kerberized cluster not under the yarn user

Fixed: mapped but not installed services caused the errors via installation of other services

2.1.4.b5
  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The HTTP mode is added for HiveServer2

The AD/LDAP/SIMPLE authorization is added for Zeppelin

The HBase REST Server component is added for HBase

The ability to use Active Directory as Kerberos storage is implemented

The ability to set Kerberos principal for running Spark jobs via YARN is added. Before that Spark always launched tasks using the yarn principal

Fixed the error with DataNodes expanding via the Add DataNode action in the kerberized environment

Fixed the error with the Livy interpreter working in the kerberized environment

Fixed the error with the Hive interpreter working in the kerberized environment

Fixed the error with Zeppelin checks after Kerberos activation

Fixed the error with removing the monitoring component jmxtrans

Fixed: the Enable Resource Manager HA action failed in the clusters with Kerberos and Ranger plugin being enabled

Fixed the error with Airflow installation in the kerberized environment

Fixed the error with enabling Kerberos after its disabling (enable → disable → enable)

The full stack testing for using the RedHat 7.9 enterprise license in ADH is added

2.1.4.b4

 
     Date: 01.11.2021

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The Reinstall status-checker action is implemented. It runs the status-checker deployment scripts for services as well as for Docker containers

The Solr check is changed: the number of live Nodes is compared instead of lists

The timeout/retry count is increased for the Zeppelin check

Fixed: the bundle version update error

Fixed: the Ranger plugin worked incorrectly in the case of using some characters in the cluster name

Fixed: the error with the inconsistent state of the actual DataNodes maintenance state after upgrading ADH from 2.1.3 to 2.1.4

Fixed: Keytabs permissions changed during some actions

Fixed: the error with per-service installation in the kerberized ADH clusters

Fixed: the error with parsing the list of containers by the Docker status checker in Airflow

The error with the heap size test is fixed

The broken compatibility with the current dev version of ADCM is fixed

The test logic for per-service installation in the kerberized ADH clusters is changed: before each service installing it is necessary to add the service to the cluster and add its components to hosts (instead of adding all components to all hosts)

2.1.4.b3

 
     Date: 30.09.2021

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The MIT Kerberos integration is implemented in ADCM

The ability to add the custom port for Kerberos Server is added

Ranger plugin and kerberized YARN are integrated

Ranger plugin and kerberized Hive are integrated

Ranger plugin and kerberized HBase are integrated

Ranger plugin and kerberized HDFS are integrated

The Ranger plugin is made operable on kerberized services

The split memory option is added for Hive services: resource management options can be configured for HiveMetastore and HiveServer2 separately

The edit memory size option is added for Flink components

The edit memory size option is added for Solr components

The edit memory size option is added for Sqoop components

The edit memory size option is added for Spark components

The edit memory size option is added for Zeppelin components

The edit memory size option is added for HBase components

The edit memory size option is added for YARN components

The Add/Remove actions are added for YARN Timeline server

The Add/Remove actions are added for Sqoop Metastore

The edit memory size option is added for HDFS components

The ADH memory management option is added

The Add/Remove actions are added for Flink Job Manager

The Add/Remove actions are added for Spark Thrift Server

The Add/Remove actions are added for Spark Livy Server

The Add/Remove actions are added for Spark History Server

The Add/Remove actions are added for Hive Tez UI

The Move action is added for YARN MapReduce History Server

Kerberos is implemented for ADH in ADCM

The ability to move any service component to another Node or remove it from the cluster is added

The unnecessary repository/packages check at the HDFS installation step for ADH EE is removed

The path for docker-status-checker files is changed

Fixed the error with the refreshUserToGroupsMappings task in the kerberized cluster

Fixed the error with Solr not working after applying host actions

Fixed the error with the Enable Resource Manager HA action in the kerberized environment

Fixed the error with Spark expanding/shrinking in the kerberized environment

Fixed the error with Solr shrinking in the kerberized environment

Fixed the error with YARN Node Manager expanding in the kerberized environment

Fixed the error Keytab does not exist: $user.home/hadoop.keytab

Fixed the error with Flink Server expanding/shrinking

Fixed the error with Flink JobManager Server Port 6123 availability

Fixed the error with Sqoop expanding in the kerberized environment

Fixed the error with YARN Server expanding/shrinking

Fixed: the Solr role tried to import the absent monitoring role even with the monitoring service being not installed

Fixed the error with running the Install service action for Solr

Fixed the error with Spark Livy server expanding in the kerberized environment

Fixed the error with Spark Thrift Server shutting down in the kerberized environment

The Solr shared memory reservation error is fixed

Fixed the error with Solr kerberization with no Hadoop services being added

Fixed the error with starting jmxtrans after the host reboot

The incorrect URL for the Hive Server Web UI is fixed

Fixed the error with opening link to the HiveServer2 UI after ADH installation

Fixed the error with availability of host actions after the cluster upgrade from 2.1.3.0 to 2.1.4.b2

Fixed: the Enterprise cluster installation failing during the per-service action

Fixed: the Reconfig and restart action failed for the Monitoring service with Airflow being installed

Fixed: the Spark ThriftServer process did not stop after the Spark context being killed

In order to speed up autotests and development process, the packages check is made optional for the specified environments

The http and registry versions are bumped to the current ET release

The specifications for new Spark and YARN MapReduce History Server actions are added

2.1.4.b2

 
     Date: 20.07.2021

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The ability to use external MySQL in AirFlow is added

The ability to use external PostgreSQL in Airflow is added

Host actions are added for the Spark3 service. Hosts actions here and below mean the actions managed at the host level

Host actions are added for the Monitoring service

Host actions are added for the Sqoop service

Host actions are added for the Airflow service

Host actions are added for the Solr service

Host actions are added for the Flink service

Host actions are added for the Zeppelin service

Host actions are added for the Spark service

Host actions are added for the MySQL service

Host actions are added for the Hive service

Host actions are added for the HBase service

Host actions are added for the YARN service

Host actions are added for the HDFS service

The Sqoop Check action is modified according to the new Hive external DB variables

Host actions are renamed

The unnecessary solr-tools.jar file is removed from the Solr submodule in the bundle, as it caused errors in CI

The error with offline installation in the operation system RH 7.9 is fixed

The error with applying the Reconfig and restart action to the Monitoring service is fixed

Fixed the error with installing MySQL on the host from which it was removed earlier

Fixed the error with the CPU utilization YARN metric after the ADH cluster installation

Fixed the error with the Maintenance DataNode action that occurred due to the incorrect content of the dfs.hosts file (if another DataNode has been switched to the maintenance state earlier)

For debugging possible problems via Allure reports, logs collecting is implemented for Airflow service

Fixed the wrong description in the autotest that implements migration to the external MySQL database

Specifications for testing host actions are changed

Tests for host actions are added

Specifications and autotests are added for the ADH shrink scenarios

2.1.4.b1

 
     Date: 22.06.2021

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The ability to define custom HBase environment variables is added

The action for removing MySQL from the ADH cluster is added

The ability to use external PostgreSQL in Hive Metastore is added

The ability to change the Hive Metastore host:port is added

The ability to configure Java Heap for HiveServer2 is added

The ability to add/change/remove configuration options from the httpfs-site.xml file via ADCM is added

Start checks for JournalNodes and NameNodes are added

Spark 3.1.1 is implemented for ADH 2.X

The offline installation is implemented for ADH

The Check action is improved for Sqoop

The built process for Solr is changed: Arenadata repositories are used instead of external Maven repositories

The ability to use Docker Registry from Arenadata repository is implemented

In order to install services (e.g. Airflow) without DNS, resolving host names in the Docker containers is implemented

Airflow installation without DNS is implemented

Implemented the DN check/wait for membership in the cluster from the DN itself

The Spark component Spark History Server is made mandatory

Refactoring of the ZooKeeper service is done

Fixed: Hive installation failed when adding the Tez component without TezUI

Fixed: the duplicate key type: list is removed from the bundle configuration file

Fixed: YARN applications could not run jobs after the Ranger plugin being enabled

Fixed the error with the Enable Resource manager HA action in the ADH Community Edition

Fixed the error with Sqoop installation after Ranger being installed

Fixed problems with HBase logging

Docker images in the package specifications are changed according to the new naming convention

Packages for the 2.1.4 release are uploaded to the Google repository

Fixed the logs collecting for HttpFS during autotests

The repository for the 2.1.4 version of the product is created

The ability to update a bundle without rebuilding packages is added

Unnecessary garbage files are removed from the bundle build archive

To resolve ansible lint-issue, the pipefail options for shell tasks are added

Tests for checking the integration between ADH and Ranger are added

Build 2.1.3.1 ADH

2.1.3

2.1.3.0

 
     Date: 14.01.2021

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The Remove/Add Hive Tez actions are added

The Add diamond and Remove diamond actions are added

Build Ranger 2.0.0

The logic of the YARN Resource Manager expanding is changed

The validation logic for Spark Client is changed from parallel to in series

ADPS integration: the new ADPS bundle that contains Ranger has to be re-integrated with ADH after moving Ranger from it

Fixed the error with closing Hive tasks after finishing checks

Livy checks are temporarily disabled

Fixed the error with bad cluster name that occurred when creating the HDFS service via Ranger Admin

Fixed the error with the HDFS action Remove Client

Fixed unsuccessful Hive CLI checks after the Ranger plugin being enabled

Fixed the error with connecting multiple clusters to one ADPS

The repository for plugins is added to the release bundle

Packages for the 2.1.3.0 release are uploaded to the Google repository

Build 2.1.3.0 ADH

ADH is bumped to 2.1.3.0

New repositories for Ranger plugins are added

Specifications on the workaround for the error with HBase expanding after HDFS expanding are edited

Specifications on expanding ADH services are created

2.1.2

2.1.2.5

 
     Date: 19.11.2020

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Client components for Flink are added

Client components for HDFS are added

Client components for YARN are added

The timeout for the docker_container is increased in autotests. Also the checks of the correct starting order are added for services in containers

The default port number for MySQL in the Airflow Metastore is changed

The volume for Hadoop configurations (e.g. /etc/hadoop/conf/, /etc/hive/conf, etc.) in Docker images with Airflow is increased

Fixed the race condition within Sqoop checks (part 2 — with multiple Clients)

Fixed the cluster installation error that occurred when MySQL being installed

Fixed the cluster installation error that occurred when checking after installing Spark

Packages for the 2.1.2.5 release are uploaded to the Google repository

Building the offline package for ADH to the ADH repository is implemented

ADH is bumped to 2.1.2.5

Tests for the YARN/HDFS Client are created

Specifications for autotests of the YARN/HDFS Client are added

Changes in the Hadoop tests related to uploading bundles are made

Autotests for Airflow are created

All file accesses are made independent from the current working directory in autotests

The dev repository for ADH 2.1.2.5 is initialized

2.1.2.4

 
     Date: 18.09.2020

  • Bug fixes

  • Misc/Internal

The incorrect repositories in the ADH release and ET packages are fixed

The error with processing the release_version flag is fixed

Packages are copied from 2.1.2.3 to 2.1.2.4

2.1.2.3

 
     Date: 15.09.2020

  • New features

  • Bug fixes

  • Misc/Internal

The ADH bundle is divided into community and enterprise versions

The High Availability for NameNodes is implemented

Fixed the error that occurred at the Restart NameNodes step during the Remove NameNode action

Fixed the error with checking Hive Tez on multiple hosts

Fixed the error with switching dynamic allocation

Fixed: ZKFC ignored the dfs.namenode.rpc-bind-host parameter and used the dfs.namenode.rpc-address parameter for binding to the host address

Packages for the 2.1.2.3 release are uploaded to the Google repository

All specifications and BOMs related to ADH20 are moved to the prj_adh. Publishing of artifacts to the artifactory is changed

The release and develop repositories are segregated in bundles

2.1.2.2

 
     Date: 05.06.2020

  • Improvements

  • Bug fixes

  • Misc/Internal

The epel-release installation is disabled

The race condition within Sqoop checks is fixed

Fixed the error with running the cluster Check action

Packages for the 2.1.2.2 release are uploaded to the Google repository

ADH is bumped to 2.1.2.2

Nginx is copied from the Epel repository to the ADH2 repository

2.1.2.1

 
     Date: 21.05.2020

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Sqoop deployment is ported to ALT Linux

Solr deployment is ported to ALT Linux

Flink deployment is ported to ALT Linux

The public ALT Linux repository for ZooKeeper 3.4.14 is created

Airflow deployment is ported to ALT Linux

The ability to set nproc limits for HBase is added

Sqoop is added into the ADH bundle

ADH 2.X packages are built for ALT Linux

Solr 8.2.0 is added for ADH 2.2

Refactoring of the ADH deployment process for ALT Linux is made

The error with commissioning/decommissioning Nodes via ADCM is fixed

Fixed the error Port already in use: 10102 that occurred with HBase hbck

Fixed the ordering of Generic components

Fixed: web links in ADCM did not refresh after HDFS DataNodes or YARN Node Manager shrinking

Fixed the error that occurred with YARN 3.1.2 in ALT Linux during ansible tasks

Fixed the absence of org.json.JSONObject for Sqoop Metastore

The /var/run/sqoop directory is created for Sqoop Metastore

The missing dependency for Flink-related packages is added

Fixed the error with installing HBase and Solr when using the external ZooKeeper

Airflow deployment is disabled (visible in ADCM and only for ALT Linux)

The public repository for the release is changed

Packages for the 2.1.2.1 release are uploaded to the Google repository

Changes for libisal are merged

Changes for bigtop-groovy, bigtop-jsvc, bigtop-tomcat, and bigtop-utils are merged

Changes for Bigtop are merged

Changes for Livy are merged

Changes for ZooKeeper are merged

Changes for Zeppelin are merged

Changes for Spark are merged

Changes for Phoenix are merged

Changes for Tez are merged

Changes for Hive are merged

Changes for HBase are merged

Changes for Hadoop are merged

Bigtop branches for CentOS and ALT linux are manually merged

The repository url is changed to 2.1.2

Autotests for ADH services are reviewed according to the current stack version

2.1.2.0

 
     Date: 19.02.2020

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

The ability to configure Hive ACID is added

SELinux is disabled for all components during installation

Support of the Flink 1.8.0 is implemented for ADCM

Flink is added into the ADH bundle

The logic of the Shrink action is improved

GPU support is enabled for YARN

Airflow is added into the ADH bundle

The UI link is added for Solr at the main ADCM page

The Shrink/Expand actions are implemented for HDFS HttpFS

HDFS HttpFS checks are implemented

The Solr Cloud Mode is implemented

The Solr deployment is implemented

Solr is added into the ADH bundle

Tez libraries are installed on Hive Client Nodes

Fixed: it was impossible to use Hive with Tez due to the configuration mismatch

Fixed the error with saving configurations for HDFS and YARN

Fixed the error with HBase checks after installation

Fixed the error with YARN checks in the HA mode

Tests/example DAGs for checking the Airflow functionality are added

Tests for checking the Solr functionality are added

2.1.1

2.1.1.1

 
     Date: 02.12.2019

  • Misc/Internal

Conversion to the custom systemd units

2.1.1.0

 
     Date: 21.11.2019

  • New features

  • Improvements

  • Misc/Internal

YARN Scheduler configuration is implemented

HDFS mover is implemented

The cluster-wide Install button is added to the ADCM UI

The ability to define the external ZooKeeper in the core-site.xml file is added

The ability to add custom/advanced configuration parameters to the *-site.xml files is added

YARN Node labels are implemented

HDFS HttpFS is implemented

HDFS Short-Circuit Local Reads are implemented

HDFS Disk Balancer is implemented

HDFS Balancer is implemented

The *-site.xml files are unified

Asserts and fails are replaced with adcm_check

Monitoring is refactored: code/dashboards are unified, metrics are redesigned, etc.

The hostname variable is removed from the Zeppelin PID definition

The HDFS dashboard is divided into HDFS and YARN dashboards in Grafana

Hadoop PID file names are changed

Manual testing of the ADH 2.1 installation is performed according to the documentation

2.1.0

2.1.0.0

 
     Date: 10.10.2019

  • New features

  • Improvements

  • Bug fixes

  • Misc/Internal

Implemented the ability to get a status for the following services:

  • Zeppelin

  • Livy Server

  • Thrift Server

  • Spark Server

  • Zeppelin

  • Phoenix

  • HBase Thrift

  • HBase

  • MySQL

  • YARN

  • HDFS

  • Hive

Implemented service management for the following services:

  • Livy Server

  • Zeppelin

  • Spark Thrift Server

  • Spark Server

  • Phoenix Server

  • HBase Thrift

  • HBase Region Server

  • HBase Master

  • Node Manager

  • Resource Manager

  • Timeline Service

  • WebHCat

  • MySQL

  • Hive Metastore

  • Hive Server

  • DataNodes

  • Secondary NameNodes

  • NameNodes

Prepared deployment scripts for the following services:

  • Livy Server

  • Spark Thrift

  • Spark Server

  • MySQL

  • HBase Thrift

  • Phoenix Server

  • Hive

  • HDFS

Implemented service checks for the following services:

  • Zeppelin

  • Spark Thrift Server

  • Spark Server

  • Livy Server

  • Hive

  • MySQL

  • YARN

  • HDFS

  • ZooKeeper

Implemented the deployment of the following services:

  • Zeppelin

  • Livy Server

  • Spark Server

  • HBase Thrift

  • Phoenix Server

  • HBase Region Server

  • HBase Master

  • WebHCat

  • Hive Metastore

  • MySQL

  • Hive Server

  • Hive Client

  • Node Manager

  • Resource Manager

  • Timeline Server

  • DataNode

  • Secondary NameNode

  • NameNode

The following builds are available:

  • Tez 0.9.2

  • Livy 0.6

  • Pig 0.17

  • Flume 1.8

  • YARN UI 2.0

  • Sqoop 1.4.7

Monitoring features are implemented for the following services:

  • Spark

  • YARN

  • HBase

  • HDFS

  • Hive

Necessary configurations for Hive/Tez are added

The hbase-site block is added to the config.yml file

Necessary configurations for Hadoop services are added

Quick links for services are added

The HDFS rack awareness is implemented via custom scripts

The YARN and MapReduce services are combined into single one

The Resource Manager High Availability is implemented

Checks for Decommission/Recommission for Node Managers are implemented

Checks for Decommission/Recommission for DataNodes are implemented

Zeppelin is bumped to 0.8.1

Zeppelin is implemented for ADCM

Tez UI is implemented for ADCM

The ability to add a new Node Manager to ADH is added

The ability to add new DataNodes to ADH clusters is added

Spark is implemented for ADCM

Ranger is bumped to 1.1

The ZooKeeper Quorum configuration is added

The MySQL role is added to the ADH bundle (as a service)

Multiple configuration directories for Nodes are implemented

The YARN logs aggregation is enabled

Spark and Hive roles are reviewed

The Hadoop role is divided into HDFS, YARN, MapReduce

The Hadoop role for ADCM is refactored

Separate roles for Hadoop are implemented

The ZooKeeper service role is ported from the ADS Bundle

Basic YARN service features are refactored

Fixed the error with the hbase.zookeeper.quorum parameter missing after the installation

Pre-release preparations are made for ADH 2.1.0

The EULA.txt file is added to the bundle root

The repository for ZooKeeper packages is added to ADH

All ADH bundle submodules are switched to Master

Documentation on Decommission/Recomission/HA is prepared

Documentation on HBase deployment via ADCM is prepared

Documentation on Spark deployment via ADCM is prepared

Documentation on Hive deployment via ADCM is prepared

Documentation on YARN deployment via ADCM is prepared

Documentation on HDFS deployment via ADCM is prepared

Documentation for the ADH bundle is prepared

Spark autotests are implemented

Hive autotests are implemented

YARN autotests are implemented

HDFS autotests are implemented

Smoke tests for the Livy Server service check are prepared

Smoke tests for the Spark Thrift Server service check are prepared

Smoke tests for the Spark Server service check are prepared

Smoke tests for the MySQL service check are prepared

Smoke tests for the HBase service check are prepared

Smoke tests for the Phoenix service check are prepared

Smoke tests for the Hive service check are prepared

Smoke tests for the HDFS service check are prepared

The latest stable packages for ADH are built

Found a mistake? Seleсt text and press Ctrl+Enter to report it