High Availability introduction & key concepts

Iraje PAM is a Privileged Access Management (PAM) solution that enables organizations to manage privileged accounts across a hybrid environment. It increases the visibility of operations with session management regardless of the locations, provides automated discovery of assets and accounts, detects anomalous behavior in the system, investigates threat patterns, and offers a comprehensive approach to privileged password management with its automated password vault. It provides an integrated platform that is highly flexible and scalable which addresses the challenges of a hybrid enterprise.

The feature of High Availability (HA) ensures availability of the application and vault. The availability of the system depends on the different factors such as the number of components, their configuration settings, and the resources allocated to each component. High availability in the system refers to the number of fail-overs combinations and aims for a 99% uptime with near-zero downtime.

Key concepts in context of the system

This section covers some of the key concepts used across the documentation in describing high availability situations and architecture.

Vault Replication

Replication is the process of storing data in more than one vault. It is achieved by electronic copying of data from one database to another which is automatically synchronized, resulting in the distributed system.

  • Replication when using Iraje non-contianer vault Iraje PAM uses encryption method to store all the data in secured vault. The Inbuilt replication works on the master-slave configuration supported by all MySQL types of the database replication (the primary database is called as master and the other synchronized databases are called slaves). This helps you to access data without any interruption hence promising high availability and ensures that the system is integrated. Replication operates on port no 1521 between instances.

  • Vault replication when working with Microsoft SQL Server When configuring your vault instance over a Microsoft SQL Server, refer to building high availability depending upon configuration and licensing of Microsoft SQL Server here.

Redundancy & failover

The main aspect of HA is to eliminate Single Point of Failover (SPoF). To achieve this, the system is implemented with redundant servers running multiple instances of services at the same time, this is called redundancy. Similarly, when the fallback server takes over from the primary server in case of failure, it is called failover. If one server fails, the system can then failover to use another server that did not fail. For example, server A is our primary server and server B is the fallback server. In case, server A fails the user traffic will be directed to server B.

Iraje supports all types instances(Active-Active,Active-Passive) for ensuring the failover of components of Iraje PAM Access and vault components.

Load balancing

To achieve optimum utilization of instance resources, it is recommended to configure load balancing between Sectona instances or components. Typically, a load balancer sits between the client and the server farm accepting incoming network and application traffic and distributing that traffic across multiple servers using various methods. By spreading the work evenly, load balancing improves application responsiveness. It also increases the availability of the application for users. For achieving load balancing all instances must have similar versions of the solution running. The solution also has inbuilt software-based load balancing capabilities when enabled and configured. Following load balancing techniques are supported by the solution.

  • Inbuilt load balancing of Iraje PAM System supports inbuilt load balancing defined at the application level between two nodes and does not depend on any external components.

High-Availability configurations

In normal scenarios, there are two possibilities for configuring high-availability in your environment.
Both scenarios are illustrated below:

  • Active-Passive configuration:

    In this configuration, there exists a primary node and a fallback node. At any point only a single node is active. Processing requests and the fallback node is activated only in case of failure of the primary node. This is an easier and recommended configuration for small-to-mid size environments for achieving high availability with minimum configuration, operational requirements, and flexibility of downtime expectation 1-5 mins in case of a failover.

  • Active-Active configuration:

    In an Active-Active configuration, both the nodes will be used for processing requests in parallel. In case one of the nodes fails, then its user traffic and operations load will be shifted towards the other node as well.

Monitoring Service

Monitoring service is an internal process used to monitor, alert, and handle failover situations. The service active and running on each node in high availability is responsible for initiating database on both the nodes, deciding on the primary, or secondary node if there is a failover, and so on. Monitoring service running on the fallback keeps sending heartbeat messages periodically to the port on which the primary application is running to check for the availability of the primary node. When it discovers that the primary application is not responding it makes itself the new master and also makes the fallback vault the new master. When the original primary app comes up again it takes the role of a fallback node and continues to operate in passive mode.

Load Management Service

Load management services checks on the numbers of sessions on each node, resource utilization on each node, and dynamically decides to pass traffic to another node. Service must be in active sync on both instances to achieve this.

Vault Replication

When using the embedded vault option without clustering, replication between nodes can be initiated to prevent any data loss. In a high availability setup, all configuration files are synchronized automatically from the primary node to the secondary node at an interval of two minute. Database synchronization happens instantly by physical replication of the database.