Administration Guides

ECA Cluster Disaster Recovery and Failover Supported Configurations

Home



Overview

This section covers how the ECA cluster can be configured for failover to another site or for migrating the ECA cluster to another cluster when the current Isilon will be decommissioned or after a failover and the failover cluster is now active.    This guide applies to Ransomware Defender, Easy Auditor and Performance Auditor.

Unsupported ECA HA Configurations

  1. It is not supported to have 2 different ECA clusters using the same Eyeglass VM. 
  2. It is not supported to stretch a ECA between 2 data centers.  An ECA cluster is not designed to be stretched and handle node failures.  It is designed as a local cluster with load balancing and local node failover.  Warm Standby is the only supported HA solution.   

Prerequisites for both Scenarios for the DR Cluster and DR Site 

  1. Easy Auditor checklist - All steps below are found in the installation guide. The check list is a summary of key steps that must be completed using the installation guide.  All Steps MUST be completed before ECA cluster can manage the target cluster.
    1. (Mandatory) If Warm Standby ECA cluster is used it must have the same VM count as the production site ECA cluster.  This option is only used when ECA cluster failover to the 2nd site is required.
    2. (Mandatory) The Easy Auditor Database has been synced to the DR site into the correct access zone - The guide here can be followed to protect the audit database with SyncIQ.
    3. (Mandatory) The IP pool is created in the Eyeglass access zone with at least 3 nodes in the pool to receive HDFS IO from the ECA and the smartconnect name is created for HDFS access.
    4. (Mandatory) Smartconnect DNS Delegation is created for HDFS access zone in your DNS infrastructure following Dell documentation.
    5. (Mandatory) Onefs HDFS license is applied to the failover target cluster for the Easy Auditor Database.
    6. (Mandatory) All firewall ports are open between the ECA cluster and the failover target cluster
    7. (Mandatory) A System zone NFS export is created with the IP addresses of the warm standby ECA cluster IP addresses at the failover site for the ECA mount to ingest audit data.
    8. (Mandatory) Smartconnect DNS Delegation is created for NFS access in the system zone in your DNS infrastructure following Dell documentation.
    9. (Mandatory) ECA will require a mount path with the target cluster name (case sensitive) and cluster GUID (Cluster Management > General  GUID is visible on this page)​
    10. (Mandatory)  Prepare DR cluster for HDDFS database following steps below.  The installation guide has detailed steps.  The steps below are high level steps that need to be completed.
    11. Create the local Hadoop user for the HDFS database (eyeglasshdfs)  in the System access zone of the DR PowerScale cluster as per Preparation of Analytics Database Cluster documentation, specify the same UID as the production PowerScale cluster’s hdfs eyeglasshdfs user. Example:
    12. isi auth users create --name=eyeglasshdfs --provider=local --enabled=yes --password-expires=no --zone=system --uid=eyeglasshdfs_uid_on_production_PowerScale
    13. Set the permissions on the HDFS folder using root user on the DR cluster and use the commands below
    14. mkdir -p /ifs/data/igls/analyticsdb/eca1/ 
    15. chown -R eyeglasshdfs:'Isilon Users' /ifs/data/igls/analyticsdb/eca1/
    16. chmod -R 755 /ifs/data/igls/analyticsdb/eca1/

Scenario #1 - Production Writeable cluster fails over to DR site ECA cluster stays at the production site (Longer RTO)

In this scenario the active cluster is failed over to the DR cluster and the ECA needs to be configured to monitor the DR cluster.  This procedure takes time to complete versus Scenario #2 warm standby ECA that offers faster RTO. NOTE: Assistance can be scheduled with services, reconfiguration is not under the support contract and must be scheduled with services that operates M-F based on resource availability.

Post Failover Reconfiguration Steps

  1. Before failover, audit data is ingested from the production site cluster and the ECA VM's are located at the production site.  
  2. After failover to the DR Site, the DR cluster needs to have auditing enabled and configure the NFS export configured with a smartconnect name delegation configured to allow the ECA cluster in the Production site to mount the DR cluster audit folder.  This will allow audit data to be ingested by the ECA cluster.  
  3. Eyeglass license must be assigned to the DR site cluster after failover.  The license manager UI in eyeglass is used to assign the license to the DR site cluster.
    1. Set the production site cluster to unlicensed and click submit.
    2. Then set the DR site cluster to user licensed and click submit.
  4.   
  5. Steps to complete (Follow the Installation guide for detailed steps)
    1. Get cluster name (case sensitive) and GUID to be used in later steps.
    2. Enable Auditing on the DR cluster  Onefs GUI  Cluster Management --> Auditing --> enable protocol Auditing and add all access zones to the list and save the configuration.
    3. Create target cluster NFS export for audit data ingestion.
    4. Create smartconnect name to mount the export in system zone, setup DNS delegation for smart connect name to the IP pool to be used for NFS mount in system zone.
    5. Create mount path on the ECA cluster using target cluster name and GUID.  See guide for steps to create the mount point on all ECA nodes using information collected from Step #1.
    6. Edit auto.nfs file on ECA node 1 (/opt/superna/eca/data/audit-nfs/auto.nfs)  with new smartconnect name on the DR cluster in system zone and enter mount point for the NFS mount using the path from the step above.  Comment with the # character the previous mount entry to prevent mounting the previous cluster.  Save the file. (example /opt/superna/mnt/audit/00505699ec55c64cf45d411e36ac285fff13/prod8 -fstype=nfs,nfsvers=3,ro 172.31.1.104:/ifs/.ifsvar/audit/logs)
    7. Sync the Configuration to all ECA nodes
      1. ecactl cluster push-config
    8. unmount the audit NFS export from the previous active cluster on the ECA cluster.
      1.  ecactl cluster exec "sudo umount -a -t autofs (note the password for ecaadmin will be requested for each node)
    9. remount the new cluster NFS audit export
      1. ecactl cluster exec "sudo systemctl restart autofs
      2. Verify the mount was successful on each node, the command below should show the mount on each node.
      3. ecactl cluster exec mount | grep "ifsvar"
    10. Restart services
      1. ecactl cluster services restart --container turboaudit --all
    11. Complete license steps documented above if not already completed
    12. Reconfigure Security guard and roboaudit features to self test the new cluster.  A new local account needs to be created on the new target cluster and edit the the configuration to switch to the target cluster.  This requires editing the user and changing the cluster that the automation will run against.  The license step must be completed before you can select the target cluster.
      1. See security guard guide here,  and roboaudit guide here.
      2. NOTE: SMB port must be open between eyeglass and the new target cluster.
      3. Run security guard and roboaudit and monitor the job from the Jobs Icon --> running jobs tab.
    13. Done


Scenario #2 - Production Writeable cluster fails over to DR site and Production site ECA cluster will failover to warm standby ECA cluster at the DR Site (Lower RTO)

 In this scenario the ECA cluster site is impacted and the Warm standby ECA cluster will become the active ECA cluster.  In this scenario it is assumed the Eyeglass VM is already located at the DR site as the best practice or the warm standby Eyeglass VM has been activated at the DR site.   


Prerequisites to Complete Before a Failover 

  1. Deploy a Warm Standby Eyeglass vm following this guide.  The active Eyeglass VM is located at the DR site, if it is not the active appliance follow the guide to make the Eyeglass VM at the DR site the active appliance.
  2. Deploy a 2nd warm standby ECA cluster at the DR site and configure it to use the Eyeglass VM at the DR site. Complete all steps to setup the DR cluster to be protected by the ECA following the installation guide here to install the Warm standby ECA cluster.     The points below are used to customize the ECA installation for the DR cluster. Use the preparation information in this guide to collect the information needed for the steps below.
    1. NOTE: The audit data NFS mount point will use the DR cluster name and cluster GUID.  the cluster name and GUID will be added to the warm standby ECA mount points.
    2. NOTE:  The eca configuration file /opt/superna/eca/eca-env-common.conf references the DR site Eyeglass VM (export EYEGLASS_LOCATION=  and the API token export EYEGLASS_API_TOKEN ) edit this file and add the warm standby VM ip address.   Use the same API token value used on the production ECA and get this from the eca-env-common.conf file.
    3. NOTE: Verify the the hdfs database URL is set correctly for the DR cluster eyeglass access zone and smartconnect name.
      1. Make sure the URL matches the production ECA cluster path at the end of the URL see yellow highlight example below to verify.   If this path does not match on the DR cluster the ECA will not be able to locate the database files during cluster up command.
      2. edit /op/superna/eca/eca-env-common.conf  locate the line below and make sure the DR HDFS smartconnect name is added and path is the same as the production ECA.
      3. export ISILON_HDFS_ROOT='hdfs://dr_hdfs_sc_zone_name:8020/eca1
    4. Audit Database DR protection guide should be followed to Sync the Audit database to the DR cluster.  This is required to switch the HDFS database to the DR cluster after failover.  Follow the Guide here to configure Database sync to the DR cluster
    5. In Warm standby mode the ECA cluster should be down  (ecactl cluster down) until the warm standby cluster is needed.
    6. Mandatory - Ensure the warm standby VM's (Eyeglass and ECA) are always running the same release version as deployed at the production site.

Post Failover Reconfiguration Steps

  1. After failover to the DR cluster, shutdown the production site eca cluster
    1. ecactl cluster down (login to ECA node 1)
  2. Follow the steps in the Easy Auditor database protection guide to make the DR cluster database writeable.  See the guide here.
    1. NOTE: Make sure to run the synciq policy manually to sync all changes to the database when the ECA cluster is shutdown.
    2. The guide steps above will bring up the cluster and validate the database is healthy
    3. Do not proceed until the ECA and database is fully operational.
    4. Login to eyeglass and verify the Managed Services Icon shows all ECA VM's are green and no warnings. 
  3. Eyeglass license must be assigned to the DR site cluster after failover.  The license manager UI in eyeglass is used to assign the license to the DR site cluster.
    1. Set the production site cluster to unlicensed and click submit.
    2. Then set the DR site cluster to user licensed and click submit.
    3.   
  4. Reconfigure Security guard and roboaudit features to self test the new cluster.  A new local account needs to be created on the new target cluster and edit the the configuration to switch to the target cluster.  This requires editing the user and changing the cluster that the automation will run against.  The license step must be completed before you can select the target cluster.
    1. See security guard guide here,  and roboaudit guide here.
    2. NOTE: SMB port must be open between eyeglass and the new target cluster.
    3. Run security guard and roboaudit and monitor the job from the Jobs Icon --> running jobs tab.
  5. Done



© Superna Inc