Planning Guides

Failover Process and Customer Role

Home


Overview

This document describes:

  • Roles and Responsibilities during Failover with Superna Eyeglass

  • Day of Failover Support Process

  • Superna Eyeglass Failover Planning Process - requires 7 days notice if you plan to use support services to validate your environment. If you chose to open a case with less notice, all validations covered by this process may not be completed in time for your failover. Customer must accept the risks of some validations or remediations not being completed in time.

Roles and Responsibilities during Failover with Superna Eyeglass

Superna Eyeglass Support Role in Failover

Superna Eyeglass Superna Eyeglass EULA and Support Services Agreement  includes support for the Superna Eyeglass product.  As related to failover this includes:

  • Failover planning process including readiness health check and remediation prior to a planned event

  • Failover log analysis during and post failover health check and failback assessment

  • Root cause of issues during a failover

  • Next steps required to complete a failover under all conditions, all recovery steps are documented.


Superna Eyeglass product support entitlement does not include

  1. Assisted failover of customer data.

    1. Professional services from 3rd party Eyeglass Certified partners offer hands on failover services. (Eyeglass Professional Services)

  2. Decisions on data protection before, during and post failover.
    1. These decisions reside with customers. Superna Eyeglass Support can not be responsible for legal reasons.
  3. Assistance with Data migration using the Data and Configuration tools in Eyeglass.
    1. Support can answer questions on use cases, limitations, recommendations
    2. Support will not join a zoom or webex to assist with data migration and is consider a Data protection decision by customers to execute all steps.
    3. We suggest testing all options before attempting any data migrations
  4. Support for any external hardware and software vendors is excluded from support

    1. Support agreements must be in place for all external hardware and software vendors. Functional recovery steps will be provided that requires customer subject matter experts to execute on the 3rd party vendor products.

    2. Customers must have access to support for (AD, DNS, Networking, Hosts, PowerScale), Superna Eyeglass support can provide root cause of external component issues but is not primary support replacement for these components.

    3. Superna Eyeglass Support is legally not authorized to take control of any devices or make business decisions on behalf of customers during a failover event Or provide specific technical advice that affects a 3rd party vendor where that vendor should be consulted.

    4. Superna support cannot provide assistance that would violate customer support agreements with 3rd party vendors.  

    5. Superna Eyeglass Support can not take control or join a Webex or phone call to assist with hands on failover procedures or troubleshooting for 3rd party hardware and software for legal reason


Customer Expectations and Role in Failover

  • Customers must provide or have access to all skills required to complete a failover and debug any issues in their IT environment which includes (AD, DNS resolution and updates if necessary, AD domain edit permissions to computer objects with ADSI Edit, PowerScale knowledge on SyncIQ operations, share/export management, Networking, firewalls, Windows logon process, Linux mount requirements, application specific knowledge that uses NAS shares).

  • Customers must be logged in to the support.superna.net portal for purpose of uploading failover logs and communicating with Superna Support team on any questions or issues that may arise during the failover.

How to Open a Failover support Case and Set the Correct Type

When a case is opened 4 choices are available:


  1. Not a failover case

    1. will not be treated as failover case if selected

  2. Failover Case Type Test Only 

    1. This is assumed to be none production data with no business impact.

    2. This will lower priority case if any higher priority cases existing depending on active case workload at the time the case is opened.
    3. Support will automatically assume the failover health check support process for this case type with suggested 7 days notice to allow the full process to be completed.   
    4. You may opt out of this process. Please indicate this to support when opening the case to ensure communication and understanding on support.
  3. Failover Case Type Planned

    1. This is assumed  to be production data failover for business continuity  testing. 
    2. Support assumes this is planned and scheduled event and not a last minute decision to failover and you have a full planning phase for the event.
    3. Support will automatically assume the failover support process will be used to health check your environment and remidate any issues prior to the planned event.  This process works best if at least 7 days notice is provided.  This advanced notice is based on our experience with many failovers and allows time to address many of the known issues that could impact your failover.  This process is designed to significantly reduce your risks for the planned event.
    4. You may opt out of this process. Please indicate this to support when opening the case to ensure communication and understanding on support and risks you will assume as a result of opting out of this process that is included in your support contract.
    5. If you are not running the latest target GA code then, please read all release notes for the Eyeglass release you are running, all these risks and known issues are assumed to be accepted and understood by you (release notes)
    6. All process steps outlined on this page will be assumed to be understood and used for the planned event. You should schedule a meeting if you have questions on this process.  We will happily review the details on a call to ensure both support and your organization are aligned on expectations and responsibilities before you execute a failover.
  4. Failover Case Type Unplanned Real DR Event

    1. This assumes it affects production data.
    2. Support will prioritize this case type above all others
    3. The DR assistant controlled check box should UNCHECKED only if the source cluster shows as unreachable in the Continous Operations Dashboard Icon
    4. Providing the failover log is a mandatory step to get support for an uncontrolled or Real DR event failover.
    5. This was not a planned event.
    6. This case type should not be used for testing. (for testing of DR events use this supported procedure (any other procedure is not supported and cases opened as DR event that are in fact a test will be switched to case type test).
    7. The process on this page should be referenced so that support exclusions are understood and clear as it relates to the support contract and domain knowledge related to 3rd parties example AD, DNS, PowerScale, Windows, Linux must use experts in your organization or support contracts as required to follow steps and directions provided by Superna support throughout the process.  Superna support is not a substitue for the knowledge and skils releated to 3rd party software and hardware.
    8. As stated on this page assistance or decisions related to your business data is not available with product support and decisions and execution of DR failover resides with customers IT staff.

Day of Failover Support Process


Please find below steps for day of failover. NOTE: Support will follow this process below exactly as written, this is fastest process to complete a failover.

How to Receive the fastest Failover support during a failover

The following is the fastest process to get timely support.  Steps below are based on 10,000 plus failovers executed globally and any deviation will negatively affect response time.  

  1. Update the case when you are about to start your failover so that the support engineer can expect the failover log soon.  
  2. Remain logged into our support portal with the case open on a browser tab and copy and paste the failover log from the running failover tab of the DR Assistant. This provides support realtime notification.  Do NOT use email to provide status.
    1. If you want feedback on anything listed in the failover log, copy and paste the whole log to the case EVEN if the failover is not finished. We can provide feedback during the failover if you have questions.
    2. DURING FAILOVER: Release > 2.5 will post a message to the failover log indicating when each policy is failed over and data access testing can be begin. Monitor the log and start testing as soon as the message is seen per policy OR post partial log to the case for review and confirmation.
      Example:
      Failover steps for policy: <policy Name> completed in X minute(s). Data access manual testing should be completed, using this guide as a reference How to Validate and troubleshoot A Successful Failover WHEN Data is NOT Accessible on the Target Cluster.
  3. When asked by support to upload the completed failover log do this immediately (How to download failover log).  This file is mandatory to receive support on ANY question you may have.
  4. Create a full backup (how to here) and upload it immediately after the failover. Our support system has automated failover debugging and can anaylze > 300 issues in minutes.  We will not join a webex since our support engineer cannot review the debug analysis while on a webex.  It is not technically possible to provide a faster response or analyze an issue without this support log analysis, delaying the upload to support WILL delay your support response to ANY question.
  5. If the support engineer determines a rapid response is required, we have the ability to open a chat window to provide a faster response or directions and allows us to continue to review failover analysis provided by our support tools.   This chat window feature is only supported if you are logged into our support website.

Objective:

  • Verify access to data on the target cluster post failover is supports priority in accessing failover logs

  • Once data access validated and you have provided confirmation of data access to the case.. Support will then move on to assess failback steps if any errors in failover steps.

  1. BEFORE FAILOVER: Onefs known bug with quota scan. Follow these Failover Release Notes before failover

  2. DURING FAILOVER: Release > 2.5 will post a message to the failover log indicating when each policy is failed over and data access testing can be begin. Monitor the log and start testing as soon as the message is seen per policy. Failover steps for policy: <policy Name> completed in X minute(s).

  3. AFTER FAILOVER: Attach failover log from eyeglass as a reply to this email.

  4. AFTER FAILOVER: Test data access by following these instructions How to Validate and troubleshoot A Successful Failover WHEN Data is NOT Accessible on the Target Cluster 

  5. AFTER FAILOVER: Upload full backup from eyeglass appliance.

  6. AFTER FAILOVER: Reply to this email with results from data access testing.

 

If you are planning to failback on same day, could you please upload a new set of eyeglass full backup (Create Full backup) logs by following this procedure to access the failback readiness: This is mandatory.


How to Raise an Eyeglass Support Request

Superna Eyeglass Failover Planning Process

The Failover Planning process is a series of steps designed to eliminate known risks during a failover event.  It includes planning steps to be completed by the customer and a Superna Eyeglass failover readiness health check and remediation prior to a planned event completed by Superna Eyeglass Support.


Step 1 in the process is to open a support case notifying the Superna Support team of the planned failover.  The Superna Eyeglass Support team will post the Failover planning checklist to the case describing planning process and next steps.

Provide the date, time and time zone of a planned failover.  


If this information is provided, we ensure a dedicated support engineer is scheduled for your failover to review support logs and information as it is posted to the case.  If this information is not provided we cannot plan to support the failover with a dedicated support resource and normal support agreement response times apply.


Superna Eyeglass Support requires 7 days notice to allow this process to be followed by opening a case.   Customers that do not follow the process are accepting all risks from release notes and known issues the planning process is designed to eliminate.  Support level is reduced for customers that do not follow documented procedures to eliminate known risks.

Failover Planning Checklist

For a planned failover please review the following to maintain support for your installation. We provide a Failover Planning Guide and Checklist. We provide a complete planning process and customer role.


  1. STEP 1 We require written acknowledgement of the failover mandatory steps below to be posted to the case. This process ensures you have read and understood all steps for failover to ensure a successful failover and ownership of the failover process resides with your company.

  2. STEP 2: Submit support logs for review of your environment (NOTE: Until step 1 is complete. support is unable to review logs.) NOTE: The readiness check only includes items listed at the end of this message.

  3. STEP 3: Support will determine next steps based on review and post to the case.

  4. NOTE1: Customers must have access to support for (AD, DNS, Networking, Hosts, PowerScale), Eyeglass support can provide root cause of external component issues but is not primary support replacement for these components.

  5. NOTE2: support can not take control or join a Webex to assist with hands on failover procedures for 3rd party hardware and software for legal reasons.

  6. NOTE3: Support agreements must be in place for all external hardware and software vendors. Functional recovery steps will be provided that requires customer subject matter experts to execute on the 3rd party vendor products.

  7. NOTE4: Decisions on data protection before, during and post failover reside with customers. Support can not be responsible

  8. NOTE5: For a permanent link to policies and support expecta tions for planned failovers refer to this Failover Process and Customer Role.

Technical details for failover planning are provided below. We would like to schedule a 30 minute webex to review these failover planning mandatory steps as well as the day of failover support process and answer any questions you might have. Please provide us with your availability so that we can schedule.



 

PART 1 - Per Failover Acknowledgement Required

Mandatory steps in this section need to be acknowledged for every failover.

0) What is covered by readiness check

As per your request to review readiness for failover, we will review the following for failover readiness validation:

- Access Zone Failover hints

- Access Zone Readiness status / DFS Readiness status / Policy Readiness Status

- SPN errors

- Eyeglass misconfigurations (example Eyeglass version)

- SyncIQ Domain Mark

- Planned Failover Type

- Client Redirection Guide Followed (DNS Dual delegation/DFS Dual Delegation)

- Eyeglass appliance health check

- Cluster API response health check


Anything other than the above is not checked. Anything else not listed needs to be verified by yourselves in the context of your overall failover plan.



Acknowledged: Yes/No

1-1) Mandatory - Are you using Superna Eyeglass Easy Auditor? yes or no

1-2) Mandatory - Are you using Superna Eyeglass Ransomware Defender?  yes or no

1-3) Mandatory - How many Access zones or policies will be failed over?

1-3A) Mandatory - Do you mount shares with DNS CNAME's?
If yes you must manually failover DNS CNAME SPN's or best practice create IP pool alias = to the CNAME to ensure SPN's are failed over by Eyeglass

1-4) Mandatory - Are you failing over and failback on the same day? yes or no

1-5) Mandatory - How long a maintenance window have you scheduled?

Note: If you are failing over less than or equal to 25 SyncIQ policies we recommend 4 hours of maintenance window but if greater than 25 SyncIQ policies then we recommend 6 hours of maintenance window.

1-6) Mandatory - Please create new RPO reports from "Reports On Demand" window and attach to the case.

This will help us to determine average time for SyncIQ policies to complete. Please use GIF animated procedure to create RPO reports: How to create RPO report using Reports on Demand


1-7) Mandatory Step: If also planning OneFS 7 to 8 upgrade review and execute this procedure after upgrade to OneFS 8 for ANY cluster PowerScale Upgrade Procedure with Eyeglass 

Acknowledged: Yes/No or NA


1-8) Mandatory Step: Please upload a full Eyeglass Backup 7 days prior to failover for our review of your failover readiness.

Please upload a new set of eyeglass full backup logs (Create Full backup) by following this procedure:

How to Raise an Eyeglass Support Request


Acknowledged: Yes/No


PART 2 - One Time Acknowledgement Required

Mandatory steps in this section only need to be acknowledged once and therefore do not need to be re-acknowledged for every failover. Once Acknowledged support will assume these steps below are done for all failovers and all risks associated with these validations are accepted.


2-1) Mandatory Step: Use the latest Eyeglass release for planned failover or you are automatically accepting risks of using older release. N-2 release accepted under conditions covered in the Failover Release Notes: Failover Release Notes

Acknowledged: Yes/No

What about unplanned DR event?

Unplanned (real dr with source cluster destroyed ) no it's not required to be on latest release. However, we always expect the software is upgraded to supported releases published here Software Releases. We provide notifications to @eyeglassPowerScale or email distribution


2-2) Mandatory Step: Review the Failover release notes and assess all risks to your environment have been assessed. If you have questions, now is the time to ask.

Acknowledged: Yes/No

>> We provide notice when a release has support removed and the requirement is to upgrade. We improve failover from all cluster failovers from all customers each release so without upgrading to latest release you don't benefit from other customers failovers.

>> We send all Eyeglass release notifications to all registered emails in the support portal.


2-3) Mandatory Step: Complete the failover planning checklist.

This document that has links to all key documents that you need to review and steps to execute before running your DR test:

Failover Planning Guide and Checklist

Acknowledged: Yes/No


2-4) Mandatory Step: Prepare a contact list in place for failover day.

Acknowledged: Yes/No


Customers must provide or have access to all skills required to complete a failover and debug any issues in their IT environment which includes (AD, DNS resolution and updates if necessary, AD domain edit permissions to computer objects with ADSI Edit, PowerScale knowledge on SyncIQ operations, share/export management, Networking, firewalls, Windows logon process, Linux mount requirements, application specific knowledge that uses NAS shares)


Superna Eyeglass Support agreement is for the Superna Eyeglass product and is not a replacement for skill or support agreement for all external hardware and software vendors.


2-5) Mandatory Step: Plan to stop IO to the source cluster before failover. This is a failover best practice.

Acknowledged: Yes/No


If yes, for SMB protocol the 2.0 or later feature can be used to block IO to shares with DR assistant. This inserts a deny read permission dynamically before failover starts and removes after failover completes.


NFS requires the protocol to be disabled to guarantee no IO. Exports should be unmounted before disabling the protocol on the cluster.


If no, you will incur data loss.


2-6) Mandatory Step: Run domain mark jobs on all SyncIQ policies prior to failover.

Acknowledged: Yes/No


Consult explanation Eyeglass and PowerScale Failover Best Practices.


2-7) Mandatory Step: Review Operational steps and procedures for the day of failover

Acknowledged: Yes/No


  1. Operational Steps for Eyeglass Assisted Access Zone Failover

    1. How to Monitor the Eyeglass Assisted Failover

    2. How to Validate and troubleshoot A Successful Failover WHEN Data is NOT Accessible on the Target Cluster

    3. Post Access Zone Failover Steps

    4. Post Access Zone Failover Checklist

    5. Troubleshooting Failover

  2. Operational Steps for Eyeglass Microsoft DFS Mode Failover

    1. How to Monitor the Eyeglass Assisted Failover

    2. How to Validate and troubleshoot A Successful Failover WHEN Data is NOT Accessible on the Target Cluster

    3. Post Eyeglass Microsoft DFS Mode Failover Checklist

2-8) Mandatory Step: Review failover Recovery document

  1. Failover Recovery Procedures


Acknowledged: Yes/No


2-9) Mandatory - Follow supported procedure for an uncontrolled failover where you are simulating source cluster unavailable

If YES, the ONLY supported procedure using Superna Eyeglass is documented here: Eyeglass Simulated Disaster Event Test Procedure.


ANY change to the documented procedure is NOT supported.


Recovery from a real uncontrolled failover is customer responsibility and is NOT covered by Superna Support.  Refer to the Failover Design Guide - How to Execute A Failover with DR Assistant for more information on execution and responsibility for recovery on a real uncontrolled failover. 


2-10) Mandatory - For multi-phase failover upload Eyeglass full backup between failovers for assessment of environment to reduce risk for subsequent failovers. Support level is reduced without backup being provided.

Acknowledged: Yes/No

 

© Superna Inc