Storage Failover with Eyeglass Failover Modes
The following section outlines the storage layer failover steps. The full end to end DR plan should also include application shutdown and bring up procedures to complete a true end to end failover. The storage layer is the foundation upon which all higher layer failover depends, and Eyeglass ensures this step is simple to execute and detect errors during failover.
Superna Professional Services can be engaged on end to end POC, or recommendations and assessments for complex or application layer orchestrated failover scenarios. Examples include:
- VMware SRM + externally mounted storage by VM’s.
- Oracle RAC Data Guard + File System dependencies for applications.
- Please see Eyeglass Solutions page.
Once you have determined which Failover Mode is appropriate for your environment, the table below provides the high level steps for each mode:
Column 1 - Ordered Steps: Ordered steps and purpose of step.
Column 2 - Description: Description of action taken by step.
Column 3 - How Action is Initiated: How each step is executed with Eyeglass depending on whether a SyncIQ Policy Failover, or Microsoft DFS Mode failover is being done.
Column 5 - Ordered Steps for Access Zone OR IP Pool Failover: - Ordered steps and purpose of step.
Column 6 - Description: Description of action taken by step.
Column 7 - How Action is Initiated: for Access Zone or IP Zone Failover.
Target of operation is shown in brackets as source, target or Eyeglass in the table below.
Ordered Steps for - Non DFS and DFS Mode | Description | How Action is Initiated DFS Mode | Ordered Steps for - Access Zone OR IP Pool Failover | Description | How Action is Initiated Access Zone | |
SyncIQ Mode | DFS Mode | |||||
1 - Ensure that there is no live access to data (source) (See new feature in 1A in 2.0 or later) | Manual check for open files. If Open files found, decide whether to failover or wait to be closed. NOTE: DR Assistant Data Integrity failover option for 2.0 or later releases blocks IO to SMB shares before failover. It is recommended to always disable SMB and NFS protocols on the SOURCE cluster prior to failover WHICH IS A CLUSTER WIDE OPERATION to eliminate data loss. | Manual | Manual | 1 - Ensure that there is no live access to data (source) | Manual check for open files. If Open files found, decide whether to failover or wait to be closed. It is recommended to always disable SMB and NFS protocols on the SOURCE cluster prior to failover WHICH IS A CLUSTER WIDE OPERATION to eliminate data loss. | Manual |
1a - Enable Data Integrity Failover (SMB only) | Applies Deny Everyone to SMB shares before failover starts | Automated by Eyeglass | Automated by Eyeglass | 1a - Enable Data Integrity Failover (SMB only) | Applies Deny Everyone to SMB shares before failover starts | Automated by Eyeglass |
1b - Cache schedule for SyncIQ policies being failed over and prevent SyncIQ policies being failed over from running (source) | Get schedule associated with the SyncIQ policies being failed over on OneFS, set policies to manual so they don’t run again during failover | Automated by Eyeglass | Automated by Eyeglass | 1a - Cache schedule for SyncIQ policies being failed over and prevent SyncIQ policies being failed over from running (source) | Get schedule associated with the SyncIQ policies being failed over on OneFS, set policies to manual so they don’t run again during failover | Automated by Eyeglass |
2 - Begin Failover with DR Assistant (Eyeglass) | Initiate Failover from Eyeglass | Manual or Eyeglass REST API | Manual Eyeglass REST API | 2 - Begin Failover with DR Assistant (Eyeglass) | Initiate Failover from Eyeglass | Manual or Eyeglass REST API |
3 - Validation of failover job (Eyeglass) | Verify all warnings before submitting the failover job | Automated by Eyeglass | Automated by Eyeglass | 3 - Validation of failover job (Eyeglass) | Verify all warnings before submitting the failover job | Automated by Eyeglass |
3a - Validation - Block on Warning enabled | Will prevent continuing a failover on warnings (quota scan required detection) | Automated by Eyeglass | Automated by Eyeglass | 3a - Validation - Block on Warning enabled | Automated by Eyeglass | Automated by Eyeglass |
3b - Set Eyeglass config Jobs to userdisabled | This sets config jobs to user disabled state to prevent failed steps from allowing these jobs to run unless a user enables them post failover | Automated by Eyeglass | Automated by Eyeglass | 3b - Set Eyeglass config Jobs to userdisabled | Automated by Eyeglass | Automated by Eyeglass |
4 - Synchronize data (Run SyncIQ policies) (source) (parallelized step)3 | Run all OneFS SyncIQ policy jobs related to the Access Zone being failed over | Automated by Eyeglass | Automated by Eyeglass | 4 - Synchronize data (run SyncIQ policies) (source) (parallelized step)3 | Run all OneFS SyncIQ policy jobs related to the Access Zone being failed over | Automated by Eyeglass (all policies in the Access Zone) |
5 - Synchronize configuration (shares/export/alias, snapshot schedules, dedupe paths) (Eyeglass)(parallelized step)3 | Run Eyeglass configuration replication | Automated by Eyeglass (configuration exists on source and target) | Automated by Eyeglass | 5 - Synchronize configuration (shares/export/alias, snapshot schedules, dedupe paths) (Eyeglass) (parallelized step)3 | Run Eyeglass configuration replication | Automated by Eyeglass (based on matching Access Zone base path) |
6 - Renaming shares DFS mode to redirect DFS clients (multi threaded) (parallelized step)3 | For DFS Failover, shares renamed on source and target cluster so clients are redirected with dual DFS target paths to target cluster | Not Applicable | Automated by Eyeglass (special handling renames Shares on source and target so that only one DFS target UNC is reachable and active for DFS clients to switch over) | NOTE: It is possible to integrate DFS protected data inside an Access Zone failover to protect Shares , exports and DFS data with Access Zone failover. | If DFS configured redirect rename steps would executed at this point | Automated by Eyeglass (based on matching Access Zone base path) |
6 - Change SmartConnect Zone on Source so not to resolve by Clients (source) (dual delegation eliminates DNS updates) | Rename SmartConnect Zones and Aliases (Source) | Automated by Eyeglass (based on matching Access Zone base path) | ||||
7 - Avoid SPN Collision (source) | Sync SPNs in all AD providers to current SmartConnect Zone names and aliases (proxy through target cluster (Source) | Automated by Eyeglass (AD delegation must be completed as per install docs) | ||||
9 - Provide write access to data on target (target) (single threaded for safe failover) (parallelized step)3 | Allow writes to SyncIQ policy(s) related to failover2 | Automated by Eyeglass | Automated by Eyeglass | 8 - Move SmartConnect Zone to Target (target) | Add source SmartConnect Zone(s) and Alias(s) on (Target) | Automated by Eyeglass |
10 - Resync prep Step SyncIQ - Disable SyncIQ on source and make active on target (source) (parallelized step)3 | Resync prep SyncIQ policy step to failover (Creates Mirror Policy on target and disables source cluster policy and enables target cluster policy OneFS | Automated by Eyeglass | Automated by Eyeglass | 9 - Update SPN to allow for authentication against target (target) | Sync SPNs in all AD providers to current SmartConnect Zone names and aliases (proxied through target cluster) (Target) | Automated by Eyeglass |
11- Re-Set SyncIQ schedule on target mirror policy (target) | Set schedule on Mirror Policy(Target) using schedule from step 1 from OneFS for policy(s) related to the Failover job | Automated by Eyeglass | Automated by Eyeglass | 10 - Repoint DNS to the Target cluster IP address | DNS Dual delegation for all SmartConnect Zones that are members of the Access Zone | Automated by Eyeglass (See "Geographic Highly Available Storage solution with Eyeglass Access Zone Failover and Dual Delegation") |
12 - Failover quota(s) (Eyeglass) (optional can be skipped) (parallelized step)3 | Eyeglass DR Assistant automatically fails over quotas by running the Quota Jobs related to the SyncIQ Policy(s) being failed over | Automated by Eyeglass (deleted on source cluster and created on target cluster) | Automated by Eyeglass (deleted on source cluster and created on the target cluster so that post failover quotas are applied) | 12 - Failover quota(s) (Eyeglass) (optional can be skipped) (parallelized step)3 | Eyeglass DR Assistant automatically fails over quotas by running the Quota Jobs related to the SyncIQ Policy(s) being failed over | Automated by Eyeglass (deleted on source cluster and created on target cluster) |
13 - Remove quotas on directories that are target of SyncIQ (PowerScale best practice) (source) (parallelized step)3 | Eyeglass deletes all quotas on the source for all the policies | Automated by Eyeglass | Automated by Eyeglass | |||
13a - Run Mirror policy (parallelized step)3 | Run policy to resync data in reference direction | Automated by Eyeglass | Automated by Eyeglass | 13a - Run Mirror policy (parallelized step)3 | Run policy to resync data in reference direction | Automated by Eyeglass |
13b - Set Eyeglass config Jobs to enabled | Enables configuration sync ONLY if Resync prep completes successful for the policy | Automated by Eyeglass | Automated by Eyeglass | 13b - Set Eyeglass config Jobs to enabled | Enables configuration sync ONLY if Resync prep completes successful for the policy | Automated by Eyeglass |
14 - Change SmartConnect Zone on Source so that names are not resolved by Clients (source) | Rename SmartConnect Zones and Aliases (Source) | Manual | Not Required (source and destination clusters can use existing SmartConnect Zones) | 14 - Disable SyncIQ on source and make active on target (source) | Resync prep SyncIQ policy step to failover (Creates Mirror Policy on target and disables source cluster policy and enables target cluster policy OneFS (parallelized step)3 | Automated by Eyeglass |
15 - Avoid SPN Collision (source) | Sync SPNs in all AD providers to current SmartConnect Zone names and aliases (Source) | Manual (deletes SmartConnect SPN from source cluster machine account) | Not Applicable (DFS SPN’s are not changed during failover) | 15 - Set proper SyncIQ schedule on target (target) | Set schedule on Mirror Policy(Target) using schedule from step 6 from OneFS for policy(s) related to the Failover | Automated by Eyeglass |
16 - Move SmartConnect Zone to Target (target) | Add source SmartConnect Zone(s) as Alias(s) on (Target) | Manual | Not Required (source and destination clusters can use existing SmartConnect Zones) | 16 - Synchronize quota(s) (Eyeglass) (parallelized step)3 | Run Eyeglass Quota Jobs related to the SyncIQ Policy or Access Zone being failed over | Automated by Eyeglass |
17 - Create SPN’s to allow for kerberos authentication against target for SMB shares (target) | Sync SPNs in all AD providers to current SmartConnect Zone names and aliases (Target) | Manual (adds new SmartConnect alias SPN’s to target cluster machine account) | Not Applicable (DFS SPN’s are not changed or registered to Cluster machine accounts) | 17 - Remove quotas on directories that are target of SyncIQ (PowerScale best practice) (source) (parallelized step)3 | Delete all quotas on the source for all the policies | Automated by Eyeglass (Requires IP pool hints are configured See docs) |
18 - Repoint DNS to the Target cluster IP address | Update DNS delegations for all SmartConnect Zones that are members of the Access Zone | Manual | Not Applicable (no updates are needed as DFS resolution has not changed in DNS, only the target UNC with an active share | 18 - Repoint DNS to the Target cluster IP address | Dual Delegation feature with Eyeglass avoids any DNS steps during failover for all SmartConnect Zones that are failed over | Automated by Eyeglass (see dual delegation one time configuration here) |
19 - Refresh session to pick up DNS change | Remount the SMB/NFS share(s) (if data integrity issued remount is not required for SMB) | Manual on clients | Automatic (Windows 7 or later with DFS support) | 19 - Refresh session to pick up DNS change | Remount the SMB/NFS share(s) or remount exports (if data integrity issued remount is not required for SMB) See IP pool Interface Removal procedure for SMB. | Automated by Eyeglass using Dual SmartConnect Zone Delegation (How to Configure Here) |
- Initiates Eyeglass Configuration Replication task for all Eyeglass jobs.
- SyncIQ does NOT modify the ACL (Access control settings on the file system), it locks the file system. ls -l will be identically on both source and target
- System.xml change required to enable parallel step mode. NOTE: all policy steps are attempted. On failure, the failover job will continue to attempt all steps and skip downstream steps per policy if the previous SyncIQ step failed.