Administration Guides

Golden Copy Backup Bundle & Adv License and Pipeline Configuration Steps

Home


Overview

This topic covers installations licensed with the backup bundle or advanced license key features.  These features require the license key installed.


Golden Copy Advanced License And Backup Bundle Feature Configuration

Overview

These features require the Advanced license addon for Golden Copy to use the Backup use case features.  The backup features make managing backup workflows easier monitor and automated reporting.  New features to protect data allow several new workflows.   Data integrity features , API access for automation with external tools and new reporting options.  

Requirements

  1. Advanced Golden Copy license applied
  2. Golden Copy Backup Bundle license

How to Assign an Advanced License to a Cluster

  1. A Golden Copy Advanced license enables the features in this guide for a cluster managed by Golden Copy.  Use the steps below.
  2. searchctl isilons license --name <clusterName> --applications GCA

NOTE:  An available advanced license must exist for this command to succeed and the cluster also requires a base Golden Copy license before an Advanced license key can be applied.


Powerscale Cluster Configuration Backup

  1. Backing up data does not fully protect your clusters configuration needed to recover your cluster.  This feature collects a backup of the shares, exports and quotas using the onefs 9.2 export feature and stores a copy of this configuration in an S3 bucket that is configured to store a backup.
  2. Supports Scheduling configuration backup jobs to be generated on the cluster and incremental changelist aware backup will pickup the new backup job files and copy them to the bucket.

Requirements

  1. Onefs 9.2 or later

Cluster Configuration Backed Up

  1. Quotas
  2. SMB shares
  3. NFS exports
  4. S3 Configuration
  5. Snapshot settings
  6. NDMP configuration settings
  7. http settings

Procedures To Configure Cluster Backup

How to run an immediate cluster backup
    1. searchctl isilon config  (show help)
    2. searchctl isilon config --name <isilon-name> --backup backup the configurations immediately and return the task id of the backup job.  The task ID will be referenced in the folder name copied to the S3 bucket.
    3. NOTE:   This only generates a backup on the cluster.  It does not copy the backup to S3 until a folder configuration is added and run.  Follow the steps below.

    How to enable a folder to store a cluster backup

    1. new flag for searchctl archivedfolders add command to specify the S3 bucket to store the cluster backup.  This is required step to protect the cluster configuration.  This flag is used instead of the --folder path.  This will auto populate the correct path to backup the cluster configuration data and copy it the S3 bucket configured.
    2. searchctl archivedfolders add --backup-isilon using this flag will set the folder path to ISILON_BACKUP_CONFIGURATIONS_PATH automatically. It cannot be set with the --folder at the same time.  Complete the other mandatory flags needed to copy data to S3 bucket.
    3. Example for AWS with a daily backup.  NOTE: This will backup the cluster configuration 5 minutes after midnight.  This provides 5 minutes for the cluster backup to be created.  The incremental flag is using 5 0 * * *   and the first number means 5 minutes after the hour.  This value can be changed if needed.
      1. searchctl archivedfolders add --isilon <cluster name> --backup-isilon --accesskey xxxxxxx --secretkey yyyyyy --endpoint <region endpoint> --region <region> --bucket <backup bucket> --cloudtype aws --incremental-schedule "5 0 * * *
      2. The job definition will fill in the correct path to pickup the cluster backup.  See example below.

      3.  


    How to setup a backup schedule for the cluster Configuration Backup on the cluster.

    1. This step is needed to create a new backup of the cluster configuration and should be set to once per day as a best practice.
    2. searchctl isilon config --name <isilon-name> --set-backup-cron set up the cron for backing up the configurations, "NEVER" or "DISABLED" could inactive the schedule
      1. searchctl isilon config --name <isilon-name> --set-backup-cron "0 0 * * *" 
    3. Done


    How to Restore Configuration Backup from S3 bucket back to the source cluster

    1. NOTE: Never use this command unless you need to restore the configuration, this command will overwrite all shares, exports and quotas on the named cluster.
    2. Browse the S3 bucket with cloud browser GUI and locate the bucket/clustername/ifs/data/Isilon_Support/config_mgr/backup path and locate the subfolder name that contains the backup you want to use for the restore.  Example in this screenshot 2 backups exist to choose from.
    3. searchctl isilon config --name <isilon-name> --restore <id> restore the configuration file named by id and return the task id of the restore job.
      1. example
        1. searchctl isilons config --name gcsource93 --restore gcsource93-20220312020124
        2. The job will read the backup data from bucket and restore it to the source cluster.
        3. Done.
    4. (optional advanced) new environment variable to set the path to store the backup in the S3 bucket
      1. Edit /opt/superna/eca/eca-env-common.conf and add this variable to change the default
      2. export ISILON_BACKUP_CONFIGURATIONS_PATH=xxxx   this is  the path of the folder that the backup job will create the new folder in. The path is fixed by isilon but add this environment variable for the future in case this path changes.
      3. Default path /ifs/data/Isilon_Support/config_mgr/backup 



    Cloud Storage Tier Aware Copy and Sync (AWS, Azure)

    1. This feature allows copying or syncing data directly into an archive tier for AWS S3 and Azure targets.  Google Cloud storage requires creating the bucket with a storage class and does not support setting the tier of individual objects.
      1. This feature avoids life cycle policies to move objects to different tiers.
    2. The add folder and modify folder CLI command allows specifying the tier that objects should be copied into.
      1. [--tier] default is standard (AWS), Cool (Azure)
        1. Requires Golden Copy Advanced license or Backup Bundle license
        2. Requires 1.1.6
        3. Azure
          1. flag to specify the tier the API calls are sent to, this should match the container tier configuration options are Access tier for Azure e.g. hot, cool, archive)  Without this flag the default is cold.
          2. NOTE: Archive tier does not support versioning and only allows one version of an object to exist.  This also means incremental mode should be disabled on this tier.   This tier is intended for cold data without any modifications on the source.
        4. AWS
          1. specify AWS tier using (STANDARD (default), STANDARD_IA, GLACIER, DEEP_ARCHIVE, INTELLIGENT_TIERING, ONEZONE_IA, OUTPOSTS, REDUCED_REDUNDANCY, GlACIER_IR (Glacier instant access tier ) Use upper case tier name.  
          2. NOTE: Not all tier options are valid for all use cases.  Consult AWS documentation.
    3. Example command
      1. searchctl archivedfolders add --isilon gcsource --folder /ifs/archive --secretkey  xxx --endpoint blob.core.windows.net --container gc1 --accesskey yyyy --cloudtype azure --tier STANDARD_IA 


    Version Aware Recall / Restore

    Overview

    1. Requires
      1. Release 1.1.6
    2. Overview
      1. A path based recall solution for bulk data recall.  
      2. NOTE:  It is not a file based recall solution.
    3. Limitations
      1. On a folder with an active archive job, recall jobs will be blocked until the archive job completes.  This prevents trying to recall data that is not yet copied.
    4. Full and incremental with S3 bucket versioning allows multiple versions of files to be protected using S3 policies configured on the target storage. The Storage bucket must have versioning enabled and the folder should be configured in sync mode, or have run a copy job multiple times to detect file changes and update objects with a new version.  NOTE: Storage bucket version configuration is external to Golden Copy consult your S3 storage device documentation.
      1. This allows recall jobs to select files based on a date range using older than x date or newer than Y date.  This allows selecting files based object creation date (the date the backup ran) using the older newer than flags on the recall job. 
      2. NOTE: The date range is evaluated against the object creation date of the object in the version history of an object.  This date is when the object was backed up.
      3. NOTE: If you run multiple recall jobs with the same path the files in the recall staging area under /ifs/goldencopy/recall  will be overwritten if they already exist.
    5. Use Case #1: Recall "hot data" first
      1. This solution is when a large recall of data is needed to be recalled /restored but you want the most recent data recalled first.  This would use the newer than flag to select a date example 2 weeks in the past.
      2. Example
        1. searchctl archivedfolders recall --id 3fd5f459aab4f84e --subdir /ifs/xxx --newer--than "2020-09-14 14:01:00" --apply-metadata
    6. Use Case #2: Recall "cold data" last
      1. This solution would be used to recall data after "hot data" is recalled since it has not been recently updated.  This would use a recall job and the older than flag , using the example above using the same date 2 weeks in the past with the older than flag would start a recall job to locate and recall data unmodified that is at least 2 weeks or older.
      2. Example
        1. searchctl archivedfolders recall --id 3fd5f459aab4f84e --subdir /ifs/xxx --older--than "2020-09-14 14:01:00" --apply-metadata 

    Non-Version Aware Recall / Restore

    1. Overview:
      1. This feature also adds the ability to scan the metadata in the objects properties to recall files based on created or modified data stamps of the files that existed on the file system at the time they were backed up.
    2. Limitations
      1. On a folder with an active archive job, recall jobs will be blocked until the archive job completes. This prevents trying to recall data that is not yet copied.
    3. The recall command adds to new options with the following date syntax. Use double quotes.
      1. NOTE: The times enter Must be in UTC time zone.
      2. --newer-than  "<date and time>" (yyyy-mm-dd HH:MM:SS e.g 2020-09-14 14:01:00)
      3. --older--than "<date and time>"  (yyyy-mm-dd HH:MM:SS e.g 2020-09-14 14:01:00)
    4. Use Case #3: Recall files with a specific created or modified time stamp
      1. This use case allows scanning the metadata that Golden Copy encodes into the properties of the objects as criteria to select data to recall the data based on the created date stamp of the files or modified time time stamp.
      2. [--start-time STARTTIME]  (yyyy-mm-dd HH:MM:SS e.g 2020-09-14 14:01:00)
      3. [--end-time ENDTIME] (yyyy-mm-dd HH:MM:SS e.g 2020-09-14 14:01:00) 
      4. [--timestamps-type {modified, created}]  The default is modified date stamp.  The files are backed up with created and modified time stamps, this flag allows selecting which time stamp to use when evaluating the older than or newer than dates.
      5. Example to scan for files with a last modified date stamp between Sept 14, 2020 and Sept 30 2020 under the /ifs/xxx folder.
        1. searchctl archivedfolders recall --id 3fd5f459aab4f84e --subdir /ifs/xxx --start-time "2020-09-14 14:01:00"  --end-time "2020-09-30 14:01:00" --apply-metadata 
      6. Example to scan for files with a created date date stamp between Sept 14, 2020 and Sept 30 2020 under the /ifs/xxx folder.
        1. searchctl archivedfolders recall --id 3fd5f459aab4f84e --subdir /ifs/xxx --start-time "2020-09-14 14:01:00"  --end-time "2020-09-30 14:01:00" --timestamps-type created --apply-metadata 

    Power User Restore Portal

    Overview

    1. This feature is targeting power users that have a need to login and recall data but they should not be an administrator within the Golden Copy appliance.  This solution is SMB share security overlay aware.  This means after AD login to the Golden Copy appliance the users SMB share paths are compared to the backed up data and restricts visibility of the data in the Cloud Browser interface.  The user is also limited to only see icons related to Cloud Browser.

    Requirements

    1. Golden Copy VM node 1 must have SMB TCP port 445 access to the Clusters added to the appliance
    2. SMB shares on a path at or below the folder definitions configured in Golden Copy
    3. Disable Admin only mode on the Golden  Copy Appliance


    User Login and Data Recall Experience

    1. The user must login with user@domain and password
    2. The options in the UI are limited to Cloud Browser and stats.
      1.  
    3. Select the Cloud Type and bucket name
      1. The displayed data is based on the SMB share permissions of the user.  In this case an SMB share is created on path /ifs/archive/incrementaltest and this is the only path the user is allowed to see.
      2.  
      3. The User can only browse backed up data under SMB share paths and all other data is hidden from view.   
      4. The user can select a folder check box and the Recall icon will appear.
      5.  
      6. Clicking the icon will launch a recall job.
      7. NOTE:  The recall data path is not the original data location and the Power user would need access via NFS or SMB to the /ifs/goldencopy/recall/ifs/archive/incrementaltest   path to retrieve the data.   Designing a user restore solution requires some planning with SMB shares to grant access to data that has been recalled.



    How to redirect recall object data to a different target cluster 

    Overview

    1. Use this option to add a cluster to Golden copy that does not require a license, since it will be used as a recall target only.  This option requires the Advanced or Backup Bundle license key.
    1. Requirements
      1. Release 1.1.6 
      2. Advanced license key or backup bundle license 
    2. This process requires adding the target cluster to Golden Copy, this cluster does not require a license when using the --goldencopy-recall-only flag. 
      1. searchctl isilons add --host IP address --user EyeglassSR [--isilon-ips x.x.x.x, y.y.y.y] --goldencopy-recall-only   
    3. Add a folder definition that specifies the new cluster name added above and keep all other folder settings the same, you can locate the original folder add command using this command
      1. history | grep -i "searchctl archivedfolders add"
      2. Modify the command and change the  searchctl archivedfolders add --isilon xxxx { keep all other folder settings the same}  replace xxxx with the name of the cluster added in the steps above. 
    4. Start the recall job and specify the source path where the data is stored in the S3 bucket.
      1. searchctl archivedfolders recall  --id YY  --source-path 'xxxx/ifs/data/<path to data to recall>'  --apply-metadata 
        1. Replace xxxx with the new cluster name
        2. YY   replace with the New folder ID created when adding the folder in the steps above.
        3. Replace <path to data to recall> example /ifs/data/backup  will recall all data under this folder to the cluster XXXX 
        4. NOTE:  The recall NFS mount must be created on the new target cluster on the /ifs/goldencopy/recall path before a recall can be started. 
      2. This will start a recall job and data will be recalled to the new cluster added to the folder definition.


    Target Object Store Stats Job to Monitor the backup object store

    Overview

    This job type will scan all the objects protected by a folder definition and will provide a count of objects, the sum total of data along with the the age of the objects. 

    In addition it will summarize:

    1. File count
    2. Total data stored
    3. oldest file found
    4. newest file found

    How to Report on the object count and quantity of data protected on a folder Definition

    1. searchctl archivedfolders s3stat --id <folderID>
      1. NOTE:  Usage charges for cloud provider storage will be assessed based on API list and get requests to objects.   The job can be canceled at any time using searchctl jobs cancel command.
      2. Use the searchctl jobs view --follow --id xxxxx (job ID) to view the results
    2. Sample output
      1.  


    How to Run a Data Integrity Job

    1. Overview
      1. This feature provides a data integrity audit of folders by leveraging the metadata checksum custom property. Random files are selected for audit on the target device ,   downloaded the file computes the checksum and compares to the checksum stored in the metadata.   If any files fail the audit the job report will summarize failures and successful audit passing.   This verifies your target device stored the data correctly.
    2. Requirements
      1. Release 1.1.6 update 2 or later
      2. NOTE: If data did not have a checksum applied during a copy the job will return  100% error rate.
      3. This feature will require data is copied with the --checksum ON global flag enabled.   See the Global configuration settings.
        1. searchctl archivedfolders getConfig
        2. searchctl archivedfolders configure -h
    3. How to run a data integrity job
      1. searchctl archivedfolders audit --folderid yy --size  x  (note x is GB total of  data that will be randomly audited, and yy is the folder ID that will be audited)
      2. Using the job id to monitor the success of the job
        1. searchctl jobs view --id   xxxxx 
        2.   


    How to Enable Ransomware Defender Smart Airgap

    1. This feature integrates with Ransomware Defender to enable Smart Airgap.  This blocks full or incremental jobs if an active alarm is raised in Ransomware Defender or Easy Auditor Active Auditor triggers.
    2. login to node 1 of Golden Copy as ecaadmin
    3. nano /opt/superna/eca/eca-env-common.conf
    4. Add these variables and generate a new API token in eyeglass to authenticate Golden Copy API calls.  This can be completed from Eyeglass GUI, main action menu, Eyeglass Rest API, API tokens and create a new new token named for Golden Copy.
    5. Add these variables and enter the eyeglass ip address and api token.
    6. export EYEGLASS_LOCATION=x.x.x.x
    7. export EYEGLASS_API_TOKEN=yyyy
      1. copy api token created in eyeglass api menu to replace yyyy
    8. export ARCHIVE_RSW_CHECK_THRESHOLD=WARNING
      1. Options are WARNING, MAJOR, CRITICAL to determine the severity of alarm that will block the backup process.  If set to warning then all severities will block,  if set to Major then Warnings will be ignored, if set to critical then warning and major will be ignored
    9. export ARCHIVE_RSW_CHECK_INTERVAL_MINS=1
      1. How often to poll eyeglass for ransomware events,  recommended to use 1 minute
    10. export ARCHIVE_RSW_CHECK_ENABLED=TRUE
      1. True / False to enable the functionality.  True required to activate the feature.
    11. control + x to save the file
    12. ecactl cluster down
    13. ecactl cluster up


    Media Mode

    Overview

    Media workflows and applications generate symlinks and hardlink files in the file system to optimize disk space.  These file system structures allow efficient re-use of common files in a media project space.   The challenge with this data structure is backup of these files from file to objects and each symlink or hardlink being treated as a full sized unique file.   This issue will cause an explosion of the data set in the object store and objects recalled back to a file system will lose the sym and hard link references.

    Media mode in Golden copy is designed to automatically detect symlinks and hardlinks and only backup the real file once and maintain the references to this file to a) optimize backup performance and b) preserve the links in the object store.   This feature will allow recalling the data back to the file system and will recreate the symlinks and hardlinks in the file system to maintain the optimized data structure.

    This mode can be enabled or disable and support full backup and incremental detection of new symlink or hardlinks that are created in between incremental sync jobs.

    Requirements

    1. License key required - Pipeline subscription

    Important Use Case Information

    1. The source and target of a hard link or symlink must be under the archive folder definition.

    Configuration Hardlink and Symlinks Aware Mode

    1. Follow these steps to enable media mode
      1. login to the Golden Copy node 1
      2. nano /opt/superna/eca/eca-env-common.conf
      3. add the following variables
      4. export IGNORE_SYMLINK=false
      5. export IGNORE_HARDLINK=false
      6. control+x  (to save and exit)
      7. ecactl cluster down
      8. ecactl cluster up
      9. done. 


    Symlink and Hardlink Stats and Job View

    1. When media mode is enabled the following stats are available along with job view details.
    2. Job View summary provides symlink and hardlink count and MB copied along with skipped files that did not need to be copied as redundant copies.   A calculation on data copied saving when duplicate data is not copied is also provided.
      1.  
    3. Stats view summary now has detailed stats on inodes, symlinks, hardlinks and skipped copies when data reduction has occured due to duplicate data removed from the copy job.


    Media Workflows Dropbox Folder or S3 Bucket Feature

    1. This allows a folder on the cluster to be monitored for new files and synced to an S3 bucket and the source files are deleted after the copy of the files completes.  This enables a workflow that allows completed work to be dropped into a folder to be moved to the production finished work storage bucket.  This feature manages the on premise to cloud direction.    It is also possible with the Pipeline license to offer an S3 bucket drop box bucket where new data copied into the bucket is copied to on premise storage and then deleted from the bucket.

    Use Cases

    1. On premise to Cloud drop box
    2. Cloud to on premise drop box - Pipeline license required

    Topology Diagram


    Configuration On Premise Drop box

    1. ecactl search archivedfolders add <add folder parameters> --delete-from-source
      1. This flag will apply delete from source rule to every full archive of this folder.
    2. ecactl search archivedfolders archive --id <folderid> --delete-from-source
      1. This flag will apply delete from source rule to this full archive only.

    Configuration Cloud Bucket Drop Box

    1. This feature allows a bucket in the cloud to receive data from various sources and have it automatically synced to the on premise cluster on a schedule and once the sync is completed the source files in the bucket are moved to the trash bucket.  Two solution to control how data is moved from Cloud to on premise storage.  One will delete data after copy is completed and the 2nd will move data to a trashcan bucket with TTL lifecycle policy applied.
    2. Move Data
      1. searchctl archivedfolders add --folder /bucket1 --isilon <name of cluster> --source-path '/bucket path' [--pipeline-data-label xxxxx] --recall-schedule "*/30 * * *" --cloudtype {aws, azure, ecs, other} --bucket <bucket name> --secretkey <> --accesskey xxxxx --trash-after-recall
        1. This will delete the files in the cloud bucket after they have been recalled.
    3. Move with trashcan
      1. searchctl archivedfolders add --folder /bucket1 --isilon <name of cluster> --source-path '/bucket path' [--pipeline-data-label xxxxx] --recall-schedule "*/30 * * *" --cloudtype {aws, azure, ecs, other} --bucket <bucket name> --secretkey <> --accesskey xxxxx --trash-after-recall --recyclebucket <trashbucket>
        1. The yellow highlighted flag with a bucket name used to store deleted data from the main bucket.  This allows a bucket policy to specify how long the deleted data should remain before it is purged from the trash bucket that is defined.  This provides a time period to recover data from the trash bucket if needed.


    Long Filesystem Path Support

    1. Overview

      1. Many media workflows create long path names which is supported in NFS, posix file systems.  These path lengths can exceed the supported S3 limit of object keys of 1024 bytes.   This blocks copying folder data at deep depth's in the file system.   
    2. Requirements

      1. Onefs 9.3 or later supports REST API calls for a path that is 4K bytes long
      2. Golden Copy 1.1.7 build 22067 or later
    3. Feature

      1. This feature supports long path names that exist in the file system that need to be copied to S3.  S3 has a limitation of 1024 byte object key length which limits the filesystem path that can be directly copied into the s3 bucket.   This feature will create object symlinks for all data that would fail to copy to the S3 bucket.
    4. How to Enable:

    1. nano /opt/superna/eca/eca-env-common.conf
    2. add this variable
    3. export ENABLE_LONGPATH_LINKING=true
    4. control+x (to save and exit)
    5. ecactl cluster down
    6. ecactl cluster up
    7. Done.



    Golden Copy Pipeline Workflow License Overview

    Overview

    This feature license allows S3 to file workflows and  File to Object.    The S3 to File direction requires incremental detection feature that will leverage the date stamps on the file system to match the object time stamp.  This will allow incremental sync from S3 to file.  This unique feature can track differences between S3 and the file system and only sync new or modified objects from the cloud down to the file system.

    Use Cases

    1. Media workflows to pickup media contribution from a 3rd party from an Cloud S3 bucket and transfer the data to an on premise Powerscale for editing workflows
    2. Media workflow to download S3 output from a rendering farm that produces output that is needed on premise for video editing workflows.
    3. HPC cloud analysis for AI/ML that requires on premise data to be copied to an S3 Cloud bucket for analysis input to AI/ML that produces an output in a different bucket that needs to be copied back on premise.
    4. Many to one - Many source buckets dropping data into a common project folder on Powerscale.  Accept data from multiple locations and drop the data into a common project folder on the file system.

    These workflows are file to object and object to file with different source and destinations along with scheduled copies or incremental sync in both directions.  The solution is designed to allow scheduled incremental in both directions to pickup new or modified files only from the S3 bucket and copy to the cluster.


    Typical Use Case

     


    Fan-In Topology

    Requirements

    1. Pipeline license key applied to Golden Copy and assigned to a cluster
      1. searchctl isilons license --name <clusterName> --applications GCP


    Fan-In Configuration Requirements

    1. Fan-in allows multiple source s3 buckets to be synced to a single path on the on premise PowerScale.
    2. The prefix path must be unique to avoid file system collisions
    3. example
      1. bucket source1 - bucket1
      2. bucket source2 - bucket2
      3. searchctl archivedfolders add --folder /ifs/fromcloud/bucket1 --isilon <name of cluster> --source-path '/bucket path'  --recall-schedule "*/30 * * *"   [--pipeline-data-label xxxxx] --cloudtype {aws, azure, ecs, other} --bucket bucket1 --secretkey <yyyy> --accesskey <xxxx>  --recall-from-sourcepath
      4. searchctl archivedfolders add --folder /ifs/fromcloud/bucket2 --isilon <name of cluster> --source-path '/bucket path'  --recall-schedule "*/30 * * *"   [--pipeline-data-label xxxxx] --cloudtype {aws, azure, ecs, other} --bucket bucket2 --secretkey <yyyy> --accesskey <xxxx>  --recall-from-sourcepath  

    Prerequisite Configuration Steps

    1. The data that is pipelined from the cloud needs to land on the cluster in a path.  An NFS mount is required to /opt/superna/mnt/fromcloud/GUID/NAME mounted on Powerscale onto a path on the cluster example /ifs/fromcloud .  This path on the cluster will store all data arriving from the cloud S3 buckets.
    2. NOTE: You can share out the data from S3 buckets using SMB or NFS protocols.
    3. Configuration Steps for NFS export for all data arriving from the cloud.
      1. Create an NFS export on the Golden Copy managed cluster(s) on path /ifs/fromcloud and set the client and root client lists to the IP addresses of all Golden Copy VM's
      2. Golden Copy uses PowerScale snapshots to copy content. Follow the steps below to add the NFS export to each of the VM's that was created in the steps above.  2 NFS mounts are required, 1 for copying data and one for recalling data.
      3. You will need to complete this steps on all nodes
        1. Cluster GUID and cluster name for each licensed cluster
        2. Cluster name as shown on top right corner after login to OneFS GUI 
      4. Change to Root user
        1. ssh to each VM as ecaadmin over ssh
        2. sudo -s
        3. enter ecaadmin password 3y3gl4ss
      5. Create local mount directory (repeat for each Isilon cluster) 
        1. mkdir -p /opt/superna/mnt/fromcloud/GUID/clusternamehere/    (replace GUID and clusternamehere with correct values)
        2. (Only if you have Virtual accelerator nodes, otherwise skip) Use this command to run against all Golden Copy nodes, you will be prompted for ecaadmin password on each node.
        3. NOTE: Must run from the Golden Copy VM and all VAN VM's must be added to the eca-env-common.conf file.
        4. NOTE:  example only.  
          1. ecactl cluster exec "sudo mkdir -p /opt/superna/mnt/fromcloud/00505699937a5e1f5b5d8b2342c2c3fe9fd7/clustername"
      6. Configure automatic NFS mount After reboot
        1. Prerequisites 
          1. This will add a mount for content copying to FSTAB on all nodes
          2. Build the mount command using cluster guid and cluster name replacing the yellow highlighted sections with correct values for your cluster. NOTE: This is only an example
          3. Replace smartconnect FQDN and <> with a DNS smartconnect name
          4. Replace <GUID> with cluster GUID
          5. Replace <name>  with the cluster name 
        2. On each VM in the Golden Copy cluster:
          1. ssh to the node as ecaadmin
          2. sudo -s
          3. enter ecaadmin password
          4. echo '<CLUSTER_NFS_FQDN>:/ifs/fromcloud /opt/superna/mnt/fromcloud/<GUID>/<NAME> nfs defaults,nfsvers=3 0 0'| sudo tee -a /etc/fstab
          5. mount -a
          6. mount to verify the mount
        3. Login to next node via ssh and ecaaadmin
        4. repeat steps above on each VM

    Configuration Examples for Creating Pipelines

    A pipeline is a Cloud to on premise storage configuration.  Follow the steps below.  NOTE:  One to one or many to one definitions are supported to map multiple source S3 buckets to a single folder path target.

    1. Modify the PowerScale cluster that is the target of the cloud pipeline.  Golden Copy needs to know where the pipeline data will be stored.  This is attached to the cluster configuration globally for all pipelines that drop data on this cluster.  Repeat for each cluster that will have Pipelines configured.
      1. searchctl isilons modify --name xxxx   --recall-sourcepath AbsolutepathforFromcloudData  < add folder parameters here for bucket, credentials etc > 
        1. xxxx is the isilon cluster name
        2.  AbsolutepathforFromcloudData is the path used for the fromCloud NFS mount.  The default should be used /ifs/fromcloud.
      2. example
        1. searchctl isilons modify --name cluster1   --recall-sourcepath /ifs/fromcloud
    2. Example to create a pipeline configuration from an S3 bucket and path to a file system path on the cluster, scan the bucket every 30 minutes for new data.  Compare the files to the file system and only copy new files to the cluster.
      1. searchctl archivedfolders add --folder /ifs/fromcloud/bucket1 --isilon <name of cluster> --source-path '/bucket path'  --recall-schedule "*/30 * * *"   [--pipeline-data-label xxxxx] --cloudtype {aws, azure, ecs, other} --bucket <bucket name> --secretkey <yyyy> --accesskey <xxxx>  --recall-from-sourcepath
        1.  --folder /ifs/fromcloud/bucket1    Fully qualified path where the data will be copied to on the cluster.  This should be a fully qualified path to the data.  The path must exist already.  
        2.  --source-path '/bucket path'  or 'bucket prefix'   - The S3 path to read data from in the bucket.  If the entire bucket will be copied then use "/" for the source path.
        3. --recall-from-sourcepath  This flag enables Pipeline feature and is a mandatory flag.
        4. --recall-schedule "*/30 * * *"  - The schedule to scan the S3 bucket to copy new data or modified data found in the bucket.   This example is scanning the S3 bucket every 30 minutes
        5. --trash-after-recall <trashbucket name>  -  This enables the drop box feature to delete data after sync of new data is completed from cloud to on premise.  This will move the S3 bucket deleted data to the trashcan bucket specified.  This data should be in the same region as the source bucket.
    3. After adding the folder you can run the job to scan the bucket and copy data from the cloud to the cluster file system.
      1. searchctl archivedfolders recall --id <folderID>


    Advanced Job Transaction Log Backup

    1. Overview:  This feature stores success, skipped, errored details for each job in an S3 bucket to provide an audit log of all actions taken during copy jobs.  This log records each action to every file copied and is a large log.  This can be exported to json using the searchctl archivedfolders export command.  This feature stores the job transactions and stores in an S3 bucket by reading the reporting topic to an S3 bucket for offsite audit logs.
    2. Implementation:
      1. When enabled, Archivereporter container will be responsible for monitoring all the archivereport-<folder-id> topics, when jobs start, newly created messages in the topics will be written to temporary text files by time intervals using a timer.
      2. host path for temp logs is "/opt/data/superna/var/transactions/"
      3. Time interval is 10 mins by default, new messages with information of temporary text files will be published to topic transactionLog after every interval
      4. Archiveworker containers will spawn a consumer to monitor the transactionLog topic, messages will be picked up individually and get sent to the configured s3 logging bucket, once upload is successful, the temporary file will be deletedExported log files will be sent to logging bucket under <folder-id>/<job-type>/<job-id>/<file-name>, with file name containing the timestamp,
      5. Meta data logs will be sent to logging bucket under <folder-id>/METADATA/<job-id>/<file-name>
      6. Only the Archiveworker running on the same container as the Archivereporter should continue to consume the message.
    3. Configuration:
      1. nano /opt/superna/eca/eca-env-common.conf
      2. add this variable
      3. The logging of the transaction is disabled by default. Set by the environment variable BACKUP_TRANSACTIONS=false, can be turned on by setting it to true.
      4. export BACKUP_TRANSACTIONS=true
      5. Default interval to send temp files to the S3 bucket is every 10 minutes
      6. export TRANSACTION_WRITE_INTERVAL=10
      7. Set the S3 bucket to store the audit transaction logs, note this bucket must be defined in a folder configuration.
      8. export LOGGING_BUCKET='gcandylin-log'
      9. control + X to save and exit
      10. ecactl cluster services restart archivereporter 
      11. done
    4. NOTE:   This will consume bandwidth and storage costs in the target bucket.









    Automation with Golden Copy Rest API

    Overview

    The rest API can be used to automate copy jobs, monitoring of jobs to allow integration to application work flows that require data movement tasks from a file system to S3 or from S3 back to a file system.  Examples include media work flows, machine learning and AI training where Cloud computing is used with data and the results are returned to on premise file systems.


    How to use the Golden Copy API


    What is Smart Archiver?



    How to Configure Smart Archiver


    Summary

    1. Add a folder definition to archive data to a target S3 bucket with all required parameters including storage tier for cloud providers
    2. Specify the authorized user paths that will be enabled for end user archive capabilities

    Steps to Prepare the VM


    1. add support for Smart Archiver
    2. nano /opt/superna/eca/eca-env-common.conf
    3. add this variable
      1. export ENABLE_SMART_ARCHIVER=true
    4. nano /opt/superna/eca/docker.compose.overrides.yml
    5. Add this text below and save the file with control+x
    6. ecactl cluster down
    7. ecactl cluster up
    version: '2.4'

    services:
         searchmw:
             environment:
                  - ENABLE_SMART_ARCHIVER
         isilongateway:
             environment:
                  - ENABLE_SMART_ARCHIVER
         indexworker:
             environment:
                  - ENABLE_SMART_ARCHIVER

     

    Steps to Configure

    1. Add the folder definition and change the yellow sections below
      1. <cluster name> the source cluster name
      2. --folder  - The location where archive data will be staged before it's moved to the cloud.  We recommend using /ifs/tocloud
      3. access and secret keys
      4. endpoint url - See configuration guide on how to specify different target endpoint urls
      5. <region>  - the region parameter based on the s3 target used
      6. cloudtype - set the cloud type to match your s3 target
      7. --smart-archiver-sourcepaths -  This parameter is used to authorize any path in the file system to appear as an eligible path to end users to select folders to archive to S3,  enter a comma separated list of paths example below
        1. --smart-archiver-sourcepaths  /ifs/projects, /ifs/home 
      8. --full-archive-schedule  - Set a schedule that will be used to archive data selected for archive.  Recommended daily archive at midnight to move the data copy to off peak hours
        1. example --full-archive-schedule "0 0 * * *" 
      9. --delete-from-source -  this will ensure data is deleted after it's copied,  this is required for move operations, versus end user backup to S3 use case 
    2. example Smart Archiver command
      1. searchctl archivedfolders add --isilon <cluster name> --folder /ifs/tocloud --accesskey xxxxxxxx --secretkey yyyyyyy --endpoint s3.ca-central-1.amazonaws.com --region <region> --bucket smartarchiver-demo --cloudtype aws --smart-archiver-sourcaths /ifs/projects --full-archive-schedule "0 0 * * *" --delete-from-source

    How to Archive data as an end user

    1. Login to the Smart Archiver GUI as an Active Directory User
    2. Select the Smart Archiver icon
    3. Browse to the path entered above that was authorized for end users to archive.  Folders that are eligible will show Archivable label.
    4.  
    5. Select a Folder
      1.  
      2. The Up Cloud icon appears,  
      3. Click the button to archive the data
      4.  
      5. User action is completed.  The data will not be moved until the scheduled job runs.   
      6. After the job completes and the user browses to the folder they will now see the archived data on the right hand screen.


    © Superna Inc