Administration Guides

Writeable Snapshots with File Clones Automation

Home



Overview

A very common requirement is to test on a large data set and making full copies is very slow and consumes a lot of space.   This solution guide will walk through how to clone a directory tree of data to make use of File Clone feature in OneFS to create writeable copies of data BUT without consuming space of the original data set.  This is done with the Search & Recover Command builder tool, Excel and a text editor.   The cloned data only consumes space based on the changes to the files at the block level.    This is a copy on first write solution.


How to create writeable Snapshot of data for testing

  1. This example will use a source path of /ifs/data/dfsdata/search
  2. NOTE: This folder path has 340 MB of data 
  3. Create the target folder structure where the cloned data will reside
    1. ssh to the cluster as root or admin 
    2. cd /ifs/data/dfsdata/search (change the path from this example)
    3. rsync -av -f"+ */" -f"- *" . /ifs/data/dfsdata/clonedsearch/  (change target path to the location where you plan to create the writeable clones)
  4. In the Search & Recover GUI advanced window enter the fully qualified path into the File path field where the data is located.  In this example /ifs/data/dfsdata/search.  Then select the files only option in the advanced search UI and then execute the search.
  5. Click the Command build icon and enter this into the first box "cp -c",  click the check box to add double quotes, switch to plain text option and enter nothing into the second box and click create all
    1.  
  6. Excel is an easy tool to modify the script file to specify the output file name and path.
    1. Import from data file default settings
    2.   
    3. Delete the first few header rows of the script.
    4. Open a new blank Excel file
      1. Then copy the column with the source files column and past into the new blank Excel file.
      2. Using the Replace feature in Excel, we will select the target path column in the new file, change the Replace to Search Columns,  enter the source path to replace and the target path to replace.  The target path is the path created earlier where all the cloned files will be copied.
      3. In this example the source path was /ifs/data/dfsdata/search/  but we will replace this with the target path of /ifs/data/dfsdata/clonedsearch/.  Enter the fields and replace all button.
    5. Copy the target path column and paste into the main script file
  7. Now Save the file as CSV using Save As
  8. You will need to save as CSV and then use a text editor to search and replace the triple quote """ with " and replace comma's with a space.  This is required to encode files and folders with spaces in " for the script to run correctly.
  9. Rename the file to clone.sh (shell script)  and scp the script to the Isilon cluster.
  10. chmod 777 clone.sh
  11. bash clone.sh  (this will run the script to clone and this can take a long time depending on the size of the data set).  Monitor the target path for files appearing in the folder tree to track progress of the copy script.  NOTE: This can run for a long time and is best to leave this script running on terminal within a VM to allow you to disconnect from session and leave the script running.

How to Verify the Cloned Data Consumes no additional space

  1. On an original file in the source path run this command.  NOTE: Change directory or enter full path to the file.
  2. isi get -D mediumfile.zip  (source file is  147 MB)  Notice the physical blocks value of 18019
  3.   
  4. Now run the command on the cloned file with isi get -D cmediumfile.zip Note: the file only consumes 5 blocks versus 18019 blocks.
  5.  
  6. done

Browse the Writeable Clone data for testing 

  1. Create an SMB share to present the cloned data to users or developers for testing.  Any writes to the data will consume space equal to the modified blocks changed within each cloned file.
  2.  


© Superna LLC