Administration Guides

Advanced Searching Syntax and Script Automation Editor

Home


Use Cases

  1. User Search: Finding email addresses in documents.
  2. Compliance: Find credit numbers in a document.
  3. e-Discovery, GDPR: Finding employee or customer ID with wild cards.
  4. e-Discovery, User Search, GDPR:  Ssingle character and multi character searches with ? and *.
  5. e-Discovery, User Search, GDPR: Finding documents with more than one mandatory term in the document example customer ID + some other term.
  6. File Recovery, User Search:  Find files with wild cards. 
  7. Capacity Management: Find all files greater than or less than a size or in a range of file sizes.


How to Use the Script Automation Editor 

The Script Automation editor icon on the UI allows administrators or end users to generate a script download that can automate some tasks with the results of a search.  The CSV and script download icons limit results to 100 000 rows, any search results over this limit will automatically run as a server side job to create the file that is available from the secure download URL.


How to download very large scripts

  1. The server side download URL provides backup archive, and CSV script downloads.
  2. Using a browser access the URL below (NOTE: the secure access userid and password are required). Default userid is ecaadmin.  The password is set during installation.
    1. https://x.x.x.x/downloads  (x.x.x.x is node 1 of the search cluster)


How to use the Script Editor

  1. The window shown below allows users to select an absolute path in the script download (i.e.  /ifs/data/.... ), which is useful when the script will be run directly on the PowerScale using ssh and a .sh script.   
    1. The other option is to insert the UNC path in the downloaded script which enables a script to be run by users over SMB connections to the cluster.  The UNC path is the SmartConnect address and SMB share to reach the file over the network.  This allows powershell or batch cmd files to run automation against a result of files.

  2. The other option available in the script download is plan file download with rows "Plain" option.  This would be used when the script language needs to be changed or modified.  Typically used when bash shell execution will not be used.   The Shell option will add bash shell and download the file with a .sh extension so that the file can be copied to PowerScale to be run on the cluster itself.  This would typically be used by PowerScale administrators.

  3. Enter the script action in the first dialog (i.e. cp, mv , del, rm  etc..) depending on the script language you plan to use.  Enter the output of the command which could use >> results.txt , or | grep "some string to look at".
    1. Examples:
      1. isi get -d 
      2. cp
      3. mv

Search Operators explained with examples

  1. +  and - term operator
  2. ? single character wild card 
  3. * multiple character wild card
  4. fieldname:  (specific schema search examples below):
    1. filesize:   (file size in bytes).
    2. extension:   (file extension).
    3. language_s:   (detected language of the document).
    4. clustername:   (cluster name).
    5. ishidden:   (hidden file in the file system).
    6. path:   (search the full path to a file using key words).
    7. type:   (find files or directories  valid values are files  or directory ).
    8. owner:   (ad or Linux owner of the file in the form domain name\\userid. NOTE: the double slash is needed i.e. owner:AD01\\usera users).
    9. group:  (group ownership of the file example group:AD01\\domain users).

    Language Detection codes 

Note: all languages are tested or supported.  Provided as the language codes, these codes help searching the language metadata field.


Language Code

Language

af

Afrikaans

ar

Arabic

bg

Bulgarian

bn

Bengali

cs

Czech

da

Danish

de

German

el

Greek

en

English

es

Spanish

et

Estonian

fa

Persian

fi

Finnish

fr

French

gu

Gujarati

he

Hebrew

hi

Hindi

hr

Croatian

hu

Hungarian

id

Indonesian

it

Italian

ja

Japanese

kn

Kannada

ko

Korean

lt

Lithuanian

lv

Latvian

mk

Macedonian

ml

Malayalam

mr

Marathi

ne

Nepali

nl

Dutch

no

Norwegian

pa

Punjabi

pl

Polish

pt

Portuguese

ro

Romanian

ru

Russian

sk

Slovak

sl

Slovene

so

Somali

sq

Albanian

sv

Swedish

sw

Swahili

ta

Tamil

te

Telugu

th

Thai

tl

Tagalog

tr

Turkish

uk

Ukrainian

ur

Urdu

vi

Vietnamese

zh-cn

Simplified Chinese

zh-tw

Traditional Chinese



How to search for directories only

  1. Use this search to find directories and below a specific path (NOTE: This requires the logged in user to be on the admin list to use the absolute path option shown below)
    1. In the search box type "type:directory" to include in the search filter.   Enter the absolute path in the "File Path" field to constrain the search to this path and below. The results count the number of directories.  Downloading results will provide a directory result csv.

    2. Or use the folders only option in the advanced search field.


How to count the number of files at a path and below

  1. Add your user id to the admin list (see the configuration guide for the cli command)
  2. In the Advanced Search drop down, enter the path in using absolute path starting with "/ifs" and enter the starting point of the search.
    1. Enter no text in the Search box to match all files, the file result count equals the number of files at the path entered and below.

How to search the size field to find files with certain file size ranges

  1. This search allows a specific field to be searched, and when combined with other search terms allows for very controlled search of data. These examples can be entered into the search bar versus using the advanced UI
    1. Enter Search: filesize:[10 TO 100]  AND  extension:pdf (this will find all the pdf files that are 10 - 100 bytes).
    2. Enter Search: filesize:[100 TO *]  AND  extension:pdf (this will find all the pdf files that are 100 bytes or greater).
    3. NOTE: TO must be upper case AND operator must be upper case.  File sizes are stored as bytes and will require a GB or MB to byte conversion.  Many online calculators can assist with this conversion.


How to use Wild Cards for single character or multiple characters

  1. In the search dialog box you can use ? to wild card a single character or use * to represent any characters. See examples below:
    1. Employee ID:  3 letters + 3 numbers example abc123   Enter Search:  abc1??  will find all employees that start with abc1. 
    2. Credit card search for PCI:  Enter Search: (master card) 5524-????-????-???? will find all documents with this pattern begging with 5524.
    3. Multi character wildcard:  Enter Search: abc*


How to search with AND to join terms together

  1. Enter search for 2 terms:  dog AND cat  (use the AND upper case to join the terms that must be in the document).
  2. This means both dog and cat must be in the document to match.
  3. The default for terms is OR with no operator specified  example.   dog  cat  is the same as dog OR cat.


How to Search for email addresses and partial email addresses

  1. Type the full email address into the search bar will find exact match in documents.
  2. To search a partial email address use wild cards to find  the addresses per the following examples:
    1. * multi character wild card use case:  To find "test.user@superna.net"  combine a phrase and operator to find documents:
      1. test* AND "@superna.net"   
      2. *.user AND "@superna.net" 
    2. ? single character wild card use case: To find "test.user@superna.net" combine a phrase and operator to find documents:
      1. t???.user AND "@superna.net" 
      2. test.us?? AND "@superna.net" 

How to search for a phrase

  1. Use double quotes.
  2. "my dogs name"  These words in this exact order must appear in the document.


How to search for a term that MUST exist or MUST NOT exist in a document to narrow the search to a userid, employee

There are many use cases for this type of search. For example a search to find documents that must have an employee ID.

  1. Enter Search for employee with userid of abc123 and add plus sign + to the term that must exist.  Example "+abc123  performance reviews"   (this search will find all documents that have abc123 and then find all documents that have performance OR reviews as words in those documents containing abc123.
  2.  How to find documents that do not have a term.
    1. Enter Search without a userid of abc123:  -abc123 performance reviews  (You can combine this with other terms, for example find all documents that do not contain the user ID abc123 and may contain contain performance  OR reviews terms)

© Superna Inc