지금 지원 담당자와 채팅
지원 담당자와 채팅

Classification Module 6.1.3 - User Guide

Introduction Deploying Classification in Identity Manager Configuring Classification: Taxonomies, Categories, and Rules
An Overview of Classification Configuration Steps Required to Implement Classification Creating Taxonomies Implementing Rules for Automated Categorization Classifying Resources When Do Categorization and Classification Occur? Managing the Life Cycle of Taxonomies and Categories
Working with Categorized Resources Appendix A: PowerShell Commands Appendix B: Oracle Configuration Appendix C: Classifying Data with Data Governance Templates Appendix D: Creating a Taxonomy to Classify Data

Working with Dictionary Text Extractors

Dictionary extractors enable you to define the patterns to locate in the extracted text with a set of terms (which can be a word or a multi-word phrase). Data Governance provides two types of dictionaries: public and protected. For details on each, see Managing Public Dictionary Text Extractors and Managing Protected Dictionary Text Extractors.

When managing dictionaries, you should keep in mind:

  • All dictionaries may include on one or more dictionaries. By default, it will have the terms from all included dictionaries. However, terms in the definition take precedence, and may override or disable the terms from the included dictionary.
  • You can add, include/exclude, and remove terms, as well as specify a terms case sensitivity for all dictionaries as required. Note: For protected dictionaries, edits are possible through customizations.
  • To work with dictionaries that contain more than 500 terms, you must use PowerShell commands.

Managing Public Dictionary Text Extractors

Public dictionaries are not secured and as such, the terms can be modified. Because of this, if you need to update it at any time, the dictionary will be replaced and any modifications will be lost.

To create a public dictionary text extractor using the web portal

  1. Select Governed Data | Categorization Manager | Extractors.
  2. Select Dictionary to create a new text extractor.
  3. Enter a unique identifier.
    The identifier is used by the classification system. Once created, you cannot change this value. It is recommended you use a naming convention that reflects the purpose of the text extractor.
  4. Enter a name and description.
    The name and descriptions are useful when building rules to ensure the proper text extractors are being included, so provide all necessary information.
  5. Select Add dictionary (included dictionaries) and Remove dictionary as required.
  6. Click Add term to augment the dictionary as required. These are the terms that will be searched for in the extracted text.
    You can add up to 200 characters and 500 terms. Alternatively, if your requirements exceed this limit you can use PowerShell commands.
  7. Click Remove to remove the term from the list of terms.
  8. You can select to include and exclude terms and adjust their case sensitivity as required. Simply select the term and click within the column where you want to make the change.
  9. Carefully review your settings and save your changes.

To edit a public dictionary text extractor using the web portal

  1. Select Governed Data | Categorization Manager | Extractors.
  2. Locate the required text extractor and click Edit.
  3. From the General tab, you can edit the name and description.
    The name and descriptions will be visible by all users who are building rules. Including detailed information helps to ensure the proper text extractors are being included.
  4. On the Definition tab, select to Add dictionary (included dictionaries) and Remove dictionary as required.
    The list of terms included in the dictionary will be displayed. You can add/include/exclude/remove and alter their case sensitivity as required.
  5. Click Add term to add new terms to the list of terms that will be included in this dictionary. These are the terms that will be searched for in the extracted text.
    You can add up to 200 characters and 500 terms. Alternatively, if your requirements exceed this limit you can use PowerShell commands.
  6. Click Remove to remove the term from the list of terms.
  7. You can select to include and exclude terms and adjust their case sensitivity as required. Simply select the term and click within the column where you want to make the change.
  8. Carefully review your settings and save your changes.
  9. Click Validate to ensure the text extractor can be processed by the classification system.

Managing Protected Dictionary Text Extractors

The terms in this type of dictionary are secured and cannot be modified, making it the best candidate for dictionary re-use.

Protected dictionaries are secured and cannot be modified. You can, however, append the dictionary with customizations to suit your needs. If you “replace” a taxonomy containing dictionaries with custom terms they will be removed. To preserve these terms, export the taxonomy and re-import. When creating your customizations, keep in mind that customterms take precedence over both included terms and terms in the dictionary definition.

To protect a dictionary text extractor with PowerShell

  1. Run the Set-QDictionaryTextExtractor command with the following parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Id
      Provide the ID of the text extractor that you want to edit. You can get the ID using the get-QTextExtractors command.
    3. State
      Set the state to Protected to secure a dictionary, making its terms read only.

To edit a protected dictionary text extractor using the web portal

  1. Select Governed Data | Categorization Manager | Extractors.
  2. Locate the required text extractor and click Edit.
  3. From the General tab, you can edit the name and description.
    The name and descriptions will be visible by all users who are building rules. Including detailed information helps to ensure the proper text extractors are being included.
  4. Click Add customization to augment the dictionary as required. These are the terms that will be searched for in the extracted text.
    You can add up to 200 characters and 500 terms. Alternatively, if your requirements exceed this limit you can use PowerShell commands.
  5. Click Remove to remove the term from the match pattern.
  6. You can select to include and exclude terms and adjust their case sensitivity as required. Simply select the term and click within the column where you want to make the change.
  7. Carefully review your settings and save your changes.
  8. Click Validate to ensure the text extractor can be processed by the classification system.

Managing Dictionary Text Extractors with PowerShell

If you are working with a large dictionary (more than 500 terms), you will need to manage the text extractor with PowerShell commands.

To add a dictionary text extractor with PowerShell

  1. Run the Add-QDictionaryTextExtractor command with the following mandatory parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Id
      Provide an ID for this text extractor. The identifier is used by the classification system and once created, it cannot be changed. It is recommended you use a naming convention that reflects the purpose of the text extractor.
    3. Name
      The name should reflect the purpose of the text extractor.
    4. IncludedDictionaries
      Include a list of the dictionaries that this dictionary is based on, specified by their extractor IDs.
  2. If desired, use the following optional parameters:
    1. Description
      Provide a description for the text extractor. This is useful when building rules to ensure the proper text extractors are being included, so provide all necessary information.

To edit a dictionary text extractor with PowerShell

  1. Make sure you know the ID of the desired text extractor. For more information, see Finding a Taxonomy, Category, or Extractor ID using PowerShell.
  2. Run the Set-QDictionaryTextExtractor command with the following mandatory parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Id
      Provide the ID of the text extractor that you want to edit. You can get the ID using the get-QTextExtractors command.
    3. Name
      Provide a name for the text extractor.
    4. Description
      Provide a description for the text extractor.
    5. IncludedDictionaries
      List the dictionaries that this dictionary is based on, specified by their extractor IDs.
    6. State
      Set the state to Protected to secure a dictionary, making its terms read only.

To add a term for a dictionary text extractor with PowerShell

  1. Run the New-QDictionaryTerm command with the following mandatory parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Text
      Provide the term to add to the dictionary.
    3. IsCaseSensitive
      Indicates whether or not the term is case sensitive. The default is $false.
    4. IsExcluded
      Indicates whether or not to exclude a term from a dictionary. The default is $false.

To view the terms included in a dictionary text extractor with PowerShell

  1. Run the Get-QDictionaryTerms command with the following mandatory parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Id
      Provide the ID of the text extractor that you want. You can get the ID using the
      Get-QTextExtractors command.
    3. GetCustomTerms
      Specifies whether you want to return terms used with a protected dictionary
      (customizations). Set to $true if you want to get the custom terms, $false if you want to get the regular terms. The default is $false.

To update the terms included in a dictionary text extractor with PowerShell

  1. Run the Set-QDictionaryTerms command with the following mandatory parameters:
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. Id
      Provide the ID of the text extractor that you want. You can get the ID using the
      Get-QTextExtractors command.
    3. TermsToUpdate
      Provide a list of terms to update for the specified dictionary. This list is specified using an array. The existing terms will be overwritten with the list.
    4. UpdateCustomTerms
      Specifies whether you are updating the terms of a public dictionary or a protected dictionary (customizations). This is a true/false value where the default is false. You set it to true to update the customizations list.
관련 문서

The document was helpful.

평가 결과 선택

I easily found the information I needed.

평가 결과 선택