立即与支持人员聊天
与支持团队交流

Classification Module 6.1.1 - User Guide

Introduction Deploying Classification in Identity Manager Configuring Classification: Taxonomies, Categories, and Rules
An Overview of Classification Configuration Steps Required to Implement Classification Creating Taxonomies Setting Up Manual Categorization Implementing Rules for Automated Categorization Classifying Resources When Do Categorization and Classification Occur? Importing and Exporting Taxonomies Working with a Taxonomy XML File Managing the Life Cycle of Taxonomies and Categories Advanced Rule Applications
Working with Categorized Resources Appendix A: PowerShell cmdlets Appendix B: Oracle Configuration Appendix C: Classifying Data with Data Governance Templates Glossary

Testing and Reviewing Automated Classification

Before you make your category available to the automated system, you should test that the rules and category are behaving as desired. You can use the following diagnostics:

  • Test a rule against a resource
  • Test a resource against all rules
  • See what text is extracted from a resource
  • See why a particular resource is categorized the way it is
  • Browse all categorized resources to see the results of the system

Testing a Rule against a Resource

Once you add a rule to the system, you should check that it has the desired results. To do this, set up a test file or SharePoint document containing data that will effectively allow you to evaluate the rule. For example, if the rule involves credit card numbers, ensure the content of the test resource includes credit card numbers. Use the Get-QRuleResults cmdlet to perform your test. You may find it useful to build a library of test resources for easier testing. You must have already added the rule to the system in order to test it. For more information, see Managing Rules in the Classification System. For information on testing all rules at once, see Testing all Rules Against a Resource.

The result of this cmdlet is an XML file, which details the results of your test. The file is divided into seven sections:

Test Results
Section Description
Log Messages Contains the messages that the rules or extractors invoked to record into the log, along with timestamps.
AutomaticClassification Includes the categories added in the 'Adds' subsection, removed in the 'Removes' subsection, and other operations in the 'Others' subsection. The 'Key' node specifies the topic ID that corresponds to each category.
EntityCache Contains text that the extractors found and the rules had hits on. For example, a 'Cities in California' rule could have a hit on 'Los Angeles', so it is stored in the entity cache, along with the offset (number of characters from the beginning) and length of the item.
ExtractorEvents Shows which rules requested which extractors to perform extraction on the content, and what the results were. The extractor event will either be an ExtractorResult if an extractor was run on the content for the first time, or an ExtractorCacheHit if the extractor's result for the given content had already been cached. Each event also has a timestamp.
FinalRuleStates Shows data contained in any rule states that had a match.
LastExtractorTime Shows the timestamp for when each extractor was last invoked.
Properties Contains any properties that were set during processing by the rules or extractors.

To test a rule against a resource using PowerShell:

  1. If you do not know the ID of the rule you want to test, run the Get-QXmlRules cmdlet with the mandatory ServerAddress parameter, and note or copy the rule ID.
  2. Run the Get-QRuleResults cmdlet with the following mandatory parameters:
    You may want to send the results to an output file using the PowerShell parameter > filename.xml. This will make the results much easier to interpret.
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. ResourcePath
      The full path to the test resource. For example c:\test files\creditcard.txt.
    3. RuleID
      The full ID of the rule you want to test.

Testing all Rules Against a Resource

Testing all rules allows you to see the results of each rule when run on a test resource. This can help with your understanding on how each rule works, and how they interact on a single resource. Only rules that have been added to the system are included in this diagnostic. Use the Get-QAllRuleResults cmdlet to perform your test. For more information, see Managing Rules in the Classification System.

The result of this cmdlet is an XML file which details the results of your test. You can use this output to see the effect of your rules, and to infer categorization. You need to know the threshold on a category, as well as the category’s settings in order to determine if it would be applied. For more information, see How Categories Work Together: Mutual Exclusivity, Strict Ordering and Inheritance and How Rules Affect Categorization.

For an explanation of the resulting XML file, see Testing a Rule against a Resource.

Depending on the number of rules in your system, you may find it helpful to test a single rule. For more information, see Testing a Rule against a Resource.

To test a rule against a resource using PowerShell

  1. Run the Get-QAllRuleResults cmdlet with the following mandatory parameters:
    You may want to send the results to an output file using the PowerShell parameter > filename.xml. This will make the results much easier to interpret.
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. ResourcePath
      The full path to the test resource, including the computer name if applicable. For example c:\test files\creditcard.txt.
  2. Examine the output.

Viewing the Text from a Resource

A rule is run against the text that is extracted from a resource. You may be unsure what content in the resource caused the results of a rule. For example, you may wonder why a rule identifies a credit card number in your resource. Using this diagnostic, you can see exactly what text is extracted from a resource. Use the Get-QResourceTextExtracted cmdlet to perform this test.

To examine the text extracted from a resource

  1. Run the Get-QResourceTextExtracted cmdlet with the following mandatory parameters:
    You may want to send the results to an output file using the PowerShell parameter > filename.xml. This will make the results much easier to interpret.
    1. ServerAddress
      Provide the name of the computer hosting the Data Governance server, and the port. Enter in the form computername:port number. The default port is 8723.
    2. ResourcePath
      The full file path to the resource, including the computer name if applicable.
  2. Examine the output.
相关文档

The document was helpful.

选择评级

I easily found the information I needed.

选择评级