Skip to main content
This content is now archived and is no longer updated. Progress is not calculated. Pega Cloud instances are disabled, and badges are no longer awarded.

Training a text categorization model

Introduction

When customers send emails to a contact center application, these emails are routed to one or more work queues based on the topics and routings configured. To correctly process emails with multiple topics, create and train a model-based topic detection text categorization model.

Video

Transcript

This video will show you how to create a model-based topic detection text categorization model.

U+ is currently using intelligent email routing with rule-based topic detection models in the Contact Center. The issue they have is that using the current rules, certain emails are often categorized incorrectly. For example, this dispute transaction email is categorized both as Complaint or Compliment and as Dispute Transaction. To fix this, as a Data Scientist you need to create and use a model-based topic detection text categorization model.

This is the email channel configuration. Notice that the email channel has multiple topics and intelligent routing rules configured.

Routing

Now, test the dispute transaction email to see how the rule-based topic detection works. Notice that the email is associated with two topics. Notice also a confidence factor of 1.

Two topics

Every channel configuration has an associated Text Analyzer. To open it, navigate to Actions>Open Text Analyzer. The text analyzer is now configured as a rule-based model. This means the topic detection is not very intelligent. It is mainly based on the must match and should match keywords. To update the text analyzer to a model-based approach, select Use model based topics if available as the topic preference and save the change.

Model based

Now you need to create the MySupport model.

Navigate to Prediction studio. Filter the models based on text categorization and select the MySupport model. Test the same dispute transaction email in the model to view the results. Confirm that the email channel has multiple topics and intelligent routing rules configured. Note the confidence score for a rule-based model is always 1.

Test

To update the topic detection model, click Update language. This enables you to build a new model and train it as required. Begin the topic detection model creation wizard by selecting Use machine learning and clicking Update. Review the topics listed in the Topics section. The new model will be created for this set of topics.

Update

The data should contain text examples for each domain with a result. The file must contain columns with the names “Content”, “Result”, and “Type”. A sample training data file will look like this with sample text, desired topic, and whether a data set is for training or testing purposes.

Data

Now, you must choose the data source. Select the user defined sampling based on ‘Type’ column to complete the sample construction. This ensures that the uploaded data source is used for creating the model, and the created model is tested against the test data in the data source to verify the results. Next, select Maximum Entropy as the model to be built. Pega supports all the listed algorithms. However, Maximum Entropy is the most appropriate one for the current use case.

Model creation

Review the created model, measuring performance on the test data.

You have completed all the configuration steps. You can now save your configuration and test the changes.

Test the same email using the model and verify that the case category is now more accurate. Notice also a confidence factor of 0.85. The confidence factor is a value from 0 to 1, indicating how likely it is that the topic is correctly detected.

Results

Perform an end-to-end test by sending the same dispute transaction email to Pega Customer Service and notice that the routing is more intelligent.

This video has concluded. What did it show you?

  • How to enable model-based topic detection in a Text Analyzer
  • How to create and train a model-based topic detection text categorization model

This Topic is available in the following Module:

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice