Skip to main content

Greenfield data modeling

You create a Greenfield data model from scratch when a data model from Pega Marketplace does not fulfill the client's data model requirements.

One technique for developing an object model is to parse a business requirement document while extracting nouns and verbs. Nouns become data types, and verbs become processes. Processes can be a case type or an action that occurs in a case type. Once project development teams identify the nouns, the data modeling job begins. 
 
Industry-standard data modeling techniques prioritize the data needs of the business and application first, deferring all considerations for the physical data models until the application and business needs are complete. 

Approach in greenfield data modelling

Greenfield data modeling follows a three-level approach:  

  1. Conceptual  
  2. Logical 
  3. Physical

As a Lead System Architect (LSA), you play a significant role at every level of data modeling by collaborating with other stakeholders.

The following table shows what actions different users perform at each level: 

Design Level Stakeholders Activities
Conceptual Business user, and subject matter expert (SME) for a business domain, Business Analyst, and Lead system architect
  • Identify the data elements and their data source.  
  • Understand the flow of information.  
  • Obtain the proper data type for data elements. 
Logical Business user, and subject matter expert (SME) for a business domain, Business Analyst, and Lead system architect
  • Create a logical group of elements by using inheritance or composition. 
  • Choose the correct layer for the data elements. 
  • Identify the primary and foreign keys for each data type.
  • Create data entity relationship. 
  • Define the cardinality of the relationship. 
  • Add constraints, normalize the data, and remove redundancy.
  • Consider data integrity issues. 
Physical Database administrator (DBA), and Lead system architect 
  • Choose the appropriate schema and create a database table. 
  • If required indexes, views and other alternations can be created. 
  • Optimizing the data model for performance, scalability, and security. 

 

Conceptual level in data modeling

Conceptual data modeling is the process of creating a high-level, abstract representation of the data requirements in a system or application. It focuses on defining the main data entities, their relationships, and key attributes, without going into the specific details of the data structure or implementation. The main purpose of conceptual data modeling is to provide a collective understanding of the data requirements among various stakeholders, facilitating communication and agreement on the overall data structure.

 Conceptual data modeling in Pega applications can be considered to consist in an informal or implicit understanding of the following elements: 

  • Clear listing of all data elements required for a given business scenario. 
  • Definition of the type of data element. 
  • Identified the source of the data element. 
  • Flow of the data in the business process. 
  • Recognition of the ownership and access control requirements of the data. 

Logical level in data modeling

Logical data modeling is the process of refining the conceptual data model by adding more detail and structure to the entities, relationships, and attributes. Logical data modeling defines primary keys, foreign keys, and constraints to ensure data integrity and the normalization of data to eliminate redundancy and inconsistencies. The logical data model is usually technology-agnostic, focusing on organizing the data independently of any specific database management system (DBMS). 

The output of the logical data modeling in Pega includes:

  1. Data Objects (Data types): Data objects, also known as data types, represent the main entities in the application and encapsulate the data structure and behavior. In Pega software, data objects can be created using App Studio or Dev Studio and are typically based on a single database table or an external data source. 
  2. Application layer: The application layer in which the data object can be placed. If a data object is required across the organization then it can be placed in the Enterprise layer. If the data object is to fulfill a specific requirement then it should be placed in that layer.
  3. Properties: Properties are the attributes or fields within a data object that define the specific data elements and their characteristics, such as data type, default value, and validation rules. In Pega software, properties can be created and managed using the Property rule form in Dev Studio. 
  4. Relationships: The relationships between data objects are defined using associations, foreign keys, or other mechanisms. These relationships determine how data objects interact with one another and enforce data integrity across the application. The cardinality of the relationship has to be decided.  
  5. Constraints and Validation Rules: In Pega applications, constraints and validation rules can be defined within the data object or property rule forms to ensure data integrity and consistency. These rules enforce business logic and help maintain the quality of the data in the application. 
  6. Data Pages: Data pages in Pega applications are used to load, cache, and manage data from various sources. They can be considered a part of logical data modeling, as they define how data is retrieved and manipulated within the application. 
  7. Reporting Database: LSAs should analyze the business requirements. If there is a need for granular reporting, and the frequency of reporting is high, then plan for a reporting database. 

Physical level in data modeling

Physical data modeling is the process of translating the logical data model into a detailed, database-specific implementation. This phase involves defining the actual database objects, such as tables, columns, indexes, constraints, partitions, storage, and other elements that are specific to the chosen database management system (DBMS). The primary goal of physical data modeling is to optimize the data model for performance, security, and maintainability within the target database environment. 
 
Physical data modeling in Pega applications typically consists in the following elements: 

  1. Database schema: A schema represents the structure of the database, including tables, columns, data types, and constraints. In Pega software, the default standard schema’s Pega Data is called Customer Data. As an LSA, you choose the appropriate schema for the given business requirement and determine what information the Customer Data and Pega DATA require. Pega applications follow a set of default naming conventions and structures for creating tables and columns that correspond to the data objects and properties. 
  2. Indexes: Indexes are used to optimize the performance of data retrieval operations. In Pega applications, system-generated indexes are automatically created for some properties, such as primary keys. You can also configure additional custom indexes for properties that are frequently used in search or filtering operations. 
  3. Reporting Database: Create the Pega Reporting Database, if there is a need for analytical and trend reporting and more frequent running of reports.
  4. Data storage and partitioning: Pega software uses a set of default storage configurations, such as BLOB (Binary Large Object) storage for storing application data. However, you can also configure additional storage settings, such as partitioning and archiving, to optimize the performance and manageability of the database. 
  5. Integration with external data sources: In Pega applications, you can integrate external databases or data sources using connectors and integration rules. The configuration of these connectors and rules can be considered to be part of the physical data modeling output, as they define how external data is accessed and managed within the Pega application. 
  6. Security and access control: In Pega applications, security and access control configurations, such as authentication profiles and data access roles, define how users can access and manipulate the data stored in the application. These configurations can also be considered to be part of the physical data modeling output, because they impact the overall data management and security of the database environment.


Check your knowledge with the following interaction:


This Topic is available in the following Module:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice