Forum

Please or Register to create posts and topics.

Data Warehouse Modeling

In this article we will talk about the steps of Data warehouse modelling, but first let’s talk about what’s meant by Data warehouse and Data warehouse Modeling?

  1. Data Warehouse is the Consolidation of data from Multiple Sources to Target to achieve single point of truth and As defined by ‘Bill Inmon the father of data warehousing’, a Data Warehouse is a subject-oriented, nonvolatile, integrated, and time-variant collection of data that supports management decision-making.
  2. Data warehouse modeling is the process of designing the schemas of the detailed and summarized information of the data warehouse. The goal of data warehouse modelling is to develop a schema describing the reality, or at least a part of the fact, which the data warehouse is needed to support.

Let’s define Data warehouse Modelling Process which are as follows:

Step1: Determine Business Objectives:

Any Business that you will go through it, you will find a rapid growth of each part of it. To answer the decision-makers’ questions, we need to understand what Business defines? , we need to know their requirements and the measurement of their business activity that they use to guide the organization. These measurements are the key performance indicators (KPIs), a numeric measure of the company’s activities, such as units sold, gross profit, net profit and time duration. This KPIs we will collect it into a table called a fact table.

Step 2: Collect and Analyze Information:

The only way to gather this information is to ask questions, Listen carefully and repeat back what you think you heard

  1. Collect all Requirements (technical requirements)
  2. Prepare BRD (Business Requirement Document)
  3. Create TRD (Technical requirement Document)

Once the requirements have been gathered, they should be documented in a business requirements document (BRD) and a technical requirements document (TRD). The BRD should describe the high-level business goals of the data warehouse and the types of questions that need to be answered. The TRD should describe the technical specifications of the data warehouse, such as the data sources, the data model, and the hardware and software requirements.

Step 3: Identify Data Sources:

In this phase we are working on Data sources, This involves identifying all of the systems and applications that contain data that will be used in the data warehouse. The data sources can be databases, files, Reports and even manual processes.

You’ll need access to all of these DS & Reports. Then start to identify it and profile it. Once the data sources have been identified, they need to be profiled. This involves understanding the data formats, the data quality, and the relationships between the different data sources. The data profiling information can be used to design the data warehouse and to develop the data extraction, transformation, and loading (ETL) processes

Once the data sources have been profiled, the next step is to select the data sources that will be used in the data warehouse. This selection should be based on the business requirements and the KPIs that need to be supported.

 Here are some additional things to consider when identifying data sources for data warehouse design:

  • Consider the future needs of the organization. When selecting data sources, it is important to consider the organization’s future needs. What data will the organization need in the future? What systems and applications will the organization be using in the future?
  • Consider the cost and complexity of extracting and loading data from the data sources. When selecting data sources, it is important to consider the cost and complexity of extracting and loading data from the data sources. Some data sources may be more difficult and expensive to extract and load data from than others.
  • Consider the security and privacy of the data. When selecting data sources, it is important to consider the security and privacy of the data. Some data sources may contain sensitive data that needs to be protected.

 

Step 4: Create a Conceptual data Model:

In data modeling, we go through three different stages: conceptual, logical, and physical. The conceptual data model is the most abstract data model that describes the data elements without much detail. It is typically created by business stakeholders and data architects. The purpose of the conceptual data model is to organize, scope, and define business concepts and rules. It establishes the entities and their relationships, but does not specify attributes or primary keys

The conceptual data model is an important step in the data warehouse design process because it helps to ensure that the data warehouse meets the needs of the business. It also provides a common understanding of the data for all stakeholders involved in the data warehouse project. and here an example of conceptual model.

Conceptual Data Model

Once you have created a conceptual data model, you can use it to design the logical and physical data models.

Step 5: Create a Logical Data Model:

The logical data model is a more detailed version of the conceptual data model, it’s a blueprint for the data warehouse that defines how the data should be structured and how the different entities should be related. It is independent of any specific database management system (DBMS), and it is typically created in collaboration with Data Architect and Business Analyst. The purpose of the logical data model is to develop a technical blueprint of the data warehouse, including the rules and data structures that will be used to implement it.

The logical data model is a more detailed version of the conceptual data model, and it includes the following:

  • A list of the entities and their relationships
  • A list of attributes for each entity
  • Normalized data structures

 

Logical Data Model

Once the logical data model has been normalized, it is important to review it with the business stakeholders to ensure that it meets their requirements. Once the logical data model has been finalized, it can be used to design the physical data model.

Step 6: Create a Physical Data Model:

The physical data model is a detailed representation of the data warehouse that describes how it will be implemented using a specific DBMS. It is typically created by database administrators (DBAs) and Data architect. The purpose of the physical data model is to provide a blueprint for the actual implementation of the database.

The physical data model is mapped to the physical structures of the RDBMS system hosting the data warehouse. This includes defining physical RDBMS structures, such as tables and data types.

The steps for designing physical data model design which are as follows:

  • Convert entities to tables.
  • Convert relationships to foreign keys.
  • Convert attributes to columns.

 

Physical Data Model

Once the physical data model has been finalized, it can be used to implement the data warehouse.

Here are some additional tips for designing a physical data model:

  • Use standard naming conventions. This will make the physical data model easier to read and maintain.
  • Use appropriate data types. This will help to ensure the integrity of the data.
  • Normalize the data model. This will help to reduce redundancy and improve performance.
  • Use indexes. Indexes can improve the performance of queries.
  • Document the physical data model. This will make it easier for others to understand and maintain the data warehouse.

 


Here you will find a list of data modeling tools to use during your practice:

  • Erwin
  • Toad Modeler
  • Oracle SQL Developer Data Modeler
  • MySQL Workbench
  • ModelSphere

 


Thanks for completing my article! I wish you the best in your continual journey. I hope this was valuable to you.

Remember, learning is a lifelong journey, so keep exploring and expanding your knowledge. I look forward to sharing the next article about DWH Schemas with you soon.