Select Page

Category Selected: Analytics Testing

8 results Found


People also read

Artificial Intelligence

AI Ethics Guidelines: A Practical Guide

Software Development

Accessible Website Development Tips and Tricks

AI Testing

Prompt Engineering for QA: Essential Tips

Talk to our Experts

Amazing clients who
trust us


poloatto
ABB
polaris
ooredo
stryker
mobility
ETL Data Quality Testing Best Practices

ETL Data Quality Testing Best Practices

Why do my ETL data quality testing techniques fail to identify bad data?

Bad data costs organizations dearly in correction activities, lost customers, missed opportunities, and incorrect decisions.

How to perform ETL Data Quality Testing to find bad data?

Impact of poor data

  • Poor data can lead to bad business decisions
  • Delays in delivering data to decision makers
  • Lost customers through poor service

ETL Data Quality Testing

ETL Testing Techniques

  • Number of Records Validation
  • Data Completeness
  • Not Null Validation
  • Validate valid values
  • Frequency Distribution
  • Min and Max Validations
  • Duplicate Records
  • Pattern Check
  • Consistency
  • Precision
  • Timeliness
  • Business Rules
  • Data Type Check
  • Size & Length Validation

A firm’s basis for competition . . . has changed from tangible products to intangible information. A firm’s information represents the firm’s collective knowledge used to produce and deliver products and services to consumers. Quality information is increasingly recognized as the most valuable asset of the firm. Firms are grappling with how to capitalize on information and knowledge. Companies are striving, more often sliently, to remedy business impacts rooted in poor quality information and knowledge.

– Kuan-Tsae Huang, Yang W. Lee and Richard Y. Wang

Data Quality Checks for Data Warehouse/ETL

Data Quality Checks for Data Warehouse/ETL

Data should be perceived as a strategic corporate tool, and data quality must be regarded as a strategic corporate responsibility. The corporate data universe is made up of a wide range of databases that are connected by infinite real-time and batch data feeds. Data is an ever-constant movement, and transition, the core of any solid and thriving business is high-quality data services which will, in turn, make for efficient and optimal business success.

However, the huge conundrum here is that data quality, solely on its own cannot improve. In truth, most IT processes have a negative impact on the quality of data. Therefore, if nothing is done, the quality of data will continue to plummet until the point that data will be considered a burden. It is also pertinent to point out that data quality is not a feat that can easily be achieved and after that, you brand the mission complete and heap praises on yourself for eternity. Rather, it must be viewed as a garden that must be continuously looked after.

Data Warehouse testing is becoming increasingly popular, and competent testers are being sought after. Data-driven decisions are termed to be accurate. A good grasp of data modelling and source to target data mappings can assist QA analysts with the relevant information to draw up an ideal testing strategy.

Data Quality Check-Verify Field Data Type and Length

Authenticate source and target fields data type and length.

Data Quality Check-Verify Field Data Type and Length

In the verification pictured above, we have a mismatch of the data type and length in the target table. It is a good habit to verify data type and length uniformity between the source and target tables.

Data Quality Check-Verify Not Null Fields

Authenticate Null Values in a column which features a NOT NULL CONSTRAINT

Data Quality Check-Verify Not Null Fields

When a source has a Not Null Constraint, it should not feature Null Values as demonstrated above.

Data Quality Check-Verify For Duplicate Records

If your data warehouse features duplicated records, then your business decisions will be inaccurate and undependable. Poor data that is marred with inaccurate and duplicate data records will not enable stakeholders to properly forecast business targets.

Data Quality Check-Verify For Duplicate Records

A QA team should ensure that source data files are verified to spot duplicate records and any error in data. Similarly, you can integrate automated data testing by adopting Python to authenticate all data rather than just checking an example or subset.

Data Quality Check-Verify for Orphan Records

A situation where there are child records with no matching parent records, then such records are termed “Orphan Records.” The business should outline Strategic relationship rules amongst parent and child tables.

Data Quality Check-Verify for Orphan Records

Data Quality Check-Verify for Unknown Data

On some occasions, unknown data validation is necessary to ensure that the right changes happened as planned for value encoding. For instance, consider during the transformation process, if there is a logic to encode the value “Male” as “M”, subsequently, the QA team should cross-check the Gender column to ensure that it does not feature a different encoded value.

Data Quality Check-Verify for Unknown Data

Conclusion

Data quality is a broad and complicated field with many aspects. Corporate data universe is made up of different databases, linked in countless real-time and batch data interfaces. Irrespective of the industry, revenue size or its target market, virtually every organization depends on credible data to produce useful information for appropriate business decisions. Data quality can be jeopardized at any level; reception, entering, integration, maintenance, loading or processing.

Thus, efforts must be made to ensure that quality data via Data Warehouse testing/ETL testing is aimed at guaranteeing the production, availability, and use of high-quality data within an organization.