Data storage testing features a variety of critical activities and automated tools to complete data validation — the process of identifying anomalies during backup and storage processes. These issues can include hacks, data deletion, corruption and malware.
Damage to data and systems can occur at various places along the path to storage. Problems can happen within the system generating the data, applications processing backups, the network connection between data source and data repository, and the storage technology.
Benefits and challenges of data storage testing
The primary benefit of data storage testing is ensuring that the data is exactly as it was prior to backup. Time spent recovering damaged data files or systems is wasteful. Incorrect data can result in loss of customers and even lawsuits. Incorrect shipping data can cause supply chain issues. Malfunctioning systems can also lead to loss of customers and reputational damage.
Key challenges for data storage testing include scheduling tests, finding software that provides optimum data analysis and validation, establishing policies and procedures, setting up service-level agreements that include storage testing, and having staff trained to perform testing and validation. With third-party storage, such as cloud technology, check to see if the vendor provides testing and validation services.
Types of data storage tests
Several data storage testing options provide timely data on the health and operating status of primary storage systems. For Windows systems, the chkdsk function provides a variety of tests on drive health scans and fixes. Next, try using self-monitoring, analysis and reporting technology, or SMART, software to examine the health of storage drives and provide insights on possible problems before they occur.
Examine the status of the system BIOS to identify potential drive issues. Finally, investigate disk diagnostic tools from the primary storage vendor or from third-party vendors.
On the backup side of testing, the easiest and simplest approach is to switch to the storage medium, locate the folder assigned for backup, locate the file or application just backed up and try to access it. If the file is shown on the list of files, that’s the first good sign. Next, try to open the file or launch the application. The users who created the files or developed the application may need to perform these kinds of activities, as they will know if workloads do not look right.
Assuming everything is in order, the next test for file validation is benchmarking. Use software that examines the original and backup items and compares them to identify any anomalies.
Application testing examines metrics such as online transaction processing response times and batch runtimes as compared to vendor specifications. Sample queries are typically entered into the backed-up application to analyze response times in a near-production environment.
Use scripts for data validation
Based on expertise with coding languages, it may be possible to write scripts to perform data validation. The script instructs the system to compare data values and structure against previously defined rules. This can verify that the quality parameters for data have been satisfied.
If a violation occurs, troubleshooting can identify it for remediation. This approach to validation can be time-consuming, which is where software tools are helpful.
Use software for data validation
Software programs are available to facilitate the process of data validation. The software is based on rules and file structures created by the user and provides validation at every point in a workflow, highlighting anomalies for attention and remediation. The following is a list of several data validation software vendors and products:
- RightData
- Xplenty
- iCEDQ
- Informatica
- QuerySurge
- Datagaps ETL Validator
- QualiDI
- Talend