Validating a Data Warehouse

I recently sent some thoughts on validating a data warehouse to a colleague working with a clinical data system.  Data warehousing is a complex topic, but that has never stopped me before.  I am reproducing my thoughts almost verbatim below.  Feel free to disagree or add to the conversation…

You should think about the extent and thoroughness you need at every level of testing (Unit, Integration, System, User Acceptance, Operational Readiness, etc.).  Depending upon the criticality of the data and how it will be used, you will need to do some or all of this testing.  Not all of the testing, however, needs to be done using formal, pre-approved test scripts; though the approach, including how each level of testing will be documented, should be documented and approved within your Validation Plan by the business, the project team (or IT), and Quality. 

For formal testing, I would recommend you focus on a subset of system level and most or all of the User Acceptance level of testing for OQ and Operational Readiness Testing for PQ.  For IQ, I refer to a webinar I gave recently that talks about aspects of testing.  In general, IQ covers hardware and physical connections as well as software installation verification (versions, patches, etc.).

Regarding validating only the reports, I would think that is not enough.  I assume the data may be accessed through channels other than just reports and that new reports will be generated that may access data not included in previous reports.  There will need to be testing that helps you gain a level of confidence that all of the data that may be accessed is valid and secure.