This book will give you a solid foundation of the basics, expose the issues and provide a high-level process for planning and implementing a data warehouse. It is divided into sections, the first three covering people, process and technology.
Section One starts with an overview enterprise IT architectures, how data warehousing fits into the scheme of things, and associated business and technical perspectives. I like the way the authors emphasize business perspectives, which is a consistent thread throughout the book. They use a framework called "InfoMotion", which covers all of the requirements, but (to me) is too wrapped-up in "consultant-speak". For example, they litter this section with nonsense such as "InfoMotion = Information/Data * motion. While it makes perfect sense from a conceptual viewpoint, there is no way to compute it, so why express it as a formula? Parenthetically, data is easy to quantify; measuring information is difficult, but can be done. The motion part of the equation is plain silliness because there is no basis given for measurement. But I am nitpicking here.
You are next introduced to data warehouse concepts. This gives a foundation that is complete and covers all key elements, such as reports, definitions of data warehouse and data mart and operational data stores. I thought this was an excellent introduction. Also included is a brief piece on cost/benefit and return on investment. It was short and hit all of the key points, but would have fit better in the prior discussion of the business perspective.
The next section addresses the people part of a data warehousing project, begining with the project sponsor. Answers to some incisive questions are given in this part, such as "how will the data warehouse affect decision-making processes?", "how will it improve financial, marketing and operations processes?" and similar business-focused questions. These draw your attention to the real reasons for data warehousing. This section moves naturally into project management considerations, and exposes some common problems like defining project scope, underestimating time and project overhead or factoring the operational support issues after the data warehouse is rolled out and in production. One of the best parts of this section is how the authors counter common problems and risks with advice on how to eliminate or mitigate them. I liked the approach to measuring results, which gives some sound key performance indications that you can use to baseline some total cost of ownership drivers after the data warehouse is in production. This section continues with roles and responsibilities of the project team. The authors have crafted a sound team structure that consists of business and technical representatives who are overseen by a steering committee. This is an excellent approach. I thought the inclusion of users from various business domains was one of the key strengths, because these people know the data's value to the business a lot better than the technical side of the team. On the other hand, I thought it was naive of the authors to state that this group would be required 80% of the time during the project. While I fully agree with this estimate, it is nearly impossible in practice. I wish the authors would have shared how they sold the business side on making an 80% commitment of their best and brightest.
As this section moves into the actual project there are some things I loved about their approach: breaking the project into four parallel tracks and the proposed rollout strategy. These give you a good understanding of the scope and magnitude of a typical data warehouse project.
Section 4 covers technology, and gets a little too technical for a business user in some places, but is just right for an IT manager who is not a DBA or data architect. I liked the discussion of metadata, why normalization is not appropriate for data warehousing, and the treatment of fact and dimension tables.
The final section discusses maintenance requirements once the data warehouse is in production. This prepares you for the realities of managing these systems. I wish the authors would have addressed some of the workload and scheduling issues that are a part of the territory - refreshing the warehouse is going to require a fine balancing act that is going to affect maintenance windows, other production jobs and a plethora of other production headaches if not planned for in advance.
Overall this is a good book for the audience I cited above. I strongly recommend anyone considering a data warehouse to also read Improving Data Warehouse and Business Information Quality by Larry P. English.
What did you think of this review?