I’ve recently had the opportunity to write about the Logical Data Warehouse(LDW). In the paper I describe three parallel streams rather than a single development path. The development is cyclic. Like the classic data warehouse before it, the logical data warehouse is never finished. That is because it provides a digital representation of pretty much everything that is happening in the business. Like the business itself it is constantly changing. It has to adapt to growth, acquisitions, competition, new product lines, in fact anything the business itself is subject to.
The public summary of the paper can be found here: Solution Path for Planning and Implementing the Logical Data Warehouse. To click through from there you’ll need a Gartner for Technical Professionals licence. The summary gives a brief description and an outline of the topics covered.
The paper describes the three main development styles used for analytics. Each is typically associated with the architectural elements that enable them, as illustrated in the picture below. They are:
- Classic data warehouse development
- Agile development: using marts,virtual marts, sandboxes and data virtualization
- Development of the data lake
Much time and effort has been expended on which is ‘best’ approach. However, stepping back from this we can now see that these are not alternative development styles, but three facets of a bigger whole. Each is appropriate in the right context, and they can be used together to optimize the overall effort and the benefit obtained. That is, optimize return on investment.
During each cycle of development one, two or three of the streams are initiated and the requirements distributed among them. Collections of requirements are steered towards their most suitable mode of development. Each stream will expand a particular part of the architecture, but new servers and components will not be introduced unnecessarily. At the end of each cycle there is a wrap-up where those artifacts developed in one stream may be transferred to another if that makes sense.
For example, it may make sense that a new piece of analysis developed quickly and simply in the agile stream may be firmed up for regular production running by transferring it into the data warehouse. As we cross the boundaries between streams this is a tripwire to remind us to put in place (or remove) the disciplines that are different between streams. Moving that new piece of agilely developed analysis into production prompts us to put in place the necessary checks on data quality, standardized metadata, restart / recovery routines etc.
Similarly, at the start of a cycle, we may make copies of artifacts from data warehouse, to allow them to be rapidly modified in the agile stream. We’ll relax many of the checks and formal processes whilst working in the agile stream. Crossing boundaries between streams reminds us of what we need to do by explicitly switching development styles.
Read Complimentary Relevant Research
Organizing for Big Data Through Better Process and Governance
With big data past the Peak of Inflated Expectations on the Hype Cycle, organizations are addressing next-level challenges and asking,...
View Relevant Webinars
Hadoop and Spark: Understanding Open Source Opportunities and Risks
As companies build foundational data and analytics infrastructure with Spark and Hadoop, the market continues to shift and evolve in...
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.