• Votes for this article 4 people voted for this
  • Dashboard Insight Newsletter Sign Up

To V or Not To V
Business Intelligence Gets Virtual!

by Claudia Imhoff, President and Founder, www.intelsols.comMonday, December 15, 2008

Example 3: When to use Physical Data Marts

There are some matters to consider when determining whether a dependent physical data mart is appropriate.

  • A change in technology – The data warehouse is best stored in technology that can handle massive amounts of data with fast load and unload times. These are usually relational databases or some form of data warehouse appliance. Some BI technologies may use their own form of proprietary technology or require a technology different from that used for the data warehouse. If this is the case, then you have no choice but to extract a subset of data warehouse data for the application and load it into a separate data mart for
    usage by the BI technology.
  • Performance reasons – Performance of the BI application can be better if a physically distinct data mart is set up for it. For example, heavy data mining analytics can impact the lighter reporting requests if both are performed within the data warehouse environment. It makes sense then to replicate the data mining data into its own environment where performance can be tailored to the data miners’ specific needs.
  • Security reasons – For some organizations, there may be a need to create a separate data mart or warehouse that contains sensitive data or algorithms. Privacy issues, security worries, or in some cases, even compliance regulations mandate a separate and protected environment.

There are disadvantages to building separate physical data marts as well:

  • Redundancy of data – Building a separate physical data mart means more storage, more ETL processing, redundancy of efforts, redundancy of the data, and higher costs. Also, anytime you have to re-extract data from the data warehouse, there is always the chance that the data mart data will get out of sync with the data warehouse data. There should be an established reason or ROI before this process is performed.
  • More IT maintenance and costs – Obviously, creation of a physical data mart means extra work for the BI team. Any time that the data mart design changes (which it certainly will), the data has to be unloaded, the schema changed, and the data reloaded – not a trivial task. In addition, IT must maintain backup/recovery schemes and archiving media for both the data warehouse and physical data marts. Most physical marts require additional hardware and software licenses, which add to the overall costs of the BI environment.
  • Potential inconsistency – Many times, the dependent data mart is created by a line-of-business IT group or by technologically savvy business users themselves. These groups are generally not under the same constraints as the data warehouse implementers are, and therefore, they may not use the same nomenclature, algorithms, or even formatting as was used in the data warehouse. This often causes inconsistency between their analyses and those performed by data warehouse analysts.
  • Bulk movement of data – Because the data mart is physically distinct from the data warehouse, the data must be transferred via the network to the mart. Any time data must be physically moved, there exists the possibility of problems, e.g., the delivery can be delayed or data may be only partially delivered. Secondly, if the data warehouse is updated more than once a day, it may not be possible to reload the physical data marts with each update.

Data Warehouse / Data Mart Archiving and Extension Integration Examples

Example 4: Combine Data Warehouse (or Data Mart) Data with Archived Data Warehouse Data

Many enterprises now archive older, infrequently used snapshots of data warehouse data to other media. Because the data is not contiguous with the data warehouse, it may be difficult to satisfy one-off queries that span the two data forms. The easiest and fastest way to combine these sets of data is to use virtual data federation technology. Rather than physically (re)loading archived data into the data warehouse, the BI team can quickly create a virtual environment with EII technology. When the query completes, the virtual combination can be just as quickly dismantled.

The advantages and drawbacks to this approach are the same as for a virtual data mart (Example 2). Fortunately there is one other advantage for this virtual scenario: usually very few users require the ability to span long timeframes of data warehouse data so their queries will usually not affect the performance of other users.

Example 5: Combine Data from Multiple Data Warehouses

Another scenario all too common in enterprises today is the existence of multiple data warehouses. The reasons for this are varied – mergers and acquisitions, political or cultural constraints, and security or privacy concerns to name just a few. Inevitably, a request will come up that requires data from these warehouses. Without having to overhaul the entire BI environment, such requests can often be satisfied by using virtual data federation techniques to create a virtual connection between the physical stores.

The advantages are similar to those for a virtual data mart – speed, low cost, no movement of data, and ease of maintenance. However, there is one significant concern to resolve – it is doubtful that the various data warehouses were built using an overarching standard data model. Therefore, they most likely will not have used consistent formats, names, data quality activities, calculations, etc., during their data consolidation processes. If a virtual connection is to be made between these repositories, then these disparities must
be abstracted or easily transformed.

Operational Data Store Integration Examples

The operational data store (ODS) is a sister to the data warehouse, similar in some aspects, but also similar to operational systems in others. Like the data warehouse, the ODS contains integrated, cleansed data for usage in reporting and analytics. But there the similarities end. The ODS contains current – or as current as the technology will allow – data that is updated rather than stored as historical snapshots. Little history is stored in the ODS which, along with the update status of its data, makes it similar to operational systems.

The ODS is often used to create management reports on operations or to perform analyses on current events like campaigns or other sales events, channel activities, inventory statuses, daily sales, etc.

Tweet article    Stumble article    Digg article    Buzz article    Delicious bookmark      Dashboard Insight RSS Feed
 
Previous Page  Next Page
1 2 3 4 5
Other articles by this author

Discussion:

No comments have been posted yet.

Site Map | Contribute | Privacy Policy | Contact Us | Dashboard Insight © 2013