22

Learn the discipline, pursue the art, and contribute ideas at
www.architecturejournal.net
                                                                        input for better outcomes




Taking Business Intelligence
Beyond the Business Analyst




                                       Increasing Productivity by
 Thinking Global BI:                   Empowering Business Users with
 Data-Warehouse Principles for         Self-Serve BI
 Supporting Enterprise-Enabled                                                  Lightweight SOAs: Exploring
                                       Business Insight =
 Business-Intelligence Applications                                             Patterns and Principles of a New
                                       Business Infrastructure =
                                                                                Generation of SOA Solutions
                                       Business-Intelligence Platform

 BI-to-LOB Integration:                Semantic Enterprise Optimizer
 Closing the Cycle                     and Coexistence of Data Models
Contents                                                 21




Foreword                                                                                             1
by Diego Dagum

Thinking Global BI: Data-Warehouse Principles for
Supporting Enterprise-Enabled Business-Intelligence Applications                                     2
by Charles Fichter
Design principles to support a global data-warehouse (DW) architecture.

BI-to-LOB Integration: Closing the Cycle                                                             8
by Razvan Grigoroiu
A discussion of the necessity of architecting BI solutions that focus on actionable data and
information flow back to LOB applications.

Increasing Productivity by Empowering
Business Users with Self-Serve BI                                                                    13
by Ken Withee
A look at the latest wave of Microsoft tools and technologies that enable end-user self-serve BI.

Business Insight = Business Infractructure =
Business-Intelligence Platform                                                                       20
by Dinesh Kumar
A prescriptive approach to planning and delivering BI services through the business-infrastructure
and business-capability models.

Semantic Enterprise Optimizer and Coexistence of Data Models                                         27
by P.A. Sundararajan, Anupama Nithyanand, and S.V. Subrahmanya
Proposal of a semantic ontology–driven enterprise data–model architecture for interoperability,
integration, and adaptability for evolution.

Lightweight SOAs: Exploring Patterns and Principles
of a New Generation of SOA Solutions                                                                 32
by Jesus Rodriguez and Don Demsak
An exploration of how to address common challenges of traditional SOAs by using more scalable,
interoperable, and agile alternatives.




Sign up for your free subscription to The Architecture Journal www.architecturejournal.net
Founder
Arvindra Sehmi                                                                                                                            Foreword
Director
Lucinda Rowley

Editor-in-Chief
Diego Dagum, Matt Valentine (Guest editor)
                                                                      Dear Architect,
Editorial	Board
Martin Sykes, Gustavo Gattass Ayub,
                                                                      In this, the 22nd issue of The Architecture Journal, you’ll get our coverage on
                                                                      business-intelligence (BI) aspects of which architects like you and me need to be
Gunther Lenz, Reeza Ali, Lou Carbone,
                                                                      aware today.
Alejandro Miguel, Charles Fichter, Danny Tambs,
Bob Pfeiff, Jay Gore, Bruno Aziza, Roger Toren,                           As we did in previous issues, so as to guarantee accuracy, relevance, and quality,
Alejandro Pacheco                                                     we set up a board of subject-matter experts to analyze the problem space—
                                                                      harvesting today’s main topic concerns.
Editorial	and	Production	Services	                                        The articles that you are about to read were chosen based on those concerns.
WASSER	Studios
                                                                      Let’s take a look at them:
Dionne Malatesta, Program Manager
Ismael Marrero, Editor
Dennis Thompson, Design Director                                      •	 Enterprise	BI	strategy—Dinesh Kumar (Microsoft) introduces the notion of
                                                                         business infrastructure, which—together with capability models that are described
                                                                         in previous issues of The Architecture Journal—help organizations not only gain
                                                                         business insight, but act upon it, too.
                                                                      •	 Also, Sundararajan PA et al. (Infosys) propose a semantic enterprise data model for
                                                                         interoperability—adaptable for evolution through its life cycle.
                                                                      •	 Embedding	business	insights	into	our	applications—Traditionally, the final
                                                                         output of BI is considered to be scorecards and reports that are used as strategic
                                                                         decision support. Using implementation examples, Razvan Grigoroiu (Epicor)
                                                                         tells us how to put these outcomes within the reach of line-of-business (LOB)
                                                                         applications.
                                                                      •	 Infrastructure	and	performance—Charles Fichter (Microsoft) explains the design
                                                                         principles for supporting a global data-warehouse architecture, with effectiveness
                                                                         and performance in mind.
                                                                      •	 End-user	and	self-service	BI—BI projects typically fall short in allowing users
                                                                         who have basic experience to handle how results are exposed, without any
                                                                         dependence on IT specialists. Ken Withee (Hitachi Consulting) shows multiple ways
                                                                         to tackle this issue by using facilities that are already available throughout the
                                                                         Microsoft platform.

                          ®                                           As a side topic, outside the cover theme, this issue features an article by MVP
                                                                      Jesus Rodriguez (Tellago) on lightweight SOA implementations, and their patterns
The information contained in The Architecture Journal                 and principles.
(“Journal”) is for information purposes only. The material
                                                                          The reader will also find more valuable BI input in side columns, as well
in the Journal does not constitute the opinion of Microsoft
Corporation (“Microsoft”) or Microsoft’s advice and you               as our second companion series of short videos, which are available at
should not rely on any material in this Journal without seeking       http://msdn.microsoft.com/en-us/architecture/bb380180.aspx.
independent advice. Microsoft does not make any warranty or
representation as to the accuracy or fitness for purpose of any
                                                                          I’d like to finish by thanking the team of subject matter experts who helped me
material in this Journal and in no event does Microsoft accept        complete this challenge. First, I want to thank guest editor Matt Valentine for giving
liability of any description, including liability for negligence      me direction in the makeup of this issue. Also, for its guidance and suggestions
(except for personal injury or death), for any damages or losses
(including, without limitation, loss of business, revenue, profits,   to give final shape to Matt’s ideas, I’d like to thank the editorial board that we put
or consequential loss) whatsoever resulting from use of this          together this time. (You’ll find their names on the left-hand side of this page, which is
Journal. The Journal may contain technical inaccuracies and
                                                                      reserved for this issue’s staff.)
typographical errors. The Journal may be updated from time
to time and may at times be out of date. Microsoft accepts                Enjoy the issue! Remember that you may send any feedback to
no responsibility for keeping the information in this Journal         archjrnl@microsoft.com.
up to date or liability for any failure to do so. This Journal
contains material submitted and created by third parties. To
the maximum extent permitted by applicable law, Microsoft
excludes all liability for any illegality arising from or error,
omission or inaccuracy in this Journal and Microsoft takes no
responsibility for such third party material.                                                                                                Diego Dagum
A list of Microsoft Corporation trademarks can be found at                                                                                  Editor-in-Chief
http://www.microsoft.com/library/toolbar/3.0/trademarks
/en-us.mspx. Other trademarks or trade names mentioned
herein are the property of their respective owners.

All copyright and other intellectual property rights in the
material contained in the Journal belong, or are licensed to,
Microsoft Corporation. You may not copy, reproduce, transmit,
store, adapt or modify the layout or content of this Journal
without the prior written consent of Microsoft Corporation and
the individual authors.

Copyright © 2009 Microsoft Corporation. All rights reserved.


                                                                                                                                                               1
                                                                            The Architecture Journal #22
Thinking Global BI:
Data-Warehouse Principles for
Supporting Enterprise-Enabled
Business-Intelligence Applications
by Charles Fichter

Summary                                                                   Understanding Isolated Enterprise Data, and Accessing It
                                                                          Enterprise architects who are looking to aggregate application
This article focuses on the design principles to support                  data stores into meaningful Multidimensional Online Analytical
a global data-warehouse (DW) architecture, the golden                     Processing (MOLAP) dimensional models are often faced with many
triumph of any successful business-intelligence (BI)                      internal obstacles to accessing source data. These obstacles are
                                                                          often less technical and more business-, legal-, audit-, or security-
application story. It draws from the Microsoft Global
                                                                          sensitive; or overhead is too restrictive, project process, or even
independent software-vendor (ISV) partner experience                      political, as business data can represent “turf” among executives
in designing enterprise BI applications by using                          and divisions. Some of the obstacles are technology constraints such
Microsoft platform technologies and contains external                     as noncompatible or proprietary solutions, legacy file formats, and
links and references to public content that delves                        nonrelational or unstructured data. But as vendor tools (especially
                                                                          enhancements in Microsoft SQL Server 2008, particularly with
deeper into the design topics that are covered.
                                                                          Microsoft SQL Server Integration Services [SSIS] capabilities) and
This article assumes that the reader has some basic DW                    service oriented–architecture (SOA) technologies advance (for
understanding of a dimensional store, the underlying                      example, adoption of WS* and other open connectivity standards),
fact tables in which columns are known as measures,                       this is becoming far less of an issue.
                                                                              However, many BI projects are stalled and/or eventually killed
dimension tables in which columns are known as
                                                                          because of a failure by the team to understand accurately what data
attributes, and how schemas take on star and snowflake                    was required, and how to access it successfully and make it usable.
patterns. There are many available resources to provide                   Usability is a key concept. How do you take a dozen columns (with
this overview; however, if needed, a concise overview                     names such as “xtssalescongproc”) and consolidate them in a central
can be found here: http://www.simple-talk.com/sql                         fact table that has readable column names, so that end users can
                                                                          leverage self-service BI technologies in the future?
/learn-sql-server/sql-server-data-warehouse-cribsheet.
                                                                              The following are a few general tips to help avoid the pitfalls of
This article focuses also on successful DW project                        navigating access to isolated data:
strategies and advanced topics of effective design for
performance.                                                              1. Establish strong executive sponsorship early. The success of
                                                                             your project will be determined by how deeply and broadly across
                                                                             enterprise stores you have executive mandate. Asking for access is
Introduction                                                                 merely 1 percent of the effort. You might be incurring significant
Architects who are looking to solve enterprise BI solutions are often        time and costs across divisions—often, potentially affecting
enticed by packaged software applications that are able to fulfill           their service to customers to grant the analysis, access, and/or
executive requests for effective reports that reveal deep analysis of        aggregation that you might be asking of them. In addition, does
business performance. Microsoft and its vast array of ISV partners           that division truly understand their own data? How much time
have made significant inroads into fulfilling a vision for easing the        are you asking them to analyze and assess even what data and
burden of generating the BI dashboards that all executives dream             capacity they have to provide for you? Do not underestimate the
of—providing them with up-to-the-minute results of their business            importance of executive sponsorship or the potential cost of time
strategies and the ability to drill down into specific areas. Too often      and resources that you might be asking across other high-value
left unsaid, however, is the larger 90 percent effort that is left to        data stores and the people who manage them.
the supporting architect and IT force behind the glamorous UI:            2. Does anyone truly know the data? This might seem like an
how to get the data; scrub and aggregate it effectively; and design          obvious question; but, as we have done more of these efforts,
appropriate, manageable, and performant dimensional stores,                  it never ceases to surprise how little enterprise customers often
including ad-hoc query support, remote geography replication,                know about their own data. Many ambitious BI projects are halted
and even data marts for mobile decision-maker support with ever-             quickly, with a realization that first a project team must perform
increasing volumes of dimensional data.                                      a full analysis of all enterprise data stores, which can often take




2
                                                                                                             The Architecture Journal 22
Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications

   months. Simply looking at tables can befuddle architects who                  savings, less network saturation, and performance benefits to a
   are new to the data source, as column names do not inherently                 decentralized approach.
   describe the data that they contain. Often, applications and stores
   have merely been maintained and not enhanced over the years,              Consider Approaching Data Warehousing in Phases
   and the intent and design of data stores is tribal knowledge that         Many BI and DW projects begin with ambitions to design and
   was lost long ago. Can you effectively look at a 500-plus-table           build the world’s greatest aggregated, dimensional store ever, with
   database and understand every relationship without                        the intention of replicating subsets of the dimensional stores to
   understanding the minutiae of every application that utilizes             geographies (the notion of data marts). However, is this approach
   the store? Using advanced vendor tools and ample time, perhaps.           always necessary or even feasible? Nearly every DW and BI project
   The devil is in the details, and the strength of your dimensions and      underestimates the investment of time and resources that it takes
   attributes later depends on your understanding of the raw data            to design aggregation and scrub, build, and/or replicate data to
   sources that are at the base of your aggregation.                         independent MOLAP stores to support global enterprise needs.
3. Understand local priorities, and consolidate. The highest                     The maturity and skill of each enterprise to deliver BI solutions
   BI reporting demands are often local/regional in nature (by               can vary greatly, and breaking up the effort into smaller, agile-based
   country/trading domain), which begs the question: Do you truly            deliveries can help your teams gain the experience and expertise that
   need a giant, aggregated DW store immediately? Or can you more            are needed to understand how to deliver against the larger, longer-
   effectively build a distributed warehouse and BI strategy, focusing       term objectives. Remember the 80/20 rule: Most of the time, you can
   on local/regional needs along with corporate? This is explored in         deliver 80 percent of the required features from 20 percent of the
   more depth in the next section, as there might be significant cost        effort. Consider a simplified approach in phases, as shown in Figure 1.




Figure 1: Proposed global BI/DW phases




                                                     Phase ratio: Complexity             BI depth/capabilities          Cost       Delivery time

                     Phase 1                                        Phase 2                                                Phase 3
 Dynamic, without published MOLAP store:            Local geography, published MOLAP store:             Global DW (traditional, DW-centric model):
“Show me what is happening in my business            “Show me historical performance in my               “Show me trends and predictive future.”
                  now.”                                            business.”




                             or
                                                                                                                                       or
                                                                                Application & MOLAP
                                                                                DW server; local                    Local application &
   Application server (such                                                     geography (such as                  reporting server
   as Office SharePoint with                                                    Office SharePoint
   SSRS & SSAS)                                                                 with SSRS & SSAS)
                                                                                                                               MOLAP
                     (Can be                        OLTP *
                                                                                              MOLAP
                     physically local,                                                                   OLTP *
                                                                                                                         Data marts
                     central, or a
                                                     Aggregate extract pulled to build                                   published to
                     combination)
                                                     MOLAP dimensional store                                             geographies



                                                             Further summary aggregation
                          OLTP database with
                                                             posted to serve global                    Aggregate
                          an analysis &/or
                                                             enterprise; centralized DW                extract pulled
                          reporting engine
                                                             (possible future phases 2 & 3)            to build
                                                                                                       MOLAP
                                                                                                       dimensional
           OLTP *                        OLTP *
                                                                                                       store


 * For visual simplification, OLTP represents all
 disparate application data sources.




                                                                                                                                                     3
       The Architecture Journal 22
Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications

Phase 1                                                                    intense mining to address BI concerns such as, “Show me historical
If BI reporting needs are infrequent enough, leave as much source          performance.” To begin here, it is essential that you critically analyze
data in place, and build multidimensional views through a distributed      how much aggregate data replication truly is needed to a centralized
caching strategy to deliver a subset of business solutions quickly. Can    store. Can you more effectively populate local MOLAP stores to
you merely query efficiently by using advanced vendor tools against        represent regional business-analysis needs? When you examine
existing application stores, even across competing platforms? This is      regional needs versus larger corporate needs, you will likely find
an important decision point that you will face early on. This approach     that more of the deeper, drilldown BI reporting needs are local/
can address BI concerns such as, “Tell me what is happening in my          regional in nature. Enterprise-wide BI queries tend to be broader,
business now.” This strategy can be most effective with noncompatible      trend based analysis that can be supported by summary aggregations
vendor technologies, or when strong divisional boundaries exist            from smaller dimensional stores. Cheaper, inexpensive MOLAP stores
between data stores. It is far more efficient (in terms of person-hour     can be effectively maintained by local resources and greatly reduce
commitment from your team), and you will be able to deliver solutions      the complexity of a central warehouse design, as well as mitigate
quicker to the business units.                                             potentially large network congestion and replication needs.
    While this might result in longer wait time for users, using tools         Consider beginning your efforts in a smaller, less-strategic division
such as SSIS can help you build packages to retrieve aggregate and         of the company or geography to test your team’s ability. This design
clean large quantities of data (even from non-Microsoft technology         approach is almost an inverse of the traditional DW and downstream
stores), build the dimensional relationships in memory cache, and          data-mart approach: Instead, the smaller, regionalized MOLAP
present them to requesting applications via Microsoft SQL Server           stores become aggregation feeds to a larger, further aggregated,
Analysis Services (SSAS). In the past, this could be practical only        summary DW store for broader trending analysis. Although business
for relatively small quantities of data; however, database vendor          trends are pushing highly globalized patterns, the need for in-depth
optimizations are making this scenario a real choice.                      regional mining is increasing, too; and relying solely on a centralized
                                                                           DW pattern can require a massive investment in both physical and
Phase 2                                                                    people resources to maintain and might prove overly costly, fraught
Whenever, possible, build and keep the dimensional stores locally.         with performance challenges, and overly cumbersome to enhance or
With large volumes of aggregated data, companies can begin more            change.




Data-Integration Strategy                                                  to integrate and store critical subject areas. The database structure
by Derek E. Wilson                                                         should be abstracted by using an enterprise integration pattern that is
                                                                           known as the canonical data model. This model requires all incoming
Today, enterprises collect and store data easier than ever in a            data to meet a user-defined pattern. For instance, an enterprise BI
variety of applications. Many of these applications are inside the         system might require the following fields:
firewall, while some are in the cloud and others in remote locations.
All too often, application data is not collected and used across an           First Name, Last Name
organization for consistent analysis and business decisions. The ability      Address
to collect and analyze this data can benefit your company, if it is           City, State, ZIP Code
treated as an organizational asset.                                           Phone Number
    To create an enterprise business-intelligence (BI) architecture, you      E-mail
must identify the core subject areas of your businesses, such as the
following:                                                                 Source applications likely store other information, such as mobile
                                                                           number, gender, and age.
    Customer                                                                   BizTalk Server can be leveraged to receive messages from various
    Product                                                                applications and write the information in the appropriate database
    Employee                                                               tables.
    Inventory                                                                  When the data has been collected, it can be stored in an online
                                                                           analytical processing (OLAP) cube and then presented to business
When these have been identified, you must further define what              users for decision-making. The process of loading and adding
attributes should be collected for an enterprise view. The fields that     calculations to a cube allows everyone in the business to leverage
you define will be the backbone of your enterprise BI platform. For        the work that is done to create value from the data. As users access
instance, if a customer relationship management (CRM) system will          the cube, they get consistent answers to queries; and, when new
allow you to capture data that shows that a customer has three             calculations are requested, everyone benefits from the additions.
children, must this piece of information be migrated to the enterprise         By identifying, integrating, and creating central OLAP stores,
BI system, to let everyone in the organization leverage it to make         an organization can leverage data as an asset across the company.
better business decisions? Knowing what attributes you must store for
the collective good of the organization will enable you to begin data
integration.                                                               Derek E. Wilson is a BI Architect and Manager in Nashville, TN. Visit
    By leveraging Microsoft SQL Server and Microsoft BizTalk Server,       his Web site at www.derekewilson.com.
an enterprise BI and data-integration strategy can be developed


4
                                                                                                               The Architecture Journal 22
Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications


Figure 2: SSAS Designer—Dimensional modeling in BIDS




Phase 3                                                                      The power of the tools that are available to you at design time
Build and populate a traditional, independent, centralized DW of             can greatly affect the strength of your models, assist visually with
your dreams to reach all of the more ambitious BI needs of your              overcoming the complexity of the relationships, and reveal potential
company. This approach will address the harder BI concerns such as           bottlenecks, poor query structure, and ineffective mining semantics.
the ever-elusive BI goldmine, “Predict future results,” which can be         Through the use of the SSAS designer within Business Intelligence
accomplished only by analysis of trends across often voluminous,             Design Studio (BIDS), the architect is given a comprehensive set of
company-wide historical data.                                                tools for designing and optimizing dimensional stores and queries
    While historical trending and data mining can be performed               against those stores (see Figure 2).
across geographies (read, utilizing or aggregating further from                  Listed here are a few key DW principals to remember when you
Phase 2 [or even Phase 1] repositories), to get the raw reporting            are designing your dimensional models to maximize performance
and drilldown-supported dashboard experience against very large,             later (more comprehensive articles on this subject and advanced
corporate-wide historical data, a centralized DW implementation              SSAS design can be found at http://technet.microsoft.com
most likely will be the most effective choice. However, many successful      /en-us/magazine/2008.04.dwperformance.aspx and at
BI projects will likely find a blend between the Phase 2 and Phase 3         http://www.ssas-info.com/analysis-services-papers/1216-sql-server
approaches.                                                                  -2008-white-paper-analysis-services-performance-guide):

Designing Effective, Performant,                                             1. The overwhelming majority of MOLAP data will grow in your
Maintainable Dimensional Storage                                                fact tables. Constrain the number of measures in your fact tables,
As data warehousing has evolved, what once was a static strategy                as query processing is most effective against narrow-columned
of replicating large, read-only stores for reporting has become a far           tables. Expand the depth of attributes in the supporting dimension
more dynamic environment in which users are given expansive powers              tables. The benefit of breaking dimension tables into further
such as building their own ad-hoc queries, self-service reporting               subdimension tables, when possible (snowflake pattern), is hotly
(using tools such as PowerPivot, previously codenamed “Gemini” and              debated, although this approach generally gives more flexibility
an extension of Office Excel 2010 that will be available in the first half      when one considers scale-out models and utilizing indexing and
of 2010 and enables users to pull down massive dimensional data to              performance-enhancing technologies such as partitioning.
the tune of 100 plus–million rows for real-time, cached pivoting), and       2. Implement surrogate keys for maintaining the key relationship
even write-back capabilities directly into the dimensional stores.              between fact and dimension tables, instead of enforcing


                                                                                                                                                    5
      The Architecture Journal 22
Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications

   foreign-key constraints, which can often manifest as compound           5. Implement a clustered index for the most common fact
   keys that cover several columns. A surrogate key is an integer-            table surrogate keys, and nonclustered indexes upon each of
   typed identity column that serves as an artificial primary key of the      the remaining surrogate keys. As per the previous item #4, the
   dimension table. This approach can minimize storage requirements           addition of ad-hoc query support can greatly affect your indexing
   and save storage/processing overhead for maintaining indexes.              strategy; however, for overall common-usage patterns, this
3. While OLTP stores traditionally are highly normalized, this                indexing strategy has proven efficient.
   is far less important for dimensional stores. Denormalized data         6. Use partitioned-table parallelism. Because of the growth of
   has certain advantages in extremely large stores—namely,                   fact tables over time, most DW architects implement a partition
   the reduction of joins that are required in queries. In addition,          strategy (breaking up the fact table over physical storage devices).
   database products such as SQL Server 2008 utilize a highly                 This is most commonly performed by using a date column, but
   optimized bitmap filtering (also known as “bloom filter”) that             it can be performed as a range that is based on usage patterns
   eliminates largely redundant and irrelevant data from query                (supported by SQL Server 2008). SQL Server 2008 implements
   processing during star joins.                                              a new partitioned-table parallelism (PTP) feature that highly
4. Prototype, prototype, prototype. Your physical design might                optimizes queries over partitioned tables by executing queries in
   work well for initial static-reporting needs; however, as users            parallel across all available multicore processors. For more detailed
   become accustomed to the new data that is available to them,               information on PTP and other new DW enhancements in SQL
   and as their expertise grows, you will need to support ad-hoc              Server 2008, visit http://technet.microsoft.com/en-us/library
   querying in a substantial way. Ad-hoc queries have the capability          /cc278097.aspx.
   to create explosive data sets, depending upon the structure of          7. Implement a compression strategy. Over time, data warehouses
   your historical modeling within your dimensions. Most database             can become huge. Overhead for compression can often be
   products support test/modeling partitioning. Spend ample time              overcome by the reduction in redundant data—thereby, reducing
   understanding the overall impact to your environment (including            query processing time, as well as providing maintenance benefits
   indexing/maintenance) when ad-hoc support is considered.                   for size of data storage. However, the general strategy remains to
   Implementing self-service BI products such as PowerPivot, instead          implement compression on the least-used data. Compression can
   of open ad-hoc query support, can greatly ease demands on the              be applied in many different ways, including at partition, page,
   server by delivering large chunks of dimensional data directly             row, and others. As per item #4, the most efficient pattern will
   down to the client for ease of manipulation by the user for                require extensive prototyping in your model.
   drilldown.



Figure 3: SSIS data-flow representation in BIDS




6
                                                                                                              The Architecture Journal 22
Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications

Consolidation, Aggregation, and Population of the Stores                     Fortunately, data replication has become a seamless art, thanks to
Fortunately, when you have determined what data to access and                improvements within database vendors. Management of replication
designed your data-warehouse topology, aggregation and population            partnerships, rollback on failures, and rules on data-consistency
is becoming a more simplified task, thanks to advancements in                violations all can be handled effectively within the replication-
vendor tools. Using tools such as SSIS within BIDS (which ships with         management console. Together with significant performance
SQL Server Enterprise edition), an architect can build a “package” that      enhancements of both the replication engine and physical data size,
can fetch, manipulate/clean, and publish data. SSIS comes with data          global enterprises can rely upon the SQL Server 2008 toolset to meet
connectors that enable the architect to design packages to access all        even the most demanding data-warehouse and downstream data-
the major database platforms (including Oracle, IBM, Teradata, and           mart strategies.
others) and a comprehensive set of data-manipulation routines that               But what about those power users who demand offline BI analysis
are conveniently arranged in a visual toolbox for easy drag-and-drop         in a mobile fashion? Using the Microsoft platform, you can deliver
operation in a visual data-flow model (see Figure 3 on page 6).              powerful BI even in a disconnected paradigm by utilizing SQL Server
    SSIS allows for highly sophisticated packages, including the ability     CE or Express Edition on the client. Microsoft has worked with dozens
to loop through result sets; the ability to compare data from multiple       of Global ISVs that have designed application suites that utilize large
sources; nested routines to ensure specific order or compensation/           dimensional cube data on the client in a disconnected mode. You
retry rules when failure occurs during the data-collection exercise          can establish replication strategies within the client-side database
(for instance, the remote server in Central America is down for              for when the mobile user is connected, or application designers
maintenance); and sophisticated data transformation.                         can implement Synchronization Services for ADO.NET and manage
    While SSIS can be compared to most traditional extract, transform,       data replication that is specific to the workflow needs within the
and load (ETL)–type tools, it offers a far richer set of features to         application. For more information, visit http://msdn.microsoft.com
complement the dynamic MOLAP environment of a DW that is                     /en-us/sync/default.aspx.
implemented with SQL Server 2008 and Analysis Services (for instance,
sophisticated SSIS packages can be imbedded within the DW and                Conclusion
called dynamically by stored procedures to perform routines that are         Enabling BI solutions for your enterprise first requires a significant
dictated by conditional data that is passed from within the Analysis         investment into an advanced understanding of the corporate data
Services engine). Many Microsoft ISVs have even built sophisticated          within your access. For querying extremely large data volumes that
SSIS package–execution strategies that are being called from within          likely include historical references (data over time), dimensional
Windows Workflow Foundation (WF) activities. This gives the added            storage models such as data-warehouse design patterns (including
possibility of managing highly state-dependent (or long-running)             MOLAP and others) have proven the most efficient strategy. For
types of data-aggregation scenarios.                                         more detailed guidance in implementing your DW and BI solutions
    For a more detailed overview of SSIS, visit                              by utilizing SQL Server 2008, visit http://www.microsoft.com
http://download.microsoft.com/download/a/c/d/acd8e043-d69b                   /sqlserver/2008/en/us/white-papers.aspx. In addition, for real-world
-4f09-bc9e-4168b65aaa71/ssis2008Intro.doc.                                   scale-out experiences, visit the best-practices work that has been
                                                                             done by the SQL CAT team at http://sqlcat.com/.
Highly Distributed Data, Global Geography, Mobile-User
Data Marts, Ad Hoc Query, and Data-Synchronization
Considerations                                                               About the Author
Perhaps the most daunting challenge for enterprise architects who are        Charles Fichter (cfichter@microsoft.com) is a Senior Solution
designing a DW solution that supports BI applications is to understand       Architect within the Developer Evangelism, Global ISV (Independent
the potential impact of a design upon the physical network assets. Will      Software Vendor) team at Microsoft Corporation. A 20-year veteran
the environment be able to sustain replication of massive amounts of         software and database architect, Charles specializes in business
dimensional data on ever-increasing historical data?                         intelligence, Microsoft SQL Server, and Microsoft BizTalk Server, as
    A key to success is utilizing effective read-only snapshot replication   well as general .NET middle- and data-tier application and database
of predetermined aggregated subsets. This means a careful analysis of        design. For the past four and a half years, Charles has focused on
the needs of each independent geography and the use of advanced              assisting Global ISVs with their application-design strategies. Other
features within database products, such as extensive data compression        recent publications by Charles include the following:
and “sparse” attributes on columns. SPARSE is a new column attribute
that is available in SQL Server 2008 and removes any physical data size      •   “Business Intelligence, Data Warehousing and the ISV”
to null values. Because data warehouses are typically full of enormous            (white paper)
amounts of null field values (not every shoe from every store in every       •   “Microsoft Manufacturing Toolkit” (video)
region is purchased every day), the compression and removal of the           •   “Emery Forwarding: Freight Forwarder Realizes a 102 Percent ROI
physical size of null value fields is essential. Many traditional data-           with BizTalk Server” (case study)
warehouse products do not have this capability, and SQL Server
2008 has many performance advantages for the replication of large
volumes of dimensional data.
    Another effective strategy is to grant write-back and ad-hoc              Follow up on this topic
query capabilities judiciously. These features by necessity create               •   SQL Server Integration Services (SSIS):
larger overhead in design to support downstream geographies                          http://msdn.microsoft.com/en-us/library/ms141026.aspx
and can greatly increase replication requirements and the possible               •   SQL Server Analysis Services (SSAS): http://www.microsoft.com
reconstitution of the aggregated dimensional stores (a very expensive                /sqlserver/2008/en/us/analysis-services.aspx
operation, as data volumes increase in size).                                    •   PowerPivot: http://www.powerpivot.com/


                                                                                                                                                       7
      The Architecture Journal 22
BI-to-LOB Integration:
Closing the Cycle
by Razvan Grigoroiu




Summary
                                                                            BI solutions must be designed with actionable information in mind
The business value of business intelligence (BI) will be                    and must offer a better integration with LOB applications that will
attained only when it leads to actions that result in                       execute the action.
increased revenue or reduced cost. This article discusses
                                                                            Agility Through Actionable Information
the necessity of architecting BI solutions that focus on                    A classical BI solution would consist of four main logical layers: ETL,
actionable data and information flow back to line-of-                       a data warehouse, OLAP cubes, and presentation (that is, analysis
business (LOB) applications.                                                tools).
                                                                                Data flows through these four layers in a way that is similar to the
Introduction                                                                following: A set of ETL jobs run periodically to gather information
While the final output of business intelligence (BI) is traditionally       from LOB data sources such as ERP systems and other applications
considered to be scorecards and reports that are used as strategic          that service a business need or their underlying transactional
decision support, we will see that this is no longer enough. BI can take    databases.
a greater role in ensuring that the value of information is harnessed to        Data is transformed according to the need of the particular
its potential.                                                              BI implementation and loaded into a data warehouse (that is,
    The value of information is quantified by the action that is taken      a database that is modeled in a denormalized star schema that is
following analysis and its result. In order to maximize the value,          optimized for a decision-support perspective).
BI applications must streamline the decision and action processes.              From there, it is then stored in an analysis-friendly
    The business value of BI information is attained only when it results   multidimensional structure such as OLAP cubes.
in a business operation that has the effect of increasing revenue or            On the presentation layer, Microsoft Office Excel is one of the
reducing cost.                                                              more popular analysis tools that is used today. It offers a well-known
    A BI report might indicate increased demand for a particular            user interface that can be utilized by users who occupy different roles
product; but this information has value only if it leads to placed          in the enterprise—from executives and business analysts to buyers
purchase orders and distributions to stores, so that sales figures go up.   and other operational staff.
    Such transactional operations are executed and managed with                 Using Office Excel 2007 and later releases, users can browse their
the help of line-of-business (LOB) applications, such as Enterprise         data that is available in OLAP cubes by using pivot tables to get an
Resource Planning (ERP) systems. BI solutions that implement a better       overview on how the business is performing.
integration to LOB systems will ultimately provide the enterprise with          Potential problems or opportunities can be highlighted by using
a better return on investment (ROI).                                        color schemes, and the user can drilldown analyze the particular
    Recent advances in hardware and software technologies, such as          information and conclude that an action must be taken.
Microsoft SQL Server Integration Services (SSIS), allow up-to-date and          Such an action is usually an operation on the LOB application from
quality data to be summarized and made available to the information         which the transactional data was loaded. Users might switch then to
user in near-real time.                                                     the LOB application and perform the required business operation.
    The times when it took many person-days of IT staff effort to           In most cases, however, this is tedious work and disconnects from the
gather such data for presentation in quarterly meetings are now             analysis context. In many cases, when it comes to this part, users have
things of the past.                                                         expressed a desire for an automated solution.
    It is not uncommon for extract, transform, and load (ETL) processes         In order to streamline decision-making processes, decrease
to collect data from LOB applications and underlying transactional          response time, and track the causality of such operations, this action
databases on an hourly basis or even more frequently.                       can be started from within the decision-support context and triggered
    There is available a wealth of near-real-time information that can      from the analysis tool (Office Excel, in this example).
be utilized by LOB users to streamline operational processes.                   This step closes the data cycle from BI back to LOB applications
    Because of this, today’s BI solutions can take a more active role       (see Figure 1 on page 9) and encapsulates decision-making logic.
in the operational space by offering a closer integration back to               The next generations of BI applications can be designed in such a
LOB applications. Information should be presented in an interactive,        way that users can act on the information easily. This will bridge the
actionable way, so as to allow users to act on it.                          gap between BI analysis and operational processes, and will lead to
    This article argues that in order to attain the business value of       more cohesive enterprise solutions.
BI information, the final output of BI must be an action that triggers          Information utilization is maximized, and BI can become in this
a business operation on LOB applications. To achieve this goal,             way a driver for business-process optimization. Actions that are taken

8
                                                                                                               The Architecture Journal 22
BI-to-LOB Integration: Closing the Cycle


Figure 1: LOB-BI-LOB data flow


            Business intelligence                                                                                             Operations




                        Intelligence as actionable information                          Action—Closing the cycle
                                                                         `


                           OLAP cubes                      Information analysis
                          SSAS with URL                        Office Excel                                              Line-of-business
                             actions                                                                                       applications




                                                                                                                                Transactions
            Data load




                                                                             ETL—SSIS




       Data warehouse                                                                                                    Line-of-business
         SQL Server                                                                                                   transactional Database



based on BI information can be better tracked; effects can be better              Whenever we are faced with data integration between different
analyzed, which will ultimately lead to better business performance.              technology layers, and loose coupling becomes a prime concern,
    BI solutions that provide actionable information allow companies              a service-oriented approach comes naturally to mind as an
to stay agile with business changes by reducing the time between the              architectural solution.
moment that a business event occurs and when an action is taken.                      In this case, a service-oriented architecture (SOA) can represent
                                                                                  the glue between the different components that must interact, and
Actionable information in this sense must contain two parts:                      can provide the governing concepts of such interaction.
                                                                                      But, when it comes to service orientation and BI, where are we
•   The condition in which an action is required (for example, on-hand            today?
    is below a minimum level).
•   An action that defines the response when the condition occurs                 Actionable Information and SOA
    (for example, perform an operation such as order items to restock).               An SOA calls for different functional elements to be implemented
    It must specify how the business action can be invocated and                  as interoperable services that are bound only by interface contracts.
    where it can be reached (the endpoint).                                       This allows for a flexible, loosely coupled integration. Different
                                                                                  components in such architectures can evolve naturally over time, or
Based on the principle of separation of concerns—and to allow a                   they can be exchanged without affecting the solution as a whole.
layered approach between a data tier (such as an OLAP back end) and                   While LOB systems have been at the forefront of SOA adoption,
presentation tier (that is, an analysis tool)—the actionable information          BI products have moved more slowly to support such architectures.
belongs on the data tier.                                                             Traditionally BI-technology vendors have offered monolithic
    Actions in this case would need to be consumer-agnostic and                   OLAP solutions that use analysis capabilities that cannot be extended
triggerable from any analysis tool.                                               with actionable information to simplify end-to-end BI-to-LOB
    Furthermore, the action implementation represents a business                  integration.
operation and, as such, is governed by LOB-specific concerns and                  In recent years, however, we have seen some solutions open up and
business rules. Therefore, it must be executed by the corresponding               offer more interoperability features that bring us closer to service
LOB application.                                                                  orientation.



                                                                                                                                                          9
      The Architecture Journal 22
BI-to-LOB Integration: Closing the Cycle

With SQL Server Analysis Services (SSAS), Microsoft provides a feature   the Web application can collect more information from the user,
that is called “URL actions” and represents a mechanism to store         depending on the action.
actionable information inside the OLAP cube metadata. Conditions             Such a solution can leverage advances in rich Internet application
in which the action becomes available can be expressed by using the      (RIA) technologies such as Ajax or Microsoft Silverlight.
MDX query language, and the URL provides an endpoint.                        A Silverlight front end has the advantage over classical Web
    Such capabilities in OLAP technologies can be utilized to            applications: Code that is written in a CLI language such as C# will
implement an end-to-end integration back to LOB applications in          run on the client in the browser application, which minimizes cross-
an SOA.                                                                  computer calls to the server.
    In an enterprise SOA, operations on LOB applications can be              Silverlight is built on the .NET framework and can utilize Windows
exposed as Web services that hide the complexity of underlying           Communication Foundation (WCF) as an out-of-the-box framework
programs and, together with any external Cloud or B2B services,          for SOA integration. WCF provides a flexible API to implement SOAP
can represent basic components in an infrastructure that models          Web service calls from the client process.
business needs.                                                              Consequently, Silverlight can be utilized as a bridge between SOA
    Office Excel has the native ability to consume the actionable        and BI.
information by displaying to the user any available actions on cells         The cell information (current members), together with any
in which the predefined conditions are met. The caption that will be     additional information that is collected from the user, can be sent to
displayed can also be defined dynamically by using MDX expressions.      the Web service that implements the business action; and, at the end,
    Each cell of the analyzed cube is defined by one member (called      any service feedback can be displayed to the user.
the current member) from every dimension.                                Data flow between any such services can be orchestrated and
The URL can be defined in the cube to include the current members as     customized to fulfill the need of business operations.
a query string.                                                              Using Web services to expose operations on LOB applications in
When the user performs the suggested action on a cell, Office Excel      an SOA is a way to utilize existing software assets better and increase
will call the Web application that the URL has located. If necessary,    their participation in enterprise information flow. This also makes




Guerrilla BI:                                                               Clear objectives and short deployments bring focus. Also, teams
Delivering a Tactical Business-Intelligence Project                         must collaborate with executives from each business function to
by Andre Michel Vargas                                                      ensure that GBI efforts align with overall business strategy.
                                                                         4. Leverage and extend your existing BI tools. Leverage
     “If ignorant both of your enemy and yourself, you are certain to       existing BI technologies with agile implementation techniques
      be in peril.”                                                         to deliver value quickly. Then, expand on traditional BI tools
                      SUN TZU
                                                                            (data warehouse, ODS, dashboards, analytics, and so on)
Business intelligence (BI) offers a distinct competitive advantage. By      with productivity-monitoring and optimization tools. A small
transforming data into knowledge through BI, you can determine your         investment in these assets will allow your GBI team to analyze and
strengths and weaknesses while better understanding your position           improve processes and support cost-reduction goals.
in the marketplace. Whether it’s cost reduction, process optimization,
increasing revenue, or improving customer satisfaction, it’s more           “ Strategy without tactics is the slowest route to victory. Tactics
important than ever to run your business better and faster. Knowledge         without strategy is the noise before defeat.”
is critical to being agile and adaptive in a changing economy; a 1                            SUN TZU
percent head start can make all the difference. However, successful
BI initiatives are traditionally lengthy and expensive. So, how can      Implementing a GBI strategy can provide a key advantage in today’s
you affordably implement a winning BI initiative? Guerrilla business     fluctuating economy. However, do not lose sight of the long term.
intelligence (GBI) delivers the insight that you need, in weeks.         Investing in enterprise BI remains essential. When signs of growth
                                                                         reappear, revise your BI strategy, including guerrilla tactics, for
The GBI approach is the following:                                       optimal success and return on investment (ROI).

1. Mobilize with speed. Reducing decision-making time is                    “Know thyself, know thine enemy. A thousand battles, a
   critical. This doesn’t mean that decisions should be made hastily;        thousand victories.”
   instead, speed requires preparation. In the first week of the GBI                          SUN TZU
   deployment, teams must be embedded with business subject-
   matter experts—thus, allowing teams to identify and address
   hot spots.
2. Deploy small multiskilled teams. Form crossfunctional teams           Andre Michel Vargas (andre.michel.vargas@paconsulting.com) is
   to interpret your data, to gain insights on operational weaknesses    a management consultant in PA Consulting Group’s IT Consulting
   and strengths. Teams can focus on anything from cost-reduction        practice. He specializes in solving complex information-related issues,
   efforts and process-efficiency optimization to market analysis for    including legacy-systems migration and integration, business-process
   revenue opportunities.                                                automation, and delivery of enterprise BI. For more information on
3. Infiltrate iteratively and collaboratively. Infiltrate iteratively    PA’s thinking around BI, please visit www.paconsulting.com/our-thinking
   through your business functions in four- to six-week deployments.     /business-intelligence-solutions.

10
                                                                                                              The Architecture Journal 22
BI-to-LOB Integration: Closing the Cycle

                                                                                                       Taking a more operational role will also result
Figure 2: Technology stack                                                                             in higher availability requirements for BI.
                                                                                                       Information must be available, regardless of
                                                                                                       potential hardware or software outages that
                                                                                                       could affect a server.
                                                                                                           Data refreshes during ETL processing will
                              Line-of-business          Cloud services                                 have a more pronounced impact in this case,
                                applications
                                                                                                       because it can affect user queries and delay
       SOA                                                                                             operational processes. Therefore, failover
                                                                                 B2B services
                                                                                                       strategies will need to be higher on the
                                                                                                       priority list than they are during the design
                                                             SOAP
                                                                                                       of classical BI solutions.
                                                                                                           One solution to increase scalability and
                                                                                                       availability is to use Windows Network Load
                                            Microsoft Internet Information Services                    Balance (NLB) to distribute user requests
                                                                                                       across different SSAS instances that are
BI-to-SOA bridge                                        Silverlight RIA                                implemented on different cluster hosts.
                                                 Out-of-the-box WCF support
                                                                                                       NLB will detect a host failure and accordingly
                                                                                                       redirect traffic to other hosts. Scheduled
                                                                                                       outages such as ETL processes can be
   Analysis and                                          Office Excel                                  mitigated by using large-scale staging
    decision                                             Analysis tool                                 systems, to keep OLAP data available all
                                                                                                       the time.
                                                                                                           Data volume will increase, because
                                                                                                       business operations are performed at a
                                                 SQL Server Analysis Services
                                                 OLAP cubes with URL actions                           lower level of information aggregation
     Business                                                                                          than strategic analysis. In this case, data will
   intelligence                                                                                        also need to be stored and delivered more
                                                         SQL Server                                    coarse-grained to information consumers.
                                                       Data warehouse
                                                                                                       Conclusion
                                                                                                        Enterprise BI architectures can maximize
                                                                                                        information value by offering a closer
collaboration of the components more flexible, and it becomes easier         integration back to LOB applications. Taking a greater role in
to adapt to changes in business processes, which ultimately leads to a       operational decision making will empower business users to interact
better IT-to-business alignment.                                             with actionable information in BI.
    An architectural solution (see Figure 2) that leverages a technology         This will enable BI solutions to close the data cycle and drive
stack that is based on SQL Server Analysis Services 2008—with                performance improvement of operational processes.
URL actions defined on the cube (BI platform), Office Excel 2007                 A technology stack that is based on Office Excel 2007, WCF
(BI analysis tool), and Silverlight RIA (BI-to-SOA bridge)—can               services, and SSAS—with URL actions that are defined on the cube—
exemplify utilization of enterprise SOA as an enabler of BI-to-LOB           can be leveraged to implement data integration easily from BI back to
integration.                                                                 LOB applications in an SOA scenario.
    URL actions represent a simple and effective way to implement
actionable information for BI solutions that are built on SSAS. The URL
can point to a Silverlight RIA application that will act as a BI-to-SOA      About the Author
bridge and make calls to Web services by using WCF.                          Razvan Grigoroiu (rgrigoroiu@epicor.com) is a Software Architect
                                                                             who has been involved for the past 10 years with LOB and BI solutions
Implementation Considerations                                                for the specialty-retail sector.
Integrating data flow from BI back to LOB applications will be
beneficial to operational processes. However, as soon as BI becomes
an essential part of that world, its design will also be burdened with
the mission-critical nature of operations.                                      Follow up on this topic
    Special implementation considerations and requirements must                 •   SQL Server Integration Services (SSIS):
be taken into account, compared to classical BI applications that are               http://msdn.microsoft.com/en-us/library/ms141026.aspx
intended exclusively for strategic decision support.                            •   SQL Server Analysis Services (SSAS): http://www.microsoft.com
    The BI solution that is integrated with LOB applications will target            /sqlserver/2008/en/us/analysis-services.aspx
a wider audience and a larger number of operational users. BI will              •   Rich Internet Applications:
need to scale from a limited number of report consumers to a larger                 http://www.microsoft.com/silverlight/
number of operational users at different levels in the enterprise               •   Service Orientation: http://msdn.microsoft.com/wcf
hierarchy. In this case, scalability will play a much more important role,
and it must be a leading factor when hardware and software design
choices are made.



                                                                                                                                                    11
      The Architecture Journal 22
BI-to-LOB Integration: Closing the Cycle

Performance Management:                                                     Relevant Time-Performance Management
How Technology Is Changing the Game                                         by Usha Venkatasubramanian
by Gustavo Gattass Ayub
                                                                            Organizations need to make informed business decisions at strategic,
Many enterprises have built the capabilities to monitor, analyze, and       tactical, and operational levels. Decision-support systems were offline
plan their businesses. But the problem is that they are delivering          solutions that catered to specific needs. With new trends, there is a
insight into the past, but not into up-to-the-moment performance.           need to cover a larger set of people—right from the CEO, who looks
    Front-line managers increasingly need to know what’s happening          at a larger timeframe, up to an operations manager, who needs recent
right now. Individual contributors need to have access to current data      statistics. Therefore, we must build a performance-management
to provide quality service. Retailers and manufacturers are taking          system that delivers information at the relevant time: Relevant
advantage of this to avoid stock-outs or overproduction. Financial-         Time-Performance Management (RTPM).
services, logistics, and utilities companies are using stream-data              How can an organization provide self-service capability to the
processing to increase operational efficiency and create new business       business, while still maintaining the data recency and granularity?
capabilities.                                                               We implemented a multilayered data warehouse that is both a sink
    There are clearly three challenges: effective delivery of integrated    and a source of information. Data currency was maintained by using
data to end users, the ability to process huge volumes of granular          a suitable adapter to poll data (for example, SSIS in the Microsoft BI
data, and the ability to process data streams.                              suite).
    In-memory processing and 64-bit PCs are changing the way in
which end-users access current integrated data, as it allows them to        Management Organization-Structure Relevance
build reports, dashboards, and scorecards without the direct support        Near-real time data was trickle-fed into the lowest layer and reflected
from IT (also known as self-service business intelligence [BI]). From       in the output for the operational manager. Data was sourced to higher
the IT perspective, it’s an alternative to delivering insight without the   levels of managers by creating higher layers of aggregation, and at
need of long implementation cycles. The integration of in-memory            predefined time intervals. Granular data got offline-archived for the
processing with business productivity tools such as spreadsheets and        future. When data reached the highest level of aggregation, it was
intranet portals is becoming the option of choice to deliver the BI-for-    retained for comparative reporting for a longer duration of time.
the masses vision.
    Predictive analytics is a top priority in every enterprise profiting    Information Relevance
from BI and its potential is directly related to the granularity and        Current information requirements that are categorized as primary
latency of data. Today, in general, there is no big advantage in            data (information source) resided in all layers. Data that is not
working only with sets of aggregated data. The real advantage               required for querying was captured as supplementary data (data
comes from processing huge volumes of granular data in near-real-           sink). Some data from the secondary layer would move to the primary
time. From a technology perspective, there is a new generation of           layer, if there is a request for additional data. Likewise, a primary data
data-warehouse (DW) appliances that will enable this capability for         element would be retired by moving it to the secondary layer.
organizations that need to predict beyond competition.
    Stream-processing technologies allow real-time monitoring               Data-Nature Relevance
by detecting events or patterns of events as data streams through           A careful balancing act is needed to control the unwieldy growth
transactional systems and networks or from sensors. Complex event-          of the data volumes in the data-warehouse database, while still
processing (or CEP) platforms are enabling new applications—varying         providing the relevant information. An offline retention policy–based
from pollution control to algorithmic trading. CEP is becoming a            archive helps maintain the relevant information.
mandatory capability for any BI platform, as new applications emerge
and also as the real-data integration paradigm might shift in the near      Recency Relevance
future from the repeatable cycles of the traditional ETL process to the     Recency of information calls for a proper Change Data Capture
event-processing paradigm.                                                  mechanism to be in place for different stakeholders to get what they
    Together, these emerging technologies are playing a key role            need. This would primarily depend on the nature of the source data
in enabling some enterprises to evolve from the traditional DW              itself. Using metadata-driven CDC and normalized CDC, the data is
to modern BI platforms that have arrived to change the game by              maintained as recently as required.
providing real-time monitoring and much faster analysis.
                                                                            Delivery Relevance
                                                                            Information delivery was a mix of push and pull to maintain the time
Gustavo Gattass Ayub (ggattass@microsoft.com) is a Senior                   relevance. Standard reports were delivered through the push method
Consultant at Microsoft Consulting Services Brazil.                         and ad-hoc reports through the pull method.
                                                                                Some of the case studies in which we’ve used these principles
                                                                            effectively can be seen at the following Web site: http://www.
                                                                            lntinfotech.com/services/business_analytics/overview.asp.



                                                                            Usha Venkatasubramanian (usha.v@lntinfotech.com) is the deputy
                                                                            head of the Business Analytics Practice at L&T Infotech.




12
                                                                                                                The Architecture Journal 22
Increasing Productivity
by Empowering Business Users
with Self-Serve BI
by Ken Withee




Summary                                                                    analytical needs of users and provide the governance and control that
                                                                           IT requires.
Enabling end-user self-serve business intelligence (BI)                        In its upcoming release of Office Excel 2010 and Office SharePoint
is a critical step in modern business. The latest wave of                  Server 2010, Microsoft attempts to provide the self-serve analytical
Microsoft products enable self-serve BI by using tools                     needs of business users in a feature that is known as PowerPivot.

and technologies such as Office SharePoint, PowerPivot,
                                                                           PowerPivot and Office Excel 2010
Business Connectivity Services (BCS), Office Excel, and                    PowerPivot is an add-in to Office Excel 2010 that, from the point of
Report Builder.                                                            view of business users, simply allows them to pull massive amounts of
                                                                           data into Office Excel and then analyze it in a familiar environment.
                                                                           The premise is that users are already familiar with Office Excel and
Introduction                                                               prefer to work in Office Excel, but are forced to go through IT when
Creating a self-serve environment is a critical evolutionary step in       they are dealing with large and complex data. The IT team then goes
any software environment. Imagine if the IT department had to be           through the process of modeling the data and building OLAP cubes
involved in sending every e-mail message. The thought is almost            that then can be used by the business users to perform their analysis.
laughable. When an e-mail system has been put in place, it is              This back-and-forth nature is the root of a great deal of frustration for
maintained by IT, but users are free to send, receive, and manage their    both sides.
e-mail through self-serve tools such as Office Outlook, Thunderbird,          PowerPivot attempts to remove this interaction by providing
Lotus Notes, Eudora, and Pine.                                             features in Office Excel that allow business users to pull in and
    Business intelligence has become increasingly important—many           analyze very large amounts of data without having to interact with IT.
would say critical—in modern business. Getting the right information       When users have completed an analysis, they can upload the Office
to the right person at the right time is the promise of BI, but this is    Excel document to an Office SharePoint library from which it can be
often easier said than done. Providing a self-serve mechanism for          shared with the rest of the organization. Because these PowerPivot
business users is the next step in the BI story. Providing self-serve BI   documents live on the Office SharePoint server, IT maintains
allows for an exponential increase in the usefulness of BI by removing     governance and control over the entire process.
the large hurdle that is involved in the back-and-forth interaction of
business users and the technical team. In essence, business users can      PowerPivot Architecture
answer their own questions as they arise, without having to pause to       From an IT point of view, the predictable thing about business users is
involve the IT team with every detail.                                     that they just want things to work; they are focused on their business
    This article explores the latest Microsoft tools and technologies      role and see technology only as a tool that helps them perform their
that enable self-serve BI. In particular, it outlines the latest wave of   tasks. Under the covers, PowerPivot (formerly known as codename
SQL Server, Office SharePoint, and other Office products and how           “Gemini”) is an incredibly complex piece of technology in which
they can be used to provide self-serve BI to business users.               Microsoft has invested heavily to bring to market. Above the covers,
                                                                           however (and what IT presents to the business user), PowerPivot is
PowerPivot                                                                 simply Office Excel. Many business users will not even care that a
When it comes to BI, there is a constant battle between business           new technology has been introduced; they will just know that the
users and IT. Business users know the functional components of             new version of Office Excel can solve their problems with analysis of
the business; they understand fully what they want to analyze and          massive and complex data sets.
the questions that they want to answer. The IT team understands                The technology that provides Office Excel the ability to analyze
the structure of the data, the models, the cubes, data flow from the       millions upon millions of rows of data in line with what business
operational systems, the data warehouse, and the control mechanisms        users are expecting can be found in memory. In particular, the data is
for the data. Business users often feel that IT is always saying “No”      loaded into memory in what is known as in-memory column-based
when they make a request for a data cube, report, chart, graph, or         storage and processing. In essence, users are building their own
even raw data. The members of the IT team often feel that users are        in-memory data cube for analysis with Office Excel.
making unreasonable requests, with timelines that equate to boiling            In order to get the amounts of data capabilities that are provided
the ocean by first thing tomorrow morning. Self-serve BI attempts          by PowerPivot, the in-memory data structure is highly compressed
to solve this problem by providing a mechanism that will satisfy the       and read-only. The engine that is responsible for compressing and


                                                                                                                                                  13
      The Architecture Journal 22
Increasing Productivity by Empowering Business Users with Self-Serve BI

managing this in-memory data structure is
called VertiPaq.                                  Figure 1: BCS allows Office SharePoint 2010 read/write interaction with external systems.

PowerPivot and Office SharePoint
Server 2010                                                   ERP
PowerPivot for Office SharePoint
accomplishes two important tasks. The
first is that it provides a home for the
                                                                                       Office SharePoint
PowerPivot documents that users create in                         Text
an environment that IT controls. The second
is that it provides users throughout the
organization the ability to view and interact        CRM
with PowerPivot documents by using nothing
more than their thin-client Web browser.
    For business users, consuming a                               XML

PowerPivot document is as simple as
going to their Intranet site and clicking a
document. The document then renders in
                                                              APP
their browser, and they can interact with the
document and perform their analysis. In fact,
the consumers do not even need to have
Office Excel installed on their local computers to interact with the      specialized software on end-user desktops. For example, imagine a
PowerPivot document. The result is that business users focus on their     customer-support representative taking an order from a customer.
business analysis and are oblivious to the technology that enables the    There might be one application for entering the order, another for
interaction.                                                              taking notes on the call, and yet another for researching questions
                                                                          that the customer has about products. If the customer notifies the
PowerPivot Control and Governance                                         support representative of an address change, the representative must
One of the biggest challenges in IT involves spreadsheets that            access also the system that stores the customer information and make
become critical to business users without the knowledge by IT of          the update. Business users have no need or desire to understand
their existence. Office SharePoint provides functionality that gives IT   where their data and information lives; they just want to interact with
the ability to track PowerPivot usage patterns. As PowerPivot (Office     the right information at the right time, in as easy a manner as possible.
Excel) documents bubble up in importance, IT can monitor and              The BCS technology provides a mechanism for the consolidation of
identify the source data and business functionality.                      access points to external systems into one convenient portal location.
    Having visibility into which PowerPivot documents are                 Consolidation greatly reduces the complexity of end-user job
most frequently used by business users is critical in developing          functions by providing a single destination to perform business tasks
management and disaster-recovery plans. For example, if an                and find business information.
extensively used PowerPivot document is pulling data from an                  Reducing the complexity for users also reduces the number of
operational system that was thought to have minimal importance, IT        disparate requests that IT must service. Instead of IT having to support
has achieved visibility into what is truly important to business users.   connections, security, audits, and one-off projects for multiple
IT is then better able to accommodate the needs of their users going      systems, IT only must set up the connection in the portal once and
forward.                                                                  then support the single portal framework. In addition, moving
                                                                          everything to a single framework restores control over the external
Business Connectivity Services (BCS)                                      systems to IT by moving users into a single environment.
The major promise of BI is getting the right information to the right
person at the right time. When people think of BI information, they       BCS Architecture
usually think of numeric data and analytics. However, information         BCS is an evolution of the Business Data Catalog (BDC) from
takes many forms, including nonnumeric information that lives in a        Office SharePoint 2007. BCS is baked into the Office SharePoint
plethora of applications and databases.                                   2010 platform and the Office 2010 clients. BCS uses three primary
    Some of the most popular systems that require integration             components that enable the connection to external systems. These
include line-of-business (LOB) systems—often called enterprise            include Business Data Connectivity, an External Content Type
resource planning (ERP)—such as SAP, Oracle, Dynamics, Lawson,            Repository, and External Lists. In addition the BCS client is included
Siebel, and Sage, to name just a few. Access to these systems often       also in the Office 2010 applications.
is cumbersome and time consuming. A typical interaction involves              The External Content Type Repository and External Lists allow
using specialized applications and screens to access or update the        solution architects not only to describe the external data model, but
information that lives in the ERP system.                                 also to define how the data should behave within Office SharePoint
    BCS is a technology that is included in Office SharePoint Server      and Office.
2010 and provides integration and interaction (read/write) with               BCS connections are XML-based and include functionality to
the information that is contained in external systems, as shown in        connect to SOA-based services. When a connection file has been set
Figure 1.                                                                 up, it can be used throughout the Office SharePoint environment by
    Integrating a front-end user-facing portal with external systems      end users.
provides a single self-serve access point, without the need to install


14
                                                                                                             The Architecture Journal 22
Increasing Productivity by Empowering Business Users with Self-Serve BI

                                                                                          Builder and for developers to design reports by
Figure 2: Report Builder is designed to provide business users with an easy-to-use        using BIDS with functional parity, due to the shared
report-development environment.                                                           underlying code base.
                                                                                              Report Builder uses ClickOnce technology for
                                                                                          deployment. ClickOnce allows users to click the
                                                                                          link in either Report Manager or Office SharePoint
                                                                                          and download the application to their desktop
                                                                                          computers. ClickOnce alleviates the need for a
                                                                                          mass install by the IT department. When Report
                                                                                          Builder must be upgraded or updated, the new bits
                                                                                          are automatically downloaded to the user desktop
                                                                                          without the need for manual updates.

                                                                                           SQL Server Reporting Services
                                                                                           SQL Server Reporting Services (SSRS) is the reporting
                                                                                           component of the SQL Server product. The SSRS
                                                                                           architecture consists of a Windows Service that is
                                                                                           designed to render reports and a couple of SQL
                                                                                           Server databases that are designed to store content,
                                                                                           configuration, metadata, and temporary rendering
                                                                                           information. SSRS reports consist of an XML-based
                                                                                           format that is called Report Definition Language
                                                                                           (RDL). SSRS reports—or RDL files, in other words—
                                                                                           can be created by using either BIDS (Visual Studio) or
                                                                                           Report Builder.
                                                                                               An SSRS database can be installed in either
                                                                                           stand-alone mode or integrated mode. When it is
Report Builder                                                           installed in stand-alone mode, a Web application that is known as
Report Builder is an application that is designed to provide end users   Report Manager is responsible for storing, managing, and providing
the ability to create and publish their own SQL Server Reporting         the reporting environment to end users. When it is installed in
Services (SSRS) reports. Report Builder was designed for the end user    integrated mode, Office SharePoint takes over, and Report Manager is
with the comfortable look and feel of other Microsoft Office products.   no longer used.
In particular, Report Builder includes the Office Ribbon at the top of       Although SSRS is a component of the SQL Server product, it is
the report-design surface, as shown in Figure 2.                         not restricted to pulling data from only a SQL Server database. Using
    The underlying report-engine code base is shared with the            Report Builder, end users can pull data from a number of different
Business Intelligence Development Studio (BIDS) report-design            connection types, including OLE DB, ODBC, Analysis Services, Oracle,
environment. This single underlying code base was designed to            XML, Report Models, SAP Netweaver BI, Hyperion Essbase, and
provide functionality for end users to create reports by using Report    TERADATA.




Figure 3: Making the connection to an SSAS OLAP cube is accomplished on the Data tab of Office Excel.




                                                                                                                                              15
      The Architecture Journal 22
Increasing Productivity by Empowering Business Users with Self-Serve BI

Self-Serve Reporting
Business users launch Report Builder by clicking a        Figure 4: Browsing an SSAS OLAP cube as a PivotTable in Office Excel.
link in either Report Manager or Office SharePoint.
As soon as Report Builder is launched, business
users create connections to data sources and build
reports. The reports can then be saved into either
Report Manager or an Office SharePoint Document
Library. Other users can then connect to Report
Manager or the Office SharePoint site to view the
available reports. The IT team can maintain control by
providing “approved” reports, monitoring usage, and
limiting access to the servers that contain the source
data.
    When SSRS is installed in integrated mode,
reports can take advantage of the Office SharePoint
Enterprise Content Management (ECM) features,
such as versioning, security, check-in/check-out, and
workflow. In addition, the responsibilities of IT are
reduced, because only a single content-management
system must be maintained and managed. In the
Office SharePoint environment, an SSRS report is
nothing more than a content type, such as an Office
Word document or Office Excel spreadsheet.

Office Excel and Excel Services
Office Excel has to be one of the most prolific and
ubiquitous data-analysis applications in the world.
Nearly every organization uses Office Excel in one capacity or            Office Excel and SSAS Data Mining
another. Some businesses use Office Excel almost exclusively to run       A Data Mining add-in for Office Excel is available to provide access to
and manage their data needs. One of the most beloved data-analysis        the data-mining algorithms that are contained within SSAS. Installing
features of Office Excel is the PivotTable. A PivotTable provides an      the add-in provides a Data Mining tab in Office Excel with which
easy-to-use drag-and-drop interface for slicing, dicing, grouping, and    users can access the algorithms that are contained within the SSAS
aggregating data. Beginning with Office Excel 2007, end users have        Data Mining engine. The Data Mining tab in Office Excel is shown in
the ability to connect to and utilize the back-end SQL Server Analysis    Figure 5.
Services (SSAS) server from the comfort of Office Excel on their             The SQL Server Data Mining Add-In provides the following
desktops. As a result, end users can browse and analyze OLAP cubes        functionality:
and tap into the powerful data-mining capabilities that SSAS provides.
                                                                           •   Data-analysis tools—Provides data-mining analysis tools to
Using Office Excel to Browse SSAS Cubes                                        the Office Excel client, allowing users to perform deeper analysis
When connecting to and analyzing an SSAS OLAP cube, the business-              on tables by using data that is contained in their Office Excel
user experience is nearly identical to analyzing a local Office Excel          spreadsheets.
pivot table. A user makes the connection by selecting the From             •   Client for SSAS data mining—Provides the ability to create,
Analysis Services option on the Get From Other Sources menu of                 manage, and work with data-mining models from within the Office
the Data tab, as shown in Figure 3 on page 15.                                 Excel environment. Users can use either data that is contained in
   When the connection has been made, users can browse the cube                the local spreadsheet or external data that is available through the
and perform an analysis, just as they would a local pivot table, as            Analysis Services instance.
shown in Figure 4.



Figure 5: The Data Mining tab provides end users the ability to interact with the data-mining capabilities of SSAS.




16
                                                                                                              The Architecture Journal 22
Increasing Productivity by Empowering Business Users with Self-Serve BI

•   Data-mining templates for Office
    Visio—In addition to the Office Excel         Figure 6: Office Excel document is e-mailed to users, who modify and e-mail again—creating
    functionality, the add-in also provides       multiple mutations of the document.
    the ability to render and distribute data-
    mining models as Office Visio documents.

Office SharePoint 2010 and Excel Services                  ERP
Office Excel has to be one of the most                                                                                   Office
                                                                                                                         Excel
popular and prominent data-analysis
applications. Business users create Office                                                                                 Office
                                                               Text
Excel spreadsheets that perform everything                                                                                 Excel

from ad-hoc analysis to fully featured
                                                                                               Office                       Office
profit-and-loss calculators. When these            CRM                                         Excel
                                                                                                                            Excel

applications are under the radar, they cannot                                                                            Office
be backed up or supported by IT. The result                                                                              Excel

is that business users become frustrated                       XML

with IT for not being able to support them,
and IT becomes frustrated with users who
are working outside the provided system.
                                                           APP
Business users feel that IT is not giving them
the correct tools, so they go and create
their own by using Office Excel. The IT team
members feel that the business users are going around IT and have              To complicate matters further, users often e-mail Office Excel
no right to complain when the stuff that they created on their own         documents to other users, who e-mail those documents again. This
breaks.                                                                    creates multiple mutations of critical business functionality, as shown
                                                                           in Figure 6.




Self-Service Reporting Best Practices                                      substantial advantages in using the latter option, if your organization
on the Microsoft BI Platform                                               is prepared for some development and maintenance overhead.
by Paul Turley                                                             Analysis tools—such as the new generation of Report Builder in
                                                                           Microsoft SQL Server 2008, and the pending release of SQL Server
Once upon a time, there was a big company whose IT department              2008 R2, Microsoft Office Excel, and Office PerformancePoint Services
wanted to ensure that everyone would see only good data in                 for SharePoint—might be given to users, but the semantic layer must
their reports. To make sure of this, they ruled that all reports           be managed centrally by IT.
would be created by IT from data that was stored on IT-controlled
databases. Business managers and users quietly circumnavigated             Separate User- and Production-Report Libraries
this—downloading data into spreadsheets and data files. Another            User reports might be used to make important decisions and might
company’s IT group enabled the business to perform its own                 even become mission-critical, but the reports, scorecards, and
reporting by using an ad-hoc tool—opening databases to everyone.           dashboards that are “guaranteed” to be accurate and reliable should
    In both of these companies, when leaders had questions, everyone       go through the same rigorous IT-managed design, development,
had answers! The only problem was that the answers were all                and testing criteria as any production-ready business application.
different. Many organizations operate in one of these extremes.            Designate a library for ad-hoc reports, separate from production
    Business users can gain important insight by using self-service        reports. Office SharePoint is an excellent medium for this purpose.
reporting tools. Armed with the right answers, leaders and workers
can take appropriate action and make informed decisions, instead           Conduct Formal Review Cycles, Validate Reports, Consolidate
of shooting from the hip or waiting for reliable information to come       Them in Production
from somewhere else. Functional business-intelligence (BI) solutions       One of the most effective methods for IT designers to understand
don’t evolve into existence and must be carefully planned and              business-reporting requirements is to leverage user-designed reports.
managed.                                                                   For mission-critical processes, use these as proofs of concept, and
    These best practices adhere to some basic principles and               then work with the business to design consolidated, flexible “super
experience-borne lessons:                                                  reports” in a production mode.
                                                                              Learn how to implement these tools to build a comprehensive self-
Manage the Semantic Layer                                                  service reporting solution by reading the full article on Paul’s blog:
A single version of the truth might consist of data that is derived from   http://www.sqlserverbiblog.com.
multiple sources. By simply giving users the keys to the database
kingdom, you aren’t doing anyone any favors. One size doesn’t fit all,
but business-reporting data should always be abstracted through a          Paul Turley is a business-intelligence architect and manager for
semantic layer. This might be a set of views on a data mart, a report      Hitachi Consulting, and a Microsoft MVP.
model, or an online analytical processing (OLAP) cube. There are


                                                                                                                                                 17
      The Architecture Journal 22
Increasing Productivity by Empowering Business Users with Self-Serve BI

Excel Services in Office SharePoint 2010
attempts to solve the issues with Office             Figure 7: Office Excel document is saved to Office SharePoint and accessed by users
Excel by providing a home for Office                 throughout the organization by using only a thin client (Web browser).
Excel documents in the Office SharePoint
environment that is controlled and governed
by IT. When Office Excel documents are saved
                                                                ERP
in an Office SharePoint document library,
there is one version of the document, and
users can connect and use the document
without spawning multiple mutations, as
                                                                   Text
shown in Figure 7. The document can also
take advantage of the ECM features of Office
SharePoint, including versioning, security,
                                                                                                     Office
                                                       CRM
                                                                                                     Excel
check-in/check-out, and workflow. In addition,
IT is able to gain oversight, visibility, and                                                                        Office
control over the Office Excel applications with                    XML
                                                                                                                   SharePoint
which users are performing their business
tasks.
    Office Excel documents in Office
                                                                APP
SharePoint can use connection files that are
managed by the IT department. For example,
IT can create connection files to the source
systems and then simply point end users
to these approved connection files. This
alleviates the need for IT to service numerous requests for connection        as a client for the data-mining functionality of the SSAS server. The
information for every Office Excel file that is created for a business        power of the data-mining algorithms can be leveraged with data that
problem.                                                                      is contained in a data warehouse or data that is local in Office Excel
    One of the most powerful features of Office SharePoint is called          spreadsheets. In both situations, Office Excel acts as a client for the
Excel Services. Excel Services is the ability to render Office Excel          SSAS server, which provides end users with the power of SQL Server
documents in a thin client (Web browser). An important Office                 and the comfort of Office Excel.
Excel document can be saved to a document library, and the entire                 One of the biggest pain points in an OLAP environment is the
organization can then view and interact with the Office Excel                 amount of effort that it takes to organize and develop data cubes.
document without having to leave their browser. The consumers of              Business users have to coordinate with BI developers to identify
the document just navigate to their company intranet and click the            the correct data, relationships, and aggregates. Requirements are
Office Excel document.                                                        constantly shifting, and, by the time a cube has been developed,
    This functionality is particularly powerful when thinking about           the requirement has changed and must be reworked. Providing end
rolling out Office Excel 2010 to provide PowerPivot functionality. Only       users the ability to create their own data cubes in an easy-to-use
a handful of business users actually produce content, with the rest just      environment is extremely important to the evolution of BI. PowerPivot
consuming it. Using Excel Services, the only users who will need to           provides the ability for users to create in-memory cubes right on
have Office Excel 2010 are the producers of content. The consumers            their desktops in the familiar Office Excel environment. The cubes can
can interact with the PowerPivot documents without ever having to             then be uploaded to an Office SharePoint site and accessed by users
leave their browser or install the latest version of Office Excel.            throughout the organization.
                                                                                  Office SharePoint 2010 includes BCS, which provides read/
Conclusion                                                                    write integration between Office SharePoint and external systems.
Report Builder is an end-user report-development tool that provides           Consolidating functionality into the Office SharePoint environment
end users the ability to create their own SSRS reports without the            reduces complexity for end users and provides a one-stop shop for
need for an SSRS expert. The Report Builder application uses the              all content, including BI, reporting, analysis, communication, and
same underlying code base as the Business Intelligence Developer              collaboration. In addition, IT can consolidate focus from multiple
Studio (BIDS), which is designed for professional developers. Allowing        access systems into a single portal system.
end users to build their own reports takes a tremendous amount of                 A self-serve environment is a key inflection point in any
resource load off of the technical team—allowing them to focus on             technological solution. Placing the power of a solution in the hands
the underlying data warehouse, instead of the tedious report-design           of end users unleashes an exponential power that can only be
process.                                                                      realized through a self-serve environment. Surfacing BI information
    Office Excel is one of the most ubiquitous data-analysis programs         into a collaborative environment such as Office SharePoint enables a
that are used in the world today. Microsoft has recognized that people        new form of BI that is called human business intelligence (HBI). HBI
are already comfortable with Office Excel and often do not want to            merges the traditional analytical capabilities of a BI solution with the
change to another application for data analysis. Office Excel can be          knowledge of the people throughout the organization.
used as a client for the back-end SSAS server.                                    The latest wave of Microsoft products are interwoven to provide a
    In particular, users can connect Office Excel to OLAP cubes that are      single cohesive self-serve environment for end-user content creation.
hosted on SSAS and slice and dice data in the same fashion in which           This places the power of content creation and analysis in the hands of
they would use a local PivotTable. In addition, Office Excel can be used      the end users. Without the intensive back-and-forth BI-development


18
                                                                                                                The Architecture Journal 22
process that currently exists, users are free to expand their knowledge   Self-Service BI: A KPI for BI Initiative
exponentially and on their own time.                                      by Uttama Mukherjee

Resources                                                                 A most commonly asked question is: Can the overall business
Donald Farmer: Foraging in the Data Forest. Blog. Available at            performance be attributed directly to the success of business
http://www.beyeblogs.com/donaldfarmer/.                                   intelligence (BI)? To come up with a credible answer, the
                                                                          prerequisite is to have a widespread adoption of an intuitive
Molnar, Sheila, and Michael Otey. “Donald Farmer Discusses                BI, which comes from a self service–enabled BI setup. For the
the Benefits of Managed Self-Service.” SQL Server Magazine,               sustenance of such models, it is better to ensure that self-
October 2009. Available at http://www.sqlmag.com/Articles                 service BI initiatives are funded through a “pay-per-use”
/ArticleID/102613/102613.html?Ad=1.                                       process.
                                                                              An index of assessing the degree of self-serviceability of BI
Microsoft Corporation. MSDN SQL Server Developer Center                   implementation is one of the key performance indicators (KPIs)
documentation and articles. Available at http://msdn.microsoft.com        to measure the success of BI.
/en-us/library/bb418432(SQL.10).aspx.                                         The following dimensions become critical to enable an all-
                                                                          pervasive self-service BI.
SQL Server Reporting Services Team Blog. “Report Builder 3.0, August          People: Self-service for standard users can be addressed
CTP.” August 2009. Available at http://blogs.msdn.com                     through a governance process. The conflict of flexibility and
/sqlrsteamblog/.                                                          standardization becomes a topic of more elaborate deliberation
                                                                          for implementing a self-service environment for power users.
Alton, Chris. “Reporting Services SharePoint Integration                  Typically, power users having direct access to the enterprise-
Troubleshooting.” MSDN SQL Server Developer Center, August 2009.          wide “single version of truth’” results in possible runaway
Available at http://msdn.microsoft.com/en-us/library/ee384252.aspx.       queries and redundant reports. Such users must be provided
                                                                          privileged access to “BI workspace,” defined succinctly by
Pendse, Nigel. “Commentary: Project Gemini—Microsoft’s Brilliant          Forrester as a data-exploration environment in which power
OLAP Trojan Horse.” The BI Verdict, October 2008. Available at            users can analyze data with near-complete freedom and
http://www.olapreport.com/Comment_Gemini.htm.                             minimal dependency on IT, or without being impeded by data-
                                                                          security restrictions.
PowerPivot Team Blog. “Linked Tables.” August 2009. Available at              Process: Standard users get a self-service option through
http://blogs.msdn.com/gemini/.                                            a set of predefined reports/analytics as a relatively static utility
                                                                          service, to which they can subscribe at a price (notional/actual).
                                                                          The accumulated credit may be used for funding future BI
About the Author                                                          initiatives. Depending on organization culture, consensus-
Ken Withee (KWithee@hitachiconsulting.com)is a consultant                 driven processes are established though a BICC framework.
with Hitachi Consulting and specializes in Microsoft technologies         Additionally, the BICC ensures transfer of the gathered insights
in Seattle, WA. He is author of Microsoft Business Intelligence for       from power users to the larger group—evolving into a more
Dummies (Hoboken, NJ: For Dummies; Chichester: Wiley Press,               mature BI setup.
2009) and, along with Paul Turley, Thiago Silva, and Bryan C. Smith,          Technology: With the preceding two aspects addressed
coauthor of Professional Microsoft SQL Server 2008 Reporting Services     appropriately, self-service demand of the majority of
(Indianapolis, IN: Wiley Publishing, Inc., 2008).                         information consumers can be met by using a standard
                                                                          enterprise-wide BI setup. Considering that most of these
                                                                          services are predefined, the load on the BI platform is
                                                                          predictable to a large extent. But for power users, who are
 Follow up on this topic                                                  synthesizers, additional data (internal/external, structured/
  •   PowerPivot: http://www.powerpivot.com/                              unstructured) and information requirements demand state-
                                                                          of-the-art technology and higher throughput of the “BI
                                                                          workspace.” To meet the needs of power users, technology
                                                                          decisions become critical, and funding becomes a challenge.
                                                                              One index (among probable others) to measure the degree
                                                                          of self-service is to monitor the usage rate of utility analytics
                                                                          by standard users and conduct a qualitative satisfaction survey
                                                                          to monitor acceptance by power users. The “pay-per-use” fund
                                                                          that is accumulated gives a quantitative measure.



                                                                          Uttama Mukherjee (Uttama.mukherjee@hcl.in), Practice
                                                                          Director–HCL Technologies, leads strategic BI consulting and
                                                                          delivery services in BI across industries.




                                                                                                                                           19
      The Architecture Journal 22
Business Insight =
Business Infrastructure =
Business-Intelligence Platform
by Dinesh Kumar




Summary
                                                                             Table 1: Business-insight characteristics
To do things differently, one must look at things
differently. This article introduces the notion of                            Requirement             Implication
business infrastructure providing the necessary bridge                        Collaborating           User experience—Consistent approach to
between (contextual) business insight and a (common)                          across the              delivering, analyzing, finding and sharing
                                                                              organization            information to make informed decisions
business-intelligence (BI) platform. Using the business-
infrastructure and business-capability models, the                                                    Collaboration—Ability to share, annotate,
                                                                                                      and perform actions as you would with a
article provides a prescriptive approach to planning and                                              document
delivering BI services.
                                                                              Transacting             Speed and integration—Real-time data
                                                                              decisions               gathering, analysis, decision making, and
Changing Landscape                                                                                    subsequently taking actions through
Currently, the IT industry is transitioning from an era of limited                                    transactional systems
capability of individual/functional reporting and analysis to one
that is defined by a connected, collaborative, and contextual world           Anticipating            Service-oriented—Adding, integrating, and
of BI. Insight is gained not only by analyzing the past, but also by          unknowns                delivering additional data as it becomes
                                                                                                      available or relevant
anticipating and understanding the future. Insight has value only if
people are able to act on it in a timely manner.                              Reducing cost           Platform—Shared services
    As the need for real-time data gathering, analysis, and decision          of change and
making increases, so, too does, the need to perform actions                   ongoing operations
through transactional systems. Insight is not individual. In a world of
collaborative BI, people want to find, share, comment on, and review
data quite similarly to how they handle documents. Insight is a core         sufficient to support new information-driven business processes and
competency only if it comes naturally to people. As a result, cost,          organizational models. To gain and capitalize on business insight, we
capability, and consistency become equally important.                        must think differently about how we evaluate, plan, communicate, and
    Table 1 provides a list of characteristics for business insight in any   implement BI capabilities in the organization.
organization.
                                                                             Next Practice
Current Practices                                                            Understandably, people are driven by their needs. The BI-capability
Currently, there are two dominant approaches to delivering BI                planning and delivery must respect individuality while driving
capabilities. Some organizations utilize a “make-to-order” approach          consistent thinking and common capabilities in business and IT. This
to deliver a specific solution for a specific business need. For             article introduces the next practice with a capability model and a
example, when a sales team wants to target customers for upgrades            methodical approach for planning BI capabilities for business insight.
and new products, the IT group creates a customer data mart or
a report, extracting and summarizing data from CRM systems.                  Concept #1: Business Infrastructure
When manufacturing wants to analyze inventory or supply chain,
                                                                                Just like IT, business also has an infrastructure.
IT creates a manufacturing data mart, extracting and summarizing
information from an ERP system. To address new requirements,                 Based on the work across a wide variety of industries and solution
these organizations keep adding layers to individual, functional,            scenarios, the author has realized that almost every business process
or unidirectional reporting-and-analysis systems—introducing                 or activity is dependent on similar sets of capabilities. For example,
duplication, delays, complexity, and cost.                                   when cashing a check in a bank branch, a client asks the bank teller
    Other organizations have adopted a “build it and they will               about a new mortgage promotion. The teller calls someone in the
come” approach by building a massive, centralized, enterprise data           mortgage department to inquire about the new promotion. The
warehouse with the expectation that different groups might want to           same client is thinking of rebalancing a financial portfolio and asks a
access and analyze data someday. It takes significant effort as well as      financial advisor in the same bank about treasury bonds. The advisor
expenditure to build something; then, it takes an equally huge effort        calls a bond expert for information.
and cost to maintain and extend it.                                              These two business activities are remarkably different and
    These approaches to planning and building BI capabilities are not        performed in two different departments, yet both rely on a similar

20
                                                                                                                  The Architecture Journal 22
Business Insight = Business Infrastructure = Business-Intelligence Platform

capability—that is, access to an expert. Likewise, various business        organization. For example, in a utility company, logistics management
processes and functions need or use similar BI capabilities, which         and workload management are added, as they are quite important
are different only in content and context. For example, the finance        and distinct areas in the organization. In a financial institution,
department is focused on financial data, whereas manufacturing is          individual and institutional banking are attributes of customer and
focused on production and quality data. However, both need the             product management, but financial-advisor services are added as a
same BI capability: to report and analyze the information that they        core capability for additional focus.
value. In other words, improving the common BI capability—such as              These capabilities are fairly industry-independent or process-
reporting and analysis in the preceding example—will improve or            independent; therefore, they can be characterized along a value-
enable multiple business activities.                                       maturity model. The maturity model helps organizations to assess the
    Just as with IT infrastructure, business infrastructure represents a   current state and articulate the desired state easily and quickly.
set of common, horizontal business capabilities that support multiple          Table 3 on page page 22 provides examples of the maturity level
specialized, vertical business processes and functions. Just as in         of some of the business-infrastructure capabilities. The detailed
IT infrastructure, improvement of the business infrastructure will         model enumerates the maturity of each capability in terms of people,
reduce process complexity, enhance organizational agility, and drive       process, information, access methods, security and compliance,
business maturity. Just as IT architecture includes both applications      availability, and performance attributes.
and infrastructure capabilities, business architecture includes both
business-process capabilities and business infrastructure.                 Concept #2: BI Platform
    In the context of BI, we have organized the horizontal business
                                                                               BI is cross-functional, cross-people, and cross-data.
capabilities into three value domains. The value domains represent
the type of outcome or impact that is expected in the context of           Just as there are common business capabilities that enable business
a business process or activity in which the underlying capability is       insight, there is also a collection of BI services that articulate
leveraged.                                                                 underlying technical BI capabilities. A service-oriented approach to
    (See Table 2 for the list of business capabilities that make up the    defining BI capabilities minimizes complexity and cost, while it drives
business infrastructure for business insight.)                             consistency and maturity in capabilities.
    One could argue that there could be additional business                    BI services are organized into four domains, each of which
capabilities under the business-management value domain. Financial,        addresses a distinct segment of the information flow. Table 4 on
customer, and product management are considered core capabilities          page 22 lists the four domains and the relevant capabilities that are
of every organization, regardless of size and industry, including          included in each domain.
the public sector. Other areas of business management are either               Figure 1 on page 23 articulates the collection of BI services under
extensions to these core areas or specific to an industry or an            each domain that use interfaces to other inbound, outbound, and


Table 2: Business-infrastructure capabilities

 Value domain      Business capability                     Description
 Business management                                       Plan & manage core business functions. Ability to plan & manage:
                   Financial management                    Cost and revenue across the organization.
                   Customer management                     Demand generation, sales, service, and customer satisfaction.
                   Product management                      Product planning, development, manufacturing, and distribution.
 Innovation & transformation                               Drive growth & competitive advantage. Ability:
                   Synergistic work                        For a team to work together to perform an activity or deliver on a shared objective.
                                                           The team might include people from either within or outside the organization.
                   Consensus & decisions                   To gain necessary consensus among stakeholders and make decisions. Stakeholders
                                                           might involve people from across organizational boundaries.
                   Communication of timely, relevant       To send and receive required information to the appropriate people, when needed.
                   information                             Communicator or receiver might be from either inside or outside the organization,
                                                           and communication may be either upstream or downstream.
                   Sense & respond                         To anticipate, detect, and monitor internal or external events or trends, and to take
                                                           appropriate actions.
                   Authoritative source of Information     To rely upon information in any transaction or decision making.
 Planning & delivery excellence                            Drive operational performance objectives. Ability to:
                   Information orchestration               Consolidate information across business activities or disseminate information to
                                                           appropriate consumer, when and where needed.
                   Governance & compliance                 Ensure that various policies and rules are understood and that the organization
                                                           behaves accordingly.
                   Reporting & analysis                    Create, analyze, and deliver appropriate information, when and where needed.
                   Performance measurement                 Measure, monitor, and communicate appropriate cost and performance metrics of a
                                                           business activity or process.


                                                                                                                                                   21
      The Architecture Journal 22
Business Insight = Business Infrastructure = Business-Intelligence Platform


Table 3: Sample business capability-maturity model

                                                                                      Value-maturity level
 Value domain       Business capability      Level 1                    Level 2                   Level 3                    Level 4
 Business           Customer                 Sales and corporate        Any individual or         An extended                Access and analysis
 management         management               functions can              business function         organization (partners)    can be performed
                                             summarize and report       can understand and        can access to sales and    with nominal effort
                                             on sales performance.      monitor sales and         support data for its       anytime, anywhere,
                                                                        marketing data in their   own analysis. People       and by anyone,
                                                                        own context.              can perform trending       including customers.
                                                                                                  and develop forecasts.
 Innovation &       Sense & respond          Local, functional level.   Enterprise level; 24/7;   Include partners and       Worldwide level.
 transformation                              Collect and report on      customer data.            remote locations.          Information includes
                                             operational data.                                    Monitor, consolidate,      industry and market
                                                                                                  and analyze.               research.
 Planning           Information              Orchestrate                Orchestrate               Orchestrate                Orchestrate
 & delivery         orchestration            information at             information at            information across         information across the
 excellence                                  functional level—          enterprise level,         partners (supply           industry—allowing
                                             based on internal          including customer        chain)—analyzed and        what-if scenarios, and
                                             operational data, and      information—              available on multiple      available at point-of-
                                             delivered in report        consolidated and          device types or            interest on any device
                                             form on internal           available via remote      networks.                  or network.
                                             network                    access, team sites,
                                                                        portals, and COTS
                                                                        apps.




dependent services.                                                          As the common business and technical capabilities are industry-,
    The concept of platform or infrastructure services can be applied        organization-, or technology-independent, the relationships or
across the whole IT domain. It is expected that BI services will             dependencies between business and technical capabilities are
leverage and integrate with other enterprise services such as security,      predefined. This allows organizations to answer quickly such questions
backup/recovery, and storage. The complete IT-service portfolio              as, “What technical capabilities do we need to enable a level of
with capability models is covered in a patent pending on IT Service          maturity in a business-infrastructure capability?” and “What business
Architecture Planning and Management.                                        capability can be enabled by using a technical capability?”
    Just as there is a capability-maturity model for business                    Relationships help in rapidly defining the scope, identifying the
infrastructure, BI services are also characterized by using a capability-    dependencies, and communicating value to various stakeholders.
maturity model. The BI-service maturity model leverages and extends          Figure 2 on page 23 provides a framework for leveraging known
the Microsoft Infrastructure Optimization (IO) Model.                        relationships and maturity models for developing the overall vision
    Table 5 on page 24 provides an example of BI-service maturity            and architecture road map.
levels. The model includes a range of attributes, addressing the
360-degree view of the service.                                              Case Study: Assessment and Road-Map Planning
                                                                             With the information model in hand, the process of aligning
Concept #3: Relationships and Road Map                                       and anticipating business needs, evaluating the current state,
                                                                             articulating the vision, developing the road map, and leveraging
     Don’t reinvent what we already know.
                                                                             every opportunity to advance the journey becomes simpler, more
                                                                             predictable, and repeatable.
                                                                                 Using a real case study and the previously described information
Table 4: BI services and capabilities                                        model, let us walk through the process and develop an assessment
                                                                             and road map for BI.
 BI-service domain        Capabilities
 Information delivery     Accessing and delivering information, when         Situation
                          needed, on a device or tool through one or         Smart grids and smart appliances are changing the landscape in the
                          more channels                                      utility industry. Depending upon the fluctuation in prices, customers
                                                                             might want to control their use of energy. This requires real-time
 Information analysis     Aggregating, analyzing, visualizing, and           pricing and usage analysis. Based on the customers’ thresholds, they
                          presenting information                             control the appliances in real time. The business model of the utility
 Data integration         Mapping, sharing, transforming, and                industry is also evolving. Within a few years, utility customers could
                          consolidating data                                 become suppliers by establishing their own solar power–generation
                                                                             facilities. The utility company in this study wanted to ensure that its
 Data management          Storing, extracting, loading, replicating,
                                                                             BI efforts were designed to meet future challenges and plan the BI
                          archiving, and monitoring data
                                                                             architecture for the expected change in the industry.

22
                                                                                                                 The Architecture Journal 22
Business Insight = Business Infrastructure = Business-Intelligence Platform


Figure 1: BI-service architecture


                                                                        BI services & capabilities
                                                  Information             Information      Data integration             Data
                                                    delivery                analysis                                 management
                                                                                                Master-data
               External                            Distribution             Reporting           management            Data store
              interfaces
                Messaging                                                                           Data
                                                         Search             Analytics             exchange              OLAP
                  client
                                                                                                   Data
                           Office                   Publishing            Visualization           mapping                ETL
                                                                                                                                      Back-office
                                                                                                     Data                              services
          Web browser                                    Portal           Data mining           transformation        Replication
                                                                                                                                      Supply chain
                                                                                                Operational
               Phone/PDA                                                    Scorecard           data Store            Archiving
                                                                                                                                         Human
                                                                                                                                        resources
          Web services                                                     Dashboard               Data
                                                                                                 warehouse                               Billing

                                                                                                 Data marts                            Customer
                                                                                                                                      management

                                                                                                                                        Financials

                    Foundation: Infrastructure & operations services                                                                   Workload
                                                 Load                                                                                 management
                              Firewall         balancing             Security        Database            Storage        Monitoring
                                                                                                                                            ...
                              Remote           Clustering                                               Backup/          Auditing/
                                                                    Directory       Server OS           recovery          logging
                               access             (HA)



Figure 2: Relationships and planning framework


                                              Strategic objectives & drivers
                                                                                                                 Value (impact)
                                                                                                                    priorities
                                            Business functions & processes

                                                                      Business management
                                    Business
                                                                  Innovation & transformation
                                 infrastructure
                                                                  Planning & delivery excellence
Predefined relationships




                                                                                                                    Capabilities        Vision
                                                                                                                   dependencies      architecture
                                                                       Information delivery                                           road map
                                       BI                              Information analysis
                                    services                             Data integration
                                                                        Data management

                                                                     Maturity levels
                                                                                                                      Cost
                                                             Technologies                                        design patterns

                                                                                                                                                     23
                           The Architecture Journal 22
Business Insight = Business Infrastructure = Business-Intelligence Platform


Table 5: BI-service capability maturity—Analytics service

 Attribute        Description               Level 1                        Level 2                      Level 3                       Level 4

 Analytics        Provide ability to
                  analyze data from any
                  source

 Information      What types and formats    Functional or                  Business-process data        Summary and details           Historical, forecast, trends,
                  of information are        departmental data                                           across business entities      KPIs, scorecards; multi-
                  provided?                                                                                                           dimensional; XML formats

 Transactions     What actions can be       Import, create                 Slice and dice               Trend, drilldown and          Predictive analysis, data
                  performed?                                                                            across                        mining

 Access           Who can access the        Desktop applications           Web browser and              Integrated productivity       Embedded LOB
                  capability, and how?                                     analytical tools; remote     suite, Web-based              applications, mobile
                                                                           access                       interactive tools; Internet   devices; available both
                                                                                                        access                        offline and on mobile
                                                                                                                                      devices

 Integration      How is information        E-mail attachments; file-      Web sites, database APIs;    Workspaces, XML-based         Subscription and
                  consolidated,             based; batch                   scheduled                    interface; on-demand          notification; Web services;
                  disseminated, or                                                                                                    real-time
                  integrated?

 Infrastructure   How is this capability    Individual files               Functional portals           IT-provisioned and            Hosting, shared services
                  implemented?                                                                          -managed

 Security         How is information        None                           Access user-managed          Role-based access             Rights management;
                  secured or access                                                                                                   compliance management
                  controlled?

 Performance      What are the service      Response time acceptable       Response time acceptable     Response time acceptable      Anytime, anywhere;
                  levels?                   at major sites; availability   at branch location;          at all user locations; 24/7   scalable
                                            unpredictable or not           unplanned down time          availability
                                            monitored

 Operations       How is this capability    No formal backup/              User-managed backup/         Centrally managed             Automated backup/
                  provisioned, monitored,   recovery procedures            recovery; manual             backup/recovery;              recovery; monitoring
                  or managed for                                           monitoring and               automated monitoring,         exceptions; self-
                  business continuity?                                     provisioning                 reporting and provisioning    provisioning

 Technologies     How is the technology     No standard or guidelines      Multiple technologies and    Standard across one or        Enterprise-wide standard
                  life cycle managed?                                      versions                     more business units




Process and Deliverable                                                                 assessed, and the desired state of business-infrastructure
A simple five-step process and the information model produced the                       capabilities that are needed to achieve the stated strategic
desired output:                                                                         objectives was articulated. (See Figure 3 for a visual representation
                                                                                        of the current and desired maturity levels.) The analysis focused
1. Understand the strategic objectives—what is or could be                              on the capabilities that had the greatest impact on strategic
   driving the need for change.                                                         objectives and the largest relative gaps between the current and
   Through interviews, 10 significant initiatives or objectives were                    desired states of capability maturity.
   identified that addressed business, regulatory, and operational                   4. Identify and evaluate required technical BI capabilities.
   goals. These objectives required improvement in people, process,                     Using the predefined relationship map between business
   and information capabilities; as such, the case study will focus on                  infrastructure and technical capabilities, the relevant technical
   BI-related capabilities.                                                             capabilities were identified and evaluated.
2. Identify the required business infrastructure.                                           Using the maturity model and knowing the desired state
   Evaluating an individual business process can be very time-                          of the business capabilities, the maturity map for the technical
   consuming. Also, the overall scope and objective of each strategic                   capabilities was developed (see Figure 4 on page 25.)
   initiative is not always clear. Therefore, instead of analyzing various           5. Develop the road map for realizing the vision.
   business processes, the focus is to understand the maturity of                       Based on the business priorities, current and planned projects,
   business-infrastructure capabilities in support of the strategic                     and dependencies between various capabilities, a statement of
   initiatives.                                                                         direction was established. The projects and capabilities delivered
       Figure 3 on page 25 shows the number of strategic initiatives                    by these projects were organized along the four BI Services
   or objectives that are enabled by each business-infrastructure                       domains in short-term, near-term and long-term time horizons.
   capability. This mapping identified the top business capabilities                    The road map was not about building future capabilities today.
   that the organization must explore and understand for the current                    It emphasized beginning with the end in mind by architecting
   and desired levels of maturity.                                                      current capabilities such that new capabilities can be introduced
3. Identify and evaluate required business-infrastructure                               cost-effectively when needed.
   capabilities.                                                                            Using the knowledge of the maturity map of both business and
   Using the capability-maturity model, the current state was                           BI capabilities, the current and desired states of these capabilities,

24
                                                                                                                          The Architecture Journal 22
Business Insight = Business Infrastructure = Business-Intelligence Platform


Figure 3: Business-capability alignment and maturity

                                                       Number of strategic
    Domain                 Capability                  objectives enabled                Value-maturity level
                                                  1 2 3 4 5 6          7 8 9 10          1        2          3   4
                                                                                                                      Current state
                     Financial management                                                                             Desired state
  Business           Customer
 management          management

                     Product management

                     Synergistic work

                     Consensus & decisions

               Communication of
 Innovation &
               timely & relevant
transformation
               information

                     Sense & respond

                     Authoritative source of
                     information
                     Information
                     orchestration
                     Governance &
   Planning &
                     compliance
    delivery
   excellence        Reporting & analysis

                     Performance
                     measurement


Figure 4: BI-capability maturity

     Domain                        Service                Value-maturity level               Current state
                                                                                             Desired state
                                                         1      2      3       4
                     Presence
  Information        Search
     access          Portal
                     Delivery
                     Reporting
  Information
                     Analytics
    analysis
                     Scorecards
                     Master-data
      Data           management
   integration       Data warehouse
                     Data exchange
                     Data store
    Data             OLAP
 management          ETL
                     Replication


                                                                                                                                25
      The Architecture Journal 22
and the statement of direction, the organization is also able to        Anatomy of a Failure:
     plan and execute each new initiative or project—ensuring that           10 “Easy” Ways to Fail a BI Project
     every investment results in another step in the right direction.        by Ciprian Jichici

Access to the Model
Microsoft has embodied the model that is presented in this article,          In the past 10 years, business intelligence (BI) has gone
along with a structured approach for developing BI strategy, in a            through the typical journey of a “hot” technology. It started
service offering that is called Assessment and Road Map for Business         with the mystery of the first implementations, fresh out of
Intelligence. Using this service, an organization can gain access to the     the academic world; went through the buzzword stage;
model, obtain additional knowledge, and develop its first BI strategy        and ended up in the technological mainstream. Next to this
by using the model.                                                          horizontal evolution, there has been a vertical one, in which
                                                                             BI has descended from the top of the organization to the
Conclusion                                                                   masses and begun addressing a much broader set of business
If an IT organization wants to create possibilities or “happy surprises”     needs.
for the business, it has to change its mindset and execution from                Regardless of the phases through which it has gone,
“think locally, act locally” to “think locally, act globally.” BI as a       BI has been—and, some argue, still is—an architectural
platform service will help organizations deliver a consistent BI             challenge. In real life, BI architects have to deal with multiple
experience while it maximizes ROI. Business infrastructure will              technologies, platforms, and sources of data. Today, we do
help business stakeholders and users to understand the business              have some kind of architectural guidance for most of the
commonalities and IT organizations to anticipate the business needs          components that play together in complex BI solutions.
and plan the BI road map. Together with business infrastructure and          Yet we still lack the proper architectural guidance on the
BI platform, organizations not only can gain business insight, but also      complicated topic of being successful in making them play
can act upon it.                                                             nicely together. Finally, the success equation of a BI project
                                                                             has one more key element—people—who are tightly
References                                                                   connected to processes and culture.
Kumar, Dinesh. “IT Service Architecture Planning and Management.”                From an architectural point of view, three of the most
U.S. Patent Pending. December 2007.                                          common reasons for failure are the inability to recognize
                                                                             that BI needs consistent architectural planning; unfortunate
Ross, Jeanne W., Peter Weill, and David C. Robertson. Enterprise             selection of technologies; and over-focusing on technological
Architecture as Strategy: Creating a Foundation for Business Execution.      issues (such as performance and data volumes) too early in
Boston, MA: Harvard Business School Press, 2006.                             the process. When it comes to data, failure to reveal relevant
                                                                             information and granularity mismatch (together with poor
Sykes, Martin, and Brad Clayton. “Surviving Turbulent Times:                 query response times) are two of the “easy” ways to derail a
Prioritizing IT Initiatives Using Business Architecture.” The Architecture   BI project.
Journal, June 2009. http://msdn.microsoft.com/en-us/architecture/                But perhaps the number-one reason for failure that
aa902621.aspx.                                                               is related to data is assuming that BI is synonymous with
                                                                             having a data warehouse. Finally, from a person’s point of
Microsoft TechNet. Microsoft Infrastructure Optimization (IO) Model.         view, failing a BI project is as easy as assuming that end
Available at http://www.microsoft.com/io.                                    users have the know-how to use the tools; not getting them
                                                                             the appropriate levels of detail; and, of course, going over
Microsoft Services. Microsoft Assessment and Road Map for Business           budget (which, oddly enough, occurs many times as a result
Intelligence. Available at http://www.microsoft.com/microsoftservices/       of previous success).
en/us/bpio_bi.aspx.                                                              It is safe to say that it is quite difficult to get BI right and
                                                                             quite easy to get it wrong. To summarize the preceding, the
Acknowledgements                                                             easiest way to fail your BI project is probably a “first build the
Special thanks to Tom Freeman, Geoff Brock, and Brant Zwiefel from           plant, then decide what to manufacture” approach.
Microsoft Services for their candid feedback and thorough reviews of
the draft.

                                                                             Ciprian Jichici is participating in the Microsoft Regional
About the Author                                                             Directors Program as a Regional Director for Romania. Since
Dinesh Kumar (dineshk@microsoft.com) is a Principal Architect who            2003, he has been involved in many BI projects, workshops,
focuses on enterprise architecture and IT planning. His recent research      and presentations. You can read an extended version of the
focuses on understanding business needs, planning, optimizing, and           preceding article at http://www.ciprianjichici.ro.
managing IT as a service organization. Dinesh serves as co-chair of the
enterprise-architecture working group at Innovation Value Institute.
He is also on the CIO advisory board for MISQ Executive, a SIM
publication.



 Follow up on this topic
     •   MS Business Intelligence: http://www.microsoft.com/bi/




26
                                                                                                            The Architecture Journal 22
Semantic Enterprise Optimizer and
Coexistence of Data Models
by P. A. Sundararajan, Anupama Nithyanand, and S.V. Subrahmanya




Summary
                                                                       Table 1: Data models for various purposes
The authors propose a semantic ontology–driven
                                                                        Data-model name          Purpose
enterprise data–model architecture for interoperability,
integration, and adaptability for evolution, by                         Hierarchical             Complex master-data hierarchies (1:M)
                                                                                                 Very high schema-to-data ratio
autonomic agent-driven intelligent design of logical
as well as physical data models in a heterogeneous                      Network                  Complex master-data relationships (M:M)
                                                                                                 Spatial networks, life sciences, chemical
distributed enterprise through its life cycle.
                                                                                                 structures, distributed network of
                                                                                                 relational tables
An enterprise-standard ontology (in Web Ontology
Language [OWL] and Semantic Web Rule Language                           Relational               Simple flat transactions
                                                                                                 Very low schema-to-data ratio
[SWRL]) for data is required to enable an automated
data platform that adds life-cycle activities to the                    Object                   Complex master-data relationships, with
                                                                                                 nested repeating groups
current Microsoft Enterprise Search and extend
Microsoft SQL Server through various engines for                        XML                      Integration across heterogeneous
                                                                                                 components; canonical; extensible
unstructured data types, as well as many domain types
that are configurable by users through a Semantic-                      File systems             Structured search

query optimizer, and using Microsoft Office SharePoint                  Record-oriented          Primary-key retrieval—OLTP—sequential
Server (MOSS) as a content and metadata repository to                                            processing

tie all these components together.                                      Column-oriented          Secondary-key retrieval; analytics;
                                                                                                 aggregates; large data volume, requiring
Introduction                                                                                     compression
Data models differ in their structural organization to suit various     Entity-attribute-value   Flexibility; unknown domain; changes
purposes. For example, product and organization hierarchies yield                                often to the structure; sparse; numerous
well to the hierarchical model, which would not be straightforward                               types together
to represent and access in a relational model (see Table 1).

The model is decided by following factors:                             While the relational database helped with the sharing of data,
                                                                       metadata sharing itself is a challenge. Here, enterprise ontology
1. Ease of representation and understandability of the structure for   is a candidate solution for metadata integration, and it leverages
   the nature of data                                                  such advances for stable Enterprise Information Integration (EII) and
2. Flexibility or maintainability of the representation                interoperability, in spite of the nebulous nature of an enterprise.
3. Ease of access and understanding the of retrieval, which involves       Ontologies are conceptual, sharable, reusable, generic, and
   the query, navigation, or search steps and language                 applicable across technical domains. They contain explicit knowledge
4. Ease of integration, which is an offshoot of maintainability and    that is represented as rules and aids in inference. Also, they improve
   understanding                                                       communication across heterogeneous, disparate components, tools,
5. Performance considerations                                          technologies, and stakeholders who are part of a single-domain
6. Space considerations                                                enterprise.

Depending on the requirement—be it a structured exact search           Evolution of Enterprise Integration
or a similarity-based unstructured, fuzzy search—we can have a         It is interesting to note the evolution of enterprise integration over
heterogeneous mix of structured, semistructured, and unstructured      periods of time, when there were simple applications for each specific
information to give the right context to enterprise users.             task in the past, to the applications on the Web that can communicate



                                                                                                                                             27
      The Architecture Journal 22
Semantic Enterprise Optimizer and Coexistence of Data Models

with each other—achieving larger business functionality, and blurring    or disk-based write-only store and memory-based read-only store—
the boundaries of intranets, Internet, and enterprises:                  moving them across frequently, according to their life-cycle stages
                                                                         and the characteristics that they exhibit. Offloading some load to
1. Data file sharing in file systems                                     auxiliary processors that specialize in SQL processing are also some
2. Databases                                                             of the practices that we can observe in data-warehouse appliances.
3. ERP, CRM, and MDM
4. ETL—data warehouse                                                    Intelligent Autonomic Design
5. Enterprise Information Integration (EII)                              Many systems optimally design or recommend based on the following
6. Enterprise Application Integration (EAI)—service-oriented             inputs:
   architecture (SOA)
7. Semantic Web services                                                 •   Logical schema
                                                                         •   Sample data
With respect to information, too, the lines between fact and             •   Query workload
dimension, data and language, and structured and unstructured are
blurred when a particular type of data morphs over time, with volume     Some systems have abstracted the file-handling parts that must be
and its importance to and relationship with its environment. A normal    handled at the OS level.
transaction data model can progress from business intelligence and          Oracle Query Advisor and Access Advisor offer best plans, based
predictive data mining to machine learning toward a knowledge            on statistics.
model and becomes actionable in language form, where it can
communicate with other systems and humans.                               Impedance Mismatch and Semantic Interoperability
    Enterprise data needs a common vocabulary and understanding          Object-Relation mapping (ORM) is a classic pattern in which there are
of the meaning of business entities and the relationships among them.    two different tools that would like to organize the same data, in two
Due to the variety of vendors who specialize in the functions of an      different ways. Here, the same enterprise data has to be represented
enterprise, it usually is a common sight to see heterogeneous data       by using an inheritance hierarchy in object-oriented design, whereas
models from disparate products, technologies, and vendors having         it can be one or more tables in a relational database. LINQ to SQL and
to interoperate.                                                         LINQ to Entities (Entity Framework) are some tools that address this
    Data as a service with SOA, enterprise service bus (ESB), and        space.
canonical data models have tried to address this disparity in                 What if the database is a hierarchical database, or a plain entity-
schema or structure, but have not addressed the seamless semantic        attribute-value model?
interoperability until the advent of Semantic Web services.                   If the language happens to be a functional programming model,
    Semantic enterprise integration through enterprise Web Ontology      and the database that we use happens to be an object-oriented
Language (OWL) can be a solution for the seamless semantic               database, a different sort of impedance mismatch might emerge.
interoperability in an enterprise.                                       So, wherever there are different paradigms that must communicate,
                                                                         it is better to have an in-between intermediary.
Motivation for This Paper: Industry Trends                                    In this case, the authors propose that the domain model (not
Accommodating and Coexisting with Diversity                              an object model) be represented in an enterprise-wide ontological
Storing all the data in a row-oriented, third normal form (3NF)          model—complete with all business logic, rules, constraints, and
relational schema might not be optimal. We see many trends, such as      knowledge represented. For each system that must communicate,
various types of storage engines, that are configurable and extensible   let it use this ontological model as a common denominator to talk to
in that specific domain data types can be configured and special         systems.
domain indexes built, and the optimizer can be made aware of them             Another area of impedance mismatch is the one between
to cater to heterogeneous data-type requirements. These object-          a relational SQL model and the OLAP cube Multidimensional
oriented semantic extensions are built as applications on top of the     Expressions (MDX). MDX is a language in which the levels of
database kernel, and there are APIs for developers to customize and      hierarchical dimensions are semantically meaningful, which is not
extend to add their own unstructured or semistructured data types.       the case with tuple-based SQL. Here again, a translation is required.
This is used in spatial, media, and text-processing extensions that      Instead of going for a point-to-point solution, we might benefit from
come with the product. In some cases, native XML data types are also     a common ontology.
supported.                                                                    The application requests a semantic-data-services provider, which
    Microsoft Enterprise Search is an example of disparate search        translates the query appropriately to the enterprise ontology model
from e-mail in Microsoft Exchange Server to user profiles in Active      and—depending on the data-source model—federates the query in
Directory to Business Data Catalog in Microsoft SQL Server RDBMS         a modified form that is understood by the specific data model of the
and Microsoft Office documents.                                          data source.
    Prominent players have addressed unstructured data in the                 The domain model is conceptual and could replace or reuse
form of content-management systems, which again have to be               the conceptual entity-relationship or UML class-object structural
accommodated in the proper context with traditional enterprise           diagrams.
structured data—both metadata- and content-wise.
                                                                         Data-Flow Architecture for the Semantic Enterprise Model
Offload to Auxiliary Units                                               The following sections explain the Semantic enterprise optimizer
Many database systems support a row-oriented OLTP store for              that can bridge the gap between the various disparate data models
updated rows and columnar-compressed store for read-only store           that can coexist and provide the data services intelligently (see also
                                                                         Figure 1 on page 29).


28
                                                                                                             The Architecture Journal 22
Semantic Enterprise Optimizer and Coexistence of Data Models


Figure 1: Semantic enterprise optimizer and coexistence of data models




                                                                                                                Workflow repository
                                            Data insertion/                                                       for life cycle and
                                            update/deletion                                                    data-model progression
                                               from user                                                               for data



                                                                                                                Rules repository for life
                                                                                                                 cycle and data-model
                         Semantic                                                                                progression for data
    User
                         enterprise
  request
                         optimizer                                                                          Workflow and rules enterprise
                                                                                                            semantic-ontology repository




                                                                                                                         Event
                                             Query/search                     Instance metadata                       generator
                                              from user                        lineage navigator                 from life cycle–state
                                                                                                                    change of data




                                                                                                            Relational




                                                                                                                                 Distributed




Autonomic Evolutionary Logical and Physical Design                          federates the query to online or archived storages, and across
Depending on the usage, volume of data, and the life-cycle stage, we        heterogeneous models and products. Here, SOA-based data and
have a proposal for automatic logical and physical data-model design.       enablement of metadata as a service is helpful.
    Initially, when a domain is not known with concrete requirements,
an entity-attribute-value model is always good to start with. Here,         Semantic Data Services
the structure is defined by an administrative screen with parameters,       The Semantic Data Services extend the features of the data-service
which takes the definition of the record structure and stores the           object to enable ontology-driven semantics in its service. The
metadata and data by association.                                           interface services consult the enterprise ontology for interaction.
    After a periodic interval, there is an agent that watches over
the record types over a period of time, as well as the actual data          Workflow and Rules Enterprise Semantic-Ontology Repository
in the record values, to see if the changes have settled down and           For each type of data that is classified, we can define the lineage that
the structure has become relatively stabilized. When the structure          it should follow. For example, we can say that an employee-master
is stabilized, the analyzer now qualifies which type of structure is        record in an enterprise will be entering as master data. It will follow
suitable for the entity—keeping into account the queries, the data          the semantic Resource Description Framework (RDF) model, in which
and its statistics, the changes to the structure and its relationship       relationships for this employee with others in the organization are
with other entities, and its life-cycle stage.                              defined. The employee master will also be distributed to have the
                                                                            attendance details in the reporting location; however, salary details
Component Model of the Semantic Enterprise Optimizer                        will be in the central office, from where disbursements happen. The
Semantic Enterprise Optimizer                                               employee record will be maintained in the online transaction systems
The Semantic enterprise optimizer consults the workflow and rules           till the tenure of the employee with the enterprise. After the employee
repositories in case of an insert or state change event, to find out        has left, the employee record will remain for about one year for year-
which data model should accommodate the incoming or state-                  over-year reporting, before it moves into a record-management
changed data item. In case of a query, it consults the instance             repository, where it is kept flattened for specific queries. Then, after
metadata lineage navigator to locate the data. Accordingly, it              three years, it is moved into archival storages, which are kept in highly


                                                                                                                                                  29
      The Architecture Journal 22
Semantic Enterprise Optimizer and Coexistence of Data Models

compressed form. But the key identifying information is kept online          of an advantage, but where there are changes in existing flows in life-
in metadata repositories, to enable any background/asynchronous/             cycle states or changes in new data types.
offline checks that might come for that employee later throughout
the life of the enterprise.                                                  Conclusion
    All these changes at appropriate life-cycle stages are defined in        We see the enterprise scene dominated by a distributed graph
the workflow repository, together with any rules that are applicable in      network GRID of heterogeneous models, which are semantically
the rules repository.                                                        integrated into the enterprise; also, that enterprise data continually
                                                                             evolves through its logical and physical design, based on its usage,
Event-Generator Agent                                                        origin, and life-cycle characteristics.
Based on the preceding workflows and rules, if a data item qualifies             Various data models that have been found appropriate or any
for a state change, an event is generated by this component, which           combination thereof can coexist to decide the heterogeneous model
alerts the optimizer to invoke the routine to check the appropriate          of an enterprise. The relational model emphasized that the user need
data model for the data item to move into after the state change.            not know the physical structure or organization of data. In this model,
                                                                             we propose that even the logical model need not be known, and any
Instance Metadata Lineage Navigator                                          enterprise-data resource should be reusable across operating systems,
Every data instance has metadata associated with it. This will involve       database products, data models, and file systems.
attributes such as creation date, created by, created system, the path           The architecture describes an adaptable system that can
that it has taken at each stage of its life-cycle state change, and so       intelligently choose the data model as per the profile of the incoming
on. It will also contain the various translations that will be required to   data. The actual models, applications and life-cycle stages that are
trace that data across various systems. This component helps locate          supported themselves are illustrative. The point is that it is flexible
the data.                                                                    enough to accommodate any future model that might be invented
                                                                             in the future. Adaptability and extensibility are takeaways from this
Data-Model Universe                                                          architecture.
This is the heterogeneous collection of data models that is available            Also, dynamic integration of enterprise boundaries will lead
for the optimizer to choose when a data item is created and,                 to more agility and informed decisions in the increasing business
subsequently, changes state:                                                 dynamics.

1. Master and reference data—largely static; MDM; hierarchy;                 Acknowledgements
   relationships; graph; network; RDF                                        The first two authors are thankful to the third author,
2. OLTP engine—transaction; normalized                                       S. V. Subrahmanya, Vice President at E&R, Infosys Technologies Ltd.,
3. OLAP cube engine—analytics; transaction life cycle completed;             for seeding and nurturing this idea, and to Dr. T.S. Mohan, Principal
   RDF analytics for relationships and semantic relations                    Researcher at E&R, Infosys Technologies Ltd., for his extensive and
4. Records management or file engine—archived data; for data                 in-depth review.
   mining, compliance reporting                                                  The authors acknowledge and thank the authors and publishers
5. Object and object-relational databases for unstructured                   of the papers, textbooks, and Web sites that are referenced. All
   information—image databases; content-based information                    trademarks and registered trademarks that are used in this article
   retrieval                                                                 are the properties of their respective owners/companies.
6. Text databases for text analytics, full-text search, and natural-
   language processing                                                       References
7. XML engines for integration of distributed-transaction processing         Larson, James A., and Saeed Rahimi. Tutorial: Distributed Database
8. Stream processing—XML                                                     Management. Silver Spring, MD: IEEE Computer Society Press, 1985.
9. Metadata—comprises RDF, XML, hierarchy, graph integration of
   heterogeneous legacy databases in terms of M&A, and partnering            Hebeler, John, Matthew Fisher, Ryan Blace, Andrew Perez-Lopez, and
   for providing collaborative solutions                                     Mike Dean (foreword). Semantic Web Programming. Indianapolis, IN:
                                                                             Wiley Publishing, Inc., 2009.
Here, the query must be federated, and real-time access has to be
enabled with appropriate semantic translation.                               Powers, Shelley. Practical RDF. Beijing; Cambridge: O’Reilly &
    When the life cycle of the data changes, there are sensors or            Associates, Inc., 2003.
machine-learning systems that are programmed to understand
the state when the life-cycle stage changes. When such changes               Chisholm, Malcolm. How to Build a Business Rules Engine: Extending
are detected, the record is moved accordingly from transaction               Application Functionality Through Metadata Engineering. Boston:
management to OLAP or data mining, or archival location, as per              Morgan Kaufmann; Oxford: Elsevier Science, 2004.
the lineage.
    So, when information is requested, the optimizer, based on the           Vertica Systems. Home page. Available at http://www.vertica.com
business rules that are configured, is able to find out which engine         (visited on October 16, 2009).
should be able to federate that query, based on the properties of the
search query, and appropriately translate it into hierarchical, OLAP, or     Microsoft Corporation. Enterprise Search for Microsoft. Available
file-system query.                                                           at http://www.microsoft.com/enterprisesearch/en/us/default.aspx
    In regular applications that are developed, we might not see much        (visited on October 16, 2009).




30
                                                                                                                The Architecture Journal 22
Semantic Enterprise Optimizer and Coexistence of Data Models


G-SDAM. Grid-Enabled Semantic Data Access Middleware. Available at         About the Authors
http://gsdam.sourceforge.net/ (visited on October 18, 2009).               P. A. Sundararajan (sundara_rajan@infosys.com) is a Lead in the
                                                                           Education & Research Department with ECOM Research Lab at Infosys
W3C. “A Semantic Web Primer for Object-Oriented Software                   Technologies Ltd. He has nearly 14 years’ experience in application
Developers.” Available at http://www.w3.org/TR/sw-oosd-primer/             development and data architecture in Discrete Manufacturing,
(visited on October 18, 2009).                                             Mortgage, and Warranty Domains.

Oracle. Oracle Exadata. Available at http://www.oracle.com/database        Anupama Nithyanand (anupama_nithyanand@infosys.com) is a
/exadata.html (visited on October 21, 2009).                               Lead Principal in the Education & Research Department at Infosys
                                                                           Technologies Ltd. She has nearly 20 years’ experience in education,
                                                                           research, consulting, and people development.

                                                                           S. V. Subrahmanya (subrahmanyasv@infosys.com) is currently Vice
                                                                           President at Infosys Technologies Ltd. and heads the ECOM Research
                                                                           Lab in the Education & Research Department at Infosys. He has
                                                                           authored three books and published several papers in international
                                                                           conferences. He has nearly 23 years’ experience in the industry and
                                                                           academics. His specialization is in Software Architecture.




Thinking Global BI: Data-Consistency Strategies for Highly                 •   Minimize fact table columns, and maximize dimension
Distributed Business-Intelligence Applications                                 attributes. The single biggest performance bottleneck in I/O for
by Charles Fichter                                                             building and replicating MOLAP stores is large fact tables that
                                                                               have excessive column size. Large columns (attributes) within the
The need for centralized data-warehousing (DW) systems to update               associated dimension tables can be processed far more efficiently.
and frequently rebuild dimensional stores, and then replicate to               Extending dimension tables to a snowflake pattern (further
geographies (data marts), can create potential consistency challenges          subordinating dimension tables) for extremely large DW sizes can
as data volumes explode. In other words: Does your executive in                further increase efficiencies, as you can utilize partitioning across
Japan see the same business results in near-real time as headquarter           tables and other database features to increase performance.
executives in France? Throw in write-back into the dimensional             •   If centralized DW, consider lightweight access (browser).
stores, plus a growing need to support mobile users in an offline              Utilizing tools such as SQL Server Report Builder, architects can
capacity, and suddenly you’ve inherited larger challenges of consistent        provide summary data by designing a set of fixed-format reports
business-intelligence (BI) data. Consistent data view across the global        that are accessible via a browser from a Reporting Services server.
enterprise is a result of DW-performance optimizations that occur at           By enabling technologies such as Microsoft PowerPivot for Excel
design time.                                                                   2010—formerly known as codename “Gemini” (to be available H1
                                                                               2010)—users can download cubes for their own manipulation in
Here are some quick tips for thinking global BI:                               tools such as Office Excel 2010. PowerPivot utilizes an advanced
                                                                               compression algorithm that can greatly reduce the physical data
•   Look for optimizations locally first. Seek ways in which to                size that is crossing the wire—to occur only when a “self-service”
    create and manage Multidimensional Online Analytic Processing              request is initiated directly by a user.
    (MOLAP) stores that are close to the users who consume it. You’ll
    likely find that 80 percent or more of BI reporting needs are local/   You can learn more about Microsoft’s experiences in working directly
    regional in nature. Effective transformation packages (using tools     with customers in extremely large DW and BI implementations by
    such as Microsoft SQL Server Integration Services [SSIS]) or even      visiting the SQL CAT team Web site at http://sqlcat.com/.
    managing data synchronization directly through application code
    for asynch/mobile users (such as Synch Services for ADO.NET) can
    often be more flexible than replication partnerships.                  Charles Fichter (cfichter@microsoft.com)is a Senior Solution Architect
•   As much as possible, utilize compression and read-only                 within the Developer Evangelism, Global ISV (Independent Software
    MOLAP for distribution. Many DW vendors have enabled write-            Vendor) team at Microsoft Corporation. For the past four and a
    back capabilities into the MOLAP stores. Use these judiciously, and    half years, Charles has focused on assisting Global ISVs with their
    minimize them to a smaller subset of stores.                           application-design strategies.


                                                                                                                                                  31
      The Architecture Journal 22
Lightweight SOAs:
Exploring Patterns and Principles of
a New Generation of SOA Solutions
by Jesus Rodriguez and Don Demsak




Summary                                                                  Simple enough, right? An ideal SOA infrastructure should resemble
                                                                         Figure 1.
This article explores some of the most common                                We can all agree that Figure 1, at least in theory, represents an
challenges of traditional service-oriented architectures                 ideal architecture for enterprise applications. Unfortunately, large
                                                                         SOA implementations have taught us that the previous architecture is
(SOAs) and discusses how those challenges can be
                                                                         just that: an ideal that is permeated by enormous challenges in areas
addressed by using more scalable, interoperable, and                     such as versioning, interoperability, performance, scalability, and
agile alternatives that we like to call “lightweight SOAs.”              governance.
                                                                             These challenges are a direct consequence of the lack of
Introduction                                                             constraints in SOA systems. Architecture styles that do not impose
During the last few years, we have witnessed how the traditional         constraints in the underlying domain quite often produce complex
approach to service orientation (SOA) has constantly failed to deliver   and unmanageable systems. Instead of simplifying the capabilities of
the business value and agility that were a key element of its value      SOA and focusing on the important aspects—such as interoperability,
proposition. Arguably, we can find the causes to this situation in       performance, and scalability—we decided to abstract complexity with
the unnecessary complexity intrinsic of traditional SOA techniques       more standards and tools. As a result, we have been building systems
such as SOAP, WS-* protocols, or enterprise service buses (ESBs). As     that present similar limitations to the ones that originated the SOA
a consequence, alternate lightweight SOA implementations that are        movement in the first place.
powered by architecture styles such as Representational State Transfer       One thing that we have learned from Ruby on Rails is the
(REST) and Web-Oriented Architectures (WOA) are slowly proving to        “convention over configuration” or “essence vs. ceremony” mantra.
be more agile than the traditional SOA approach.                         By removing options and sticking with conventions (that is, standard
                                                                         ways of doing something), you can remove extra levels of abstraction
SOA: Architecting Without Constraints                                    and, in doing so, remove unneeded complexity of systems. Embrace
SOA has been the cornerstone of distributed systems in the last few      the options, when needed, but do not provide them just for the sake
years. Essentially, the fundamental promise of SOA was to facilitate     of giving options that will be underutilized.
IT agility by implementing business
capabilities by using interoperable
interfaces that can be composed to         Figure 1: Ideal SOA
enable new business capabilities. From
an architectural style standpoint,
                                                      Service            Service           Service
traditional SOA systems share a set of
characteristics, such as the following:

•    Leveraging SOAP and WSDL as the
     fundamental standards for service
     interfaces                                                                        Integration
•    Using WS-* protocols to enable         B                                             server
     mission-critical enterprise-service    P
     capabilities                           M                Enterprise service bus (ESB)
•    Deploying a centralized ESB for
                                                                                                                             SOA
     abstracting the different service                                                                                    governance
     orchestrations
•    Using an integration server for
     implementing complex business
     processes                                       Service              Service               Service
•    Using an SOA-governance tool to
     enable the management of the
     entire SOA                                        LOB                  LOB                  LOB


32
                                                                                                           The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

                                                                                                                 we can all agree that SOAP has
Figure 2: WSDL dependency—a big challenge in large SOA implementations                                           failed on its original expectations.
                                                                                                                 This about it: SOAP was originally
                                                                                                                 coined as the Simple Object Access
                                                                                                                 Protocol; but, as we all learned, it
             Service                                                     Service                                 is not simple, is not about object
                                                                                                                 access, and, arguably, is not a
                                                 New
                                                                                                                 protocol.
                                                version
               WSDL                                                        WSDL
               v1.0                                                        v2.0                                     WSDL Abuse
                                                                                                                    The Web Service Description
                                                                                                                    Language (WSDL) is one of the
                                                                                                                    fundamental specifications in the
       Client                      Client                         Client                         Client             SOA portfolio. The purpose of
                                                                                                                    WSDL is to describe the capabilities
                                                                                                                    of a service, such as the messages
               Client          Client                                      Client            Client                 that it can send and receive or
                                                                                                                    how those messages are encoded
                                                                                                                    by using the SOAP protocol.
                                                                                                                    Conceptually, WSDL represents an
                                                                                                                    evolution of previous description
Although this article does not attempt to present a forensic analysis          languages, such as the Interface Description Language (IDL), which
of failed SOA initiatives, we think that it is worth highlighting some         was the core of distributed programming technologies such as COM
of the factors that developers should consider when they implement             and CORBA. Following a similar pattern, WSDL quickly became the
large SOA solutions. Given the extent of this article, we decided to           fundamental artifact that client applications used to generate “proxy”
concentrate on the following topics:                                           representations that abstract the communication with the service.
                                                                                    Undoubtedly, the idea of generating proxy artifacts that are
 • SOAP and transport abstraction                                              based on a service Web Services Description Language (WSDL) can
 • Abuse of description languages                                              facilitate client-service interactions in very simple environments. Like
 • ESB complexity                                                              its predecessors, the fundamental challenge of this approach is that
 • WS-* interoperability                                                       it introduces a level of coupling between the client and the service.
 • SOA governance                                                              In large SOA scenarios that involve hundreds of services and clients,
                                                                               that level of service-client dependency is typically the cause of serious
The remainder of this article will dive deep into each of these                service-management and versioning challenges, as Figure 2 illustrates.
topics and will explore alternative architecture styles that can help
developers implement more robust SOA solutions.                                To ESB or Not to ESB
                                                                               The seamless integration of heterogeneous line-of-business (LOB)
SOAP and the Illusion of Transport Abstraction                                 systems as part of business processes has been one of the promises
The current generation of SOA solutions evolved around the                     of enterprise SOA systems. These integration capabilities are normally
concepts of the Simple Object Access Protocol (SOAP). This protocol            powered by a set of common patterns that constitute the backbone
was originally designed to abstract services from the underlying               of what the industry has considered one of the backbones of SOA:
transport protocol. Conceptually, SOAP services can be hosted by               the enterprise service bus (ESB).
using completely different transports, such as HTTP and
MSMQ. Although in theory it is a lovely idea, in practice
we have learned that SOAP reliance on transport                     Figure 3: Central ESB
neutrality comes at a significant cost—a cost that can
be reduced when you do not need that neutrality. One
of the best examples of the limitations of transport                    Service                            Service                          Service
neutrality is the use of HTTP with SOAP-based service
endpoints.
    The vast majority of SOAP-based services rely on
HTTP as the transport protocol. However, the SOA
HTTP binding uses only a very small subset of the HTTP
specification, reduced to the POST method and a couple                                                        ESB
of headers. As a result, SOAP HTTP–based services do
not take advantage of many of the features that have
made HTTP one of the most popular and scalable
transport protocol in the history of computer systems.
    If we have inherited something good from SOAP, it
has been the use of XML, which has drastically increased                                  Service                           Service
the interoperability of distributed applications. However,


                                                                                                                                                     33
      The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

Although there is no formal industry standard that defines
what an ESB is, at least we can find a set of commonalities       Figure 4: ESB reality in a large enterprise
across the different ESB products. Typically, an ESB
abstracts a set of fundamental capabilities such as
protocol mapping, service choreographies, line-
of-business (LOB) adapters, message distribution,
transformations, and durability. Theoretically, we can use             Service                      Service                       Service
this sophisticated feature to abstract the communication
between services and system—making the ESB the
central backbone of the enterprise, as Figure 3 on page 33
illustrates.
    As architects, we have to love absolutely the diagram
                                                                                                      ESB
in Figure 3. Undoubtedly, it represents an ideal model on
which messages are sent to a central service broker and
from there distributed to the final services. Unfortunately,
if we are working in a large SOA, we are very likely to
find that a central bus architecture introduces serious
limitations in aspects such as management, performance,                              Service                       Service
and scalability, as our entire integration infrastructure
now lives within a proprietary framework. Instead of
being a facilitator, an ESB can become a bottleneck for the agility,         solutions. Interoperability is by far the most challenging aspect of
governance, and scalability of our SOA. Consequently, we are forced          WS-*–based solutions, as different Web-service toolkits sometimes
to start building applications that do not fully leverage the ESB, and       implement different WS-* protocols, different versions of the same
our architecture quickly starts to resemble the diagram in Figure 4.         protocols, or even different aspects of the same specification.
                                                                             Additionally, the adoption of WS-* has fundamentally been reduced
WS-* Madness                                                                 to the .NET and Java ecosystems, which makes it completely
After the first wave of SOA specifications was created, several              impractical to leverage emerging programming paradigms such as
technology vendors began a collaborative effort to incorporate               dynamic or functional languages into SOA solutions (see Figure 5).
some key enterprise capabilities such as security, reliable messaging,
and transactions into the SOAP/WSDL model. The result of this                SOA Governance or SOA Dictatorship
effort materialized in a series of specifications such as WS-Security,       Management and governance must be at the center of every
WS-Trust, and WS-ReliableMessaging, which the industry baptized              medium to large SOA. While we keep enabling business capabilities
as WS-* protocols. Beyond the undeniable academic value of the               via services, it is important to consider how to version, deploy,
WS-* specifications, they have not received a wide adoption in               monitor, and manage them. This set of functionalities has been the
heterogeneous environments.                                                  bread and butter of SOA-governance platforms that have traditionally
    The limited WS-* adoption in the enterprise can ultimately               evolved around the concepts of the Universal Description, Discovery,
be seen as a consequence of the incredibly large number of WS-*              and Integration (UDDI) specification.
specifications that have been produced during the last few years.                Depending on our implementation, we might find that SOA-
Currently, there are more than a hundred different versions of WS-*          governance platforms are sometimes too limited for truly managing
protocols, from which just a handful have seen real adoption on SOA          complex Web services. These types of challenges are very common
                                                                                                                  in SOA-governance solutions and
                                                                                                                  are a consequence of the fact that
Figure 5: WS-* interoperability challenges                                                                        Web-service technologies have
                                                                                                                  evolved relatively faster than the
                                                                                                                  corresponding SOA-governance
    WCF client                                                                                                    platforms.
                                                                                                                      SOA-governance technologies
                                                                               WCF service                        have traditionally addressed
                                                                           (secured by using                      those limitations by relying on
                                                                                                                  a centralized model in which
                                                                                 WS-Trust)
                                                                                                                  services are virtualized in the
                                                                                                                  governance-hosting environments
    Ruby client                                                                                                   and policies are enforced at a
                                                                                                                  central location. Although this
                                                                                                                  model can certainly be applicable
                                                                         Apache Axis2 service                     on small environments, it presents
                                                                            (hosted by using                      serious limitations in terms of
                                                                                                                  interoperability, performance, and
                                                                             JMS transport)
        Oracle                                                                                                    scalability (see Figure 6 on page 35).
     WebLogic
      client

34
                                                                                                                  The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

Introducing Lightweight SOAs
Conceptually, SOAs can be a very powerful        Figure 6: Centralized SOA-governance models—impractical in large SOA implementations
vehicle for delivering true business value
to enterprise applications. However,
some of the challenges that are explained                                                Virtualization
in the previous sections have caused
SOA initiatives to become multiyear,
multimillion-dollar projects that failed to                                  Authorization           Monitoring
enable over the promised agility.
    Despite these challenges, the benefits of
                                                                                                        Policy
correctly enabling SOAs can result in a great                                Authentication
                                                                                                     enforcement                    Service
differentiator to deliver true business value
on an enterprise. However, we believe in a
lighter, more agile approach that leverages
the correct implementation techniques and           Client
emerging architecture styles is mandatory
                                                                                                                                    Service
to implement SOA solutions correctly.
    The remaining sections of this article
will introduce some of the patterns and
architecture techniques that we believe can         Client                                                                          Service
help address some of the challenges that
were presented in the previous section.
Fundamentally, we will focus on the
following aspects:                                                                                                                  Service
                                                    Client
•   Leveraging RESTful services
•   WS-* interoperability
•   Federated ESB                                                                                                                   Service
•   Lightweight SOA governance
•   Embracing cloud computing                                            •   Multiple resource representation
                                                                         •   Link-based relationships
Embracing the Web: RESTful Services                                      •   Scalability
In previous sections, the authors have explored various limitations of   •   Caching
the fundamental building blocks of traditional SOA systems such as       •   Standard methods
XML, SOAP, WSDL, and WS-* protocols. Although Web Services are
transport-agnostic, the vast majority of implementations in the real     The simplicity from the consumer perspective and high levels of
world leverage HTTP as the underlying protocol. Those HTTP-hosted        interoperability of RESTful services are some of the factors that can
services should (in theory, at least) work similarly to Web-based        drastically improve the agility of the next generation of SOA solutions.
systems. However, the different layers of abstractions that we have
built over the HTTP protocol limit those services from fully using the   WS-* Interoperability
capabilities of the Web.                                                 Despite the remarkable academic work that supports the WS-* family
    To address some of these challenges, SOA technologies have           of protocols, we can all agree that its adoption and benefits did not
started to embrace more Web-friendly architecture styles, such as the    reach initial expectations. Interoperability and complexity remain as
REST. REST has its origins in Roy Thomas Fielding’s PH.D dissertation    important challenges that we face in the adoption of WS-* protocols
that states the principles that make the Web the most scalable and       in the enterprise.
interoperable system in the history of computer software. Essentially,       When it comes to WS-* interoperability, the absolute best
REST-enabled systems are modeled in terms of URI-addressable             practice is to identify the capabilities of the potential consumers of
resources that can be accessed through HTTP stateless interactions.      our services. Based on that, we can determine which WS-* protocols
Following the principles of REST, we can architect highly scalable       are best suited for our particular scenarios. In highly heterogeneous
services that truly leverage the principles of the Web.                  environments, we should considering enabling different service
    Undoubtedly, REST has become a very appealing alternative to         endpoints that support various WS-* capabilities. This approach
SOAP/WS-* Web Services. The use of REST addresses some of the            can drastically improve the interoperability of our services, given
limitations of traditional Web Services such as interoperability and     that the different clients can interact with the service endpoint that
scalability.                                                             interoperates best with them.
The following are capabilities of RESTful services:                          For example, let us consider a scenario in which we must secure a
                                                                         Web Service that is going to be consumed by .NET, Sun Metro, Oracle
•   URI-addressable resources                                            WebLogic, and Ruby clients. In this scenario, we could enable three
•   HTTP-based interactions                                              service endpoints with different security configurations, based on the
•   Interoperability                                                     consumer capabilities, as Figure 7 on page 36 illustrates.
•   Stateless interactions                                                   Even in scenarios in which we decide to use WS-* protocols, the
•   Leveraging syndication formats                                       technique that Figure 7 illustrates helps us improve interoperability by


                                                                                                                                               35
      The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

enabling various endpoints that capture the
interoperability requirements of the different    Figure 7: Pattern of multiple service endpoints
service consumers.

Lightweight Federated ESBs                              WCF
As explored in previous sections, a                    client
centralized ESB very often is one of                                                                WS-Trust endpoint
the fundamental causes of failed SOA
initiatives. The ability to centralize very           Metro
smart functionalities such as message                  client
routing, transformation, and workflows is as
                                                                                                                                  Service
appealing as it is unrealistic in medium-to
large-enterprise environments. Essentially, by        Oracle                                    WS-Security endpoint

relying on the clean concept of the central          WL client
service bus, we drastically constrain the
options for scalability, specialization, and
management of the enterprise solutions that
                                                       Ruby                                   Custom security endpoint


leverage our SOA infrastructure.
                                                       client
    After several failed attempts to
implement centralized ESBs in large
organizations, the industry has moved to a more agile pattern in           Lightweight Governance
which the functionality is partitioned across multiple lightweight         The limited adoption of UDDI in large-scale SOA environments has
physical ESBs that are grouped as a federated entity. This pattern         been a catalyst for the emergence of lighter and more interoperable
is commonly known as federated ESB and represents one of the               SOA-governance models that leverage new architecture styles, such as
emerging architecture styles for building highly scalable ESB solutions    the REST and Web 2.0. Essentially, these new models look to remove
(see Figure 8).                                                            some of the complexities that are bound to the centralized, UDDI-
    The federated-ESB pattern addresses the fundamental limitations        based architectures—replacing them with widely adopted standards
of the centralized-ESB model by partitioning the infrastructure            such as HTTP, Atom, and JSON.
into separate ESB that can be scaled and configured separately.            One of the most popular new SOA-governance models is the idea of
For instance, in this model, we can have a specific ESB infrastructure     a RESTful Service Repository. In this model, traditional SOA constructs
to host the B2B interfaces, while another ESB is in charge of the          such as service, endpoints, operations, and messages are represented
financial-transaction processing. This approach also centralizes certain   as resources that can be accessed through a set of RESTful interfaces.
capabilities—such as security or endpoint configuration—that do not        Specifically, the use of resource-oriented standards such as Atom and
impose any scalability limitation on the SOA infrastructure.               the Atom Publishing Protocol (APP) is very appealing to represent and
                                                                           interact with SOA artifacts (see Figure 9 on page 37).



Figure 8: Pattern of federated ESB




                                                    ESB
                                                                                           Business-activity
                                                                                             monitoring

                                     ESB
                                                                                                                              Security

                                                                                        Business rules

                   ESB
                                                                                                                            Operational
                                                                                                                            monitoring
                                                                                       Service registry



                                                                                                           Error handling

       Services                  Services                  Services




36
                                                                                                                  The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

This model represents a lighter, more flexible
approach to both design and runtime governance.           Figure 9: RESTful registry
For example, runtime-governance aspects such as
endpoint resolution are reduced to a simple HTTP                                           RESTful registry
GET request against the RESTful interfaces. The main
advantage of this type of governance probably
is the interoperability that is gained by the use of               Resolve-service                                Register-service
RESTful services, which will allow us to extend our                endpoint                                       endpoint
SOA-governance practices beyond .NET and J2EE
to heterogeneous technologies such as dynamic or
functional languages.
                                                                                                                             Service
Welcoming the Cloud
The emergence of cloud-computing models is steadily
changing the way in which we build distributed
systems. Specifically, we believe that the next            Client                                                            Service
generations of SOA solutions are going to be a hybrid
of cloud and on-premises services and infrastructure
components. The influence of cloud computing is by
no means reduced to the ability of hosting enterprise                                                                        Service
Web services in private or public clouds. Additionally,
there are specific SOA-infrastructure components
that can be directly enabled from cloud environments
as a complement to the traditional on-premises SOA
technologies (see Figure 10).



Figure 10: Enhancing on-premises SOAs with cloud infrastructures


                                                             Cloud services



                         Internet service                    Security-token
                                                                                                Data services
                               bus                              service




On-premises SOA                                                                                                 On-premises SOA


         Client                           Client                                     Service                  Client



                  Enterprise service bus                                                   Enterprise service bus



                  Service                          Service                                 Service                     Service



                                                                                                                                     37
      The Architecture Journal 22
Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions

Let us look at some examples:                                                 Enterprise Service Bus
                                                                              To ESB or not to ESB: That is the question. Point-to-point
•    Cloud-based service bus—Can we host an ESB in a cloud                    communications are strongly coupled and easy to implement. But
     infrastructure? Absolutely! This type of ESB leveraging can              point-to-point communications, by their very nature, are brittle, tend
     enable capabilities such as message routing, publish-subscribe,          to stagnate, and limit the business-intelligence opportunities that are
     transformations, and service orchestration, which are the                embedded in the messages.
     cornerstones of on-premises ESBs.
•    Cloud-based security services—In the last few years, we                     Do not confuse ESBs with event-processing systems. They are
     have witnessed the increasing adoption of security services                 similar, but have different scales and performance requirements.
     such as Windows Live ID or Facebook Connect. Leveraging                     Consider federated ESBs, as they address the limitations of a
     cloud security infrastructures can facilitate the implementation            centralized ESB (spoke and hub).
     of interoperable security mechanisms such as authentication,                Do not reproduce your strongly coupled point-to-point patterns
     identity representation, authorization, and federation on Internet          within your ESB by simply moving the code to the bridge.
     Web-service APIs.                                                           Consider using pub-sub over request-response when you are
•    Cloud-based storage services—Arguably, storage services such                building distributed systems.
     as Amazon S3 or Azure DB are the most appealing capability
     of cloud infrastructures. Leveraging these types of service can          Cloud-Based Services
     drastically increase the flexibility and interoperability of the data-   Everything in the cloud: That seems to be where we are headed.
     exchange mechanisms of our SOA, while it removes some of the             Microsoft was a bit early with its My Services concept, but more and
     complexities that are associated with on-premises storage.               more services are headed towards the cloud.

Conclusion                                                                       Consider cloud-based security services over local, proprietary
The traditional approach to SOA introduces serious challenges                    security for public-facing services. It is arguably the most mature
that make it impractical for large implementations. This article                 of the cloud-based services.
suggests a series of patterns that can help developers enable lighter,           Consider the possibility of future enhancements to take
interoperable, and scalable SOAs that can enable true business agility           advantage of cloud-based storage and the cloud-based
in large enterprise scenarios.                                                   service bus.

Transport Abstraction                                                         The most important thing to keep in mind when you are building your
   Consider first standardizing on the HTTP protocol. HTTP is a               enterprise services is the mantra “convention over configuration.” By
   great lightweight alternative, and it can interoperate with more           keeping the number of options to a minimum and building only what
   frameworks.                                                                is required by the business, you can create lighter-weight services that
   Do use SOAP & WS-* when transactions, durable messaging, or                are easier to maintain and enhance.
   extreme (TCP/IP) performance is required.

SOAP & WSDL                                                                   About the Authors
If you have already decided to standardize on HTTP, there is little           Jesus Rodriguez (Jesus.Rodriguez@tellago.com) is the Chief Architect
need for SOAP and WSDL. Learn to embrace technologies such as                 at Tellago, Inc. He is also a Microsoft BizTalk Server MVP, an Oracle
REST, JSON, and Atom Pub, and use the Web to the fullest extent.              ACE, and one of a few architects worldwide who is a member of the
                                                                              Microsoft Connected Systems Advisor team.
     Do not use SOAP and WSDL unless you are sure that you need the
     services that they provide.                                              Don Demsak is a Senior Solution Architect at Tellago, based out
     Consider using REST, JSON, and Atom Pub as lightweight                   of New Jersey, who specializes in building enterprise applications
     alternatives.                                                            by using .NET. He has a popular blog at www.donxml.com and is a
     Do not fall into the trap of generating WSDLs as a side effect of        Microsoft Data Platform MVP, and a member of the INETA Speakers
     creating SOAP-based services. Think contract first.                      Bureau.

Governance & Discoverability
If you do not have WSDL, how can you govern your corporate
services? By using a service registry, of course! UDDI failed, but that        Follow up on this topic
does not mean that a service repository was not needed—just that                •   WCF JSON support:
UDDI was too complex and looking to solve the wrong issues. By                      http://msdn.microsoft.com/en-us/library/bb412173.aspx
using a lightweight service registry that is built upon RESTful services,       •   RESTful WCF:
you can still supply governance and discoverability, without the                    http://msdn.microsoft.com/en-us/library/bb412169.aspx
complexity of UDDI.                                                             •   MS BizTalk ESB Toolkit:
                                                                                    http://msdn.microsoft.com/en-us/library/ee529141.aspx
     Do store your service artifacts in a sort of repository—not just as        •   Windows Server AppFabric:
     an option on an endpoint.                                                      http://msdn.microsoft.com/appfabric
     Do use your service repository to help govern the services of your
     corporation.
     Consider using a RESTful service repository for SOAP and RESTful
     services, for governance and discoverability.

38
                                                                                                                 The Architecture Journal 22
22   subscribe at
     www.architecturejournal.net

Bi arch journal

  • 1.
    22 Learn the discipline,pursue the art, and contribute ideas at www.architecturejournal.net input for better outcomes Taking Business Intelligence Beyond the Business Analyst Increasing Productivity by Thinking Global BI: Empowering Business Users with Data-Warehouse Principles for Self-Serve BI Supporting Enterprise-Enabled Lightweight SOAs: Exploring Business Insight = Business-Intelligence Applications Patterns and Principles of a New Business Infrastructure = Generation of SOA Solutions Business-Intelligence Platform BI-to-LOB Integration: Semantic Enterprise Optimizer Closing the Cycle and Coexistence of Data Models
  • 2.
    Contents 21 Foreword 1 by Diego Dagum Thinking Global BI: Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications 2 by Charles Fichter Design principles to support a global data-warehouse (DW) architecture. BI-to-LOB Integration: Closing the Cycle 8 by Razvan Grigoroiu A discussion of the necessity of architecting BI solutions that focus on actionable data and information flow back to LOB applications. Increasing Productivity by Empowering Business Users with Self-Serve BI 13 by Ken Withee A look at the latest wave of Microsoft tools and technologies that enable end-user self-serve BI. Business Insight = Business Infractructure = Business-Intelligence Platform 20 by Dinesh Kumar A prescriptive approach to planning and delivering BI services through the business-infrastructure and business-capability models. Semantic Enterprise Optimizer and Coexistence of Data Models 27 by P.A. Sundararajan, Anupama Nithyanand, and S.V. Subrahmanya Proposal of a semantic ontology–driven enterprise data–model architecture for interoperability, integration, and adaptability for evolution. Lightweight SOAs: Exploring Patterns and Principles of a New Generation of SOA Solutions 32 by Jesus Rodriguez and Don Demsak An exploration of how to address common challenges of traditional SOAs by using more scalable, interoperable, and agile alternatives. Sign up for your free subscription to The Architecture Journal www.architecturejournal.net
  • 3.
    Founder Arvindra Sehmi Foreword Director Lucinda Rowley Editor-in-Chief Diego Dagum, Matt Valentine (Guest editor) Dear Architect, Editorial Board Martin Sykes, Gustavo Gattass Ayub, In this, the 22nd issue of The Architecture Journal, you’ll get our coverage on business-intelligence (BI) aspects of which architects like you and me need to be Gunther Lenz, Reeza Ali, Lou Carbone, aware today. Alejandro Miguel, Charles Fichter, Danny Tambs, Bob Pfeiff, Jay Gore, Bruno Aziza, Roger Toren, As we did in previous issues, so as to guarantee accuracy, relevance, and quality, Alejandro Pacheco we set up a board of subject-matter experts to analyze the problem space— harvesting today’s main topic concerns. Editorial and Production Services The articles that you are about to read were chosen based on those concerns. WASSER Studios Let’s take a look at them: Dionne Malatesta, Program Manager Ismael Marrero, Editor Dennis Thompson, Design Director • Enterprise BI strategy—Dinesh Kumar (Microsoft) introduces the notion of business infrastructure, which—together with capability models that are described in previous issues of The Architecture Journal—help organizations not only gain business insight, but act upon it, too. • Also, Sundararajan PA et al. (Infosys) propose a semantic enterprise data model for interoperability—adaptable for evolution through its life cycle. • Embedding business insights into our applications—Traditionally, the final output of BI is considered to be scorecards and reports that are used as strategic decision support. Using implementation examples, Razvan Grigoroiu (Epicor) tells us how to put these outcomes within the reach of line-of-business (LOB) applications. • Infrastructure and performance—Charles Fichter (Microsoft) explains the design principles for supporting a global data-warehouse architecture, with effectiveness and performance in mind. • End-user and self-service BI—BI projects typically fall short in allowing users who have basic experience to handle how results are exposed, without any dependence on IT specialists. Ken Withee (Hitachi Consulting) shows multiple ways to tackle this issue by using facilities that are already available throughout the Microsoft platform. ® As a side topic, outside the cover theme, this issue features an article by MVP Jesus Rodriguez (Tellago) on lightweight SOA implementations, and their patterns The information contained in The Architecture Journal and principles. (“Journal”) is for information purposes only. The material The reader will also find more valuable BI input in side columns, as well in the Journal does not constitute the opinion of Microsoft Corporation (“Microsoft”) or Microsoft’s advice and you as our second companion series of short videos, which are available at should not rely on any material in this Journal without seeking http://msdn.microsoft.com/en-us/architecture/bb380180.aspx. independent advice. Microsoft does not make any warranty or representation as to the accuracy or fitness for purpose of any I’d like to finish by thanking the team of subject matter experts who helped me material in this Journal and in no event does Microsoft accept complete this challenge. First, I want to thank guest editor Matt Valentine for giving liability of any description, including liability for negligence me direction in the makeup of this issue. Also, for its guidance and suggestions (except for personal injury or death), for any damages or losses (including, without limitation, loss of business, revenue, profits, to give final shape to Matt’s ideas, I’d like to thank the editorial board that we put or consequential loss) whatsoever resulting from use of this together this time. (You’ll find their names on the left-hand side of this page, which is Journal. The Journal may contain technical inaccuracies and reserved for this issue’s staff.) typographical errors. The Journal may be updated from time to time and may at times be out of date. Microsoft accepts Enjoy the issue! Remember that you may send any feedback to no responsibility for keeping the information in this Journal [email protected]. up to date or liability for any failure to do so. This Journal contains material submitted and created by third parties. To the maximum extent permitted by applicable law, Microsoft excludes all liability for any illegality arising from or error, omission or inaccuracy in this Journal and Microsoft takes no responsibility for such third party material. Diego Dagum A list of Microsoft Corporation trademarks can be found at Editor-in-Chief http://www.microsoft.com/library/toolbar/3.0/trademarks /en-us.mspx. Other trademarks or trade names mentioned herein are the property of their respective owners. All copyright and other intellectual property rights in the material contained in the Journal belong, or are licensed to, Microsoft Corporation. You may not copy, reproduce, transmit, store, adapt or modify the layout or content of this Journal without the prior written consent of Microsoft Corporation and the individual authors. Copyright © 2009 Microsoft Corporation. All rights reserved. 1 The Architecture Journal #22
  • 4.
    Thinking Global BI: Data-WarehousePrinciples for Supporting Enterprise-Enabled Business-Intelligence Applications by Charles Fichter Summary Understanding Isolated Enterprise Data, and Accessing It Enterprise architects who are looking to aggregate application This article focuses on the design principles to support data stores into meaningful Multidimensional Online Analytical a global data-warehouse (DW) architecture, the golden Processing (MOLAP) dimensional models are often faced with many triumph of any successful business-intelligence (BI) internal obstacles to accessing source data. These obstacles are often less technical and more business-, legal-, audit-, or security- application story. It draws from the Microsoft Global sensitive; or overhead is too restrictive, project process, or even independent software-vendor (ISV) partner experience political, as business data can represent “turf” among executives in designing enterprise BI applications by using and divisions. Some of the obstacles are technology constraints such Microsoft platform technologies and contains external as noncompatible or proprietary solutions, legacy file formats, and links and references to public content that delves nonrelational or unstructured data. But as vendor tools (especially enhancements in Microsoft SQL Server 2008, particularly with deeper into the design topics that are covered. Microsoft SQL Server Integration Services [SSIS] capabilities) and This article assumes that the reader has some basic DW service oriented–architecture (SOA) technologies advance (for understanding of a dimensional store, the underlying example, adoption of WS* and other open connectivity standards), fact tables in which columns are known as measures, this is becoming far less of an issue. However, many BI projects are stalled and/or eventually killed dimension tables in which columns are known as because of a failure by the team to understand accurately what data attributes, and how schemas take on star and snowflake was required, and how to access it successfully and make it usable. patterns. There are many available resources to provide Usability is a key concept. How do you take a dozen columns (with this overview; however, if needed, a concise overview names such as “xtssalescongproc”) and consolidate them in a central can be found here: http://www.simple-talk.com/sql fact table that has readable column names, so that end users can leverage self-service BI technologies in the future? /learn-sql-server/sql-server-data-warehouse-cribsheet. The following are a few general tips to help avoid the pitfalls of This article focuses also on successful DW project navigating access to isolated data: strategies and advanced topics of effective design for performance. 1. Establish strong executive sponsorship early. The success of your project will be determined by how deeply and broadly across enterprise stores you have executive mandate. Asking for access is Introduction merely 1 percent of the effort. You might be incurring significant Architects who are looking to solve enterprise BI solutions are often time and costs across divisions—often, potentially affecting enticed by packaged software applications that are able to fulfill their service to customers to grant the analysis, access, and/or executive requests for effective reports that reveal deep analysis of aggregation that you might be asking of them. In addition, does business performance. Microsoft and its vast array of ISV partners that division truly understand their own data? How much time have made significant inroads into fulfilling a vision for easing the are you asking them to analyze and assess even what data and burden of generating the BI dashboards that all executives dream capacity they have to provide for you? Do not underestimate the of—providing them with up-to-the-minute results of their business importance of executive sponsorship or the potential cost of time strategies and the ability to drill down into specific areas. Too often and resources that you might be asking across other high-value left unsaid, however, is the larger 90 percent effort that is left to data stores and the people who manage them. the supporting architect and IT force behind the glamorous UI: 2. Does anyone truly know the data? This might seem like an how to get the data; scrub and aggregate it effectively; and design obvious question; but, as we have done more of these efforts, appropriate, manageable, and performant dimensional stores, it never ceases to surprise how little enterprise customers often including ad-hoc query support, remote geography replication, know about their own data. Many ambitious BI projects are halted and even data marts for mobile decision-maker support with ever- quickly, with a realization that first a project team must perform increasing volumes of dimensional data. a full analysis of all enterprise data stores, which can often take 2 The Architecture Journal 22
  • 5.
    Thinking Global BI:Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications months. Simply looking at tables can befuddle architects who savings, less network saturation, and performance benefits to a are new to the data source, as column names do not inherently decentralized approach. describe the data that they contain. Often, applications and stores have merely been maintained and not enhanced over the years, Consider Approaching Data Warehousing in Phases and the intent and design of data stores is tribal knowledge that Many BI and DW projects begin with ambitions to design and was lost long ago. Can you effectively look at a 500-plus-table build the world’s greatest aggregated, dimensional store ever, with database and understand every relationship without the intention of replicating subsets of the dimensional stores to understanding the minutiae of every application that utilizes geographies (the notion of data marts). However, is this approach the store? Using advanced vendor tools and ample time, perhaps. always necessary or even feasible? Nearly every DW and BI project The devil is in the details, and the strength of your dimensions and underestimates the investment of time and resources that it takes attributes later depends on your understanding of the raw data to design aggregation and scrub, build, and/or replicate data to sources that are at the base of your aggregation. independent MOLAP stores to support global enterprise needs. 3. Understand local priorities, and consolidate. The highest The maturity and skill of each enterprise to deliver BI solutions BI reporting demands are often local/regional in nature (by can vary greatly, and breaking up the effort into smaller, agile-based country/trading domain), which begs the question: Do you truly deliveries can help your teams gain the experience and expertise that need a giant, aggregated DW store immediately? Or can you more are needed to understand how to deliver against the larger, longer- effectively build a distributed warehouse and BI strategy, focusing term objectives. Remember the 80/20 rule: Most of the time, you can on local/regional needs along with corporate? This is explored in deliver 80 percent of the required features from 20 percent of the more depth in the next section, as there might be significant cost effort. Consider a simplified approach in phases, as shown in Figure 1. Figure 1: Proposed global BI/DW phases Phase ratio: Complexity BI depth/capabilities Cost Delivery time Phase 1 Phase 2 Phase 3 Dynamic, without published MOLAP store: Local geography, published MOLAP store: Global DW (traditional, DW-centric model): “Show me what is happening in my business “Show me historical performance in my “Show me trends and predictive future.” now.” business.” or or Application & MOLAP DW server; local Local application & Application server (such geography (such as reporting server as Office SharePoint with Office SharePoint SSRS & SSAS) with SSRS & SSAS) MOLAP (Can be OLTP * MOLAP physically local, OLTP * Data marts central, or a Aggregate extract pulled to build published to combination) MOLAP dimensional store geographies Further summary aggregation OLTP database with posted to serve global Aggregate an analysis &/or enterprise; centralized DW extract pulled reporting engine (possible future phases 2 & 3) to build MOLAP dimensional OLTP * OLTP * store * For visual simplification, OLTP represents all disparate application data sources. 3 The Architecture Journal 22
  • 6.
    Thinking Global BI:Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications Phase 1 intense mining to address BI concerns such as, “Show me historical If BI reporting needs are infrequent enough, leave as much source performance.” To begin here, it is essential that you critically analyze data in place, and build multidimensional views through a distributed how much aggregate data replication truly is needed to a centralized caching strategy to deliver a subset of business solutions quickly. Can store. Can you more effectively populate local MOLAP stores to you merely query efficiently by using advanced vendor tools against represent regional business-analysis needs? When you examine existing application stores, even across competing platforms? This is regional needs versus larger corporate needs, you will likely find an important decision point that you will face early on. This approach that more of the deeper, drilldown BI reporting needs are local/ can address BI concerns such as, “Tell me what is happening in my regional in nature. Enterprise-wide BI queries tend to be broader, business now.” This strategy can be most effective with noncompatible trend based analysis that can be supported by summary aggregations vendor technologies, or when strong divisional boundaries exist from smaller dimensional stores. Cheaper, inexpensive MOLAP stores between data stores. It is far more efficient (in terms of person-hour can be effectively maintained by local resources and greatly reduce commitment from your team), and you will be able to deliver solutions the complexity of a central warehouse design, as well as mitigate quicker to the business units. potentially large network congestion and replication needs. While this might result in longer wait time for users, using tools Consider beginning your efforts in a smaller, less-strategic division such as SSIS can help you build packages to retrieve aggregate and of the company or geography to test your team’s ability. This design clean large quantities of data (even from non-Microsoft technology approach is almost an inverse of the traditional DW and downstream stores), build the dimensional relationships in memory cache, and data-mart approach: Instead, the smaller, regionalized MOLAP present them to requesting applications via Microsoft SQL Server stores become aggregation feeds to a larger, further aggregated, Analysis Services (SSAS). In the past, this could be practical only summary DW store for broader trending analysis. Although business for relatively small quantities of data; however, database vendor trends are pushing highly globalized patterns, the need for in-depth optimizations are making this scenario a real choice. regional mining is increasing, too; and relying solely on a centralized DW pattern can require a massive investment in both physical and Phase 2 people resources to maintain and might prove overly costly, fraught Whenever, possible, build and keep the dimensional stores locally. with performance challenges, and overly cumbersome to enhance or With large volumes of aggregated data, companies can begin more change. Data-Integration Strategy to integrate and store critical subject areas. The database structure by Derek E. Wilson should be abstracted by using an enterprise integration pattern that is known as the canonical data model. This model requires all incoming Today, enterprises collect and store data easier than ever in a data to meet a user-defined pattern. For instance, an enterprise BI variety of applications. Many of these applications are inside the system might require the following fields: firewall, while some are in the cloud and others in remote locations. All too often, application data is not collected and used across an First Name, Last Name organization for consistent analysis and business decisions. The ability Address to collect and analyze this data can benefit your company, if it is City, State, ZIP Code treated as an organizational asset. Phone Number To create an enterprise business-intelligence (BI) architecture, you E-mail must identify the core subject areas of your businesses, such as the following: Source applications likely store other information, such as mobile number, gender, and age. Customer BizTalk Server can be leveraged to receive messages from various Product applications and write the information in the appropriate database Employee tables. Inventory When the data has been collected, it can be stored in an online analytical processing (OLAP) cube and then presented to business When these have been identified, you must further define what users for decision-making. The process of loading and adding attributes should be collected for an enterprise view. The fields that calculations to a cube allows everyone in the business to leverage you define will be the backbone of your enterprise BI platform. For the work that is done to create value from the data. As users access instance, if a customer relationship management (CRM) system will the cube, they get consistent answers to queries; and, when new allow you to capture data that shows that a customer has three calculations are requested, everyone benefits from the additions. children, must this piece of information be migrated to the enterprise By identifying, integrating, and creating central OLAP stores, BI system, to let everyone in the organization leverage it to make an organization can leverage data as an asset across the company. better business decisions? Knowing what attributes you must store for the collective good of the organization will enable you to begin data integration. Derek E. Wilson is a BI Architect and Manager in Nashville, TN. Visit By leveraging Microsoft SQL Server and Microsoft BizTalk Server, his Web site at www.derekewilson.com. an enterprise BI and data-integration strategy can be developed 4 The Architecture Journal 22
  • 7.
    Thinking Global BI:Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications Figure 2: SSAS Designer—Dimensional modeling in BIDS Phase 3 The power of the tools that are available to you at design time Build and populate a traditional, independent, centralized DW of can greatly affect the strength of your models, assist visually with your dreams to reach all of the more ambitious BI needs of your overcoming the complexity of the relationships, and reveal potential company. This approach will address the harder BI concerns such as bottlenecks, poor query structure, and ineffective mining semantics. the ever-elusive BI goldmine, “Predict future results,” which can be Through the use of the SSAS designer within Business Intelligence accomplished only by analysis of trends across often voluminous, Design Studio (BIDS), the architect is given a comprehensive set of company-wide historical data. tools for designing and optimizing dimensional stores and queries While historical trending and data mining can be performed against those stores (see Figure 2). across geographies (read, utilizing or aggregating further from Listed here are a few key DW principals to remember when you Phase 2 [or even Phase 1] repositories), to get the raw reporting are designing your dimensional models to maximize performance and drilldown-supported dashboard experience against very large, later (more comprehensive articles on this subject and advanced corporate-wide historical data, a centralized DW implementation SSAS design can be found at http://technet.microsoft.com most likely will be the most effective choice. However, many successful /en-us/magazine/2008.04.dwperformance.aspx and at BI projects will likely find a blend between the Phase 2 and Phase 3 http://www.ssas-info.com/analysis-services-papers/1216-sql-server approaches. -2008-white-paper-analysis-services-performance-guide): Designing Effective, Performant, 1. The overwhelming majority of MOLAP data will grow in your Maintainable Dimensional Storage fact tables. Constrain the number of measures in your fact tables, As data warehousing has evolved, what once was a static strategy as query processing is most effective against narrow-columned of replicating large, read-only stores for reporting has become a far tables. Expand the depth of attributes in the supporting dimension more dynamic environment in which users are given expansive powers tables. The benefit of breaking dimension tables into further such as building their own ad-hoc queries, self-service reporting subdimension tables, when possible (snowflake pattern), is hotly (using tools such as PowerPivot, previously codenamed “Gemini” and debated, although this approach generally gives more flexibility an extension of Office Excel 2010 that will be available in the first half when one considers scale-out models and utilizing indexing and of 2010 and enables users to pull down massive dimensional data to performance-enhancing technologies such as partitioning. the tune of 100 plus–million rows for real-time, cached pivoting), and 2. Implement surrogate keys for maintaining the key relationship even write-back capabilities directly into the dimensional stores. between fact and dimension tables, instead of enforcing 5 The Architecture Journal 22
  • 8.
    Thinking Global BI:Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications foreign-key constraints, which can often manifest as compound 5. Implement a clustered index for the most common fact keys that cover several columns. A surrogate key is an integer- table surrogate keys, and nonclustered indexes upon each of typed identity column that serves as an artificial primary key of the the remaining surrogate keys. As per the previous item #4, the dimension table. This approach can minimize storage requirements addition of ad-hoc query support can greatly affect your indexing and save storage/processing overhead for maintaining indexes. strategy; however, for overall common-usage patterns, this 3. While OLTP stores traditionally are highly normalized, this indexing strategy has proven efficient. is far less important for dimensional stores. Denormalized data 6. Use partitioned-table parallelism. Because of the growth of has certain advantages in extremely large stores—namely, fact tables over time, most DW architects implement a partition the reduction of joins that are required in queries. In addition, strategy (breaking up the fact table over physical storage devices). database products such as SQL Server 2008 utilize a highly This is most commonly performed by using a date column, but optimized bitmap filtering (also known as “bloom filter”) that it can be performed as a range that is based on usage patterns eliminates largely redundant and irrelevant data from query (supported by SQL Server 2008). SQL Server 2008 implements processing during star joins. a new partitioned-table parallelism (PTP) feature that highly 4. Prototype, prototype, prototype. Your physical design might optimizes queries over partitioned tables by executing queries in work well for initial static-reporting needs; however, as users parallel across all available multicore processors. For more detailed become accustomed to the new data that is available to them, information on PTP and other new DW enhancements in SQL and as their expertise grows, you will need to support ad-hoc Server 2008, visit http://technet.microsoft.com/en-us/library querying in a substantial way. Ad-hoc queries have the capability /cc278097.aspx. to create explosive data sets, depending upon the structure of 7. Implement a compression strategy. Over time, data warehouses your historical modeling within your dimensions. Most database can become huge. Overhead for compression can often be products support test/modeling partitioning. Spend ample time overcome by the reduction in redundant data—thereby, reducing understanding the overall impact to your environment (including query processing time, as well as providing maintenance benefits indexing/maintenance) when ad-hoc support is considered. for size of data storage. However, the general strategy remains to Implementing self-service BI products such as PowerPivot, instead implement compression on the least-used data. Compression can of open ad-hoc query support, can greatly ease demands on the be applied in many different ways, including at partition, page, server by delivering large chunks of dimensional data directly row, and others. As per item #4, the most efficient pattern will down to the client for ease of manipulation by the user for require extensive prototyping in your model. drilldown. Figure 3: SSIS data-flow representation in BIDS 6 The Architecture Journal 22
  • 9.
    Thinking Global BI:Data-Warehouse Principles for Supporting Enterprise-Enabled Business-Intelligence Applications Consolidation, Aggregation, and Population of the Stores Fortunately, data replication has become a seamless art, thanks to Fortunately, when you have determined what data to access and improvements within database vendors. Management of replication designed your data-warehouse topology, aggregation and population partnerships, rollback on failures, and rules on data-consistency is becoming a more simplified task, thanks to advancements in violations all can be handled effectively within the replication- vendor tools. Using tools such as SSIS within BIDS (which ships with management console. Together with significant performance SQL Server Enterprise edition), an architect can build a “package” that enhancements of both the replication engine and physical data size, can fetch, manipulate/clean, and publish data. SSIS comes with data global enterprises can rely upon the SQL Server 2008 toolset to meet connectors that enable the architect to design packages to access all even the most demanding data-warehouse and downstream data- the major database platforms (including Oracle, IBM, Teradata, and mart strategies. others) and a comprehensive set of data-manipulation routines that But what about those power users who demand offline BI analysis are conveniently arranged in a visual toolbox for easy drag-and-drop in a mobile fashion? Using the Microsoft platform, you can deliver operation in a visual data-flow model (see Figure 3 on page 6). powerful BI even in a disconnected paradigm by utilizing SQL Server SSIS allows for highly sophisticated packages, including the ability CE or Express Edition on the client. Microsoft has worked with dozens to loop through result sets; the ability to compare data from multiple of Global ISVs that have designed application suites that utilize large sources; nested routines to ensure specific order or compensation/ dimensional cube data on the client in a disconnected mode. You retry rules when failure occurs during the data-collection exercise can establish replication strategies within the client-side database (for instance, the remote server in Central America is down for for when the mobile user is connected, or application designers maintenance); and sophisticated data transformation. can implement Synchronization Services for ADO.NET and manage While SSIS can be compared to most traditional extract, transform, data replication that is specific to the workflow needs within the and load (ETL)–type tools, it offers a far richer set of features to application. For more information, visit http://msdn.microsoft.com complement the dynamic MOLAP environment of a DW that is /en-us/sync/default.aspx. implemented with SQL Server 2008 and Analysis Services (for instance, sophisticated SSIS packages can be imbedded within the DW and Conclusion called dynamically by stored procedures to perform routines that are Enabling BI solutions for your enterprise first requires a significant dictated by conditional data that is passed from within the Analysis investment into an advanced understanding of the corporate data Services engine). Many Microsoft ISVs have even built sophisticated within your access. For querying extremely large data volumes that SSIS package–execution strategies that are being called from within likely include historical references (data over time), dimensional Windows Workflow Foundation (WF) activities. This gives the added storage models such as data-warehouse design patterns (including possibility of managing highly state-dependent (or long-running) MOLAP and others) have proven the most efficient strategy. For types of data-aggregation scenarios. more detailed guidance in implementing your DW and BI solutions For a more detailed overview of SSIS, visit by utilizing SQL Server 2008, visit http://www.microsoft.com http://download.microsoft.com/download/a/c/d/acd8e043-d69b /sqlserver/2008/en/us/white-papers.aspx. In addition, for real-world -4f09-bc9e-4168b65aaa71/ssis2008Intro.doc. scale-out experiences, visit the best-practices work that has been done by the SQL CAT team at http://sqlcat.com/. Highly Distributed Data, Global Geography, Mobile-User Data Marts, Ad Hoc Query, and Data-Synchronization Considerations About the Author Perhaps the most daunting challenge for enterprise architects who are Charles Fichter ([email protected]) is a Senior Solution designing a DW solution that supports BI applications is to understand Architect within the Developer Evangelism, Global ISV (Independent the potential impact of a design upon the physical network assets. Will Software Vendor) team at Microsoft Corporation. A 20-year veteran the environment be able to sustain replication of massive amounts of software and database architect, Charles specializes in business dimensional data on ever-increasing historical data? intelligence, Microsoft SQL Server, and Microsoft BizTalk Server, as A key to success is utilizing effective read-only snapshot replication well as general .NET middle- and data-tier application and database of predetermined aggregated subsets. This means a careful analysis of design. For the past four and a half years, Charles has focused on the needs of each independent geography and the use of advanced assisting Global ISVs with their application-design strategies. Other features within database products, such as extensive data compression recent publications by Charles include the following: and “sparse” attributes on columns. SPARSE is a new column attribute that is available in SQL Server 2008 and removes any physical data size • “Business Intelligence, Data Warehousing and the ISV” to null values. Because data warehouses are typically full of enormous (white paper) amounts of null field values (not every shoe from every store in every • “Microsoft Manufacturing Toolkit” (video) region is purchased every day), the compression and removal of the • “Emery Forwarding: Freight Forwarder Realizes a 102 Percent ROI physical size of null value fields is essential. Many traditional data- with BizTalk Server” (case study) warehouse products do not have this capability, and SQL Server 2008 has many performance advantages for the replication of large volumes of dimensional data. Another effective strategy is to grant write-back and ad-hoc Follow up on this topic query capabilities judiciously. These features by necessity create • SQL Server Integration Services (SSIS): larger overhead in design to support downstream geographies http://msdn.microsoft.com/en-us/library/ms141026.aspx and can greatly increase replication requirements and the possible • SQL Server Analysis Services (SSAS): http://www.microsoft.com reconstitution of the aggregated dimensional stores (a very expensive /sqlserver/2008/en/us/analysis-services.aspx operation, as data volumes increase in size). • PowerPivot: http://www.powerpivot.com/ 7 The Architecture Journal 22
  • 10.
    BI-to-LOB Integration: Closing theCycle by Razvan Grigoroiu Summary BI solutions must be designed with actionable information in mind The business value of business intelligence (BI) will be and must offer a better integration with LOB applications that will attained only when it leads to actions that result in execute the action. increased revenue or reduced cost. This article discusses Agility Through Actionable Information the necessity of architecting BI solutions that focus on A classical BI solution would consist of four main logical layers: ETL, actionable data and information flow back to line-of- a data warehouse, OLAP cubes, and presentation (that is, analysis business (LOB) applications. tools). Data flows through these four layers in a way that is similar to the Introduction following: A set of ETL jobs run periodically to gather information While the final output of business intelligence (BI) is traditionally from LOB data sources such as ERP systems and other applications considered to be scorecards and reports that are used as strategic that service a business need or their underlying transactional decision support, we will see that this is no longer enough. BI can take databases. a greater role in ensuring that the value of information is harnessed to Data is transformed according to the need of the particular its potential. BI implementation and loaded into a data warehouse (that is, The value of information is quantified by the action that is taken a database that is modeled in a denormalized star schema that is following analysis and its result. In order to maximize the value, optimized for a decision-support perspective). BI applications must streamline the decision and action processes. From there, it is then stored in an analysis-friendly The business value of BI information is attained only when it results multidimensional structure such as OLAP cubes. in a business operation that has the effect of increasing revenue or On the presentation layer, Microsoft Office Excel is one of the reducing cost. more popular analysis tools that is used today. It offers a well-known A BI report might indicate increased demand for a particular user interface that can be utilized by users who occupy different roles product; but this information has value only if it leads to placed in the enterprise—from executives and business analysts to buyers purchase orders and distributions to stores, so that sales figures go up. and other operational staff. Such transactional operations are executed and managed with Using Office Excel 2007 and later releases, users can browse their the help of line-of-business (LOB) applications, such as Enterprise data that is available in OLAP cubes by using pivot tables to get an Resource Planning (ERP) systems. BI solutions that implement a better overview on how the business is performing. integration to LOB systems will ultimately provide the enterprise with Potential problems or opportunities can be highlighted by using a better return on investment (ROI). color schemes, and the user can drilldown analyze the particular Recent advances in hardware and software technologies, such as information and conclude that an action must be taken. Microsoft SQL Server Integration Services (SSIS), allow up-to-date and Such an action is usually an operation on the LOB application from quality data to be summarized and made available to the information which the transactional data was loaded. Users might switch then to user in near-real time. the LOB application and perform the required business operation. The times when it took many person-days of IT staff effort to In most cases, however, this is tedious work and disconnects from the gather such data for presentation in quarterly meetings are now analysis context. In many cases, when it comes to this part, users have things of the past. expressed a desire for an automated solution. It is not uncommon for extract, transform, and load (ETL) processes In order to streamline decision-making processes, decrease to collect data from LOB applications and underlying transactional response time, and track the causality of such operations, this action databases on an hourly basis or even more frequently. can be started from within the decision-support context and triggered There is available a wealth of near-real-time information that can from the analysis tool (Office Excel, in this example). be utilized by LOB users to streamline operational processes. This step closes the data cycle from BI back to LOB applications Because of this, today’s BI solutions can take a more active role (see Figure 1 on page 9) and encapsulates decision-making logic. in the operational space by offering a closer integration back to The next generations of BI applications can be designed in such a LOB applications. Information should be presented in an interactive, way that users can act on the information easily. This will bridge the actionable way, so as to allow users to act on it. gap between BI analysis and operational processes, and will lead to This article argues that in order to attain the business value of more cohesive enterprise solutions. BI information, the final output of BI must be an action that triggers Information utilization is maximized, and BI can become in this a business operation on LOB applications. To achieve this goal, way a driver for business-process optimization. Actions that are taken 8 The Architecture Journal 22
  • 11.
    BI-to-LOB Integration: Closingthe Cycle Figure 1: LOB-BI-LOB data flow Business intelligence Operations Intelligence as actionable information Action—Closing the cycle ` OLAP cubes Information analysis SSAS with URL Office Excel Line-of-business actions applications Transactions Data load ETL—SSIS Data warehouse Line-of-business SQL Server transactional Database based on BI information can be better tracked; effects can be better Whenever we are faced with data integration between different analyzed, which will ultimately lead to better business performance. technology layers, and loose coupling becomes a prime concern, BI solutions that provide actionable information allow companies a service-oriented approach comes naturally to mind as an to stay agile with business changes by reducing the time between the architectural solution. moment that a business event occurs and when an action is taken. In this case, a service-oriented architecture (SOA) can represent the glue between the different components that must interact, and Actionable information in this sense must contain two parts: can provide the governing concepts of such interaction. But, when it comes to service orientation and BI, where are we • The condition in which an action is required (for example, on-hand today? is below a minimum level). • An action that defines the response when the condition occurs Actionable Information and SOA (for example, perform an operation such as order items to restock). An SOA calls for different functional elements to be implemented It must specify how the business action can be invocated and as interoperable services that are bound only by interface contracts. where it can be reached (the endpoint). This allows for a flexible, loosely coupled integration. Different components in such architectures can evolve naturally over time, or Based on the principle of separation of concerns—and to allow a they can be exchanged without affecting the solution as a whole. layered approach between a data tier (such as an OLAP back end) and While LOB systems have been at the forefront of SOA adoption, presentation tier (that is, an analysis tool)—the actionable information BI products have moved more slowly to support such architectures. belongs on the data tier. Traditionally BI-technology vendors have offered monolithic Actions in this case would need to be consumer-agnostic and OLAP solutions that use analysis capabilities that cannot be extended triggerable from any analysis tool. with actionable information to simplify end-to-end BI-to-LOB Furthermore, the action implementation represents a business integration. operation and, as such, is governed by LOB-specific concerns and In recent years, however, we have seen some solutions open up and business rules. Therefore, it must be executed by the corresponding offer more interoperability features that bring us closer to service LOB application. orientation. 9 The Architecture Journal 22
  • 12.
    BI-to-LOB Integration: Closingthe Cycle With SQL Server Analysis Services (SSAS), Microsoft provides a feature the Web application can collect more information from the user, that is called “URL actions” and represents a mechanism to store depending on the action. actionable information inside the OLAP cube metadata. Conditions Such a solution can leverage advances in rich Internet application in which the action becomes available can be expressed by using the (RIA) technologies such as Ajax or Microsoft Silverlight. MDX query language, and the URL provides an endpoint. A Silverlight front end has the advantage over classical Web Such capabilities in OLAP technologies can be utilized to applications: Code that is written in a CLI language such as C# will implement an end-to-end integration back to LOB applications in run on the client in the browser application, which minimizes cross- an SOA. computer calls to the server. In an enterprise SOA, operations on LOB applications can be Silverlight is built on the .NET framework and can utilize Windows exposed as Web services that hide the complexity of underlying Communication Foundation (WCF) as an out-of-the-box framework programs and, together with any external Cloud or B2B services, for SOA integration. WCF provides a flexible API to implement SOAP can represent basic components in an infrastructure that models Web service calls from the client process. business needs. Consequently, Silverlight can be utilized as a bridge between SOA Office Excel has the native ability to consume the actionable and BI. information by displaying to the user any available actions on cells The cell information (current members), together with any in which the predefined conditions are met. The caption that will be additional information that is collected from the user, can be sent to displayed can also be defined dynamically by using MDX expressions. the Web service that implements the business action; and, at the end, Each cell of the analyzed cube is defined by one member (called any service feedback can be displayed to the user. the current member) from every dimension. Data flow between any such services can be orchestrated and The URL can be defined in the cube to include the current members as customized to fulfill the need of business operations. a query string. Using Web services to expose operations on LOB applications in When the user performs the suggested action on a cell, Office Excel an SOA is a way to utilize existing software assets better and increase will call the Web application that the URL has located. If necessary, their participation in enterprise information flow. This also makes Guerrilla BI: Clear objectives and short deployments bring focus. Also, teams Delivering a Tactical Business-Intelligence Project must collaborate with executives from each business function to by Andre Michel Vargas ensure that GBI efforts align with overall business strategy. 4. Leverage and extend your existing BI tools. Leverage “If ignorant both of your enemy and yourself, you are certain to existing BI technologies with agile implementation techniques be in peril.” to deliver value quickly. Then, expand on traditional BI tools SUN TZU (data warehouse, ODS, dashboards, analytics, and so on) Business intelligence (BI) offers a distinct competitive advantage. By with productivity-monitoring and optimization tools. A small transforming data into knowledge through BI, you can determine your investment in these assets will allow your GBI team to analyze and strengths and weaknesses while better understanding your position improve processes and support cost-reduction goals. in the marketplace. Whether it’s cost reduction, process optimization, increasing revenue, or improving customer satisfaction, it’s more “ Strategy without tactics is the slowest route to victory. Tactics important than ever to run your business better and faster. Knowledge without strategy is the noise before defeat.” is critical to being agile and adaptive in a changing economy; a 1 SUN TZU percent head start can make all the difference. However, successful BI initiatives are traditionally lengthy and expensive. So, how can Implementing a GBI strategy can provide a key advantage in today’s you affordably implement a winning BI initiative? Guerrilla business fluctuating economy. However, do not lose sight of the long term. intelligence (GBI) delivers the insight that you need, in weeks. Investing in enterprise BI remains essential. When signs of growth reappear, revise your BI strategy, including guerrilla tactics, for The GBI approach is the following: optimal success and return on investment (ROI). 1. Mobilize with speed. Reducing decision-making time is “Know thyself, know thine enemy. A thousand battles, a critical. This doesn’t mean that decisions should be made hastily; thousand victories.” instead, speed requires preparation. In the first week of the GBI SUN TZU deployment, teams must be embedded with business subject- matter experts—thus, allowing teams to identify and address hot spots. 2. Deploy small multiskilled teams. Form crossfunctional teams Andre Michel Vargas ([email protected]) is to interpret your data, to gain insights on operational weaknesses a management consultant in PA Consulting Group’s IT Consulting and strengths. Teams can focus on anything from cost-reduction practice. He specializes in solving complex information-related issues, efforts and process-efficiency optimization to market analysis for including legacy-systems migration and integration, business-process revenue opportunities. automation, and delivery of enterprise BI. For more information on 3. Infiltrate iteratively and collaboratively. Infiltrate iteratively PA’s thinking around BI, please visit www.paconsulting.com/our-thinking through your business functions in four- to six-week deployments. /business-intelligence-solutions. 10 The Architecture Journal 22
  • 13.
    BI-to-LOB Integration: Closingthe Cycle Taking a more operational role will also result Figure 2: Technology stack in higher availability requirements for BI. Information must be available, regardless of potential hardware or software outages that could affect a server. Data refreshes during ETL processing will Line-of-business Cloud services have a more pronounced impact in this case, applications because it can affect user queries and delay SOA operational processes. Therefore, failover B2B services strategies will need to be higher on the priority list than they are during the design SOAP of classical BI solutions. One solution to increase scalability and availability is to use Windows Network Load Microsoft Internet Information Services Balance (NLB) to distribute user requests across different SSAS instances that are BI-to-SOA bridge Silverlight RIA implemented on different cluster hosts. Out-of-the-box WCF support NLB will detect a host failure and accordingly redirect traffic to other hosts. Scheduled outages such as ETL processes can be Analysis and Office Excel mitigated by using large-scale staging decision Analysis tool systems, to keep OLAP data available all the time. Data volume will increase, because business operations are performed at a SQL Server Analysis Services OLAP cubes with URL actions lower level of information aggregation Business than strategic analysis. In this case, data will intelligence also need to be stored and delivered more SQL Server coarse-grained to information consumers. Data warehouse Conclusion Enterprise BI architectures can maximize information value by offering a closer collaboration of the components more flexible, and it becomes easier integration back to LOB applications. Taking a greater role in to adapt to changes in business processes, which ultimately leads to a operational decision making will empower business users to interact better IT-to-business alignment. with actionable information in BI. An architectural solution (see Figure 2) that leverages a technology This will enable BI solutions to close the data cycle and drive stack that is based on SQL Server Analysis Services 2008—with performance improvement of operational processes. URL actions defined on the cube (BI platform), Office Excel 2007 A technology stack that is based on Office Excel 2007, WCF (BI analysis tool), and Silverlight RIA (BI-to-SOA bridge)—can services, and SSAS—with URL actions that are defined on the cube— exemplify utilization of enterprise SOA as an enabler of BI-to-LOB can be leveraged to implement data integration easily from BI back to integration. LOB applications in an SOA scenario. URL actions represent a simple and effective way to implement actionable information for BI solutions that are built on SSAS. The URL can point to a Silverlight RIA application that will act as a BI-to-SOA About the Author bridge and make calls to Web services by using WCF. Razvan Grigoroiu ([email protected]) is a Software Architect who has been involved for the past 10 years with LOB and BI solutions Implementation Considerations for the specialty-retail sector. Integrating data flow from BI back to LOB applications will be beneficial to operational processes. However, as soon as BI becomes an essential part of that world, its design will also be burdened with the mission-critical nature of operations. Follow up on this topic Special implementation considerations and requirements must • SQL Server Integration Services (SSIS): be taken into account, compared to classical BI applications that are http://msdn.microsoft.com/en-us/library/ms141026.aspx intended exclusively for strategic decision support. • SQL Server Analysis Services (SSAS): http://www.microsoft.com The BI solution that is integrated with LOB applications will target /sqlserver/2008/en/us/analysis-services.aspx a wider audience and a larger number of operational users. BI will • Rich Internet Applications: need to scale from a limited number of report consumers to a larger http://www.microsoft.com/silverlight/ number of operational users at different levels in the enterprise • Service Orientation: http://msdn.microsoft.com/wcf hierarchy. In this case, scalability will play a much more important role, and it must be a leading factor when hardware and software design choices are made. 11 The Architecture Journal 22
  • 14.
    BI-to-LOB Integration: Closingthe Cycle Performance Management: Relevant Time-Performance Management How Technology Is Changing the Game by Usha Venkatasubramanian by Gustavo Gattass Ayub Organizations need to make informed business decisions at strategic, Many enterprises have built the capabilities to monitor, analyze, and tactical, and operational levels. Decision-support systems were offline plan their businesses. But the problem is that they are delivering solutions that catered to specific needs. With new trends, there is a insight into the past, but not into up-to-the-moment performance. need to cover a larger set of people—right from the CEO, who looks Front-line managers increasingly need to know what’s happening at a larger timeframe, up to an operations manager, who needs recent right now. Individual contributors need to have access to current data statistics. Therefore, we must build a performance-management to provide quality service. Retailers and manufacturers are taking system that delivers information at the relevant time: Relevant advantage of this to avoid stock-outs or overproduction. Financial- Time-Performance Management (RTPM). services, logistics, and utilities companies are using stream-data How can an organization provide self-service capability to the processing to increase operational efficiency and create new business business, while still maintaining the data recency and granularity? capabilities. We implemented a multilayered data warehouse that is both a sink There are clearly three challenges: effective delivery of integrated and a source of information. Data currency was maintained by using data to end users, the ability to process huge volumes of granular a suitable adapter to poll data (for example, SSIS in the Microsoft BI data, and the ability to process data streams. suite). In-memory processing and 64-bit PCs are changing the way in which end-users access current integrated data, as it allows them to Management Organization-Structure Relevance build reports, dashboards, and scorecards without the direct support Near-real time data was trickle-fed into the lowest layer and reflected from IT (also known as self-service business intelligence [BI]). From in the output for the operational manager. Data was sourced to higher the IT perspective, it’s an alternative to delivering insight without the levels of managers by creating higher layers of aggregation, and at need of long implementation cycles. The integration of in-memory predefined time intervals. Granular data got offline-archived for the processing with business productivity tools such as spreadsheets and future. When data reached the highest level of aggregation, it was intranet portals is becoming the option of choice to deliver the BI-for- retained for comparative reporting for a longer duration of time. the masses vision. Predictive analytics is a top priority in every enterprise profiting Information Relevance from BI and its potential is directly related to the granularity and Current information requirements that are categorized as primary latency of data. Today, in general, there is no big advantage in data (information source) resided in all layers. Data that is not working only with sets of aggregated data. The real advantage required for querying was captured as supplementary data (data comes from processing huge volumes of granular data in near-real- sink). Some data from the secondary layer would move to the primary time. From a technology perspective, there is a new generation of layer, if there is a request for additional data. Likewise, a primary data data-warehouse (DW) appliances that will enable this capability for element would be retired by moving it to the secondary layer. organizations that need to predict beyond competition. Stream-processing technologies allow real-time monitoring Data-Nature Relevance by detecting events or patterns of events as data streams through A careful balancing act is needed to control the unwieldy growth transactional systems and networks or from sensors. Complex event- of the data volumes in the data-warehouse database, while still processing (or CEP) platforms are enabling new applications—varying providing the relevant information. An offline retention policy–based from pollution control to algorithmic trading. CEP is becoming a archive helps maintain the relevant information. mandatory capability for any BI platform, as new applications emerge and also as the real-data integration paradigm might shift in the near Recency Relevance future from the repeatable cycles of the traditional ETL process to the Recency of information calls for a proper Change Data Capture event-processing paradigm. mechanism to be in place for different stakeholders to get what they Together, these emerging technologies are playing a key role need. This would primarily depend on the nature of the source data in enabling some enterprises to evolve from the traditional DW itself. Using metadata-driven CDC and normalized CDC, the data is to modern BI platforms that have arrived to change the game by maintained as recently as required. providing real-time monitoring and much faster analysis. Delivery Relevance Information delivery was a mix of push and pull to maintain the time Gustavo Gattass Ayub ([email protected]) is a Senior relevance. Standard reports were delivered through the push method Consultant at Microsoft Consulting Services Brazil. and ad-hoc reports through the pull method. Some of the case studies in which we’ve used these principles effectively can be seen at the following Web site: http://www. lntinfotech.com/services/business_analytics/overview.asp. Usha Venkatasubramanian ([email protected]) is the deputy head of the Business Analytics Practice at L&T Infotech. 12 The Architecture Journal 22
  • 15.
    Increasing Productivity by EmpoweringBusiness Users with Self-Serve BI by Ken Withee Summary analytical needs of users and provide the governance and control that IT requires. Enabling end-user self-serve business intelligence (BI) In its upcoming release of Office Excel 2010 and Office SharePoint is a critical step in modern business. The latest wave of Server 2010, Microsoft attempts to provide the self-serve analytical Microsoft products enable self-serve BI by using tools needs of business users in a feature that is known as PowerPivot. and technologies such as Office SharePoint, PowerPivot, PowerPivot and Office Excel 2010 Business Connectivity Services (BCS), Office Excel, and PowerPivot is an add-in to Office Excel 2010 that, from the point of Report Builder. view of business users, simply allows them to pull massive amounts of data into Office Excel and then analyze it in a familiar environment. The premise is that users are already familiar with Office Excel and Introduction prefer to work in Office Excel, but are forced to go through IT when Creating a self-serve environment is a critical evolutionary step in they are dealing with large and complex data. The IT team then goes any software environment. Imagine if the IT department had to be through the process of modeling the data and building OLAP cubes involved in sending every e-mail message. The thought is almost that then can be used by the business users to perform their analysis. laughable. When an e-mail system has been put in place, it is This back-and-forth nature is the root of a great deal of frustration for maintained by IT, but users are free to send, receive, and manage their both sides. e-mail through self-serve tools such as Office Outlook, Thunderbird, PowerPivot attempts to remove this interaction by providing Lotus Notes, Eudora, and Pine. features in Office Excel that allow business users to pull in and Business intelligence has become increasingly important—many analyze very large amounts of data without having to interact with IT. would say critical—in modern business. Getting the right information When users have completed an analysis, they can upload the Office to the right person at the right time is the promise of BI, but this is Excel document to an Office SharePoint library from which it can be often easier said than done. Providing a self-serve mechanism for shared with the rest of the organization. Because these PowerPivot business users is the next step in the BI story. Providing self-serve BI documents live on the Office SharePoint server, IT maintains allows for an exponential increase in the usefulness of BI by removing governance and control over the entire process. the large hurdle that is involved in the back-and-forth interaction of business users and the technical team. In essence, business users can PowerPivot Architecture answer their own questions as they arise, without having to pause to From an IT point of view, the predictable thing about business users is involve the IT team with every detail. that they just want things to work; they are focused on their business This article explores the latest Microsoft tools and technologies role and see technology only as a tool that helps them perform their that enable self-serve BI. In particular, it outlines the latest wave of tasks. Under the covers, PowerPivot (formerly known as codename SQL Server, Office SharePoint, and other Office products and how “Gemini”) is an incredibly complex piece of technology in which they can be used to provide self-serve BI to business users. Microsoft has invested heavily to bring to market. Above the covers, however (and what IT presents to the business user), PowerPivot is PowerPivot simply Office Excel. Many business users will not even care that a When it comes to BI, there is a constant battle between business new technology has been introduced; they will just know that the users and IT. Business users know the functional components of new version of Office Excel can solve their problems with analysis of the business; they understand fully what they want to analyze and massive and complex data sets. the questions that they want to answer. The IT team understands The technology that provides Office Excel the ability to analyze the structure of the data, the models, the cubes, data flow from the millions upon millions of rows of data in line with what business operational systems, the data warehouse, and the control mechanisms users are expecting can be found in memory. In particular, the data is for the data. Business users often feel that IT is always saying “No” loaded into memory in what is known as in-memory column-based when they make a request for a data cube, report, chart, graph, or storage and processing. In essence, users are building their own even raw data. The members of the IT team often feel that users are in-memory data cube for analysis with Office Excel. making unreasonable requests, with timelines that equate to boiling In order to get the amounts of data capabilities that are provided the ocean by first thing tomorrow morning. Self-serve BI attempts by PowerPivot, the in-memory data structure is highly compressed to solve this problem by providing a mechanism that will satisfy the and read-only. The engine that is responsible for compressing and 13 The Architecture Journal 22
  • 16.
    Increasing Productivity byEmpowering Business Users with Self-Serve BI managing this in-memory data structure is called VertiPaq. Figure 1: BCS allows Office SharePoint 2010 read/write interaction with external systems. PowerPivot and Office SharePoint Server 2010 ERP PowerPivot for Office SharePoint accomplishes two important tasks. The first is that it provides a home for the Office SharePoint PowerPivot documents that users create in Text an environment that IT controls. The second is that it provides users throughout the organization the ability to view and interact CRM with PowerPivot documents by using nothing more than their thin-client Web browser. For business users, consuming a XML PowerPivot document is as simple as going to their Intranet site and clicking a document. The document then renders in APP their browser, and they can interact with the document and perform their analysis. In fact, the consumers do not even need to have Office Excel installed on their local computers to interact with the specialized software on end-user desktops. For example, imagine a PowerPivot document. The result is that business users focus on their customer-support representative taking an order from a customer. business analysis and are oblivious to the technology that enables the There might be one application for entering the order, another for interaction. taking notes on the call, and yet another for researching questions that the customer has about products. If the customer notifies the PowerPivot Control and Governance support representative of an address change, the representative must One of the biggest challenges in IT involves spreadsheets that access also the system that stores the customer information and make become critical to business users without the knowledge by IT of the update. Business users have no need or desire to understand their existence. Office SharePoint provides functionality that gives IT where their data and information lives; they just want to interact with the ability to track PowerPivot usage patterns. As PowerPivot (Office the right information at the right time, in as easy a manner as possible. Excel) documents bubble up in importance, IT can monitor and The BCS technology provides a mechanism for the consolidation of identify the source data and business functionality. access points to external systems into one convenient portal location. Having visibility into which PowerPivot documents are Consolidation greatly reduces the complexity of end-user job most frequently used by business users is critical in developing functions by providing a single destination to perform business tasks management and disaster-recovery plans. For example, if an and find business information. extensively used PowerPivot document is pulling data from an Reducing the complexity for users also reduces the number of operational system that was thought to have minimal importance, IT disparate requests that IT must service. Instead of IT having to support has achieved visibility into what is truly important to business users. connections, security, audits, and one-off projects for multiple IT is then better able to accommodate the needs of their users going systems, IT only must set up the connection in the portal once and forward. then support the single portal framework. In addition, moving everything to a single framework restores control over the external Business Connectivity Services (BCS) systems to IT by moving users into a single environment. The major promise of BI is getting the right information to the right person at the right time. When people think of BI information, they BCS Architecture usually think of numeric data and analytics. However, information BCS is an evolution of the Business Data Catalog (BDC) from takes many forms, including nonnumeric information that lives in a Office SharePoint 2007. BCS is baked into the Office SharePoint plethora of applications and databases. 2010 platform and the Office 2010 clients. BCS uses three primary Some of the most popular systems that require integration components that enable the connection to external systems. These include line-of-business (LOB) systems—often called enterprise include Business Data Connectivity, an External Content Type resource planning (ERP)—such as SAP, Oracle, Dynamics, Lawson, Repository, and External Lists. In addition the BCS client is included Siebel, and Sage, to name just a few. Access to these systems often also in the Office 2010 applications. is cumbersome and time consuming. A typical interaction involves The External Content Type Repository and External Lists allow using specialized applications and screens to access or update the solution architects not only to describe the external data model, but information that lives in the ERP system. also to define how the data should behave within Office SharePoint BCS is a technology that is included in Office SharePoint Server and Office. 2010 and provides integration and interaction (read/write) with BCS connections are XML-based and include functionality to the information that is contained in external systems, as shown in connect to SOA-based services. When a connection file has been set Figure 1. up, it can be used throughout the Office SharePoint environment by Integrating a front-end user-facing portal with external systems end users. provides a single self-serve access point, without the need to install 14 The Architecture Journal 22
  • 17.
    Increasing Productivity byEmpowering Business Users with Self-Serve BI Builder and for developers to design reports by Figure 2: Report Builder is designed to provide business users with an easy-to-use using BIDS with functional parity, due to the shared report-development environment. underlying code base. Report Builder uses ClickOnce technology for deployment. ClickOnce allows users to click the link in either Report Manager or Office SharePoint and download the application to their desktop computers. ClickOnce alleviates the need for a mass install by the IT department. When Report Builder must be upgraded or updated, the new bits are automatically downloaded to the user desktop without the need for manual updates. SQL Server Reporting Services SQL Server Reporting Services (SSRS) is the reporting component of the SQL Server product. The SSRS architecture consists of a Windows Service that is designed to render reports and a couple of SQL Server databases that are designed to store content, configuration, metadata, and temporary rendering information. SSRS reports consist of an XML-based format that is called Report Definition Language (RDL). SSRS reports—or RDL files, in other words— can be created by using either BIDS (Visual Studio) or Report Builder. An SSRS database can be installed in either stand-alone mode or integrated mode. When it is Report Builder installed in stand-alone mode, a Web application that is known as Report Builder is an application that is designed to provide end users Report Manager is responsible for storing, managing, and providing the ability to create and publish their own SQL Server Reporting the reporting environment to end users. When it is installed in Services (SSRS) reports. Report Builder was designed for the end user integrated mode, Office SharePoint takes over, and Report Manager is with the comfortable look and feel of other Microsoft Office products. no longer used. In particular, Report Builder includes the Office Ribbon at the top of Although SSRS is a component of the SQL Server product, it is the report-design surface, as shown in Figure 2. not restricted to pulling data from only a SQL Server database. Using The underlying report-engine code base is shared with the Report Builder, end users can pull data from a number of different Business Intelligence Development Studio (BIDS) report-design connection types, including OLE DB, ODBC, Analysis Services, Oracle, environment. This single underlying code base was designed to XML, Report Models, SAP Netweaver BI, Hyperion Essbase, and provide functionality for end users to create reports by using Report TERADATA. Figure 3: Making the connection to an SSAS OLAP cube is accomplished on the Data tab of Office Excel. 15 The Architecture Journal 22
  • 18.
    Increasing Productivity byEmpowering Business Users with Self-Serve BI Self-Serve Reporting Business users launch Report Builder by clicking a Figure 4: Browsing an SSAS OLAP cube as a PivotTable in Office Excel. link in either Report Manager or Office SharePoint. As soon as Report Builder is launched, business users create connections to data sources and build reports. The reports can then be saved into either Report Manager or an Office SharePoint Document Library. Other users can then connect to Report Manager or the Office SharePoint site to view the available reports. The IT team can maintain control by providing “approved” reports, monitoring usage, and limiting access to the servers that contain the source data. When SSRS is installed in integrated mode, reports can take advantage of the Office SharePoint Enterprise Content Management (ECM) features, such as versioning, security, check-in/check-out, and workflow. In addition, the responsibilities of IT are reduced, because only a single content-management system must be maintained and managed. In the Office SharePoint environment, an SSRS report is nothing more than a content type, such as an Office Word document or Office Excel spreadsheet. Office Excel and Excel Services Office Excel has to be one of the most prolific and ubiquitous data-analysis applications in the world. Nearly every organization uses Office Excel in one capacity or Office Excel and SSAS Data Mining another. Some businesses use Office Excel almost exclusively to run A Data Mining add-in for Office Excel is available to provide access to and manage their data needs. One of the most beloved data-analysis the data-mining algorithms that are contained within SSAS. Installing features of Office Excel is the PivotTable. A PivotTable provides an the add-in provides a Data Mining tab in Office Excel with which easy-to-use drag-and-drop interface for slicing, dicing, grouping, and users can access the algorithms that are contained within the SSAS aggregating data. Beginning with Office Excel 2007, end users have Data Mining engine. The Data Mining tab in Office Excel is shown in the ability to connect to and utilize the back-end SQL Server Analysis Figure 5. Services (SSAS) server from the comfort of Office Excel on their The SQL Server Data Mining Add-In provides the following desktops. As a result, end users can browse and analyze OLAP cubes functionality: and tap into the powerful data-mining capabilities that SSAS provides. • Data-analysis tools—Provides data-mining analysis tools to Using Office Excel to Browse SSAS Cubes the Office Excel client, allowing users to perform deeper analysis When connecting to and analyzing an SSAS OLAP cube, the business- on tables by using data that is contained in their Office Excel user experience is nearly identical to analyzing a local Office Excel spreadsheets. pivot table. A user makes the connection by selecting the From • Client for SSAS data mining—Provides the ability to create, Analysis Services option on the Get From Other Sources menu of manage, and work with data-mining models from within the Office the Data tab, as shown in Figure 3 on page 15. Excel environment. Users can use either data that is contained in When the connection has been made, users can browse the cube the local spreadsheet or external data that is available through the and perform an analysis, just as they would a local pivot table, as Analysis Services instance. shown in Figure 4. Figure 5: The Data Mining tab provides end users the ability to interact with the data-mining capabilities of SSAS. 16 The Architecture Journal 22
  • 19.
    Increasing Productivity byEmpowering Business Users with Self-Serve BI • Data-mining templates for Office Visio—In addition to the Office Excel Figure 6: Office Excel document is e-mailed to users, who modify and e-mail again—creating functionality, the add-in also provides multiple mutations of the document. the ability to render and distribute data- mining models as Office Visio documents. Office SharePoint 2010 and Excel Services ERP Office Excel has to be one of the most Office Excel popular and prominent data-analysis applications. Business users create Office Office Text Excel spreadsheets that perform everything Excel from ad-hoc analysis to fully featured Office Office profit-and-loss calculators. When these CRM Excel Excel applications are under the radar, they cannot Office be backed up or supported by IT. The result Excel is that business users become frustrated XML with IT for not being able to support them, and IT becomes frustrated with users who are working outside the provided system. APP Business users feel that IT is not giving them the correct tools, so they go and create their own by using Office Excel. The IT team members feel that the business users are going around IT and have To complicate matters further, users often e-mail Office Excel no right to complain when the stuff that they created on their own documents to other users, who e-mail those documents again. This breaks. creates multiple mutations of critical business functionality, as shown in Figure 6. Self-Service Reporting Best Practices substantial advantages in using the latter option, if your organization on the Microsoft BI Platform is prepared for some development and maintenance overhead. by Paul Turley Analysis tools—such as the new generation of Report Builder in Microsoft SQL Server 2008, and the pending release of SQL Server Once upon a time, there was a big company whose IT department 2008 R2, Microsoft Office Excel, and Office PerformancePoint Services wanted to ensure that everyone would see only good data in for SharePoint—might be given to users, but the semantic layer must their reports. To make sure of this, they ruled that all reports be managed centrally by IT. would be created by IT from data that was stored on IT-controlled databases. Business managers and users quietly circumnavigated Separate User- and Production-Report Libraries this—downloading data into spreadsheets and data files. Another User reports might be used to make important decisions and might company’s IT group enabled the business to perform its own even become mission-critical, but the reports, scorecards, and reporting by using an ad-hoc tool—opening databases to everyone. dashboards that are “guaranteed” to be accurate and reliable should In both of these companies, when leaders had questions, everyone go through the same rigorous IT-managed design, development, had answers! The only problem was that the answers were all and testing criteria as any production-ready business application. different. Many organizations operate in one of these extremes. Designate a library for ad-hoc reports, separate from production Business users can gain important insight by using self-service reports. Office SharePoint is an excellent medium for this purpose. reporting tools. Armed with the right answers, leaders and workers can take appropriate action and make informed decisions, instead Conduct Formal Review Cycles, Validate Reports, Consolidate of shooting from the hip or waiting for reliable information to come Them in Production from somewhere else. Functional business-intelligence (BI) solutions One of the most effective methods for IT designers to understand don’t evolve into existence and must be carefully planned and business-reporting requirements is to leverage user-designed reports. managed. For mission-critical processes, use these as proofs of concept, and These best practices adhere to some basic principles and then work with the business to design consolidated, flexible “super experience-borne lessons: reports” in a production mode. Learn how to implement these tools to build a comprehensive self- Manage the Semantic Layer service reporting solution by reading the full article on Paul’s blog: A single version of the truth might consist of data that is derived from http://www.sqlserverbiblog.com. multiple sources. By simply giving users the keys to the database kingdom, you aren’t doing anyone any favors. One size doesn’t fit all, but business-reporting data should always be abstracted through a Paul Turley is a business-intelligence architect and manager for semantic layer. This might be a set of views on a data mart, a report Hitachi Consulting, and a Microsoft MVP. model, or an online analytical processing (OLAP) cube. There are 17 The Architecture Journal 22
  • 20.
    Increasing Productivity byEmpowering Business Users with Self-Serve BI Excel Services in Office SharePoint 2010 attempts to solve the issues with Office Figure 7: Office Excel document is saved to Office SharePoint and accessed by users Excel by providing a home for Office throughout the organization by using only a thin client (Web browser). Excel documents in the Office SharePoint environment that is controlled and governed by IT. When Office Excel documents are saved ERP in an Office SharePoint document library, there is one version of the document, and users can connect and use the document without spawning multiple mutations, as Text shown in Figure 7. The document can also take advantage of the ECM features of Office SharePoint, including versioning, security, Office CRM Excel check-in/check-out, and workflow. In addition, IT is able to gain oversight, visibility, and Office control over the Office Excel applications with XML SharePoint which users are performing their business tasks. Office Excel documents in Office APP SharePoint can use connection files that are managed by the IT department. For example, IT can create connection files to the source systems and then simply point end users to these approved connection files. This alleviates the need for IT to service numerous requests for connection as a client for the data-mining functionality of the SSAS server. The information for every Office Excel file that is created for a business power of the data-mining algorithms can be leveraged with data that problem. is contained in a data warehouse or data that is local in Office Excel One of the most powerful features of Office SharePoint is called spreadsheets. In both situations, Office Excel acts as a client for the Excel Services. Excel Services is the ability to render Office Excel SSAS server, which provides end users with the power of SQL Server documents in a thin client (Web browser). An important Office and the comfort of Office Excel. Excel document can be saved to a document library, and the entire One of the biggest pain points in an OLAP environment is the organization can then view and interact with the Office Excel amount of effort that it takes to organize and develop data cubes. document without having to leave their browser. The consumers of Business users have to coordinate with BI developers to identify the document just navigate to their company intranet and click the the correct data, relationships, and aggregates. Requirements are Office Excel document. constantly shifting, and, by the time a cube has been developed, This functionality is particularly powerful when thinking about the requirement has changed and must be reworked. Providing end rolling out Office Excel 2010 to provide PowerPivot functionality. Only users the ability to create their own data cubes in an easy-to-use a handful of business users actually produce content, with the rest just environment is extremely important to the evolution of BI. PowerPivot consuming it. Using Excel Services, the only users who will need to provides the ability for users to create in-memory cubes right on have Office Excel 2010 are the producers of content. The consumers their desktops in the familiar Office Excel environment. The cubes can can interact with the PowerPivot documents without ever having to then be uploaded to an Office SharePoint site and accessed by users leave their browser or install the latest version of Office Excel. throughout the organization. Office SharePoint 2010 includes BCS, which provides read/ Conclusion write integration between Office SharePoint and external systems. Report Builder is an end-user report-development tool that provides Consolidating functionality into the Office SharePoint environment end users the ability to create their own SSRS reports without the reduces complexity for end users and provides a one-stop shop for need for an SSRS expert. The Report Builder application uses the all content, including BI, reporting, analysis, communication, and same underlying code base as the Business Intelligence Developer collaboration. In addition, IT can consolidate focus from multiple Studio (BIDS), which is designed for professional developers. Allowing access systems into a single portal system. end users to build their own reports takes a tremendous amount of A self-serve environment is a key inflection point in any resource load off of the technical team—allowing them to focus on technological solution. Placing the power of a solution in the hands the underlying data warehouse, instead of the tedious report-design of end users unleashes an exponential power that can only be process. realized through a self-serve environment. Surfacing BI information Office Excel is one of the most ubiquitous data-analysis programs into a collaborative environment such as Office SharePoint enables a that are used in the world today. Microsoft has recognized that people new form of BI that is called human business intelligence (HBI). HBI are already comfortable with Office Excel and often do not want to merges the traditional analytical capabilities of a BI solution with the change to another application for data analysis. Office Excel can be knowledge of the people throughout the organization. used as a client for the back-end SSAS server. The latest wave of Microsoft products are interwoven to provide a In particular, users can connect Office Excel to OLAP cubes that are single cohesive self-serve environment for end-user content creation. hosted on SSAS and slice and dice data in the same fashion in which This places the power of content creation and analysis in the hands of they would use a local PivotTable. In addition, Office Excel can be used the end users. Without the intensive back-and-forth BI-development 18 The Architecture Journal 22
  • 21.
    process that currentlyexists, users are free to expand their knowledge Self-Service BI: A KPI for BI Initiative exponentially and on their own time. by Uttama Mukherjee Resources A most commonly asked question is: Can the overall business Donald Farmer: Foraging in the Data Forest. Blog. Available at performance be attributed directly to the success of business http://www.beyeblogs.com/donaldfarmer/. intelligence (BI)? To come up with a credible answer, the prerequisite is to have a widespread adoption of an intuitive Molnar, Sheila, and Michael Otey. “Donald Farmer Discusses BI, which comes from a self service–enabled BI setup. For the the Benefits of Managed Self-Service.” SQL Server Magazine, sustenance of such models, it is better to ensure that self- October 2009. Available at http://www.sqlmag.com/Articles service BI initiatives are funded through a “pay-per-use” /ArticleID/102613/102613.html?Ad=1. process. An index of assessing the degree of self-serviceability of BI Microsoft Corporation. MSDN SQL Server Developer Center implementation is one of the key performance indicators (KPIs) documentation and articles. Available at http://msdn.microsoft.com to measure the success of BI. /en-us/library/bb418432(SQL.10).aspx. The following dimensions become critical to enable an all- pervasive self-service BI. SQL Server Reporting Services Team Blog. “Report Builder 3.0, August People: Self-service for standard users can be addressed CTP.” August 2009. Available at http://blogs.msdn.com through a governance process. The conflict of flexibility and /sqlrsteamblog/. standardization becomes a topic of more elaborate deliberation for implementing a self-service environment for power users. Alton, Chris. “Reporting Services SharePoint Integration Typically, power users having direct access to the enterprise- Troubleshooting.” MSDN SQL Server Developer Center, August 2009. wide “single version of truth’” results in possible runaway Available at http://msdn.microsoft.com/en-us/library/ee384252.aspx. queries and redundant reports. Such users must be provided privileged access to “BI workspace,” defined succinctly by Pendse, Nigel. “Commentary: Project Gemini—Microsoft’s Brilliant Forrester as a data-exploration environment in which power OLAP Trojan Horse.” The BI Verdict, October 2008. Available at users can analyze data with near-complete freedom and http://www.olapreport.com/Comment_Gemini.htm. minimal dependency on IT, or without being impeded by data- security restrictions. PowerPivot Team Blog. “Linked Tables.” August 2009. Available at Process: Standard users get a self-service option through http://blogs.msdn.com/gemini/. a set of predefined reports/analytics as a relatively static utility service, to which they can subscribe at a price (notional/actual). The accumulated credit may be used for funding future BI About the Author initiatives. Depending on organization culture, consensus- Ken Withee ([email protected])is a consultant driven processes are established though a BICC framework. with Hitachi Consulting and specializes in Microsoft technologies Additionally, the BICC ensures transfer of the gathered insights in Seattle, WA. He is author of Microsoft Business Intelligence for from power users to the larger group—evolving into a more Dummies (Hoboken, NJ: For Dummies; Chichester: Wiley Press, mature BI setup. 2009) and, along with Paul Turley, Thiago Silva, and Bryan C. Smith, Technology: With the preceding two aspects addressed coauthor of Professional Microsoft SQL Server 2008 Reporting Services appropriately, self-service demand of the majority of (Indianapolis, IN: Wiley Publishing, Inc., 2008). information consumers can be met by using a standard enterprise-wide BI setup. Considering that most of these services are predefined, the load on the BI platform is predictable to a large extent. But for power users, who are Follow up on this topic synthesizers, additional data (internal/external, structured/ • PowerPivot: http://www.powerpivot.com/ unstructured) and information requirements demand state- of-the-art technology and higher throughput of the “BI workspace.” To meet the needs of power users, technology decisions become critical, and funding becomes a challenge. One index (among probable others) to measure the degree of self-service is to monitor the usage rate of utility analytics by standard users and conduct a qualitative satisfaction survey to monitor acceptance by power users. The “pay-per-use” fund that is accumulated gives a quantitative measure. Uttama Mukherjee ([email protected]), Practice Director–HCL Technologies, leads strategic BI consulting and delivery services in BI across industries. 19 The Architecture Journal 22
  • 22.
    Business Insight = BusinessInfrastructure = Business-Intelligence Platform by Dinesh Kumar Summary Table 1: Business-insight characteristics To do things differently, one must look at things differently. This article introduces the notion of Requirement Implication business infrastructure providing the necessary bridge Collaborating User experience—Consistent approach to between (contextual) business insight and a (common) across the delivering, analyzing, finding and sharing organization information to make informed decisions business-intelligence (BI) platform. Using the business- infrastructure and business-capability models, the Collaboration—Ability to share, annotate, and perform actions as you would with a article provides a prescriptive approach to planning and document delivering BI services. Transacting Speed and integration—Real-time data decisions gathering, analysis, decision making, and Changing Landscape subsequently taking actions through Currently, the IT industry is transitioning from an era of limited transactional systems capability of individual/functional reporting and analysis to one that is defined by a connected, collaborative, and contextual world Anticipating Service-oriented—Adding, integrating, and of BI. Insight is gained not only by analyzing the past, but also by unknowns delivering additional data as it becomes available or relevant anticipating and understanding the future. Insight has value only if people are able to act on it in a timely manner. Reducing cost Platform—Shared services As the need for real-time data gathering, analysis, and decision of change and making increases, so, too does, the need to perform actions ongoing operations through transactional systems. Insight is not individual. In a world of collaborative BI, people want to find, share, comment on, and review data quite similarly to how they handle documents. Insight is a core sufficient to support new information-driven business processes and competency only if it comes naturally to people. As a result, cost, organizational models. To gain and capitalize on business insight, we capability, and consistency become equally important. must think differently about how we evaluate, plan, communicate, and Table 1 provides a list of characteristics for business insight in any implement BI capabilities in the organization. organization. Next Practice Current Practices Understandably, people are driven by their needs. The BI-capability Currently, there are two dominant approaches to delivering BI planning and delivery must respect individuality while driving capabilities. Some organizations utilize a “make-to-order” approach consistent thinking and common capabilities in business and IT. This to deliver a specific solution for a specific business need. For article introduces the next practice with a capability model and a example, when a sales team wants to target customers for upgrades methodical approach for planning BI capabilities for business insight. and new products, the IT group creates a customer data mart or a report, extracting and summarizing data from CRM systems. Concept #1: Business Infrastructure When manufacturing wants to analyze inventory or supply chain, Just like IT, business also has an infrastructure. IT creates a manufacturing data mart, extracting and summarizing information from an ERP system. To address new requirements, Based on the work across a wide variety of industries and solution these organizations keep adding layers to individual, functional, scenarios, the author has realized that almost every business process or unidirectional reporting-and-analysis systems—introducing or activity is dependent on similar sets of capabilities. For example, duplication, delays, complexity, and cost. when cashing a check in a bank branch, a client asks the bank teller Other organizations have adopted a “build it and they will about a new mortgage promotion. The teller calls someone in the come” approach by building a massive, centralized, enterprise data mortgage department to inquire about the new promotion. The warehouse with the expectation that different groups might want to same client is thinking of rebalancing a financial portfolio and asks a access and analyze data someday. It takes significant effort as well as financial advisor in the same bank about treasury bonds. The advisor expenditure to build something; then, it takes an equally huge effort calls a bond expert for information. and cost to maintain and extend it. These two business activities are remarkably different and These approaches to planning and building BI capabilities are not performed in two different departments, yet both rely on a similar 20 The Architecture Journal 22
  • 23.
    Business Insight =Business Infrastructure = Business-Intelligence Platform capability—that is, access to an expert. Likewise, various business organization. For example, in a utility company, logistics management processes and functions need or use similar BI capabilities, which and workload management are added, as they are quite important are different only in content and context. For example, the finance and distinct areas in the organization. In a financial institution, department is focused on financial data, whereas manufacturing is individual and institutional banking are attributes of customer and focused on production and quality data. However, both need the product management, but financial-advisor services are added as a same BI capability: to report and analyze the information that they core capability for additional focus. value. In other words, improving the common BI capability—such as These capabilities are fairly industry-independent or process- reporting and analysis in the preceding example—will improve or independent; therefore, they can be characterized along a value- enable multiple business activities. maturity model. The maturity model helps organizations to assess the Just as with IT infrastructure, business infrastructure represents a current state and articulate the desired state easily and quickly. set of common, horizontal business capabilities that support multiple Table 3 on page page 22 provides examples of the maturity level specialized, vertical business processes and functions. Just as in of some of the business-infrastructure capabilities. The detailed IT infrastructure, improvement of the business infrastructure will model enumerates the maturity of each capability in terms of people, reduce process complexity, enhance organizational agility, and drive process, information, access methods, security and compliance, business maturity. Just as IT architecture includes both applications availability, and performance attributes. and infrastructure capabilities, business architecture includes both business-process capabilities and business infrastructure. Concept #2: BI Platform In the context of BI, we have organized the horizontal business BI is cross-functional, cross-people, and cross-data. capabilities into three value domains. The value domains represent the type of outcome or impact that is expected in the context of Just as there are common business capabilities that enable business a business process or activity in which the underlying capability is insight, there is also a collection of BI services that articulate leveraged. underlying technical BI capabilities. A service-oriented approach to (See Table 2 for the list of business capabilities that make up the defining BI capabilities minimizes complexity and cost, while it drives business infrastructure for business insight.) consistency and maturity in capabilities. One could argue that there could be additional business BI services are organized into four domains, each of which capabilities under the business-management value domain. Financial, addresses a distinct segment of the information flow. Table 4 on customer, and product management are considered core capabilities page 22 lists the four domains and the relevant capabilities that are of every organization, regardless of size and industry, including included in each domain. the public sector. Other areas of business management are either Figure 1 on page 23 articulates the collection of BI services under extensions to these core areas or specific to an industry or an each domain that use interfaces to other inbound, outbound, and Table 2: Business-infrastructure capabilities Value domain Business capability Description Business management Plan & manage core business functions. Ability to plan & manage: Financial management Cost and revenue across the organization. Customer management Demand generation, sales, service, and customer satisfaction. Product management Product planning, development, manufacturing, and distribution. Innovation & transformation Drive growth & competitive advantage. Ability: Synergistic work For a team to work together to perform an activity or deliver on a shared objective. The team might include people from either within or outside the organization. Consensus & decisions To gain necessary consensus among stakeholders and make decisions. Stakeholders might involve people from across organizational boundaries. Communication of timely, relevant To send and receive required information to the appropriate people, when needed. information Communicator or receiver might be from either inside or outside the organization, and communication may be either upstream or downstream. Sense & respond To anticipate, detect, and monitor internal or external events or trends, and to take appropriate actions. Authoritative source of Information To rely upon information in any transaction or decision making. Planning & delivery excellence Drive operational performance objectives. Ability to: Information orchestration Consolidate information across business activities or disseminate information to appropriate consumer, when and where needed. Governance & compliance Ensure that various policies and rules are understood and that the organization behaves accordingly. Reporting & analysis Create, analyze, and deliver appropriate information, when and where needed. Performance measurement Measure, monitor, and communicate appropriate cost and performance metrics of a business activity or process. 21 The Architecture Journal 22
  • 24.
    Business Insight =Business Infrastructure = Business-Intelligence Platform Table 3: Sample business capability-maturity model Value-maturity level Value domain Business capability Level 1 Level 2 Level 3 Level 4 Business Customer Sales and corporate Any individual or An extended Access and analysis management management functions can business function organization (partners) can be performed summarize and report can understand and can access to sales and with nominal effort on sales performance. monitor sales and support data for its anytime, anywhere, marketing data in their own analysis. People and by anyone, own context. can perform trending including customers. and develop forecasts. Innovation & Sense & respond Local, functional level. Enterprise level; 24/7; Include partners and Worldwide level. transformation Collect and report on customer data. remote locations. Information includes operational data. Monitor, consolidate, industry and market and analyze. research. Planning Information Orchestrate Orchestrate Orchestrate Orchestrate & delivery orchestration information at information at information across information across the excellence functional level— enterprise level, partners (supply industry—allowing based on internal including customer chain)—analyzed and what-if scenarios, and operational data, and information— available on multiple available at point-of- delivered in report consolidated and device types or interest on any device form on internal available via remote networks. or network. network access, team sites, portals, and COTS apps. dependent services. As the common business and technical capabilities are industry-, The concept of platform or infrastructure services can be applied organization-, or technology-independent, the relationships or across the whole IT domain. It is expected that BI services will dependencies between business and technical capabilities are leverage and integrate with other enterprise services such as security, predefined. This allows organizations to answer quickly such questions backup/recovery, and storage. The complete IT-service portfolio as, “What technical capabilities do we need to enable a level of with capability models is covered in a patent pending on IT Service maturity in a business-infrastructure capability?” and “What business Architecture Planning and Management. capability can be enabled by using a technical capability?” Just as there is a capability-maturity model for business Relationships help in rapidly defining the scope, identifying the infrastructure, BI services are also characterized by using a capability- dependencies, and communicating value to various stakeholders. maturity model. The BI-service maturity model leverages and extends Figure 2 on page 23 provides a framework for leveraging known the Microsoft Infrastructure Optimization (IO) Model. relationships and maturity models for developing the overall vision Table 5 on page 24 provides an example of BI-service maturity and architecture road map. levels. The model includes a range of attributes, addressing the 360-degree view of the service. Case Study: Assessment and Road-Map Planning With the information model in hand, the process of aligning Concept #3: Relationships and Road Map and anticipating business needs, evaluating the current state, articulating the vision, developing the road map, and leveraging Don’t reinvent what we already know. every opportunity to advance the journey becomes simpler, more predictable, and repeatable. Using a real case study and the previously described information Table 4: BI services and capabilities model, let us walk through the process and develop an assessment and road map for BI. BI-service domain Capabilities Information delivery Accessing and delivering information, when Situation needed, on a device or tool through one or Smart grids and smart appliances are changing the landscape in the more channels utility industry. Depending upon the fluctuation in prices, customers might want to control their use of energy. This requires real-time Information analysis Aggregating, analyzing, visualizing, and pricing and usage analysis. Based on the customers’ thresholds, they presenting information control the appliances in real time. The business model of the utility Data integration Mapping, sharing, transforming, and industry is also evolving. Within a few years, utility customers could consolidating data become suppliers by establishing their own solar power–generation facilities. The utility company in this study wanted to ensure that its Data management Storing, extracting, loading, replicating, BI efforts were designed to meet future challenges and plan the BI archiving, and monitoring data architecture for the expected change in the industry. 22 The Architecture Journal 22
  • 25.
    Business Insight =Business Infrastructure = Business-Intelligence Platform Figure 1: BI-service architecture BI services & capabilities Information Information Data integration Data delivery analysis management Master-data External Distribution Reporting management Data store interfaces Messaging Data Search Analytics exchange OLAP client Data Office Publishing Visualization mapping ETL Back-office Data services Web browser Portal Data mining transformation Replication Supply chain Operational Phone/PDA Scorecard data Store Archiving Human resources Web services Dashboard Data warehouse Billing Data marts Customer management Financials Foundation: Infrastructure & operations services Workload Load management Firewall balancing Security Database Storage Monitoring ... Remote Clustering Backup/ Auditing/ Directory Server OS recovery logging access (HA) Figure 2: Relationships and planning framework Strategic objectives & drivers Value (impact) priorities Business functions & processes Business management Business Innovation & transformation infrastructure Planning & delivery excellence Predefined relationships Capabilities Vision dependencies architecture Information delivery road map BI Information analysis services Data integration Data management Maturity levels Cost Technologies design patterns 23 The Architecture Journal 22
  • 26.
    Business Insight =Business Infrastructure = Business-Intelligence Platform Table 5: BI-service capability maturity—Analytics service Attribute Description Level 1 Level 2 Level 3 Level 4 Analytics Provide ability to analyze data from any source Information What types and formats Functional or Business-process data Summary and details Historical, forecast, trends, of information are departmental data across business entities KPIs, scorecards; multi- provided? dimensional; XML formats Transactions What actions can be Import, create Slice and dice Trend, drilldown and Predictive analysis, data performed? across mining Access Who can access the Desktop applications Web browser and Integrated productivity Embedded LOB capability, and how? analytical tools; remote suite, Web-based applications, mobile access interactive tools; Internet devices; available both access offline and on mobile devices Integration How is information E-mail attachments; file- Web sites, database APIs; Workspaces, XML-based Subscription and consolidated, based; batch scheduled interface; on-demand notification; Web services; disseminated, or real-time integrated? Infrastructure How is this capability Individual files Functional portals IT-provisioned and Hosting, shared services implemented? -managed Security How is information None Access user-managed Role-based access Rights management; secured or access compliance management controlled? Performance What are the service Response time acceptable Response time acceptable Response time acceptable Anytime, anywhere; levels? at major sites; availability at branch location; at all user locations; 24/7 scalable unpredictable or not unplanned down time availability monitored Operations How is this capability No formal backup/ User-managed backup/ Centrally managed Automated backup/ provisioned, monitored, recovery procedures recovery; manual backup/recovery; recovery; monitoring or managed for monitoring and automated monitoring, exceptions; self- business continuity? provisioning reporting and provisioning provisioning Technologies How is the technology No standard or guidelines Multiple technologies and Standard across one or Enterprise-wide standard life cycle managed? versions more business units Process and Deliverable assessed, and the desired state of business-infrastructure A simple five-step process and the information model produced the capabilities that are needed to achieve the stated strategic desired output: objectives was articulated. (See Figure 3 for a visual representation of the current and desired maturity levels.) The analysis focused 1. Understand the strategic objectives—what is or could be on the capabilities that had the greatest impact on strategic driving the need for change. objectives and the largest relative gaps between the current and Through interviews, 10 significant initiatives or objectives were desired states of capability maturity. identified that addressed business, regulatory, and operational 4. Identify and evaluate required technical BI capabilities. goals. These objectives required improvement in people, process, Using the predefined relationship map between business and information capabilities; as such, the case study will focus on infrastructure and technical capabilities, the relevant technical BI-related capabilities. capabilities were identified and evaluated. 2. Identify the required business infrastructure. Using the maturity model and knowing the desired state Evaluating an individual business process can be very time- of the business capabilities, the maturity map for the technical consuming. Also, the overall scope and objective of each strategic capabilities was developed (see Figure 4 on page 25.) initiative is not always clear. Therefore, instead of analyzing various 5. Develop the road map for realizing the vision. business processes, the focus is to understand the maturity of Based on the business priorities, current and planned projects, business-infrastructure capabilities in support of the strategic and dependencies between various capabilities, a statement of initiatives. direction was established. The projects and capabilities delivered Figure 3 on page 25 shows the number of strategic initiatives by these projects were organized along the four BI Services or objectives that are enabled by each business-infrastructure domains in short-term, near-term and long-term time horizons. capability. This mapping identified the top business capabilities The road map was not about building future capabilities today. that the organization must explore and understand for the current It emphasized beginning with the end in mind by architecting and desired levels of maturity. current capabilities such that new capabilities can be introduced 3. Identify and evaluate required business-infrastructure cost-effectively when needed. capabilities. Using the knowledge of the maturity map of both business and Using the capability-maturity model, the current state was BI capabilities, the current and desired states of these capabilities, 24 The Architecture Journal 22
  • 27.
    Business Insight =Business Infrastructure = Business-Intelligence Platform Figure 3: Business-capability alignment and maturity Number of strategic Domain Capability objectives enabled Value-maturity level 1 2 3 4 5 6 7 8 9 10 1 2 3 4 Current state Financial management Desired state Business Customer management management Product management Synergistic work Consensus & decisions Communication of Innovation & timely & relevant transformation information Sense & respond Authoritative source of information Information orchestration Governance & Planning & compliance delivery excellence Reporting & analysis Performance measurement Figure 4: BI-capability maturity Domain Service Value-maturity level Current state Desired state 1 2 3 4 Presence Information Search access Portal Delivery Reporting Information Analytics analysis Scorecards Master-data Data management integration Data warehouse Data exchange Data store Data OLAP management ETL Replication 25 The Architecture Journal 22
  • 28.
    and the statementof direction, the organization is also able to Anatomy of a Failure: plan and execute each new initiative or project—ensuring that 10 “Easy” Ways to Fail a BI Project every investment results in another step in the right direction. by Ciprian Jichici Access to the Model Microsoft has embodied the model that is presented in this article, In the past 10 years, business intelligence (BI) has gone along with a structured approach for developing BI strategy, in a through the typical journey of a “hot” technology. It started service offering that is called Assessment and Road Map for Business with the mystery of the first implementations, fresh out of Intelligence. Using this service, an organization can gain access to the the academic world; went through the buzzword stage; model, obtain additional knowledge, and develop its first BI strategy and ended up in the technological mainstream. Next to this by using the model. horizontal evolution, there has been a vertical one, in which BI has descended from the top of the organization to the Conclusion masses and begun addressing a much broader set of business If an IT organization wants to create possibilities or “happy surprises” needs. for the business, it has to change its mindset and execution from Regardless of the phases through which it has gone, “think locally, act locally” to “think locally, act globally.” BI as a BI has been—and, some argue, still is—an architectural platform service will help organizations deliver a consistent BI challenge. In real life, BI architects have to deal with multiple experience while it maximizes ROI. Business infrastructure will technologies, platforms, and sources of data. Today, we do help business stakeholders and users to understand the business have some kind of architectural guidance for most of the commonalities and IT organizations to anticipate the business needs components that play together in complex BI solutions. and plan the BI road map. Together with business infrastructure and Yet we still lack the proper architectural guidance on the BI platform, organizations not only can gain business insight, but also complicated topic of being successful in making them play can act upon it. nicely together. Finally, the success equation of a BI project has one more key element—people—who are tightly References connected to processes and culture. Kumar, Dinesh. “IT Service Architecture Planning and Management.” From an architectural point of view, three of the most U.S. Patent Pending. December 2007. common reasons for failure are the inability to recognize that BI needs consistent architectural planning; unfortunate Ross, Jeanne W., Peter Weill, and David C. Robertson. Enterprise selection of technologies; and over-focusing on technological Architecture as Strategy: Creating a Foundation for Business Execution. issues (such as performance and data volumes) too early in Boston, MA: Harvard Business School Press, 2006. the process. When it comes to data, failure to reveal relevant information and granularity mismatch (together with poor Sykes, Martin, and Brad Clayton. “Surviving Turbulent Times: query response times) are two of the “easy” ways to derail a Prioritizing IT Initiatives Using Business Architecture.” The Architecture BI project. Journal, June 2009. http://msdn.microsoft.com/en-us/architecture/ But perhaps the number-one reason for failure that aa902621.aspx. is related to data is assuming that BI is synonymous with having a data warehouse. Finally, from a person’s point of Microsoft TechNet. Microsoft Infrastructure Optimization (IO) Model. view, failing a BI project is as easy as assuming that end Available at http://www.microsoft.com/io. users have the know-how to use the tools; not getting them the appropriate levels of detail; and, of course, going over Microsoft Services. Microsoft Assessment and Road Map for Business budget (which, oddly enough, occurs many times as a result Intelligence. Available at http://www.microsoft.com/microsoftservices/ of previous success). en/us/bpio_bi.aspx. It is safe to say that it is quite difficult to get BI right and quite easy to get it wrong. To summarize the preceding, the Acknowledgements easiest way to fail your BI project is probably a “first build the Special thanks to Tom Freeman, Geoff Brock, and Brant Zwiefel from plant, then decide what to manufacture” approach. Microsoft Services for their candid feedback and thorough reviews of the draft. Ciprian Jichici is participating in the Microsoft Regional About the Author Directors Program as a Regional Director for Romania. Since Dinesh Kumar ([email protected]) is a Principal Architect who 2003, he has been involved in many BI projects, workshops, focuses on enterprise architecture and IT planning. His recent research and presentations. You can read an extended version of the focuses on understanding business needs, planning, optimizing, and preceding article at http://www.ciprianjichici.ro. managing IT as a service organization. Dinesh serves as co-chair of the enterprise-architecture working group at Innovation Value Institute. He is also on the CIO advisory board for MISQ Executive, a SIM publication. Follow up on this topic • MS Business Intelligence: http://www.microsoft.com/bi/ 26 The Architecture Journal 22
  • 29.
    Semantic Enterprise Optimizerand Coexistence of Data Models by P. A. Sundararajan, Anupama Nithyanand, and S.V. Subrahmanya Summary Table 1: Data models for various purposes The authors propose a semantic ontology–driven Data-model name Purpose enterprise data–model architecture for interoperability, integration, and adaptability for evolution, by Hierarchical Complex master-data hierarchies (1:M) Very high schema-to-data ratio autonomic agent-driven intelligent design of logical as well as physical data models in a heterogeneous Network Complex master-data relationships (M:M) Spatial networks, life sciences, chemical distributed enterprise through its life cycle. structures, distributed network of relational tables An enterprise-standard ontology (in Web Ontology Language [OWL] and Semantic Web Rule Language Relational Simple flat transactions Very low schema-to-data ratio [SWRL]) for data is required to enable an automated data platform that adds life-cycle activities to the Object Complex master-data relationships, with nested repeating groups current Microsoft Enterprise Search and extend Microsoft SQL Server through various engines for XML Integration across heterogeneous components; canonical; extensible unstructured data types, as well as many domain types that are configurable by users through a Semantic- File systems Structured search query optimizer, and using Microsoft Office SharePoint Record-oriented Primary-key retrieval—OLTP—sequential Server (MOSS) as a content and metadata repository to processing tie all these components together. Column-oriented Secondary-key retrieval; analytics; aggregates; large data volume, requiring Introduction compression Data models differ in their structural organization to suit various Entity-attribute-value Flexibility; unknown domain; changes purposes. For example, product and organization hierarchies yield often to the structure; sparse; numerous well to the hierarchical model, which would not be straightforward types together to represent and access in a relational model (see Table 1). The model is decided by following factors: While the relational database helped with the sharing of data, metadata sharing itself is a challenge. Here, enterprise ontology 1. Ease of representation and understandability of the structure for is a candidate solution for metadata integration, and it leverages the nature of data such advances for stable Enterprise Information Integration (EII) and 2. Flexibility or maintainability of the representation interoperability, in spite of the nebulous nature of an enterprise. 3. Ease of access and understanding the of retrieval, which involves Ontologies are conceptual, sharable, reusable, generic, and the query, navigation, or search steps and language applicable across technical domains. They contain explicit knowledge 4. Ease of integration, which is an offshoot of maintainability and that is represented as rules and aids in inference. Also, they improve understanding communication across heterogeneous, disparate components, tools, 5. Performance considerations technologies, and stakeholders who are part of a single-domain 6. Space considerations enterprise. Depending on the requirement—be it a structured exact search Evolution of Enterprise Integration or a similarity-based unstructured, fuzzy search—we can have a It is interesting to note the evolution of enterprise integration over heterogeneous mix of structured, semistructured, and unstructured periods of time, when there were simple applications for each specific information to give the right context to enterprise users. task in the past, to the applications on the Web that can communicate 27 The Architecture Journal 22
  • 30.
    Semantic Enterprise Optimizerand Coexistence of Data Models with each other—achieving larger business functionality, and blurring or disk-based write-only store and memory-based read-only store— the boundaries of intranets, Internet, and enterprises: moving them across frequently, according to their life-cycle stages and the characteristics that they exhibit. Offloading some load to 1. Data file sharing in file systems auxiliary processors that specialize in SQL processing are also some 2. Databases of the practices that we can observe in data-warehouse appliances. 3. ERP, CRM, and MDM 4. ETL—data warehouse Intelligent Autonomic Design 5. Enterprise Information Integration (EII) Many systems optimally design or recommend based on the following 6. Enterprise Application Integration (EAI)—service-oriented inputs: architecture (SOA) 7. Semantic Web services • Logical schema • Sample data With respect to information, too, the lines between fact and • Query workload dimension, data and language, and structured and unstructured are blurred when a particular type of data morphs over time, with volume Some systems have abstracted the file-handling parts that must be and its importance to and relationship with its environment. A normal handled at the OS level. transaction data model can progress from business intelligence and Oracle Query Advisor and Access Advisor offer best plans, based predictive data mining to machine learning toward a knowledge on statistics. model and becomes actionable in language form, where it can communicate with other systems and humans. Impedance Mismatch and Semantic Interoperability Enterprise data needs a common vocabulary and understanding Object-Relation mapping (ORM) is a classic pattern in which there are of the meaning of business entities and the relationships among them. two different tools that would like to organize the same data, in two Due to the variety of vendors who specialize in the functions of an different ways. Here, the same enterprise data has to be represented enterprise, it usually is a common sight to see heterogeneous data by using an inheritance hierarchy in object-oriented design, whereas models from disparate products, technologies, and vendors having it can be one or more tables in a relational database. LINQ to SQL and to interoperate. LINQ to Entities (Entity Framework) are some tools that address this Data as a service with SOA, enterprise service bus (ESB), and space. canonical data models have tried to address this disparity in What if the database is a hierarchical database, or a plain entity- schema or structure, but have not addressed the seamless semantic attribute-value model? interoperability until the advent of Semantic Web services. If the language happens to be a functional programming model, Semantic enterprise integration through enterprise Web Ontology and the database that we use happens to be an object-oriented Language (OWL) can be a solution for the seamless semantic database, a different sort of impedance mismatch might emerge. interoperability in an enterprise. So, wherever there are different paradigms that must communicate, it is better to have an in-between intermediary. Motivation for This Paper: Industry Trends In this case, the authors propose that the domain model (not Accommodating and Coexisting with Diversity an object model) be represented in an enterprise-wide ontological Storing all the data in a row-oriented, third normal form (3NF) model—complete with all business logic, rules, constraints, and relational schema might not be optimal. We see many trends, such as knowledge represented. For each system that must communicate, various types of storage engines, that are configurable and extensible let it use this ontological model as a common denominator to talk to in that specific domain data types can be configured and special systems. domain indexes built, and the optimizer can be made aware of them Another area of impedance mismatch is the one between to cater to heterogeneous data-type requirements. These object- a relational SQL model and the OLAP cube Multidimensional oriented semantic extensions are built as applications on top of the Expressions (MDX). MDX is a language in which the levels of database kernel, and there are APIs for developers to customize and hierarchical dimensions are semantically meaningful, which is not extend to add their own unstructured or semistructured data types. the case with tuple-based SQL. Here again, a translation is required. This is used in spatial, media, and text-processing extensions that Instead of going for a point-to-point solution, we might benefit from come with the product. In some cases, native XML data types are also a common ontology. supported. The application requests a semantic-data-services provider, which Microsoft Enterprise Search is an example of disparate search translates the query appropriately to the enterprise ontology model from e-mail in Microsoft Exchange Server to user profiles in Active and—depending on the data-source model—federates the query in Directory to Business Data Catalog in Microsoft SQL Server RDBMS a modified form that is understood by the specific data model of the and Microsoft Office documents. data source. Prominent players have addressed unstructured data in the The domain model is conceptual and could replace or reuse form of content-management systems, which again have to be the conceptual entity-relationship or UML class-object structural accommodated in the proper context with traditional enterprise diagrams. structured data—both metadata- and content-wise. Data-Flow Architecture for the Semantic Enterprise Model Offload to Auxiliary Units The following sections explain the Semantic enterprise optimizer Many database systems support a row-oriented OLTP store for that can bridge the gap between the various disparate data models updated rows and columnar-compressed store for read-only store that can coexist and provide the data services intelligently (see also Figure 1 on page 29). 28 The Architecture Journal 22
  • 31.
    Semantic Enterprise Optimizerand Coexistence of Data Models Figure 1: Semantic enterprise optimizer and coexistence of data models Workflow repository Data insertion/ for life cycle and update/deletion data-model progression from user for data Rules repository for life cycle and data-model Semantic progression for data User enterprise request optimizer Workflow and rules enterprise semantic-ontology repository Event Query/search Instance metadata generator from user lineage navigator from life cycle–state change of data Relational Distributed Autonomic Evolutionary Logical and Physical Design federates the query to online or archived storages, and across Depending on the usage, volume of data, and the life-cycle stage, we heterogeneous models and products. Here, SOA-based data and have a proposal for automatic logical and physical data-model design. enablement of metadata as a service is helpful. Initially, when a domain is not known with concrete requirements, an entity-attribute-value model is always good to start with. Here, Semantic Data Services the structure is defined by an administrative screen with parameters, The Semantic Data Services extend the features of the data-service which takes the definition of the record structure and stores the object to enable ontology-driven semantics in its service. The metadata and data by association. interface services consult the enterprise ontology for interaction. After a periodic interval, there is an agent that watches over the record types over a period of time, as well as the actual data Workflow and Rules Enterprise Semantic-Ontology Repository in the record values, to see if the changes have settled down and For each type of data that is classified, we can define the lineage that the structure has become relatively stabilized. When the structure it should follow. For example, we can say that an employee-master is stabilized, the analyzer now qualifies which type of structure is record in an enterprise will be entering as master data. It will follow suitable for the entity—keeping into account the queries, the data the semantic Resource Description Framework (RDF) model, in which and its statistics, the changes to the structure and its relationship relationships for this employee with others in the organization are with other entities, and its life-cycle stage. defined. The employee master will also be distributed to have the attendance details in the reporting location; however, salary details Component Model of the Semantic Enterprise Optimizer will be in the central office, from where disbursements happen. The Semantic Enterprise Optimizer employee record will be maintained in the online transaction systems The Semantic enterprise optimizer consults the workflow and rules till the tenure of the employee with the enterprise. After the employee repositories in case of an insert or state change event, to find out has left, the employee record will remain for about one year for year- which data model should accommodate the incoming or state- over-year reporting, before it moves into a record-management changed data item. In case of a query, it consults the instance repository, where it is kept flattened for specific queries. Then, after metadata lineage navigator to locate the data. Accordingly, it three years, it is moved into archival storages, which are kept in highly 29 The Architecture Journal 22
  • 32.
    Semantic Enterprise Optimizerand Coexistence of Data Models compressed form. But the key identifying information is kept online of an advantage, but where there are changes in existing flows in life- in metadata repositories, to enable any background/asynchronous/ cycle states or changes in new data types. offline checks that might come for that employee later throughout the life of the enterprise. Conclusion All these changes at appropriate life-cycle stages are defined in We see the enterprise scene dominated by a distributed graph the workflow repository, together with any rules that are applicable in network GRID of heterogeneous models, which are semantically the rules repository. integrated into the enterprise; also, that enterprise data continually evolves through its logical and physical design, based on its usage, Event-Generator Agent origin, and life-cycle characteristics. Based on the preceding workflows and rules, if a data item qualifies Various data models that have been found appropriate or any for a state change, an event is generated by this component, which combination thereof can coexist to decide the heterogeneous model alerts the optimizer to invoke the routine to check the appropriate of an enterprise. The relational model emphasized that the user need data model for the data item to move into after the state change. not know the physical structure or organization of data. In this model, we propose that even the logical model need not be known, and any Instance Metadata Lineage Navigator enterprise-data resource should be reusable across operating systems, Every data instance has metadata associated with it. This will involve database products, data models, and file systems. attributes such as creation date, created by, created system, the path The architecture describes an adaptable system that can that it has taken at each stage of its life-cycle state change, and so intelligently choose the data model as per the profile of the incoming on. It will also contain the various translations that will be required to data. The actual models, applications and life-cycle stages that are trace that data across various systems. This component helps locate supported themselves are illustrative. The point is that it is flexible the data. enough to accommodate any future model that might be invented in the future. Adaptability and extensibility are takeaways from this Data-Model Universe architecture. This is the heterogeneous collection of data models that is available Also, dynamic integration of enterprise boundaries will lead for the optimizer to choose when a data item is created and, to more agility and informed decisions in the increasing business subsequently, changes state: dynamics. 1. Master and reference data—largely static; MDM; hierarchy; Acknowledgements relationships; graph; network; RDF The first two authors are thankful to the third author, 2. OLTP engine—transaction; normalized S. V. Subrahmanya, Vice President at E&R, Infosys Technologies Ltd., 3. OLAP cube engine—analytics; transaction life cycle completed; for seeding and nurturing this idea, and to Dr. T.S. Mohan, Principal RDF analytics for relationships and semantic relations Researcher at E&R, Infosys Technologies Ltd., for his extensive and 4. Records management or file engine—archived data; for data in-depth review. mining, compliance reporting The authors acknowledge and thank the authors and publishers 5. Object and object-relational databases for unstructured of the papers, textbooks, and Web sites that are referenced. All information—image databases; content-based information trademarks and registered trademarks that are used in this article retrieval are the properties of their respective owners/companies. 6. Text databases for text analytics, full-text search, and natural- language processing References 7. XML engines for integration of distributed-transaction processing Larson, James A., and Saeed Rahimi. Tutorial: Distributed Database 8. Stream processing—XML Management. Silver Spring, MD: IEEE Computer Society Press, 1985. 9. Metadata—comprises RDF, XML, hierarchy, graph integration of heterogeneous legacy databases in terms of M&A, and partnering Hebeler, John, Matthew Fisher, Ryan Blace, Andrew Perez-Lopez, and for providing collaborative solutions Mike Dean (foreword). Semantic Web Programming. Indianapolis, IN: Wiley Publishing, Inc., 2009. Here, the query must be federated, and real-time access has to be enabled with appropriate semantic translation. Powers, Shelley. Practical RDF. Beijing; Cambridge: O’Reilly & When the life cycle of the data changes, there are sensors or Associates, Inc., 2003. machine-learning systems that are programmed to understand the state when the life-cycle stage changes. When such changes Chisholm, Malcolm. How to Build a Business Rules Engine: Extending are detected, the record is moved accordingly from transaction Application Functionality Through Metadata Engineering. Boston: management to OLAP or data mining, or archival location, as per Morgan Kaufmann; Oxford: Elsevier Science, 2004. the lineage. So, when information is requested, the optimizer, based on the Vertica Systems. Home page. Available at http://www.vertica.com business rules that are configured, is able to find out which engine (visited on October 16, 2009). should be able to federate that query, based on the properties of the search query, and appropriately translate it into hierarchical, OLAP, or Microsoft Corporation. Enterprise Search for Microsoft. Available file-system query. at http://www.microsoft.com/enterprisesearch/en/us/default.aspx In regular applications that are developed, we might not see much (visited on October 16, 2009). 30 The Architecture Journal 22
  • 33.
    Semantic Enterprise Optimizerand Coexistence of Data Models G-SDAM. Grid-Enabled Semantic Data Access Middleware. Available at About the Authors http://gsdam.sourceforge.net/ (visited on October 18, 2009). P. A. Sundararajan ([email protected]) is a Lead in the Education & Research Department with ECOM Research Lab at Infosys W3C. “A Semantic Web Primer for Object-Oriented Software Technologies Ltd. He has nearly 14 years’ experience in application Developers.” Available at http://www.w3.org/TR/sw-oosd-primer/ development and data architecture in Discrete Manufacturing, (visited on October 18, 2009). Mortgage, and Warranty Domains. Oracle. Oracle Exadata. Available at http://www.oracle.com/database Anupama Nithyanand ([email protected]) is a /exadata.html (visited on October 21, 2009). Lead Principal in the Education & Research Department at Infosys Technologies Ltd. She has nearly 20 years’ experience in education, research, consulting, and people development. S. V. Subrahmanya ([email protected]) is currently Vice President at Infosys Technologies Ltd. and heads the ECOM Research Lab in the Education & Research Department at Infosys. He has authored three books and published several papers in international conferences. He has nearly 23 years’ experience in the industry and academics. His specialization is in Software Architecture. Thinking Global BI: Data-Consistency Strategies for Highly • Minimize fact table columns, and maximize dimension Distributed Business-Intelligence Applications attributes. The single biggest performance bottleneck in I/O for by Charles Fichter building and replicating MOLAP stores is large fact tables that have excessive column size. Large columns (attributes) within the The need for centralized data-warehousing (DW) systems to update associated dimension tables can be processed far more efficiently. and frequently rebuild dimensional stores, and then replicate to Extending dimension tables to a snowflake pattern (further geographies (data marts), can create potential consistency challenges subordinating dimension tables) for extremely large DW sizes can as data volumes explode. In other words: Does your executive in further increase efficiencies, as you can utilize partitioning across Japan see the same business results in near-real time as headquarter tables and other database features to increase performance. executives in France? Throw in write-back into the dimensional • If centralized DW, consider lightweight access (browser). stores, plus a growing need to support mobile users in an offline Utilizing tools such as SQL Server Report Builder, architects can capacity, and suddenly you’ve inherited larger challenges of consistent provide summary data by designing a set of fixed-format reports business-intelligence (BI) data. Consistent data view across the global that are accessible via a browser from a Reporting Services server. enterprise is a result of DW-performance optimizations that occur at By enabling technologies such as Microsoft PowerPivot for Excel design time. 2010—formerly known as codename “Gemini” (to be available H1 2010)—users can download cubes for their own manipulation in Here are some quick tips for thinking global BI: tools such as Office Excel 2010. PowerPivot utilizes an advanced compression algorithm that can greatly reduce the physical data • Look for optimizations locally first. Seek ways in which to size that is crossing the wire—to occur only when a “self-service” create and manage Multidimensional Online Analytic Processing request is initiated directly by a user. (MOLAP) stores that are close to the users who consume it. You’ll likely find that 80 percent or more of BI reporting needs are local/ You can learn more about Microsoft’s experiences in working directly regional in nature. Effective transformation packages (using tools with customers in extremely large DW and BI implementations by such as Microsoft SQL Server Integration Services [SSIS]) or even visiting the SQL CAT team Web site at http://sqlcat.com/. managing data synchronization directly through application code for asynch/mobile users (such as Synch Services for ADO.NET) can often be more flexible than replication partnerships. Charles Fichter ([email protected])is a Senior Solution Architect • As much as possible, utilize compression and read-only within the Developer Evangelism, Global ISV (Independent Software MOLAP for distribution. Many DW vendors have enabled write- Vendor) team at Microsoft Corporation. For the past four and a back capabilities into the MOLAP stores. Use these judiciously, and half years, Charles has focused on assisting Global ISVs with their minimize them to a smaller subset of stores. application-design strategies. 31 The Architecture Journal 22
  • 34.
    Lightweight SOAs: Exploring Patternsand Principles of a New Generation of SOA Solutions by Jesus Rodriguez and Don Demsak Summary Simple enough, right? An ideal SOA infrastructure should resemble Figure 1. This article explores some of the most common We can all agree that Figure 1, at least in theory, represents an challenges of traditional service-oriented architectures ideal architecture for enterprise applications. Unfortunately, large SOA implementations have taught us that the previous architecture is (SOAs) and discusses how those challenges can be just that: an ideal that is permeated by enormous challenges in areas addressed by using more scalable, interoperable, and such as versioning, interoperability, performance, scalability, and agile alternatives that we like to call “lightweight SOAs.” governance. These challenges are a direct consequence of the lack of Introduction constraints in SOA systems. Architecture styles that do not impose During the last few years, we have witnessed how the traditional constraints in the underlying domain quite often produce complex approach to service orientation (SOA) has constantly failed to deliver and unmanageable systems. Instead of simplifying the capabilities of the business value and agility that were a key element of its value SOA and focusing on the important aspects—such as interoperability, proposition. Arguably, we can find the causes to this situation in performance, and scalability—we decided to abstract complexity with the unnecessary complexity intrinsic of traditional SOA techniques more standards and tools. As a result, we have been building systems such as SOAP, WS-* protocols, or enterprise service buses (ESBs). As that present similar limitations to the ones that originated the SOA a consequence, alternate lightweight SOA implementations that are movement in the first place. powered by architecture styles such as Representational State Transfer One thing that we have learned from Ruby on Rails is the (REST) and Web-Oriented Architectures (WOA) are slowly proving to “convention over configuration” or “essence vs. ceremony” mantra. be more agile than the traditional SOA approach. By removing options and sticking with conventions (that is, standard ways of doing something), you can remove extra levels of abstraction SOA: Architecting Without Constraints and, in doing so, remove unneeded complexity of systems. Embrace SOA has been the cornerstone of distributed systems in the last few the options, when needed, but do not provide them just for the sake years. Essentially, the fundamental promise of SOA was to facilitate of giving options that will be underutilized. IT agility by implementing business capabilities by using interoperable interfaces that can be composed to Figure 1: Ideal SOA enable new business capabilities. From an architectural style standpoint, Service Service Service traditional SOA systems share a set of characteristics, such as the following: • Leveraging SOAP and WSDL as the fundamental standards for service interfaces Integration • Using WS-* protocols to enable B server mission-critical enterprise-service P capabilities M Enterprise service bus (ESB) • Deploying a centralized ESB for SOA abstracting the different service governance orchestrations • Using an integration server for implementing complex business processes Service Service Service • Using an SOA-governance tool to enable the management of the entire SOA LOB LOB LOB 32 The Architecture Journal 22
  • 35.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions we can all agree that SOAP has Figure 2: WSDL dependency—a big challenge in large SOA implementations failed on its original expectations. This about it: SOAP was originally coined as the Simple Object Access Protocol; but, as we all learned, it Service Service is not simple, is not about object access, and, arguably, is not a New protocol. version WSDL WSDL v1.0 v2.0 WSDL Abuse The Web Service Description Language (WSDL) is one of the fundamental specifications in the Client Client Client Client SOA portfolio. The purpose of WSDL is to describe the capabilities of a service, such as the messages Client Client Client Client that it can send and receive or how those messages are encoded by using the SOAP protocol. Conceptually, WSDL represents an evolution of previous description Although this article does not attempt to present a forensic analysis languages, such as the Interface Description Language (IDL), which of failed SOA initiatives, we think that it is worth highlighting some was the core of distributed programming technologies such as COM of the factors that developers should consider when they implement and CORBA. Following a similar pattern, WSDL quickly became the large SOA solutions. Given the extent of this article, we decided to fundamental artifact that client applications used to generate “proxy” concentrate on the following topics: representations that abstract the communication with the service. Undoubtedly, the idea of generating proxy artifacts that are • SOAP and transport abstraction based on a service Web Services Description Language (WSDL) can • Abuse of description languages facilitate client-service interactions in very simple environments. Like • ESB complexity its predecessors, the fundamental challenge of this approach is that • WS-* interoperability it introduces a level of coupling between the client and the service. • SOA governance In large SOA scenarios that involve hundreds of services and clients, that level of service-client dependency is typically the cause of serious The remainder of this article will dive deep into each of these service-management and versioning challenges, as Figure 2 illustrates. topics and will explore alternative architecture styles that can help developers implement more robust SOA solutions. To ESB or Not to ESB The seamless integration of heterogeneous line-of-business (LOB) SOAP and the Illusion of Transport Abstraction systems as part of business processes has been one of the promises The current generation of SOA solutions evolved around the of enterprise SOA systems. These integration capabilities are normally concepts of the Simple Object Access Protocol (SOAP). This protocol powered by a set of common patterns that constitute the backbone was originally designed to abstract services from the underlying of what the industry has considered one of the backbones of SOA: transport protocol. Conceptually, SOAP services can be hosted by the enterprise service bus (ESB). using completely different transports, such as HTTP and MSMQ. Although in theory it is a lovely idea, in practice we have learned that SOAP reliance on transport Figure 3: Central ESB neutrality comes at a significant cost—a cost that can be reduced when you do not need that neutrality. One of the best examples of the limitations of transport Service Service Service neutrality is the use of HTTP with SOAP-based service endpoints. The vast majority of SOAP-based services rely on HTTP as the transport protocol. However, the SOA HTTP binding uses only a very small subset of the HTTP specification, reduced to the POST method and a couple ESB of headers. As a result, SOAP HTTP–based services do not take advantage of many of the features that have made HTTP one of the most popular and scalable transport protocol in the history of computer systems. If we have inherited something good from SOAP, it has been the use of XML, which has drastically increased Service Service the interoperability of distributed applications. However, 33 The Architecture Journal 22
  • 36.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions Although there is no formal industry standard that defines what an ESB is, at least we can find a set of commonalities Figure 4: ESB reality in a large enterprise across the different ESB products. Typically, an ESB abstracts a set of fundamental capabilities such as protocol mapping, service choreographies, line- of-business (LOB) adapters, message distribution, transformations, and durability. Theoretically, we can use Service Service Service this sophisticated feature to abstract the communication between services and system—making the ESB the central backbone of the enterprise, as Figure 3 on page 33 illustrates. As architects, we have to love absolutely the diagram ESB in Figure 3. Undoubtedly, it represents an ideal model on which messages are sent to a central service broker and from there distributed to the final services. Unfortunately, if we are working in a large SOA, we are very likely to find that a central bus architecture introduces serious limitations in aspects such as management, performance, Service Service and scalability, as our entire integration infrastructure now lives within a proprietary framework. Instead of being a facilitator, an ESB can become a bottleneck for the agility, solutions. Interoperability is by far the most challenging aspect of governance, and scalability of our SOA. Consequently, we are forced WS-*–based solutions, as different Web-service toolkits sometimes to start building applications that do not fully leverage the ESB, and implement different WS-* protocols, different versions of the same our architecture quickly starts to resemble the diagram in Figure 4. protocols, or even different aspects of the same specification. Additionally, the adoption of WS-* has fundamentally been reduced WS-* Madness to the .NET and Java ecosystems, which makes it completely After the first wave of SOA specifications was created, several impractical to leverage emerging programming paradigms such as technology vendors began a collaborative effort to incorporate dynamic or functional languages into SOA solutions (see Figure 5). some key enterprise capabilities such as security, reliable messaging, and transactions into the SOAP/WSDL model. The result of this SOA Governance or SOA Dictatorship effort materialized in a series of specifications such as WS-Security, Management and governance must be at the center of every WS-Trust, and WS-ReliableMessaging, which the industry baptized medium to large SOA. While we keep enabling business capabilities as WS-* protocols. Beyond the undeniable academic value of the via services, it is important to consider how to version, deploy, WS-* specifications, they have not received a wide adoption in monitor, and manage them. This set of functionalities has been the heterogeneous environments. bread and butter of SOA-governance platforms that have traditionally The limited WS-* adoption in the enterprise can ultimately evolved around the concepts of the Universal Description, Discovery, be seen as a consequence of the incredibly large number of WS-* and Integration (UDDI) specification. specifications that have been produced during the last few years. Depending on our implementation, we might find that SOA- Currently, there are more than a hundred different versions of WS-* governance platforms are sometimes too limited for truly managing protocols, from which just a handful have seen real adoption on SOA complex Web services. These types of challenges are very common in SOA-governance solutions and are a consequence of the fact that Figure 5: WS-* interoperability challenges Web-service technologies have evolved relatively faster than the corresponding SOA-governance WCF client platforms. SOA-governance technologies WCF service have traditionally addressed (secured by using those limitations by relying on a centralized model in which WS-Trust) services are virtualized in the governance-hosting environments Ruby client and policies are enforced at a central location. Although this model can certainly be applicable Apache Axis2 service on small environments, it presents (hosted by using serious limitations in terms of interoperability, performance, and JMS transport) Oracle scalability (see Figure 6 on page 35). WebLogic client 34 The Architecture Journal 22
  • 37.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions Introducing Lightweight SOAs Conceptually, SOAs can be a very powerful Figure 6: Centralized SOA-governance models—impractical in large SOA implementations vehicle for delivering true business value to enterprise applications. However, some of the challenges that are explained Virtualization in the previous sections have caused SOA initiatives to become multiyear, multimillion-dollar projects that failed to Authorization Monitoring enable over the promised agility. Despite these challenges, the benefits of Policy correctly enabling SOAs can result in a great Authentication enforcement Service differentiator to deliver true business value on an enterprise. However, we believe in a lighter, more agile approach that leverages the correct implementation techniques and Client emerging architecture styles is mandatory Service to implement SOA solutions correctly. The remaining sections of this article will introduce some of the patterns and architecture techniques that we believe can Client Service help address some of the challenges that were presented in the previous section. Fundamentally, we will focus on the following aspects: Service Client • Leveraging RESTful services • WS-* interoperability • Federated ESB Service • Lightweight SOA governance • Embracing cloud computing • Multiple resource representation • Link-based relationships Embracing the Web: RESTful Services • Scalability In previous sections, the authors have explored various limitations of • Caching the fundamental building blocks of traditional SOA systems such as • Standard methods XML, SOAP, WSDL, and WS-* protocols. Although Web Services are transport-agnostic, the vast majority of implementations in the real The simplicity from the consumer perspective and high levels of world leverage HTTP as the underlying protocol. Those HTTP-hosted interoperability of RESTful services are some of the factors that can services should (in theory, at least) work similarly to Web-based drastically improve the agility of the next generation of SOA solutions. systems. However, the different layers of abstractions that we have built over the HTTP protocol limit those services from fully using the WS-* Interoperability capabilities of the Web. Despite the remarkable academic work that supports the WS-* family To address some of these challenges, SOA technologies have of protocols, we can all agree that its adoption and benefits did not started to embrace more Web-friendly architecture styles, such as the reach initial expectations. Interoperability and complexity remain as REST. REST has its origins in Roy Thomas Fielding’s PH.D dissertation important challenges that we face in the adoption of WS-* protocols that states the principles that make the Web the most scalable and in the enterprise. interoperable system in the history of computer software. Essentially, When it comes to WS-* interoperability, the absolute best REST-enabled systems are modeled in terms of URI-addressable practice is to identify the capabilities of the potential consumers of resources that can be accessed through HTTP stateless interactions. our services. Based on that, we can determine which WS-* protocols Following the principles of REST, we can architect highly scalable are best suited for our particular scenarios. In highly heterogeneous services that truly leverage the principles of the Web. environments, we should considering enabling different service Undoubtedly, REST has become a very appealing alternative to endpoints that support various WS-* capabilities. This approach SOAP/WS-* Web Services. The use of REST addresses some of the can drastically improve the interoperability of our services, given limitations of traditional Web Services such as interoperability and that the different clients can interact with the service endpoint that scalability. interoperates best with them. The following are capabilities of RESTful services: For example, let us consider a scenario in which we must secure a Web Service that is going to be consumed by .NET, Sun Metro, Oracle • URI-addressable resources WebLogic, and Ruby clients. In this scenario, we could enable three • HTTP-based interactions service endpoints with different security configurations, based on the • Interoperability consumer capabilities, as Figure 7 on page 36 illustrates. • Stateless interactions Even in scenarios in which we decide to use WS-* protocols, the • Leveraging syndication formats technique that Figure 7 illustrates helps us improve interoperability by 35 The Architecture Journal 22
  • 38.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions enabling various endpoints that capture the interoperability requirements of the different Figure 7: Pattern of multiple service endpoints service consumers. Lightweight Federated ESBs WCF As explored in previous sections, a client centralized ESB very often is one of WS-Trust endpoint the fundamental causes of failed SOA initiatives. The ability to centralize very Metro smart functionalities such as message client routing, transformation, and workflows is as Service appealing as it is unrealistic in medium-to large-enterprise environments. Essentially, by Oracle WS-Security endpoint relying on the clean concept of the central WL client service bus, we drastically constrain the options for scalability, specialization, and management of the enterprise solutions that Ruby Custom security endpoint leverage our SOA infrastructure. client After several failed attempts to implement centralized ESBs in large organizations, the industry has moved to a more agile pattern in Lightweight Governance which the functionality is partitioned across multiple lightweight The limited adoption of UDDI in large-scale SOA environments has physical ESBs that are grouped as a federated entity. This pattern been a catalyst for the emergence of lighter and more interoperable is commonly known as federated ESB and represents one of the SOA-governance models that leverage new architecture styles, such as emerging architecture styles for building highly scalable ESB solutions the REST and Web 2.0. Essentially, these new models look to remove (see Figure 8). some of the complexities that are bound to the centralized, UDDI- The federated-ESB pattern addresses the fundamental limitations based architectures—replacing them with widely adopted standards of the centralized-ESB model by partitioning the infrastructure such as HTTP, Atom, and JSON. into separate ESB that can be scaled and configured separately. One of the most popular new SOA-governance models is the idea of For instance, in this model, we can have a specific ESB infrastructure a RESTful Service Repository. In this model, traditional SOA constructs to host the B2B interfaces, while another ESB is in charge of the such as service, endpoints, operations, and messages are represented financial-transaction processing. This approach also centralizes certain as resources that can be accessed through a set of RESTful interfaces. capabilities—such as security or endpoint configuration—that do not Specifically, the use of resource-oriented standards such as Atom and impose any scalability limitation on the SOA infrastructure. the Atom Publishing Protocol (APP) is very appealing to represent and interact with SOA artifacts (see Figure 9 on page 37). Figure 8: Pattern of federated ESB ESB Business-activity monitoring ESB Security Business rules ESB Operational monitoring Service registry Error handling Services Services Services 36 The Architecture Journal 22
  • 39.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions This model represents a lighter, more flexible approach to both design and runtime governance. Figure 9: RESTful registry For example, runtime-governance aspects such as endpoint resolution are reduced to a simple HTTP RESTful registry GET request against the RESTful interfaces. The main advantage of this type of governance probably is the interoperability that is gained by the use of Resolve-service Register-service RESTful services, which will allow us to extend our endpoint endpoint SOA-governance practices beyond .NET and J2EE to heterogeneous technologies such as dynamic or functional languages. Service Welcoming the Cloud The emergence of cloud-computing models is steadily changing the way in which we build distributed systems. Specifically, we believe that the next Client Service generations of SOA solutions are going to be a hybrid of cloud and on-premises services and infrastructure components. The influence of cloud computing is by no means reduced to the ability of hosting enterprise Service Web services in private or public clouds. Additionally, there are specific SOA-infrastructure components that can be directly enabled from cloud environments as a complement to the traditional on-premises SOA technologies (see Figure 10). Figure 10: Enhancing on-premises SOAs with cloud infrastructures Cloud services Internet service Security-token Data services bus service On-premises SOA On-premises SOA Client Client Service Client Enterprise service bus Enterprise service bus Service Service Service Service 37 The Architecture Journal 22
  • 40.
    Lightweight SOAs: ExploringPatterns and Principles of a New Generation of SOA Solutions Let us look at some examples: Enterprise Service Bus To ESB or not to ESB: That is the question. Point-to-point • Cloud-based service bus—Can we host an ESB in a cloud communications are strongly coupled and easy to implement. But infrastructure? Absolutely! This type of ESB leveraging can point-to-point communications, by their very nature, are brittle, tend enable capabilities such as message routing, publish-subscribe, to stagnate, and limit the business-intelligence opportunities that are transformations, and service orchestration, which are the embedded in the messages. cornerstones of on-premises ESBs. • Cloud-based security services—In the last few years, we Do not confuse ESBs with event-processing systems. They are have witnessed the increasing adoption of security services similar, but have different scales and performance requirements. such as Windows Live ID or Facebook Connect. Leveraging Consider federated ESBs, as they address the limitations of a cloud security infrastructures can facilitate the implementation centralized ESB (spoke and hub). of interoperable security mechanisms such as authentication, Do not reproduce your strongly coupled point-to-point patterns identity representation, authorization, and federation on Internet within your ESB by simply moving the code to the bridge. Web-service APIs. Consider using pub-sub over request-response when you are • Cloud-based storage services—Arguably, storage services such building distributed systems. as Amazon S3 or Azure DB are the most appealing capability of cloud infrastructures. Leveraging these types of service can Cloud-Based Services drastically increase the flexibility and interoperability of the data- Everything in the cloud: That seems to be where we are headed. exchange mechanisms of our SOA, while it removes some of the Microsoft was a bit early with its My Services concept, but more and complexities that are associated with on-premises storage. more services are headed towards the cloud. Conclusion Consider cloud-based security services over local, proprietary The traditional approach to SOA introduces serious challenges security for public-facing services. It is arguably the most mature that make it impractical for large implementations. This article of the cloud-based services. suggests a series of patterns that can help developers enable lighter, Consider the possibility of future enhancements to take interoperable, and scalable SOAs that can enable true business agility advantage of cloud-based storage and the cloud-based in large enterprise scenarios. service bus. Transport Abstraction The most important thing to keep in mind when you are building your Consider first standardizing on the HTTP protocol. HTTP is a enterprise services is the mantra “convention over configuration.” By great lightweight alternative, and it can interoperate with more keeping the number of options to a minimum and building only what frameworks. is required by the business, you can create lighter-weight services that Do use SOAP & WS-* when transactions, durable messaging, or are easier to maintain and enhance. extreme (TCP/IP) performance is required. SOAP & WSDL About the Authors If you have already decided to standardize on HTTP, there is little Jesus Rodriguez ([email protected]) is the Chief Architect need for SOAP and WSDL. Learn to embrace technologies such as at Tellago, Inc. He is also a Microsoft BizTalk Server MVP, an Oracle REST, JSON, and Atom Pub, and use the Web to the fullest extent. ACE, and one of a few architects worldwide who is a member of the Microsoft Connected Systems Advisor team. Do not use SOAP and WSDL unless you are sure that you need the services that they provide. Don Demsak is a Senior Solution Architect at Tellago, based out Consider using REST, JSON, and Atom Pub as lightweight of New Jersey, who specializes in building enterprise applications alternatives. by using .NET. He has a popular blog at www.donxml.com and is a Do not fall into the trap of generating WSDLs as a side effect of Microsoft Data Platform MVP, and a member of the INETA Speakers creating SOAP-based services. Think contract first. Bureau. Governance & Discoverability If you do not have WSDL, how can you govern your corporate services? By using a service registry, of course! UDDI failed, but that Follow up on this topic does not mean that a service repository was not needed—just that • WCF JSON support: UDDI was too complex and looking to solve the wrong issues. By http://msdn.microsoft.com/en-us/library/bb412173.aspx using a lightweight service registry that is built upon RESTful services, • RESTful WCF: you can still supply governance and discoverability, without the http://msdn.microsoft.com/en-us/library/bb412169.aspx complexity of UDDI. • MS BizTalk ESB Toolkit: http://msdn.microsoft.com/en-us/library/ee529141.aspx Do store your service artifacts in a sort of repository—not just as • Windows Server AppFabric: an option on an endpoint. http://msdn.microsoft.com/appfabric Do use your service repository to help govern the services of your corporation. Consider using a RESTful service repository for SOAP and RESTful services, for governance and discoverability. 38 The Architecture Journal 22
  • 41.
    22 subscribe at www.architecturejournal.net