Embedding Intelligence Throughout the Enterprise
A InforSense White Paper

submitted by David Menninger, VP Global Marketing & Product Management , InforSenseMonday, November 12, 2007

EXECUTIVE SUMMARY

Historically, business intelligence (BI) has been about bringing data to the analyst. Analysis could only be performed with specialized business intelligence tools, specialized data sources called data warehouses and specialized skills to be able to design and use business intelligence systems. Instead of bringing the data to the analysts, InforSense provides software that enables enterprises to bring the analysis to the data. The solution is simple: Data originates and resides in your existing business systems and processes. That’s where the analysis belongs.

By embedding analytics in business processes you can ensure that the entire organization benefits from business intelligence technology and you can ensure that your organization is applying its decision-making policies in a consistent and efficient manner.

This document presents the following key technologies of the InforSense platform and describes how these enable your entire organization to leverage the information generated in your existing business processes to make better business decisions.

• Dynamic data and application integration: Existing applications such as customer relationship management (CRM), supply chain management, and customer service/customer support must all be tied together to create a single horizontal analytical
platform. The InforSense platform can combine and analyze data in a flexible way to support analyses ranging from high data volume, single point analyses to high throughput real-time scoring applications.
• In-database processing: The ability to perform data processing and analytics within standard commercial databases such as Oracle, IBM DB2 and Microsoft SQL Server is critical for data warehouse creation and searching. Data integrity is maintained by
avoiding the need to extract data unnecessarily for analysis. High performance data processing and analytics are both secure and scalable.
• Interactive analytical workflows: Adopting a workflow paradigm enables domain experts to rapidly and interactively create and optimize streamlined analytic applications that are specific to your organization’s needs.
• Predictive Analytics, Data Mining and Text Mining: Business processes may require very specific types of analyses ranging from simple analytics to the most advanced statistical algorithms or data mining techniques. Embedded analytics must support whatever types of analyses are needed by the business process.
• Enterprise deployment and reporting: The custom workflow applications incorporate standard reporting as well as advanced visualization techniques. These workflow applications can be deployed to any user community from a centrally-maintained portal
as fully-interactive applications enabling any type of decision making while optimizing standard data processing steps.

EMBEDDED ANALYTICS

Embedded analytics is a paradigm pioneered by InforSense that addresses the challenges faced by IT teams and management across all industries. This approach is based on the use of analytical workflows as a means for rapidly developing and deploying applications that integrate data from multiple sources and coordinate the invocation of various analytical tools. The approach enables domain experts to construct their own analytical workflows within a user-friendly environment, to conduct analysis, and to easily deploy their analyses as interactive applications for use by colleagues. The approach reduces the long cycle times required for developing and deploying analytical processes, while providing IT personnel with the control and flexibility they require to maintain, integrate and deliver the underlying data sources and software tools.

Analytical Workflows

The Embedded Analytics approach uses analytical workflows as the enabling technology for crossdomain data and application integration. Informally, a workflow (see Figure 2) is a high-level description of the steps required for executing a particular real-world process and the flow of information between these tasks. Work passes through the flow from start to finish and the activities are executed by people or by system functions. Visually, a workflow is often best represented as a directed graph where tasks are represented as nodes (boxes) and information flow represented as arcs (arrows).

In InforSense’s Embedded Analytics approach, workflows are used to specify the data processing and analysis steps using data integrated from distributed data sources. The workflows are authored through a visual interface where users can drag-and-drop nodes representing available data sources and processing tools. The workflows are then submitted for execution by a workflow engine that controls the access and data transfer between the distributed applications that implement the processing steps.

In-Database Workflows and Execution

InforSense leverages specialized in-database analytics capabilities. The use of commercial relational databases dominates the enterprise landscape. The InforSense workflow engine provides specialized extensions to InforSense for dynamically accessing, querying and
manipulating data stored in Oracle, DB2, and SQL Server databases. These extensions enable users to access and integrate other in-database functions, including SQL scripts, stored procedures, and database-specific statistics and data mining components. The implementation is engineered to keep the data in the database throughout the whole workflow execution ensuring fully in-database processing from inception to delivery, for simplified data management, data integrity, performance, security and scalability.

Interactive Analytical Workflows

InforSense provides an interactive user environment, enabling experts and non-experts alike to rapidly design, construct and execute analytical workflows. It also provides a wide variety of interactive knowledge discovery tools.

The availability of these interactive tools transforms the workflow-authoring tool from a simple workflow scripting environment to a fully-fledged knowledge discovery environment designed to support the dynamic, iterative and interactive decision-making process. It allows users to easily follow a trial and error approach when analyzing their data, as needed, and allows them to interactively inspect the results of the analysis at any point in a workflow.

The basic interactivity features of the environment include:
• Dynamic Data Integration: Users can spontaneously query, analyze and integrate data from multiple data sources with no database or SQL programming expertise.
• Built-in Knowledge Discovery Tools: Users can access and execute a wide range of builtin data processing, transformation and analysis operations that are commonly required.
• Interactive Visualization Tools: Users can access an extensive and extensible set of generic and specialized visualization tools that can be launched from any node in a workflow. These tools enable instant review, comparison and refinement of results to deliver quick
insights and decision-making.
• Visualization Network Technology: Users can launch multiple interactive visualization tools from more than one node in a workflow. Seamless interaction among independent visualization tools enables users to easily define relationships between independent datasets,
and dynamically transfer data selections among the visualization tools and persist through the workflow.
• Guided Predictive Model Development: Users have access to wizards that guide them through different analytical tasks, including predictive model development and optimization.

Rapid Interactive Application Deployment

In addition to execution from the main InforSense client interface, InforSense analytical workflows can be deployed easily and rapidly as end-user applications across the enterprise via a variety of methods, including a portal interface or into a third party application such as a customer service application or supply chain application. The deployment mechanism is also programming-free and requires no extra software or third party code. The same workflows can be executed as web services, via the command line or via the InforSense Server API, ensuring consistent enterprisewide dissemination of applications.

Furthermore, analytical workflows deployed as applications, via a portal interface, can make use of InforSense's interactive visualization tools which include charts and plots deployed as applets. This feature provides portal users with the same rich interactive analysis and decision-making facilities as the InforSense main client. The portal supports communication between multiple workflow applications ensuring that the deployed interactive applications can easily form part of a comprehensive enterprise-wide solution.

Predictive Model Development & Deployment

Predictive modelling has many applications within business intelligence processes. For example, credit risk, fraud detection, quality control, warranty analysis, lifetime value of a customer and cross-sell/up-sell analyses all require predictive modelling. InforSense provides an extensible toolset for predictive modelling methods including Decision Tree Induction, Decision Rule Induction, Naïve Bayes Classification, Logistic Regression, Neural Net, SVM and an Ensemble Classification approach. It also provides extensions for multivariate analysis techniques including Principal Component Analysis (PCA), Partial Least Squares (PLS), Linear and Polynomial Regression and Neural Net Prediction. Extensions for statistical analysis include ANOVA, F-test, Kolmogorov-Smirnov test, T-test, Wilcoxon test and a wide range of descriptive summary statistics.

In addition, the system offers interfaces to various statistical analysis tools including Weka, R, Matlab and Oracle-based analytics components, including Oracle Data Mining and Oracle statistics.

Furthermore, the InforSense Classification Studio provides an interactive environment that facilitates interaction essential for this kind of data mining while capturing the resulting analysis process in the workflow format so that it is instantly reusable. It provides a wizard environment that guides users through predictive model development and optimization, allowing for the building of workflows that capture important analysis steps from start to finish, including:
• Data partitioning and preparation, Feature subset selection methods, Building predictive classification models, and model ensembles.
• Assessment of models, Graphical environment for visualizing attribute importance tables, Confusion matrices, Lift and gains charts, and ROC (Receiver Operating Characteristic) plots.

Text Analytics

Much of the information that influences our analysis and decision-making is contained in documents and publications. By some estimates, 80-85% of all data used in businesses is based on unstructured data sources. Text analytics is necessary if an organization is going to make full and efficient use of the data available in its business processes. It can provide the user with the following:
1. Information, such as concepts, extracted from within documents, not just a list of documents or sections of text from the documents.
2. Information extracted from the full document collection, which may not reside in one single document.

Text analytics in this context includes several techniques. Many applications already exist which provide a subset of these techniques. The InforSense platform has been designed to provide and seamlessly integrate all of them. The techniques involved in text analytics are listed here in three levels, with each level building on the processing undertaken in the previous level.

Level One
• Information Retrieval: Retrieval and ranking of documents from a collection based on a user-defined query.
• Text Processing: Any text-in, text-out operation usually used to get the raw text into a format for further processing. It may include parsing XML documents, document filtering, word stemming, character replacement and document sectioning.

Level Two
• Named Entity Recognition: The identification and extraction from within the text of certain predefined entities such as dates, people’s names, addresses, etc. In the life sciences, these can include gene names, chemical compounds, tissues types, etc.
• Natural Language Processing: The semantic processing of the text to identify each word's or phrase's part of speech (noun, verb, preposition, etc.).

Level Three
• Document Classification: The categorization/assignment of documents into pre-defined classes.
• Document Clustering: Automatic categorization of documents into groups based on some measure of text similarity. The groups are not predefined, in contrast to document classification.
• Information Extraction: The extraction of specific information items from a document collection, e.g. gene-disease interactions or gene-tissue interactions.

InforSense is unique in that it combines text analytics into the underlying business intelligence platform, enabling users to construct workflows that access and analyze both structured and unstructured data.

InforSense Architecture

The InforSense architecture supporting the development, execution and deployment of analytical workflows is shown in Figure 8.

The InforSense workflow server is based on a high performance and scalable service-based architecture for managing and integrating data from a wide variety of different and distributed sources and for coordinating the execution of distributed data analysis software components. It manages services for accessing and querying remote data sources and employing remote computational services through standardised API’s (Application Programming Interfaces). It also provides services for managing data and control flow operations, notification operations and workflow scheduling operations. The server makes use of a variety of standards, including web services for accessing distributed data and resources, and a well defined SDK (Software Development Kit) which enables users to rapidly integrate proprietary data sources and applications into their workflows.

The workflow server communicates with different workflow clients (specialized clients, web-based clients, etc.) via a variety of standards based API’s. Such clients can be used to author and/or call up the execution of workflows. The InforSense application deployment mechanism ensures that all the clients have the ability to execute the same workflows.

Knowledge Management Tools

InforSense's embedded analytics approach goes beyond data and application integration. It provides a basis for intellectual property capture and management within an organization. Teams of users can collaboratively author their workflows and re-use them in different applications. Organizations can audit the analytical processes developed using workflows and manage portfolios of such projects more efficiently. The end results include capturing and sharing best practices and SOPs, improved project management and improved decision-making across the organization. InforSense's workflow management technology is supported by various features, including:
• Analytical workflow storage, retrieval and sharing
• Collaborative workflow annotation
• Workflow change history tracking
• Automatic report generation

These knowledge management capabilities make InforSense a natural hub or repository for your organization, and even for an extended community via an extranet. InforSense has used some of these capabilities to create a hub for sharing workflows among its customers. http://chub.inforsense.com

CONCLUSIONS

It is likely that your organization has implemented some business intelligence initiatives. While those initiatives have provided value, they have not been successful at reaching your entire organization and making BI pervasive. In order for your BI efforts to succeed more broadly they need to be embedded directly into your business processes.

Process-driven BI requires a broad set of analytical functionality as well as new capabilities including Web services, workflow, collaboration, predictive analytics and data mining. Analytical workflows provide the mechanism for repeatable analyses and for collaboration around the analytical process. A rapid application development environment is necessary to quickly identify and respond to changing market conditions.

Predictive modeling or data mining enables analysis of large volumes of data with thousands of potentially relevant factors that would be impossible to evaluate individually or manually. And with these large volumes of data, advanced visualization capabilities are necessary. The actual process of embedding must be cost effective. It cannot require significant modification of your existing systems. Web services and portal deployment technology provide the mechanism for embedding the analyses within your existing IT infrastructure.

A new generation of business intelligence technology is available today from InforSense which provides these capabilities. Your organization can adopt this technology today and begin realizing the benefits of embedding intelligence throughout the enterprise.
For more information or a demonstration, please call us today to talk to one of our consultants.

For more information or a demonstration, please call us today to talk to one of our consultants.

www.inforsense.com
information@inforsense.com

Europe:
InforSense Limited
Colet Court,
100 Hammersmith Road,
London, W6 7JP
United Kingdom
Phone: +44 (0)20 8237 8440
Fax: +44 (0)20 8237 8441
North America:
InforSense LLC
155 Second Street,
Cambridge, MA 02141
USA
Phone: +1 617 547 2500
Fax: +1 617 547 2772

©2007 InforSense Ltd. All rights reserved. InforSense, the InforSense logo and TextSense are registered trademarks of InforSense Ltd. Open Discovery Workflow is a trademark of InforSense Ltd. All other brands or product names are trademarks of their respective holders.

    Other articles by this author

Discussion:

No comments have been posted yet.

Site Map | Contribute | Privacy Policy | Contact Us | Dashboard Insight © 2008