Data and Technology
Business Analytics Data
To generate analytics we need
 Structured
 Unstructured data
As a beginning for organizing data into an understandable framework,
statisticians usually categorize data into meaning groups.
Data can be generated by
 Primary Sources of Data
 Secondary Sources of Data
Categorizing Data
 There are many ways to categorize business analytics data.
 Data is commonly categorized by either
 Internal Sources.
 External Sources.
Internal Source
 Business
 Customer
 Information from ERP system
 CRM system
 Human Resources
 Product
 Production
 Questionnaire
 Web Logs
 Billing and reminder system
External Source
 Customer Satisfaction
 Customer Demographic
 Competition
 Economic
 When firms try to solve internal production or service operations
problems, internally sourced data may be all that is needed.
 Typical external sources of data are numerous and provide great
diversity and unique challenges for BA to process.
 Data can be measured quantitatively (for example, sales dollars) or
qualitatively by preference surveys (for example, products
compared based on consumers preferring one product over
another) or by the amount of consumer discussion (chatter) on the
Web regarding the pluses and minuses of competing products.
 A major portion of the external data sources
are found in the literature.
 For example, the US Census and the
International Monetary Fund (IMF) are useful
data sources at the macroeconomic level
for model building.
 Likewise, audience and survey data sources
might include Nielsen
(www.nielsen.com/us/en.html),
psychographic or demographic data
sourced from Claritas (www.claritas.com),
financial data from Equifax
(www.equifax.com), Dun & Bradstreet
(www.dnb.com), and so forth
DATA ISSUES
 Data issues that are critical to the usability of any database or data
file. Those issues are data quality and data privacy.
 Data quality can be defined as data that serves the purpose for
which it is collected.
 It means different things for different applications, but there are
some commonalities of high-quality data.
These qualities usually include
 Accurately representing reality
 Measuring what it is supposed to measure
 Being timeless, and having completeness.
 When data is of high quality, it helps ensure competitiveness, aids
customer service, and improves profitability.
 When data is of poor quality, it can provide information that is
contradictory, leading to misguided decision-making.
 For example, having missing data in files can prohibit some forms’
statistical modeling, and incorrect coding of information can
completely render databases useless.
 Data quality requires effort on the part of data managers to cleanse
data of erroneous information and repair or replace missing data
DATA PRIVACY
 Data privacy refers to the protection of shared data such that access is
permitted only to those users for whom it is intended.
 It is a security issue that requires balancing the need to know with the
risks of sharing too much. There are many risks in leaving unrestricted
access to a company’s database.
For example, competitors can steal a firm’s customers by accessing
addresses. Data leaks on product quality failures can damage brand
image, and customers can become distrustful of a firm that shares
information given in confidence.
 To avoid these issues, a firm needs to abide by the current legislation
regarding customer privacy and develop a program devoted to data
privacy
 A large part of what BA personnel do is related to managing information
systems to collect, process, store, and retrieve data from various sources.
 Collecting and retrieving data and computing analytics requires the use of
computers and information technology.
BUSINESS ANALYTICS TECHNOLOGY
 Firms need an information
technology (IT) infrastructure that
supports personnel in the conduct
of their daily business operations.
 The general requirements for such
a system are stated in Table
 These types of technology are
elemental needs for business
analytics operations
DATABASE MANAGEMENT SYSTEMS
(DBMS)
 Importance for BA is the data management technologies
 Database management systems (DBMS) is a data management technology
software that permits firms to centralize data, manage it efficiently, and
provide access to stored data by application programs.
 DBMS usually serves as an interface between application programs and the
physical data files of structured data.
 DBMS makes the task of understanding where and how the data is actually
stored more efficient.
 In addition, other DBMS systems can handle unstructured data.
 For example, object-oriented DBMS systems are able to store and retrieve
unstructured data, like drawings, images, photographs, and voice data.
These types of technology are necessary to handle the load of big data that
most firms currently collect
 DBMS includes capabilities and tools for organizing, managing, and
accessing data in databases. Four of the more important capabilities
are
 Data Definition Language
 Data Dictionary,
 Database Encyclopedia and
 Data Manipulation Language.
DATA DEFINITION
This is used to create database tables and characteristics used in fields to
identify content. These tables and characteristics are critical success factors for
search efforts as the database grows in size.
DATA DICTIONARY
Database tables and characteristics are documented in the data
dictionary (an automated or manual file that stores the size, descriptions,
format, and other properties needed to characterize data)
DATABASE ENCYCLOPEDIA
The database encyclopedia is a table of contents listing a firm’s current data
inventory and what data files can be built or purchased
DATA MANIPULATION LANGUAGE
 Of particular importance for BA is the data manipulation language
tools included in DMBS.
 These tools are used to search databases for specific information.
 An example is structure query language (SQL), which allows users to
find specific data through a session of queries and responses in a
database
THE TYPICAL CONTENT OF THE DATABASE
ENCYCLOPEDIA
DATA WAREHOUSES
 Data warehouses are databases that store current and historical data of
potential interest to decision makers.
 What a data warehouse does is make data available to anyone who needs
access to it.
 In a data warehouse, the data is prohibited from being altered.
 Data warehouses also provide a set of query tools, analytical tools, and
graphical reporting facilities.
 Some firms use intranet portals to make data warehouse information widely
available throughout a firm.
DATA MARTS
 Data marts are focused subsets or smaller groupings within a data warehouse.
Firms often build enterprise-wide data warehouses where a central data
warehouse serves the entire organization and smaller, decentralized data
warehouses (called data marts)
 Data marts are focused on a limited portion of the organization’s data that is
placed in a separate database for a specific population of users.
 For example, a firm might develop a smaller database on just product quality
to focus efforts on quality customer an
 Once data has been captured and placed into database management
systems, it is available for analysis with BA tools, including online analytical
processing, as well as data, text, and Web mining technologies.
 Online analytical processing (OLAP) is software that allows users to view data in
multiple dimensions.
 For example, employees can be viewed in terms of their age, sex, geographic
location, and so on.
 OLAP would allow identification of the number of employees who are age 35,
male, and in the western region of a country.
 OLAP allows users to obtain online answers to ad hoc questions quickly, even
when the data is stored in very large databases.
MINING IN BA
DATA MINING
 It is the application of a software, discovery-driven process that provides
insights into business data by finding hidden patterns and relationships in big
data or large databases and inferring rules from them to predict future
behavior.
 The observed patterns and rules are used to guide decision-making. They
can also act to forecast the impact of those decisions.
 It is an ideal predictive analytics tool used in the BA process
WEB MINING
 Its seeks to find patterns, trends, and insights into customer
 behavior from users of the Web.
 Marketers for example, use BA services like
 Google Trends (www.google.com/trends/) and
 Google Insights for Search (http://google.about.com/od/i/g/google-insights-for-
search.htm)
 to track the popularity of various words and phrases to learn what consumers are
interested in and what they are buying.
 Another Excel add-in, Solver, contains operations research optimization tools (for
example, linear programming) used in the prescriptive step of the BA process.
 SAS® Analytics Pro (www.sas.com/) software provides a desktop statistical toolset
allowing users to access, manipulate, analyze, and present information in visual
formats.
 It permits users to access data from nearly any source and transform it into
meaningful, usable information presented in visuals that allow decision makers to
gain quick understanding of critical issues within the data.
 It is designed for use by analysts, researchers, statisticians, engineers, and scientists
who need to explore, examine, and present data in an easily understandable way
and distribute findings in a variety of formats.
 It is a statistical package chiefly useful in the descriptive and predictive steps of the
BA process.
In addition to the general software applications discussed earlier, there are
focused software applications used every day by BA analysts in conducting the
three steps of the BA process
 Microsoft Excel® spreadsheet applications,
 SAS applications and
 SPSS applications.
 Microsoft Excel (www.microsoft.com/) spreadsheet systems have add-in
applications specifically used for BA analysis.
 These add-in applications broaden the use of Excel into areas of BA.
 Analysis Tool Pak is an Excel add-in that contains a variety of statistical tools (for
example, graphics and multiple regression) for the descriptive and predictive BA
process steps.
 Another Excel add-in, Solver, contains operations research optimization tools (for
example, linear programming) used in the prescriptive step of the BA process.
 SAS® Analytics Pro (www.sas.com/) software provides a desktop statistical toolset
allowing users to access, manipulate, analyze, and present information in visual formats.
 It permits users to access data from nearly any source and transform it into meaningful,
usable information presented in visuals that allow decision makers to gain quick
understanding of critical issues within the data.
 It is designed for use by analysts, researchers, statisticians, engineers, and scientists who
need to explore, examine, and present data in an easily understandable way and
distribute findings in a variety of formats. It is a statistical package chiefly useful in the
descriptive and predictive steps of the BA process.
 Other software applications exist to cover the prescriptive step of the BA process. One
that will be used in this book is LINGO® by Lindo Systems (www.lindo.com).
 LINGO is a comprehensive tool designed to makebuilding and solving optimization
models faster, easier, and more efficient.
 LINGO provides a completely integrated package that includes an
 understandable language for expressing optimization models, a full-featured
 environment for building and editing problems, and a set of built-in solvers
 to handle optimization modeling in linear, nonlinear, quadratic, stochastic,
 and integer programming models.
 In summary, the technology needed to support a BA program in any
 organization will entail a general information system architecture, including
 database management systems and progress in greater specificity down to
the
 software that BA analysts need to compute their unique contributions to the
 organization. Organizations with greater BA requirements will have
 substantially more technology to support BA efforts, but all firms that seek to
 use BA as a strategy for competitive advantage will need a substantial
 investment in technology, because BA is a technology-dependent
undertaking.
Thank You!
SOMEONE@EXAMPLE.COM

Data and types in business analytics process

  • 1.
  • 2.
    Business Analytics Data Togenerate analytics we need  Structured  Unstructured data As a beginning for organizing data into an understandable framework, statisticians usually categorize data into meaning groups. Data can be generated by  Primary Sources of Data  Secondary Sources of Data
  • 3.
    Categorizing Data  Thereare many ways to categorize business analytics data.  Data is commonly categorized by either  Internal Sources.  External Sources.
  • 4.
    Internal Source  Business Customer  Information from ERP system  CRM system  Human Resources  Product  Production  Questionnaire  Web Logs  Billing and reminder system External Source  Customer Satisfaction  Customer Demographic  Competition  Economic
  • 7.
     When firmstry to solve internal production or service operations problems, internally sourced data may be all that is needed.  Typical external sources of data are numerous and provide great diversity and unique challenges for BA to process.  Data can be measured quantitatively (for example, sales dollars) or qualitatively by preference surveys (for example, products compared based on consumers preferring one product over another) or by the amount of consumer discussion (chatter) on the Web regarding the pluses and minuses of competing products.
  • 8.
     A majorportion of the external data sources are found in the literature.  For example, the US Census and the International Monetary Fund (IMF) are useful data sources at the macroeconomic level for model building.  Likewise, audience and survey data sources might include Nielsen (www.nielsen.com/us/en.html), psychographic or demographic data sourced from Claritas (www.claritas.com), financial data from Equifax (www.equifax.com), Dun & Bradstreet (www.dnb.com), and so forth
  • 10.
    DATA ISSUES  Dataissues that are critical to the usability of any database or data file. Those issues are data quality and data privacy.  Data quality can be defined as data that serves the purpose for which it is collected.  It means different things for different applications, but there are some commonalities of high-quality data. These qualities usually include  Accurately representing reality  Measuring what it is supposed to measure  Being timeless, and having completeness.
  • 11.
     When datais of high quality, it helps ensure competitiveness, aids customer service, and improves profitability.  When data is of poor quality, it can provide information that is contradictory, leading to misguided decision-making.  For example, having missing data in files can prohibit some forms’ statistical modeling, and incorrect coding of information can completely render databases useless.  Data quality requires effort on the part of data managers to cleanse data of erroneous information and repair or replace missing data
  • 12.
    DATA PRIVACY  Dataprivacy refers to the protection of shared data such that access is permitted only to those users for whom it is intended.  It is a security issue that requires balancing the need to know with the risks of sharing too much. There are many risks in leaving unrestricted access to a company’s database. For example, competitors can steal a firm’s customers by accessing addresses. Data leaks on product quality failures can damage brand image, and customers can become distrustful of a firm that shares information given in confidence.
  • 13.
     To avoidthese issues, a firm needs to abide by the current legislation regarding customer privacy and develop a program devoted to data privacy  A large part of what BA personnel do is related to managing information systems to collect, process, store, and retrieve data from various sources.  Collecting and retrieving data and computing analytics requires the use of computers and information technology.
  • 14.
    BUSINESS ANALYTICS TECHNOLOGY Firms need an information technology (IT) infrastructure that supports personnel in the conduct of their daily business operations.  The general requirements for such a system are stated in Table  These types of technology are elemental needs for business analytics operations
  • 15.
    DATABASE MANAGEMENT SYSTEMS (DBMS) Importance for BA is the data management technologies  Database management systems (DBMS) is a data management technology software that permits firms to centralize data, manage it efficiently, and provide access to stored data by application programs.  DBMS usually serves as an interface between application programs and the physical data files of structured data.  DBMS makes the task of understanding where and how the data is actually stored more efficient.  In addition, other DBMS systems can handle unstructured data.  For example, object-oriented DBMS systems are able to store and retrieve unstructured data, like drawings, images, photographs, and voice data. These types of technology are necessary to handle the load of big data that most firms currently collect
  • 16.
     DBMS includescapabilities and tools for organizing, managing, and accessing data in databases. Four of the more important capabilities are  Data Definition Language  Data Dictionary,  Database Encyclopedia and  Data Manipulation Language.
  • 17.
    DATA DEFINITION This isused to create database tables and characteristics used in fields to identify content. These tables and characteristics are critical success factors for search efforts as the database grows in size. DATA DICTIONARY Database tables and characteristics are documented in the data dictionary (an automated or manual file that stores the size, descriptions, format, and other properties needed to characterize data) DATABASE ENCYCLOPEDIA The database encyclopedia is a table of contents listing a firm’s current data inventory and what data files can be built or purchased
  • 18.
    DATA MANIPULATION LANGUAGE Of particular importance for BA is the data manipulation language tools included in DMBS.  These tools are used to search databases for specific information.  An example is structure query language (SQL), which allows users to find specific data through a session of queries and responses in a database
  • 19.
    THE TYPICAL CONTENTOF THE DATABASE ENCYCLOPEDIA
  • 20.
    DATA WAREHOUSES  Datawarehouses are databases that store current and historical data of potential interest to decision makers.  What a data warehouse does is make data available to anyone who needs access to it.  In a data warehouse, the data is prohibited from being altered.  Data warehouses also provide a set of query tools, analytical tools, and graphical reporting facilities.  Some firms use intranet portals to make data warehouse information widely available throughout a firm.
  • 21.
    DATA MARTS  Datamarts are focused subsets or smaller groupings within a data warehouse. Firms often build enterprise-wide data warehouses where a central data warehouse serves the entire organization and smaller, decentralized data warehouses (called data marts)  Data marts are focused on a limited portion of the organization’s data that is placed in a separate database for a specific population of users.  For example, a firm might develop a smaller database on just product quality to focus efforts on quality customer an
  • 22.
     Once datahas been captured and placed into database management systems, it is available for analysis with BA tools, including online analytical processing, as well as data, text, and Web mining technologies.  Online analytical processing (OLAP) is software that allows users to view data in multiple dimensions.  For example, employees can be viewed in terms of their age, sex, geographic location, and so on.  OLAP would allow identification of the number of employees who are age 35, male, and in the western region of a country.  OLAP allows users to obtain online answers to ad hoc questions quickly, even when the data is stored in very large databases.
  • 23.
    MINING IN BA DATAMINING  It is the application of a software, discovery-driven process that provides insights into business data by finding hidden patterns and relationships in big data or large databases and inferring rules from them to predict future behavior.  The observed patterns and rules are used to guide decision-making. They can also act to forecast the impact of those decisions.  It is an ideal predictive analytics tool used in the BA process
  • 25.
    WEB MINING  Itsseeks to find patterns, trends, and insights into customer  behavior from users of the Web.  Marketers for example, use BA services like  Google Trends (www.google.com/trends/) and  Google Insights for Search (http://google.about.com/od/i/g/google-insights-for- search.htm)  to track the popularity of various words and phrases to learn what consumers are interested in and what they are buying.
  • 26.
     Another Exceladd-in, Solver, contains operations research optimization tools (for example, linear programming) used in the prescriptive step of the BA process.  SAS® Analytics Pro (www.sas.com/) software provides a desktop statistical toolset allowing users to access, manipulate, analyze, and present information in visual formats.  It permits users to access data from nearly any source and transform it into meaningful, usable information presented in visuals that allow decision makers to gain quick understanding of critical issues within the data.  It is designed for use by analysts, researchers, statisticians, engineers, and scientists who need to explore, examine, and present data in an easily understandable way and distribute findings in a variety of formats.  It is a statistical package chiefly useful in the descriptive and predictive steps of the BA process.
  • 27.
    In addition tothe general software applications discussed earlier, there are focused software applications used every day by BA analysts in conducting the three steps of the BA process  Microsoft Excel® spreadsheet applications,  SAS applications and  SPSS applications.  Microsoft Excel (www.microsoft.com/) spreadsheet systems have add-in applications specifically used for BA analysis.  These add-in applications broaden the use of Excel into areas of BA.  Analysis Tool Pak is an Excel add-in that contains a variety of statistical tools (for example, graphics and multiple regression) for the descriptive and predictive BA process steps.
  • 28.
     Another Exceladd-in, Solver, contains operations research optimization tools (for example, linear programming) used in the prescriptive step of the BA process.  SAS® Analytics Pro (www.sas.com/) software provides a desktop statistical toolset allowing users to access, manipulate, analyze, and present information in visual formats.  It permits users to access data from nearly any source and transform it into meaningful, usable information presented in visuals that allow decision makers to gain quick understanding of critical issues within the data.  It is designed for use by analysts, researchers, statisticians, engineers, and scientists who need to explore, examine, and present data in an easily understandable way and distribute findings in a variety of formats. It is a statistical package chiefly useful in the descriptive and predictive steps of the BA process.
  • 29.
     Other softwareapplications exist to cover the prescriptive step of the BA process. One that will be used in this book is LINGO® by Lindo Systems (www.lindo.com).  LINGO is a comprehensive tool designed to makebuilding and solving optimization models faster, easier, and more efficient.  LINGO provides a completely integrated package that includes an  understandable language for expressing optimization models, a full-featured  environment for building and editing problems, and a set of built-in solvers  to handle optimization modeling in linear, nonlinear, quadratic, stochastic,  and integer programming models.
  • 30.
     In summary,the technology needed to support a BA program in any  organization will entail a general information system architecture, including  database management systems and progress in greater specificity down to the  software that BA analysts need to compute their unique contributions to the  organization. Organizations with greater BA requirements will have  substantially more technology to support BA efforts, but all firms that seek to  use BA as a strategy for competitive advantage will need a substantial  investment in technology, because BA is a technology-dependent undertaking.
  • 32.