SlideShare a Scribd company logo
Data Mining With Big
Data
Guide: Prof. Prashant G. Ahire
Presented by :
Miss.Rupa Solapure
Roll no. 259
Agenda
Problem Definition
Objectives
Literature Survey
Architecture/Big Data mining algorithm
Existing System/Mathematical model
Advantages
Disadvantages/Limitations
Characteristics of Big Data
Big Data and it’s challenges
Big Data mining Tools
Applications of Big Data
References
Problem Definition:
Big Data consists of huge modules, difficult, growing data sets with
numerous and , independent sources. With the fast development of
networking, storage of data, and the data gathering capacity, Big Data are
now quickly increasing in all science and engineering domains, as well as
animal, genetic and biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big Data revolution, and
proposes a Big Data processing model from the data mining view.
Objective:
This requires carefully designed algorithms to analyze model correlations
between distributed sites, and fuse decisions from multiple sources to gain a best
model out of the Big Data. Developing a safe and sound information sharing
protocol is a major challenge.
To support Big Data mining, high-performance computing platforms are
required, which impose systematic designs to unleash the full power of the Big
Data. Big data as an emerging trend and the need for Big data mining is rising in
all science and engineering domains.
Literature Survey
Title/Year Keywords Concept/Abstract Author
“Data Mining With Big
Data,Jan 2014”
Big Data,data
Mining,Heterogeneity,Au
tonomous
sources,Complex,and
Evolving associations.
This paper presents a HACE
theorem that characterizes the
features of Big Data
revolutions,processing model
from data mining.
Xindong Wu, Fellow,
IEEE, Xingquan Zhu,
Senior Member, IEEE,
Gong-Qing Wu, and Wei
Ding
“The Survey of Data
Mining Applications
And Feature
Scope,,June 2012”
Data mining task, Data
mining life cycle ,
Visualization of the data
mining model , Data
mining Methods,s
Data mining applications.
This paper imparts more
number of applications of the
data mining and also o focuses
scope of the data mining which
will helpful in the further
research.
Neelamadhab Padhy1,
Dr. Pragnyaban Mishra 2,
and Rasmita Panigrahi3
“Review on Data
Mining with Big
Data..Dec 2014”
Big Data, data mining,
heterogeneity,
autonomous sources,
complex and evolving
associations.
This data-driven model involves
demand-driven aggregation of
information sources, mining and
analysis, security and privacy
considerations.
Savita Suryavanshi, Prof.
Bharati Kale.
“SURVEY ON BIG
DATA MINING
PLATFORMS,
ALGORITHMS AND
CHALLENGES.sep201
4”
big data, big data mining
platforms, big data
mining algorithms, big
data mining challenges,
data mining.
This paper gives A review on
various big data mining
platforms, algorithms and
challenges is also discussed in
this paper.
SHERIN A1, Dr S UMA2,
SARANYA K3, SARANYA
VANI M4.
Architecture:
Fig.: Big data Memory evolution
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Model based clustering algorithms
Existing System:
The rise of Big Data applications where data collection has grown tremendous
doubly and is beyond the ability of commonly used software tools to capture,
manage, and process within a “tolerable elapsed time.”
The most fundamental challenge for Big Data applications is to explore the large
volumes of data and extract useful information or knowledge for future actions.
In many situations, the knowledge extraction process has to be very efficient and
close to real time because storing all observed data is nearly infeasible.
The unprecedented data volumes require an effective data analysis and prediction
platform to achieve fast response and real-time classification for such Big Data.
In model level it will produce local pattern. This pattern will be produced after
mined local data.
By sharing these local patterns with other local sites, we can produce a single
global pattern.
At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how related
the data sources are correlated to each other, and how to form accurate decisions
based on models built from autonomous sources
Continue…
Big Data
Big Data is a comprehensive term for any collection of data sets so large and multifarious
that it becomes difficult to process them using conventional data processing applications.
There are two types of Big Data: structured and unstructured.
Structured data
Structured data are numbers and words that can be easily categorized and analyzed.
These data are generated by things like network sensors embedded in electronic
devices, smart phones, and global positioning system (GPS) devices. Structured data
also include things like sales figures, account balances, and transaction data.
Unstructured data
Unstructured data include more multifarious information, such as customer reviews
from feasible websites, photos and other multimedia, and comments on social
networking sites. These data can not be separated into categorized or analyzed
numerically.
Big Data Characteristic(HACE Theorem)
Figure . The blind men and the enormous elephant: the restricted view
of each blind man leads to a biased conclusion.
HACE theorem suggests that the key characteristics of the
Big Data are:
A. Huge with various and miscellaneous data sources
B. Autonomous Sources with circulated & disperse Control
C. Complex and Evolving associations
Applications of Data Mining
Marketing
 Analysis of consumer behaviour
 Advertising campaigns
 Targeted mailings
 Segmentation of customers, stores, or products
Finance
 Creditworthiness of clients
 Performance analysis of finance investments
 Fraud detection
Manufacturing
 Optimization of resources
 Optimization of manufacturing processes
 Product design based on customer requirements
Health Care
 Discovering patterns in X-ray images
 Analyzing side effects of drugs
 Effectiveness of treatments
Big Data Mining Algorithm
Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site.But it is prohibited because of high data transmission cost
and privacy concerns.
Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
The global data mining is done through two steps process.
 Model level
Knowledge level.
Each and every local sites use local data to calculate the data statistics
and it share this information in order to achieve global data distribution in
their data level.
Data Mining Challenges With Big Data
Fig. a conceptual view of the Big Data processing framework
DISADVANTAGES OF EXISTING
SYSTEM
To explore Big Data, we have analysed several challenges at the
data, model, and system levels.
The challenges at Tier I focus on data accessing and arithmetic
computing procedures. Because Big Data are often stored at
different locations and data volumes may continuously grow, an
effective computing platform will have to take distributed large-
scale data storage into consideration for computing.
PROPOSED SYSTEM
We propose a HACE theorem to model Big Data characteristics. The
characteristics of HACH make it an extreme challenge for
discovering useful knowledge from the Big Data.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
Characteristics of Big Data
Fig. Five Vs of BIG DATA
Volume- The quantity of data
Variety - categorizing the data
Velocity- speed of generation of data or the speed
of processing the data
Variability- Inconsistency
Complexity- Managing the data
Continue…
BIG Data Mining Tools
Hadoop
Apache S4
Strom
Apache Mahout
MOA
Fig.: Big Data processing
Conclusion:
Because of Increase in the amount of data in the field of genomics,
meteorology, biology, environmental research, it becomes difficult to handle
the data, to find Associations, patterns and to analyze the large data sets.
As an organization collects more data at this scale, formalizing the process of
big data analysis will become paramount.The paper describes methods for
different algorithms used to handle such large data sets. And it gives an
overview of architecture and algorithms used in large data sets.
References
 McKinsy Global Institute, Big Data: The next frontier for
innovation, competition and productivity- May 2011
Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013,
Data Mining with Big Data
 Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis,
Algorithms for mining the evolution of conserved relational states in
dynamic network
 IEEE, Data Mining with Big Data, January 2014
 Oracle, June 2013,Unstructured Data Management with Oracle
Database 12c
Data minig with Big data analysis

More Related Content

What's hot (20)

PPTX
Introduction to data science
Mahir Haque
 
PPTX
Chapter 1 big data
Prof .Pragati Khade
 
PPTX
Introduction of Data Science
Jason Geng
 
PPTX
Big Data Analytics
Ghulam Imaduddin
 
PPTX
Data science
Ranjit Nambisan
 
PPTX
Data science applications and usecases
Sreenatha Reddy K R
 
PPTX
Introduction to Data Analytics
Utkarsh Sharma
 
PPTX
Data science
SwapnilDahake2
 
PPTX
Data mining & big data presentation 01
Aseem Chakrabarthy
 
PPTX
Big Data & Data Mining
Md Mizanur Rahman
 
PDF
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Edureka!
 
PDF
Introduction to Big Data Analytics and Data Science
Data Science Thailand
 
PPTX
Introduction to Big Data
Srinath Perera
 
PDF
Challenges of Big Data Research
Regional Science Academy
 
PDF
Big data Analytics
ShivanandaVSeeri
 
PPTX
Introduction to Data Analytics
Dr. C.V. Suresh Babu
 
PPT
Big data ppt
IDBI Bank Ltd.
 
PPTX
Data analytics unit 1 aktu updated syllabus new
yogendra2210162
 
PDF
Big Data Evolution
itnewsafrica
 
Introduction to data science
Mahir Haque
 
Chapter 1 big data
Prof .Pragati Khade
 
Introduction of Data Science
Jason Geng
 
Big Data Analytics
Ghulam Imaduddin
 
Data science
Ranjit Nambisan
 
Data science applications and usecases
Sreenatha Reddy K R
 
Introduction to Data Analytics
Utkarsh Sharma
 
Data science
SwapnilDahake2
 
Data mining & big data presentation 01
Aseem Chakrabarthy
 
Big Data & Data Mining
Md Mizanur Rahman
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Edureka!
 
Introduction to Big Data Analytics and Data Science
Data Science Thailand
 
Introduction to Big Data
Srinath Perera
 
Challenges of Big Data Research
Regional Science Academy
 
Big data Analytics
ShivanandaVSeeri
 
Introduction to Data Analytics
Dr. C.V. Suresh Babu
 
Big data ppt
IDBI Bank Ltd.
 
Data analytics unit 1 aktu updated syllabus new
yogendra2210162
 
Big Data Evolution
itnewsafrica
 

Viewers also liked (20)

PDF
Big Data v Data Mining
University of Hertfordshire
 
PPTX
Data mining with big data
Sandip Tipayle Patil
 
PPTX
Data mining with big data
kk1718
 
PPTX
Data mining with big data implementation
Sandip Tipayle Patil
 
PPTX
Big data ppt
Nasrin Hussain
 
PPTX
What is Big Data?
Bernard Marr
 
PPT
Big Data
NGDATA
 
PPT
Data mining slides
smj
 
PPTX
Data mining
imran khan
 
PDF
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
PPTX
What is big data?
David Wellman
 
PPTX
Big data ppt
Thirunavukkarasu Ps
 
PPTX
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PPTX
Big Data & The Role Analytics Can Play In Our Organizations
Agile Technologies
 
PDF
Data is Currency
iMedia Connection
 
PPTX
How Great Companies Think Differently
Dia Lao
 
PPTX
Frank henry digital rural futures conf june 2013 v3
Frank Henry
 
PPTX
Big data by Mithlesh sadh
Mithlesh Sadh
 
DOCX
2016 and 2017 Data Mining Projects @ TMKS Infotech
Manju Nath
 
Big Data v Data Mining
University of Hertfordshire
 
Data mining with big data
Sandip Tipayle Patil
 
Data mining with big data
kk1718
 
Data mining with big data implementation
Sandip Tipayle Patil
 
Big data ppt
Nasrin Hussain
 
What is Big Data?
Bernard Marr
 
Big Data
NGDATA
 
Data mining slides
smj
 
Data mining
imran khan
 
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
What is big data?
David Wellman
 
Big data ppt
Thirunavukkarasu Ps
 
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr
 
Big Data Analytics with Hadoop
Philippe Julio
 
Big Data & The Role Analytics Can Play In Our Organizations
Agile Technologies
 
Data is Currency
iMedia Connection
 
How Great Companies Think Differently
Dia Lao
 
Frank henry digital rural futures conf june 2013 v3
Frank Henry
 
Big data by Mithlesh sadh
Mithlesh Sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
Manju Nath
 
Ad

Similar to Data minig with Big data analysis (20)

PDF
Characterizing and Processing of Big Data Using Data Mining Techniques
IJTET Journal
 
DOCX
data mining with big data
swathi78
 
PPTX
Data Mining With Big Data
Muhammad Rumman Islam Nur
 
PDF
An Efficient Approach for Clustering High Dimensional Data
IJSTA
 
PPTX
Big data and data mining
Polash Halder
 
PPTX
Data mining on big data
Swapnil Chaudhari
 
DOCX
JPJ1417 Data Mining With Big Data
chennaijp
 
PDF
Datamining with big data
muhammed jassim k
 
PDF
Ijariie1184
IJARIIE JOURNAL
 
PDF
Ijariie1184
IJARIIE JOURNAL
 
PDF
A Novel Framework for Big Data Processing in a Data-driven Society
AnthonyOtuonye
 
PDF
A Survey Paper on Data Mining With Big Data
AM Publications
 
PDF
Data Mining in the World of BIG Data-A Survey
Editor IJCATR
 
PPTX
Data mining with big data
Sandip Tipayle Patil
 
PDF
Data Mining and Big Data Challenges and Research Opportunities
Kathirvel Ayyaswamy
 
PDF
A Survey on Big Data Mining Challenges
Editor IJMTER
 
PDF
A Model Design of Big Data Processing using HACE Theorem
AnthonyOtuonye
 
PDF
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
IJSRD
 
PDF
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
IJSRD
 
Characterizing and Processing of Big Data Using Data Mining Techniques
IJTET Journal
 
data mining with big data
swathi78
 
Data Mining With Big Data
Muhammad Rumman Islam Nur
 
An Efficient Approach for Clustering High Dimensional Data
IJSTA
 
Big data and data mining
Polash Halder
 
Data mining on big data
Swapnil Chaudhari
 
JPJ1417 Data Mining With Big Data
chennaijp
 
Datamining with big data
muhammed jassim k
 
Ijariie1184
IJARIIE JOURNAL
 
Ijariie1184
IJARIIE JOURNAL
 
A Novel Framework for Big Data Processing in a Data-driven Society
AnthonyOtuonye
 
A Survey Paper on Data Mining With Big Data
AM Publications
 
Data Mining in the World of BIG Data-A Survey
Editor IJCATR
 
Data mining with big data
Sandip Tipayle Patil
 
Data Mining and Big Data Challenges and Research Opportunities
Kathirvel Ayyaswamy
 
A Survey on Big Data Mining Challenges
Editor IJMTER
 
A Model Design of Big Data Processing using HACE Theorem
AnthonyOtuonye
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
IJSRD
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
IJSRD
 
Ad

Recently uploaded (20)

PPTX
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
PDF
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PPTX
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
PDF
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
PPTX
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PDF
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
PDF
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
PPTX
Hashing Introduction , hash functions and techniques
sailajam21
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PPTX
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
Day2 B2 Best.pptx
helenjenefa1
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
Hashing Introduction , hash functions and techniques
sailajam21
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
265587293-NFPA 101 Life safety code-PPT-1.pptx
chandermwason
 

Data minig with Big data analysis

  • 1. Data Mining With Big Data Guide: Prof. Prashant G. Ahire Presented by : Miss.Rupa Solapure Roll no. 259
  • 2. Agenda Problem Definition Objectives Literature Survey Architecture/Big Data mining algorithm Existing System/Mathematical model Advantages Disadvantages/Limitations Characteristics of Big Data Big Data and it’s challenges Big Data mining Tools Applications of Big Data References
  • 3. Problem Definition: Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. Objective: This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high-performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. Literature Survey Title/Year Keywords Concept/Abstract Author “Data Mining With Big Data,Jan 2014” Big Data,data Mining,Heterogeneity,Au tonomous sources,Complex,and Evolving associations. This paper presents a HACE theorem that characterizes the features of Big Data revolutions,processing model from data mining. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding “The Survey of Data Mining Applications And Feature Scope,,June 2012” Data mining task, Data mining life cycle , Visualization of the data mining model , Data mining Methods,s Data mining applications. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research. Neelamadhab Padhy1, Dr. Pragnyaban Mishra 2, and Rasmita Panigrahi3 “Review on Data Mining with Big Data..Dec 2014” Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, security and privacy considerations. Savita Suryavanshi, Prof. Bharati Kale. “SURVEY ON BIG DATA MINING PLATFORMS, ALGORITHMS AND CHALLENGES.sep201 4” big data, big data mining platforms, big data mining algorithms, big data mining challenges, data mining. This paper gives A review on various big data mining platforms, algorithms and challenges is also discussed in this paper. SHERIN A1, Dr S UMA2, SARANYA K3, SARANYA VANI M4.
  • 6. Architecture: Fig.: Big data Memory evolution
  • 7. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Model based clustering algorithms
  • 8. Existing System: The rise of Big Data applications where data collection has grown tremendous doubly and is beyond the ability of commonly used software tools to capture, manage, and process within a “tolerable elapsed time.” The most fundamental challenge for Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions. In many situations, the knowledge extraction process has to be very efficient and close to real time because storing all observed data is nearly infeasible. The unprecedented data volumes require an effective data analysis and prediction platform to achieve fast response and real-time classification for such Big Data.
  • 9. In model level it will produce local pattern. This pattern will be produced after mined local data. By sharing these local patterns with other local sites, we can produce a single global pattern. At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources Continue…
  • 10. Big Data Big Data is a comprehensive term for any collection of data sets so large and multifarious that it becomes difficult to process them using conventional data processing applications. There are two types of Big Data: structured and unstructured. Structured data Structured data are numbers and words that can be easily categorized and analyzed. These data are generated by things like network sensors embedded in electronic devices, smart phones, and global positioning system (GPS) devices. Structured data also include things like sales figures, account balances, and transaction data. Unstructured data Unstructured data include more multifarious information, such as customer reviews from feasible websites, photos and other multimedia, and comments on social networking sites. These data can not be separated into categorized or analyzed numerically.
  • 11. Big Data Characteristic(HACE Theorem) Figure . The blind men and the enormous elephant: the restricted view of each blind man leads to a biased conclusion.
  • 12. HACE theorem suggests that the key characteristics of the Big Data are: A. Huge with various and miscellaneous data sources B. Autonomous Sources with circulated & disperse Control C. Complex and Evolving associations
  • 13. Applications of Data Mining Marketing  Analysis of consumer behaviour  Advertising campaigns  Targeted mailings  Segmentation of customers, stores, or products Finance  Creditworthiness of clients  Performance analysis of finance investments  Fraud detection Manufacturing  Optimization of resources  Optimization of manufacturing processes  Product design based on customer requirements Health Care  Discovering patterns in X-ray images  Analyzing side effects of drugs  Effectiveness of treatments
  • 14. Big Data Mining Algorithm Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site.But it is prohibited because of high data transmission cost and privacy concerns. Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources. The global data mining is done through two steps process.  Model level Knowledge level. Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 15. Data Mining Challenges With Big Data Fig. a conceptual view of the Big Data processing framework
  • 16. DISADVANTAGES OF EXISTING SYSTEM To explore Big Data, we have analysed several challenges at the data, model, and system levels. The challenges at Tier I focus on data accessing and arithmetic computing procedures. Because Big Data are often stored at different locations and data volumes may continuously grow, an effective computing platform will have to take distributed large- scale data storage into consideration for computing.
  • 17. PROPOSED SYSTEM We propose a HACE theorem to model Big Data characteristics. The characteristics of HACH make it an extreme challenge for discovering useful knowledge from the Big Data.
  • 18. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 19. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 20. Characteristics of Big Data Fig. Five Vs of BIG DATA
  • 21. Volume- The quantity of data Variety - categorizing the data Velocity- speed of generation of data or the speed of processing the data Variability- Inconsistency Complexity- Managing the data Continue…
  • 22. BIG Data Mining Tools Hadoop Apache S4 Strom Apache Mahout MOA
  • 23. Fig.: Big Data processing
  • 24. Conclusion: Because of Increase in the amount of data in the field of genomics, meteorology, biology, environmental research, it becomes difficult to handle the data, to find Associations, patterns and to analyze the large data sets. As an organization collects more data at this scale, formalizing the process of big data analysis will become paramount.The paper describes methods for different algorithms used to handle such large data sets. And it gives an overview of architecture and algorithms used in large data sets.
  • 25. References  McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data  Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network  IEEE, Data Mining with Big Data, January 2014  Oracle, June 2013,Unstructured Data Management with Oracle Database 12c