Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique

Download as DOCX, PDF

0 likes59 views

The document proposes a data-aware module for the Hadoop ecosystem and a distributed encoding technique for genetic algorithms. This allows Hadoop to manage data distribution and placement based on cluster analysis of the data itself. The framework can handle different data types and optimize query time and resource usage, as shown through experiments on multiple datasets. It aims to address inefficiencies in existing big data solutions for today's diverse data landscapes and reduce the environmental impact of large-scale computing.

Education

Handling Big Data Using a Data-Aware HDFS and
Evolutionary Clustering Technique
ABSTRACT:
The increased use of cyber-enabled systems and Internet-of-Things (IoT) led to a
massive amount of data with different structures. Most big data solutions are built
on top of the Hadoop eco-system or use its distributed file system (HDFS).
However, studies have shown inefficiency in such systems when dealing with
today’s data. Some research overcame these problems for specific types of graph
data, but today’s data are more than one type of data. Such efficiency issues lead to
large scale problems, including larger space required in data centers, and waste in
resources (like power consumption), that in turn lead to environmental problems
(such as more carbon emission), as per scholars. We propose a data-aware module
for the Hadoop eco-system. We also propose a distributed encoding technique for
Genetic Algorithms. Our framework allows Hadoop to manage the distribution of
data and its placement based on cluster analysis of the data itself. We are able to
handle a broad range of data types as well as optimize query time and resource
usage. We performed our experiments on multiple datasets generated via LUBM.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
 System : i3 Processor
 Hard Disk : 500 GB.
 Monitor : 15’’ LED

 Input Devices : Keyboard, Mouse
 Ram : 4GB.
SOFTWARE REQUIREMENTS:
 Operating system : Windows 7/UBUNTU.
 Coding Language : Java 1.7 ,Hadoop 0.8.1
 IDE : Eclipse
 Database : MYSQL
REFERENCE:
Mustafa Hajeer, Member, IEEE, and Dipankar Dasgupta, Fellow, IEEE, “Handling
Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique”,
IEEE Transactions on Big Data, 2019.

More Related Content

What's hot (20)

PDF

Introduction to the Environmental Data Initiative (EDI)Corinna Gries

PPT

GreenLight Data Collection ArchitectureJerry Sheehan

PDF

Keynote IEEE International Workshop on Cloud Analytics. Dennis GannonMicrosoft Azure for Research

PPTX

A4 r overview deck_1.7Microsoft Azure for Research

PDF

Doing Research in the Cloud - NIH Workshop Dennis GannonMicrosoft Azure for Research

PPTX

Starfish-A self tuning system for bigdata analyticssai Pramoda

PPTX

Hadoop TutorialUjjwal Gupta

PPTX

Supporting Big Data, Open Data, Data Analytics and Data ScienceSimon Price

DOCX

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted DataJAYAPRAKASH JPINFOTECH

PPT

Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...JISC KeepIt project

PDF

NOVEL FUNCTIONAL DEPENDENCY APPROACH FOR STORAGE SPACE OPTIMISATION IN GREEN ...Nurul Emran

PPTX

Class 1 - Introduction to Big data.pptxtejayasam

DOCX

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data JAYAPRAKASH JPINFOTECH

PPTX

EcoTas13 Turner AEKOSTERN Australia

DOCX

Hadoop bigdata projects list(ver)S3 Infotech IEEE Projects

PPTX

Empowering Transformational ScienceChelle Gentemann

PPTX

PhD Projects in Green Cloud Computing Research GuidancePhD Services

PPT

Usage Statistics for E-Resources: Is All That Data Meaningful? - Justin ClarkeElectronic Resources & Libraries

PDF

LEVERAGING DATA DEDUPLICATION TO IMPROVE THE PERFORMANCE OF PRIMARY STORAGE S...Nexgen Technology

PPTX

SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford

Introduction to the Environmental Data Initiative (EDI)Corinna Gries

GreenLight Data Collection ArchitectureJerry Sheehan

Keynote IEEE International Workshop on Cloud Analytics. Dennis GannonMicrosoft Azure for Research

A4 r overview deck_1.7Microsoft Azure for Research

Doing Research in the Cloud - NIH Workshop Dennis GannonMicrosoft Azure for Research

Starfish-A self tuning system for bigdata analyticssai Pramoda

Hadoop TutorialUjjwal Gupta

Supporting Big Data, Open Data, Data Analytics and Data ScienceSimon Price

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted DataJAYAPRAKASH JPINFOTECH

Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, ...JISC KeepIt project

NOVEL FUNCTIONAL DEPENDENCY APPROACH FOR STORAGE SPACE OPTIMISATION IN GREEN ...Nurul Emran

Class 1 - Introduction to Big data.pptxtejayasam

Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data JAYAPRAKASH JPINFOTECH

EcoTas13 Turner AEKOSTERN Australia

Hadoop bigdata projects list(ver)S3 Infotech IEEE Projects

Empowering Transformational ScienceChelle Gentemann

PhD Projects in Green Cloud Computing Research GuidancePhD Services

Usage Statistics for E-Resources: Is All That Data Meaningful? - Justin ClarkeElectronic Resources & Libraries

LEVERAGING DATA DEDUPLICATION TO IMPROVE THE PERFORMANCE OF PRIMARY STORAGE S...Nexgen Technology

SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford

Similar to Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique (20)

PDF

Optimizing Bigdata Processing by using Hybrid Hierarchically Distributed Data...IJCSIS Research Publications

PPTX

Introduction to Data Science: A Practical Approach to Big Data AnalyticsIvan Khvostishkov

PDF

2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08

PPTX

Big data processing systemshima jafari

PDF

Survey Paper on Big Data and HadoopIRJET Journal

PPTX

Introduction to Big Data and HadoopEdureka!

PPTX

Big Data Practice_Planning_steps_RKRajesh Jayarman

PPTX

Big Data and Hadoop Training in Bangalore by myTectramyTectra Learning Solutions Private Ltd

PDF

Hadoop-2.6.0 Slideskul prasad subedi

PPTX

Big data and apache hadoop adoptionfaizrashid1995

PDF

An Efficient Approach for Clustering High Dimensional DataIJSTA

PPTX

(Big) Data (Science) SkillsOscar Corcho

PDF

Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281

PPTX

selected topics in CS-CHaaapteerobe.pptxBachaLamessaa

PDF

The Hadoop Ecosystem for DevelopersZohar Elkayam

PDF

Introduction to Big Data & HadoopEdureka!

DOCX

2014 IEEE JAVA DATA MINING PROJECT Data mining with big dataIEEEFINALYEARSTUDENTPROJECT

DOCX

IEEE 2014 JAVA DATA MINING PROJECTS Data mining with big dataIEEEFINALYEARSTUDENTPROJECTS

DOCX

2014 IEEE JAVA DATA MINING PROJECT Data mining with big dataIEEEMEMTECHSTUDENTSPROJECTS

PPT

Introduction to hadoopkarthika karthi

Optimizing Bigdata Processing by using Hybrid Hierarchically Distributed Data...IJCSIS Research Publications

Introduction to Data Science: A Practical Approach to Big Data AnalyticsIvan Khvostishkov

2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08

Big data processing systemshima jafari

Survey Paper on Big Data and HadoopIRJET Journal

Introduction to Big Data and HadoopEdureka!

Big Data Practice_Planning_steps_RKRajesh Jayarman

Big Data and Hadoop Training in Bangalore by myTectramyTectra Learning Solutions Private Ltd

Hadoop-2.6.0 Slideskul prasad subedi

Big data and apache hadoop adoptionfaizrashid1995

An Efficient Approach for Clustering High Dimensional DataIJSTA

(Big) Data (Science) SkillsOscar Corcho

Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281

selected topics in CS-CHaaapteerobe.pptxBachaLamessaa

The Hadoop Ecosystem for DevelopersZohar Elkayam

Introduction to Big Data & HadoopEdureka!

2014 IEEE JAVA DATA MINING PROJECT Data mining with big dataIEEEFINALYEARSTUDENTPROJECT

IEEE 2014 JAVA DATA MINING PROJECTS Data mining with big dataIEEEFINALYEARSTUDENTPROJECTS

2014 IEEE JAVA DATA MINING PROJECT Data mining with big dataIEEEMEMTECHSTUDENTSPROJECTS

Introduction to hadoopkarthika karthi

More from JAYAPRAKASH JPINFOTECH (20)

PDF

Java Web Application Project Titles 2023-2024.pdfJAYAPRAKASH JPINFOTECH

PDF

Dot Net Final Year IEEE Project Titles.pdfJAYAPRAKASH JPINFOTECH

PDF

MATLAB Final Year IEEE Project Titles 2023 - 2024.pdfJAYAPRAKASH JPINFOTECH

PDF

Python IEEE Project Titles 2023 - 2024.pdfJAYAPRAKASH JPINFOTECH

PDF

Python ieee project titles 2021 - 2022 | Machine Learning Final Year Project...JAYAPRAKASH JPINFOTECH

DOCX

Spammer detection and fake user Identification on Social NetworksJAYAPRAKASH JPINFOTECH

DOCX

Sentiment Classification using N-gram IDF and Automated Machine LearningJAYAPRAKASH JPINFOTECH

DOCX

Privacy-Preserving Social Media DataPublishing for Personalized Ranking-Based...JAYAPRAKASH JPINFOTECH

DOCX

FunkR-pDAE: Personalized Project Recommendation Using Deep LearningJAYAPRAKASH JPINFOTECH

DOCX

Discovering the Type 2 Diabetes in Electronic Health Records using the Sparse...JAYAPRAKASH JPINFOTECH

DOCX

Crop Yield Prediction and Efficient use of FertilizersJAYAPRAKASH JPINFOTECH

DOCX

Collaborative Filtering-based Electricity Plan Recommender SystemJAYAPRAKASH JPINFOTECH

DOCX

Achieving Data Truthfulness and Privacy Preservation in Data MarketsJAYAPRAKASH JPINFOTECH

DOCX

V2V Routing in a VANET Based on the Auto regressive Integrated Moving Average...JAYAPRAKASH JPINFOTECH

DOCX

Towards Fast and Reliable Multi-hop Routing in VANETsJAYAPRAKASH JPINFOTECH

DOCX

Selective Authentication Based Geographic Opportunistic Routing in Wireless S...JAYAPRAKASH JPINFOTECH

DOCX

Robust Defense Scheme Against Selective DropAttack in Wireless Ad Hoc NetworksJAYAPRAKASH JPINFOTECH

DOCX

Privacy-Preserving Cloud-based Road Condition Monitoring with Source Authenti...JAYAPRAKASH JPINFOTECH

DOCX

Novel Intrusion Detection and Prevention for Mobile Ad Hoc NetworksJAYAPRAKASH JPINFOTECH

DOCX

Node-Level Trust Evaluation in Wireless Sensor NetworksJAYAPRAKASH JPINFOTECH