SlideShare a Scribd company logo
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING &
6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME

TECHNOLOGY (IJCET)

ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 5, Issue 1, January (2014), pp. 162-167
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com

IJCET
©IAEME

AN IMPROVING THE EFFECTIVENESS OF TEXT TO MATRIX
GENERATOR BY USING GENETIC ALGORITHM
Manish Sharma1,

Mr. Rahul Patel2

1

2

PG Scholar, CSE, AITR, Indore
Assistant professor, CSE, AITR, Indore

ABSTRACT
Retrieval of relevant documents from a huge collection of documents is a traditional task. As
genetic algorithms (GA) are efficient search and optimization techniques, they can be used to search
the huge document search space. In this field, the aim of the GA is to help an IR system to find, in a
huge documents text collection, a good reply to a query expressed by the user. In this paper, we
investigate how to improve an efficiency of Text to Matrix Generator (TMG). Here, we are
proposing a genetic approach in Text to Matrix Generator (TMG) to increase the efficiency. We
present experimental results using the improvement in average number of terms.
Keywords: Information Retrieval (IR), Genetic algorithm (GA), Text to Matrix Generator (TMG),
Vector Space Model (VSM).
1. INTRODUCTION
Information retrieval is generally considered as a subfield of computer science that deals with
the representation, storage, and access of information [1]. Information retrieval is concerned with the
organization and retrieval of information from large database collections [2]. Information Retrieval
(IR) is the process by which a collection of data is represented, stored, and searched for the purpose
of knowledge discovery as a response to a user request (query) [3].this process involves various
stages initiate with representing data and ending with returning relevant information to the user.
Intermediate stage includes filtering, searching, matching and ranking operations. The main goal of
information retrieval system (IRS) is to “finding relevant information or a document that satisfies
user information needs”. To achieve this goal, IRSs usually implement following processes: In
indexing process the documents are represented in summarized content form. In filtering process all
the stop words and common words are remove. Searching is the core process of IRS. There are
162
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME
various techniques for retrieving documents that match with users need. There are two basic
measures for assessing the quality of information retrieval [2].
Precision: This is the percentage of retrieved documents that are in fact relevant to the query. Recall:
This is the percentage of documents that are relevant to the query and were, in fact, retrieved. In this
paper, we present an approach to improve the performance of text to matrix generator (TMG). Text
to Matrix Generator (TMG) is a MATLAB Toolbox that can be used for various Data Mining (DM)
and Information Retrieval (IR) tasks.
The structure of this paper is as follows. A brief literature review is presented in Section II,
followed by vector space model in section III. Followed by genetic algorithm in Section IV.
Followed by proposed method in section V, Followed by Experiment in section VI Followed by
Result in section VII Finally, Section VIII covers conclusions.
2. RELATED WORK
Bangorn Klabbankoh and Ouen Pinngern [4] analyzed vector space model to boost
information retrieval efficiency. In vector space model, IR is based on the similarity measurement
between query and documents.
Md. Abu Kausar and Md. Nasar [14] give the details on Information retrieval system using
genetic algorithm.
Bangorn Klabbankoh and Ouen Pinngern [15] applied genetic algorithm in information
retrieval.
Marıa J. Martın-Bautista and Marıa-Amparo Vila and Henrik Legind Larsen [13] address A
Fuzzy Genetic Algorithm Approach to an Adaptive Information Retrieval Agent.
Wafa. Maitah, Mamoun. Al-Rababaa and Ghasan. Kannan [11] address improving the
effectiveness of information retrieval system using adaptive genetic algorithm.
Vaclav Snasel, Ajith Abraham et al. [5] Optimize Information Retrieval Using Evolutionary
Algorithms and Fuzzy Inference System.
Mohammad Othman Nassar et al. [6] investigate Genetic algorithms to optimize the user
query in the vector space model.
S.Siva Sathya and Philomina Simon [7] describe Review on Applicability of Genetic
Algorithm to Web Search.
Priya I. Borkar and Leena H. Patil [9] address Web Information Retrieval Using Genetic
Algorithm-Particle Swarm Optimization.
S.Siva Sathya and Philomina Simon address [8] A Document Retrieval System with
Combination Terms Using Genetic Algorithm.
Mohammad Othman Nassar, Feras Al Mashagba, and Eman Al Mashagba [12] Improving
the User Query for the Boolean Model Using Genetic Algorithms
Praveen Pathak Michael Gordon Weiguo Fan [16] address an Effective Information Retrieval
using Genetic Algorithms based Matching Functions Adaptation
J. Usharani, and Dr K Iyakutti address [17] A Genetic Algorithm based on Cosine Similarity
for Relevant Document Retrieval.
3. VECTOR SPACE MODEL
The vector space model can best be characterized by its attempt to rank documents by the
similarity between the query and each document [10].In the Vector Space Model(VSM), documents
and query are represent as a Vector and the angle between the two vectors are computed using the
similarity cosine function. Similarity Cosine function can be defined as:
163
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME

Where,
Documents and queries are represented as vectors.

Vector Space Model have been introduce term weight scheme known as if
if-idf weighting.
These weights have a term frequency (tf) factor measuring the frequency of occurrence of the terms
ights
(tf)
in the document or query texts and an inverse document frequency (idf) factor measuring the inverse
of the number of documents that contain a query or document term [4].
4. GENETIC ALGORITHM
Genetic Algorithm (GA) is a global optimization algorithm derived from evolution and
natural selection. Although genetic algorithm cannot always provide optimal solution, it has its own
advantages and is a powerful tool for solving complex problems.
Genetic algorithm is a powerful search mechanism and it is suitable for the information
retrieval for the following reasons [18
ollowing
[18].
The document search space represents a high dimensional space. GAs are one of the powerful
searching mechanisms known for its robustness and quick search capabilities. So they ar suitable
are
for information retrieval. In comparison with the classical information retrieval models, GA
manipulates a population of queries rather than a single query. Each query may retrieve a subset of
relevant documents that can be merged. The traditional methods of query expansion manipulate each
traditional
term independent of other. GA contributes to maintain useful information links representing a set of
terms indexing the relevant documents. The traditional methods of relevance feedback are not
efficient when no relevant documents are retrieved with the initial query.
o
Genetic algorithm operations can be used to generate new and better generations. The genetic
algorithm operations include:
4.1 Reproduction: the selection of the fittest individuals based on the fitness function.
4.2 Crossover: is the exchange of genes between two individual chromosomes that are reproducing.
In one point cross over a chunk of connected. Genes will be swapped between two chromosomes.
ween
4.3 Mutation: is the process of randomly altering the genes in a particular chromosome. There are
two types of mutation:
4.3.1 Point mutation: in which a single gene is changed.
4.3.2 Chromosomal mutation: where some number of genes is changed completely.

164
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME
As shown in figure 1 a simple GA works as follows:
1.
2.
3.
4.
5.
6.
7.

Start with a randomly generated population.
Evaluate the fitness of each individual in the population
Select individuals to reproduce based on their fitness
Apply crossover
Apply mutation
Replace the population by the new generation of individuals
Go to step 2.

Fig 1: Flowchart of typical Genetic algorithm [19]

5. PROPOSED METHOD
5.1 Research Objective
The aim of this proposed work is to retrieve the relevant documents by using the best
combination of the term list, given a set of document collections. The terms that are extracted from
the document collections for generating the combination terms after obtaining the best combination
of terms, it is applied to the information retrieval system to obtain more relevant documents. Genetic
Algorithm enhances average numbers of terms in Text to Matrix Generator by optimize the objective
function.
5.2 Proposed approach
The keywords extracted from the document collections are stored in the database. A
frequency measure is associated with each keyword. Here we are using Genetic Algorithm approach
to obtain the best combination of keywords by optimizing the fitness function to enhance the
165
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME
effectiveness of Text to Matrix Generator. The advantage of the proposed approach improves the
effectiveness of TMG and retrieves the more relevant document when a query is given. In this paper,
we are proposing Genetic Algorithm in Text to Matrix Generator toolbox for improving the
performance of the system
5.3 Methodology
We are performing the GA operation in Text to Matrix Generator on MATLAB simulator.
For the optimization of objective function we use optimization toolbox of MATLAB simulator to get
the results.
6.

EXPERIMENT

In this section we discuss about how experiment is conducted and result occur during
experiment. In our experiment we put objective function of TMG in optimization tool. A series of
computer experiments was conducted in order to evaluate proposed GA enabled Text to Matrix
Generator (TMG). Experiments were executed using data taken from the TMG collection. In this the
collection was indexed, which is based on normalized term frequency and normalized inverse
document frequency. Indexed collection contained 12 documents and 549 numbers of terms.
7. RESULTS
Adding new Keyword and Calculating Improvement:-

S.No

Table 1.1: Improvement in Average number of terms before and after GA
Avg. no. of terms before GA
Avg. no. of terms after GA
Improvement

1.

164.417

173.0010986328125

8.58409

8. CONCLUSION
In this paper, we have investigated how to improve efficiency of Text to Matrix (TMG).We
have proposed a Genetic Algorithm in Text to Matrix (TMG) to improve retrieval effectiveness.
Genetic algorithm is an excellent optimization tool. The algorithm use fitness function which is
represented by the equation gives more sophisticated result.
By using genetic algorithm in Text to Matrix Generator (TMG) the number of Iterations will
be increases which will improve the performance.
REFERENCES
[1]

[2]
[3]
[4]

Mohameth-François Sy, Sylvie Ranwez, Jacky Montmain, Armelle Regnault, Michel
Crampes, Vincent Ranwez Pezzoli, User centered and ontology based information Retrieval
system for life sciences, BMC Bioinformatics, 2012, 1471-2105.
R. Sagayam, S.Srinivasan, S. Roshni, A Survey of Text Mining: Retrieval, Extraction and
Indexing Techniques, IJCER, Vol. 2 Issue. 5, Sep 2012, PP: 1443-1444.
Anwar A. Alhenshiri, Web Information Retrieval and Search Engines Techniques, Al- Satil
journal PP: 55-92.
Djoerd Hiemstra, Arjen P. de Vries, Relating the new language models of information
retrieval to the traditional retrieval models, published as CTIT technical report TR-CTIT-0009, May 2000.
166
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME
[5]

Vaclav Snasel, Ajith Abraham2, Suhail Owais3, Jan Platos, and Pavel Kromer, Optimizing
Information Retrieval Using Evolutionary Algorithms and Fuzzy Inference System, pages:
1-23.
[6] Mohammad Othman Nassar, Feras Fares Al Mashagba, and Eman Fares Al Mashagba,
Investigating Genetic algorithms to optimize the user query in the vector space model, Australian
Journal of Basic and Applied Sciences, 7(2): 47-53, 2013.
[7] S.Siva Sathya and Philomina Simon, Review on Applicability of Genetic Algorithm to Web
Search, International Journal of Computer Theory and Engineering, Vol. 1, No. 4, October2009
1793-8201.
[8] S.Siva Sathya and Philomina Simon, A Document Retrieval System with Combination Terms
Using Genetic Algorithm, International Journal of Computer and Electrical Engineering, Vol. 2,
No. 1, February, 2010 1793-8163.
[9] Priya I. Borkar and Leena H. Patil, Web Information Retrieval Using Genetic Algorithm-Particle
Swarm Optimization, International Journal of Future Computer and Communication, Vol. 2, No.
6, December 2013.
[10] G. Salton and M.J. McGill, editors. Introduction to Modern Information Retrieval. McGraw-Hill
1983
[11] Wafa. Maitah, Mamoun. Al-Rababaa and Ghasan. Kannan, improving the effectiveness of
information retrieval system using adaptive genetic algorithm, International Journal of Computer
Science & Information Technology (IJCSIT) Vol 5, No 5, October 2013
[12] Mohammad Othman Nassar, Feras Al Mashagba, and Eman Al Mashagba, Improving the User
Query for the Boolean Model Using Genetic Algorithms, IJCSI International Journal of
Computer Science Issues, Vol. 8, Issue 5, No 1, September 2011.
[13] Marıa J. Martın-Bautista and Marıa-Amparo Vila and Henrik Legind Larsen [13] address A
Fuzzy Genetic Algorithm Approach to an Adaptive Information Retrieval Agent, Journal of the
American society for information science.50(9):760–771, 1999.
[14] Md. Abu Kausar and Md. Nasar, the detailed study on Information retrieval system using genetic
algorithm, Journal of Industrial and Intelligent Information Vol. 1, No. 3, September 2013.
[15] Bangorn Klabbankoh and Ouen Pinngern, applied genetic algorithm in information retrieval.
[16] Praveen Pathak Michael Gordon Weiguo Fan, Effective Information Retrieval using Genetic
Algorithms based Matching Functions Adaptation, Proceedings of the 33rd Hawaii International
Conference on System Sciences – 2000.
[17] J. Usharani, and Dr K Iyakutti, A Genetic Algorithm based on Cosine Similarity for Relevant
Document Retrieval, International Journal of Engineering Research & Technology (IJERT) Vol.
2 Issue 2, February- 2013 ISSN: 2278-0181.
[18] M.Boughanem, C. Chrisment, L. Tamine, Multiple query evaluation based on an enhanced
genetic algorithm , Information Processing and Management 39,215–231, 2003.
[19] Priya I. Borkar, Leena H. Patil, A model of Hybrid Genetic Algorithm Particle Swarm
Optimization (HGAPSO) based query optimization for web Information Retrieval, IJRET
Volume: 2 Issue: 1 JAN 2013, ISSN: 2319 – 1163 pages 59-64.
[20] Dimitrios Zeimpekis, Efstratios Gallopoulos, Text to matrix generator.
[21] Purvi Dubey and Asst. Prof. Sourabh Dave, “Effective Web Mining Technique for Retrieval
Information on the World Wide Web”, International Journal of Computer Engineering
& Technology (IJCET), Volume 4, Issue 6, 2013, pp. 156 - 160, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.
[22] Prakasha S, Shashidhar HR and Dr. G T Raju, “A Survey on Various Architectures, Models and
Methodologies for Information Retrieval”, International Journal of Computer Engineering
& Technology (IJCET), Volume 4, Issue 1, 2013, pp. 182 - 194, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.

167

More Related Content

PDF
Ijmet 10 02_050
IAEME Publication
 
PDF
C017510717
IOSR Journals
 
PDF
Improving the effectiveness of information retrieval system using adaptive ge...
ijcsit
 
PDF
Evaluating the efficiency of rule techniques for file classification
eSAT Journals
 
PDF
Evaluating the efficiency of rule techniques for file
eSAT Publishing House
 
PDF
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Publishing House
 
PDF
An efficient information retrieval ontology system based indexing for context
eSAT Journals
 
PDF
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
ijtsrd
 
Ijmet 10 02_050
IAEME Publication
 
C017510717
IOSR Journals
 
Improving the effectiveness of information retrieval system using adaptive ge...
ijcsit
 
Evaluating the efficiency of rule techniques for file classification
eSAT Journals
 
Evaluating the efficiency of rule techniques for file
eSAT Publishing House
 
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Publishing House
 
An efficient information retrieval ontology system based indexing for context
eSAT Journals
 
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
ijtsrd
 

What's hot (16)

PDF
Correlation of artificial neural network classification and nfrs attribute fi...
eSAT Journals
 
PDF
50120130405011
IAEME Publication
 
PDF
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
IJDKP
 
PDF
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
ertekg
 
PDF
Optimal approach for text summarization
IAEME Publication
 
PDF
Text documents clustering using modified multi-verse optimizer
IJECEIAES
 
PDF
11.software modules clustering an effective approach for reusability
Alexander Decker
 
PDF
Optimised Kd-Tree Approach with Dimension Reduction for Efficient Indexing an...
IJCSIS Research Publications
 
PDF
An Improved Similarity Matching based Clustering Framework for Short and Sent...
IJECEIAES
 
PDF
An investigative scheme for keyword search using inverted key tactic
eSAT Publishing House
 
PDF
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
IJNSA Journal
 
PDF
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
IJDKP
 
PDF
Applying genetic algorithms to information retrieval using vector space model
IJCSEA Journal
 
PDF
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
IJCSEA Journal
 
PDF
Development of pattern knowledge discovery framework using
IAEME Publication
 
PDF
K355662
IJERA Editor
 
Correlation of artificial neural network classification and nfrs attribute fi...
eSAT Journals
 
50120130405011
IAEME Publication
 
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
IJDKP
 
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
ertekg
 
Optimal approach for text summarization
IAEME Publication
 
Text documents clustering using modified multi-verse optimizer
IJECEIAES
 
11.software modules clustering an effective approach for reusability
Alexander Decker
 
Optimised Kd-Tree Approach with Dimension Reduction for Efficient Indexing an...
IJCSIS Research Publications
 
An Improved Similarity Matching based Clustering Framework for Short and Sent...
IJECEIAES
 
An investigative scheme for keyword search using inverted key tactic
eSAT Publishing House
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
IJNSA Journal
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
IJDKP
 
Applying genetic algorithms to information retrieval using vector space model
IJCSEA Journal
 
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
IJCSEA Journal
 
Development of pattern knowledge discovery framework using
IAEME Publication
 
K355662
IJERA Editor
 
Ad

Viewers also liked (20)

PDF
20120140503003
IAEME Publication
 
PDF
Diploma
Anthony Cotita, DL
 
PDF
Monumental Development Logo
Chad Shugarman
 
PDF
A Market Appraisal Machine
Leon Flynn
 
PDF
door s 1
michael byrnes
 
PPTX
Presentation_NEW.PPTX
jameschloejames
 
PDF
Presentation1
Rawa sawalha| abu lawi
 
PDF
NaluTribe_Catalog
Tanya Brown
 
PPT
Celula
andresby
 
PDF
print_jobs_1433774631
Scott Hafer
 
PDF
DeansList2013Macomb
Marcus Beal
 
DOCX
La adolescencia normal
Lezti Gao Mora
 
PDF
Modern Love 1
Victoria Russell
 
PDF
UFC reference letter
Mariusz Mistewicz
 
PPT
Reciclados
weatriz
 
PPTX
Castillos
Vale Llanos
 
XLS
Enem pe 2010
Portal NE10
 
PDF
OHS Certificate
Jonathan Carolissen
 
PDF
ID_card[1]
wisak seesuk
 
20120140503003
IAEME Publication
 
Monumental Development Logo
Chad Shugarman
 
A Market Appraisal Machine
Leon Flynn
 
door s 1
michael byrnes
 
Presentation_NEW.PPTX
jameschloejames
 
Presentation1
Rawa sawalha| abu lawi
 
NaluTribe_Catalog
Tanya Brown
 
Celula
andresby
 
print_jobs_1433774631
Scott Hafer
 
DeansList2013Macomb
Marcus Beal
 
La adolescencia normal
Lezti Gao Mora
 
Modern Love 1
Victoria Russell
 
UFC reference letter
Mariusz Mistewicz
 
Reciclados
weatriz
 
Castillos
Vale Llanos
 
Enem pe 2010
Portal NE10
 
OHS Certificate
Jonathan Carolissen
 
ID_card[1]
wisak seesuk
 
Ad

Similar to 50120140501018 (20)

PDF
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
ijcsa
 
PDF
APPLYING GENETIC ALGORITHMS TO INFORMATION RETRIEVAL USING VECTOR SPACE MODEL
IJCSEA Journal
 
PDF
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
IJCSEA Journal
 
PDF
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Journals
 
PDF
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Journals
 
PDF
Performance Evaluation of Query Processing Techniques in Information Retrieval
idescitation
 
PDF
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
Kelly Lipiec
 
PDF
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
AIRCC Publishing Corporation
 
PDF
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
ijcsit
 
PDF
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
AIRCC Publishing Corporation
 
PDF
An efficient algorithm for sequence generation in data mining
ijcisjournal
 
PDF
Applying Soft Computing Techniques in Information Retrieval
IJAEMSJORNAL
 
PDF
A genetic based research framework 3
prj_publication
 
PDF
Correlation Coefficient Based Average Textual Similarity Model for Informatio...
IOSR Journals
 
PDF
C017161925
IOSR Journals
 
PDF
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
IJNSA Journal
 
PDF
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET Journal
 
PDF
Filter Based Approach for Genomic Feature Set Selection (FBA-GFS)
IJCSEA Journal
 
PDF
Filter Based Approach for Genomic Feature Set Selection (FBA-GFS)
IJCSEA Journal
 
PDF
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET Journal
 
IMPACT OF DIFFERENT SELECTION STRATEGIES ON PERFORMANCE OF GA BASED INFORMATI...
ijcsa
 
APPLYING GENETIC ALGORITHMS TO INFORMATION RETRIEVAL USING VECTOR SPACE MODEL
IJCSEA Journal
 
Applying Genetic Algorithms to Information Retrieval Using Vector Space Model
IJCSEA Journal
 
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Journals
 
A model of hybrid genetic algorithm particle swarm optimization(hgapso) based...
eSAT Journals
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
idescitation
 
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
Kelly Lipiec
 
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
AIRCC Publishing Corporation
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
ijcsit
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
AIRCC Publishing Corporation
 
An efficient algorithm for sequence generation in data mining
ijcisjournal
 
Applying Soft Computing Techniques in Information Retrieval
IJAEMSJORNAL
 
A genetic based research framework 3
prj_publication
 
Correlation Coefficient Based Average Textual Similarity Model for Informatio...
IOSR Journals
 
C017161925
IOSR Journals
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
IJNSA Journal
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET Journal
 
Filter Based Approach for Genomic Feature Set Selection (FBA-GFS)
IJCSEA Journal
 
Filter Based Approach for Genomic Feature Set Selection (FBA-GFS)
IJCSEA Journal
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET Journal
 

More from IAEME Publication (20)

PDF
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME Publication
 
PDF
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
IAEME Publication
 
PDF
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
PDF
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
PDF
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
IAEME Publication
 
PDF
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
IAEME Publication
 
PDF
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IAEME Publication
 
PDF
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IAEME Publication
 
PDF
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
IAEME Publication
 
PDF
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
IAEME Publication
 
PDF
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
PDF
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
PDF
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
PDF
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
PDF
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
PDF
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
IAEME Publication
 
PDF
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
PDF
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
PDF
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 
PDF
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME Publication
 
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
IAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
IAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
IAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME Publication
 

Recently uploaded (20)

PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Software Development Methodologies in 2025
KodekX
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
The Future of Artificial Intelligence (AI)
Mukul
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 

50120140501018

  • 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 5, Issue 1, January (2014), pp. 162-167 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET ©IAEME AN IMPROVING THE EFFECTIVENESS OF TEXT TO MATRIX GENERATOR BY USING GENETIC ALGORITHM Manish Sharma1, Mr. Rahul Patel2 1 2 PG Scholar, CSE, AITR, Indore Assistant professor, CSE, AITR, Indore ABSTRACT Retrieval of relevant documents from a huge collection of documents is a traditional task. As genetic algorithms (GA) are efficient search and optimization techniques, they can be used to search the huge document search space. In this field, the aim of the GA is to help an IR system to find, in a huge documents text collection, a good reply to a query expressed by the user. In this paper, we investigate how to improve an efficiency of Text to Matrix Generator (TMG). Here, we are proposing a genetic approach in Text to Matrix Generator (TMG) to increase the efficiency. We present experimental results using the improvement in average number of terms. Keywords: Information Retrieval (IR), Genetic algorithm (GA), Text to Matrix Generator (TMG), Vector Space Model (VSM). 1. INTRODUCTION Information retrieval is generally considered as a subfield of computer science that deals with the representation, storage, and access of information [1]. Information retrieval is concerned with the organization and retrieval of information from large database collections [2]. Information Retrieval (IR) is the process by which a collection of data is represented, stored, and searched for the purpose of knowledge discovery as a response to a user request (query) [3].this process involves various stages initiate with representing data and ending with returning relevant information to the user. Intermediate stage includes filtering, searching, matching and ranking operations. The main goal of information retrieval system (IRS) is to “finding relevant information or a document that satisfies user information needs”. To achieve this goal, IRSs usually implement following processes: In indexing process the documents are represented in summarized content form. In filtering process all the stop words and common words are remove. Searching is the core process of IRS. There are 162
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME various techniques for retrieving documents that match with users need. There are two basic measures for assessing the quality of information retrieval [2]. Precision: This is the percentage of retrieved documents that are in fact relevant to the query. Recall: This is the percentage of documents that are relevant to the query and were, in fact, retrieved. In this paper, we present an approach to improve the performance of text to matrix generator (TMG). Text to Matrix Generator (TMG) is a MATLAB Toolbox that can be used for various Data Mining (DM) and Information Retrieval (IR) tasks. The structure of this paper is as follows. A brief literature review is presented in Section II, followed by vector space model in section III. Followed by genetic algorithm in Section IV. Followed by proposed method in section V, Followed by Experiment in section VI Followed by Result in section VII Finally, Section VIII covers conclusions. 2. RELATED WORK Bangorn Klabbankoh and Ouen Pinngern [4] analyzed vector space model to boost information retrieval efficiency. In vector space model, IR is based on the similarity measurement between query and documents. Md. Abu Kausar and Md. Nasar [14] give the details on Information retrieval system using genetic algorithm. Bangorn Klabbankoh and Ouen Pinngern [15] applied genetic algorithm in information retrieval. Marıa J. Martın-Bautista and Marıa-Amparo Vila and Henrik Legind Larsen [13] address A Fuzzy Genetic Algorithm Approach to an Adaptive Information Retrieval Agent. Wafa. Maitah, Mamoun. Al-Rababaa and Ghasan. Kannan [11] address improving the effectiveness of information retrieval system using adaptive genetic algorithm. Vaclav Snasel, Ajith Abraham et al. [5] Optimize Information Retrieval Using Evolutionary Algorithms and Fuzzy Inference System. Mohammad Othman Nassar et al. [6] investigate Genetic algorithms to optimize the user query in the vector space model. S.Siva Sathya and Philomina Simon [7] describe Review on Applicability of Genetic Algorithm to Web Search. Priya I. Borkar and Leena H. Patil [9] address Web Information Retrieval Using Genetic Algorithm-Particle Swarm Optimization. S.Siva Sathya and Philomina Simon address [8] A Document Retrieval System with Combination Terms Using Genetic Algorithm. Mohammad Othman Nassar, Feras Al Mashagba, and Eman Al Mashagba [12] Improving the User Query for the Boolean Model Using Genetic Algorithms Praveen Pathak Michael Gordon Weiguo Fan [16] address an Effective Information Retrieval using Genetic Algorithms based Matching Functions Adaptation J. Usharani, and Dr K Iyakutti address [17] A Genetic Algorithm based on Cosine Similarity for Relevant Document Retrieval. 3. VECTOR SPACE MODEL The vector space model can best be characterized by its attempt to rank documents by the similarity between the query and each document [10].In the Vector Space Model(VSM), documents and query are represent as a Vector and the angle between the two vectors are computed using the similarity cosine function. Similarity Cosine function can be defined as: 163
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME Where, Documents and queries are represented as vectors. Vector Space Model have been introduce term weight scheme known as if if-idf weighting. These weights have a term frequency (tf) factor measuring the frequency of occurrence of the terms ights (tf) in the document or query texts and an inverse document frequency (idf) factor measuring the inverse of the number of documents that contain a query or document term [4]. 4. GENETIC ALGORITHM Genetic Algorithm (GA) is a global optimization algorithm derived from evolution and natural selection. Although genetic algorithm cannot always provide optimal solution, it has its own advantages and is a powerful tool for solving complex problems. Genetic algorithm is a powerful search mechanism and it is suitable for the information retrieval for the following reasons [18 ollowing [18]. The document search space represents a high dimensional space. GAs are one of the powerful searching mechanisms known for its robustness and quick search capabilities. So they ar suitable are for information retrieval. In comparison with the classical information retrieval models, GA manipulates a population of queries rather than a single query. Each query may retrieve a subset of relevant documents that can be merged. The traditional methods of query expansion manipulate each traditional term independent of other. GA contributes to maintain useful information links representing a set of terms indexing the relevant documents. The traditional methods of relevance feedback are not efficient when no relevant documents are retrieved with the initial query. o Genetic algorithm operations can be used to generate new and better generations. The genetic algorithm operations include: 4.1 Reproduction: the selection of the fittest individuals based on the fitness function. 4.2 Crossover: is the exchange of genes between two individual chromosomes that are reproducing. In one point cross over a chunk of connected. Genes will be swapped between two chromosomes. ween 4.3 Mutation: is the process of randomly altering the genes in a particular chromosome. There are two types of mutation: 4.3.1 Point mutation: in which a single gene is changed. 4.3.2 Chromosomal mutation: where some number of genes is changed completely. 164
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME As shown in figure 1 a simple GA works as follows: 1. 2. 3. 4. 5. 6. 7. Start with a randomly generated population. Evaluate the fitness of each individual in the population Select individuals to reproduce based on their fitness Apply crossover Apply mutation Replace the population by the new generation of individuals Go to step 2. Fig 1: Flowchart of typical Genetic algorithm [19] 5. PROPOSED METHOD 5.1 Research Objective The aim of this proposed work is to retrieve the relevant documents by using the best combination of the term list, given a set of document collections. The terms that are extracted from the document collections for generating the combination terms after obtaining the best combination of terms, it is applied to the information retrieval system to obtain more relevant documents. Genetic Algorithm enhances average numbers of terms in Text to Matrix Generator by optimize the objective function. 5.2 Proposed approach The keywords extracted from the document collections are stored in the database. A frequency measure is associated with each keyword. Here we are using Genetic Algorithm approach to obtain the best combination of keywords by optimizing the fitness function to enhance the 165
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME effectiveness of Text to Matrix Generator. The advantage of the proposed approach improves the effectiveness of TMG and retrieves the more relevant document when a query is given. In this paper, we are proposing Genetic Algorithm in Text to Matrix Generator toolbox for improving the performance of the system 5.3 Methodology We are performing the GA operation in Text to Matrix Generator on MATLAB simulator. For the optimization of objective function we use optimization toolbox of MATLAB simulator to get the results. 6. EXPERIMENT In this section we discuss about how experiment is conducted and result occur during experiment. In our experiment we put objective function of TMG in optimization tool. A series of computer experiments was conducted in order to evaluate proposed GA enabled Text to Matrix Generator (TMG). Experiments were executed using data taken from the TMG collection. In this the collection was indexed, which is based on normalized term frequency and normalized inverse document frequency. Indexed collection contained 12 documents and 549 numbers of terms. 7. RESULTS Adding new Keyword and Calculating Improvement:- S.No Table 1.1: Improvement in Average number of terms before and after GA Avg. no. of terms before GA Avg. no. of terms after GA Improvement 1. 164.417 173.0010986328125 8.58409 8. CONCLUSION In this paper, we have investigated how to improve efficiency of Text to Matrix (TMG).We have proposed a Genetic Algorithm in Text to Matrix (TMG) to improve retrieval effectiveness. Genetic algorithm is an excellent optimization tool. The algorithm use fitness function which is represented by the equation gives more sophisticated result. By using genetic algorithm in Text to Matrix Generator (TMG) the number of Iterations will be increases which will improve the performance. REFERENCES [1] [2] [3] [4] Mohameth-François Sy, Sylvie Ranwez, Jacky Montmain, Armelle Regnault, Michel Crampes, Vincent Ranwez Pezzoli, User centered and ontology based information Retrieval system for life sciences, BMC Bioinformatics, 2012, 1471-2105. R. Sagayam, S.Srinivasan, S. Roshni, A Survey of Text Mining: Retrieval, Extraction and Indexing Techniques, IJCER, Vol. 2 Issue. 5, Sep 2012, PP: 1443-1444. Anwar A. Alhenshiri, Web Information Retrieval and Search Engines Techniques, Al- Satil journal PP: 55-92. Djoerd Hiemstra, Arjen P. de Vries, Relating the new language models of information retrieval to the traditional retrieval models, published as CTIT technical report TR-CTIT-0009, May 2000. 166
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 1, January (2014), © IAEME [5] Vaclav Snasel, Ajith Abraham2, Suhail Owais3, Jan Platos, and Pavel Kromer, Optimizing Information Retrieval Using Evolutionary Algorithms and Fuzzy Inference System, pages: 1-23. [6] Mohammad Othman Nassar, Feras Fares Al Mashagba, and Eman Fares Al Mashagba, Investigating Genetic algorithms to optimize the user query in the vector space model, Australian Journal of Basic and Applied Sciences, 7(2): 47-53, 2013. [7] S.Siva Sathya and Philomina Simon, Review on Applicability of Genetic Algorithm to Web Search, International Journal of Computer Theory and Engineering, Vol. 1, No. 4, October2009 1793-8201. [8] S.Siva Sathya and Philomina Simon, A Document Retrieval System with Combination Terms Using Genetic Algorithm, International Journal of Computer and Electrical Engineering, Vol. 2, No. 1, February, 2010 1793-8163. [9] Priya I. Borkar and Leena H. Patil, Web Information Retrieval Using Genetic Algorithm-Particle Swarm Optimization, International Journal of Future Computer and Communication, Vol. 2, No. 6, December 2013. [10] G. Salton and M.J. McGill, editors. Introduction to Modern Information Retrieval. McGraw-Hill 1983 [11] Wafa. Maitah, Mamoun. Al-Rababaa and Ghasan. Kannan, improving the effectiveness of information retrieval system using adaptive genetic algorithm, International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 5, October 2013 [12] Mohammad Othman Nassar, Feras Al Mashagba, and Eman Al Mashagba, Improving the User Query for the Boolean Model Using Genetic Algorithms, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, No 1, September 2011. [13] Marıa J. Martın-Bautista and Marıa-Amparo Vila and Henrik Legind Larsen [13] address A Fuzzy Genetic Algorithm Approach to an Adaptive Information Retrieval Agent, Journal of the American society for information science.50(9):760–771, 1999. [14] Md. Abu Kausar and Md. Nasar, the detailed study on Information retrieval system using genetic algorithm, Journal of Industrial and Intelligent Information Vol. 1, No. 3, September 2013. [15] Bangorn Klabbankoh and Ouen Pinngern, applied genetic algorithm in information retrieval. [16] Praveen Pathak Michael Gordon Weiguo Fan, Effective Information Retrieval using Genetic Algorithms based Matching Functions Adaptation, Proceedings of the 33rd Hawaii International Conference on System Sciences – 2000. [17] J. Usharani, and Dr K Iyakutti, A Genetic Algorithm based on Cosine Similarity for Relevant Document Retrieval, International Journal of Engineering Research & Technology (IJERT) Vol. 2 Issue 2, February- 2013 ISSN: 2278-0181. [18] M.Boughanem, C. Chrisment, L. Tamine, Multiple query evaluation based on an enhanced genetic algorithm , Information Processing and Management 39,215–231, 2003. [19] Priya I. Borkar, Leena H. Patil, A model of Hybrid Genetic Algorithm Particle Swarm Optimization (HGAPSO) based query optimization for web Information Retrieval, IJRET Volume: 2 Issue: 1 JAN 2013, ISSN: 2319 – 1163 pages 59-64. [20] Dimitrios Zeimpekis, Efstratios Gallopoulos, Text to matrix generator. [21] Purvi Dubey and Asst. Prof. Sourabh Dave, “Effective Web Mining Technique for Retrieval Information on the World Wide Web”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 6, 2013, pp. 156 - 160, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [22] Prakasha S, Shashidhar HR and Dr. G T Raju, “A Survey on Various Architectures, Models and Methodologies for Information Retrieval”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 1, 2013, pp. 182 - 194, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 167