SlideShare a Scribd company logo
Database Indexing Framework  ( Version 1.0 )
Objective  To index database tables using Solr Requirements  To create Search Index relevant views (like database views) by collating data from multiple database tables  To convert data from database to XMLs that can be posted to Solr To enable incremental indexing Overview
Possible approaches  to create Search Index relevant views at the database level  (creating Database Views) ( this will involve creation of  database views  based on search result requirements.  For example the message module or the shopping module both have different search result requirement. So probably one view could cater to one module. These views only have columns relevant to search.  Here the application layer will directly get prepared data from the database and will just have the job of posting it to Solr ) at the database level  (using Procedures) ( this will involve creation of  procedures  to fetch the index relevant data ) at the application layer ( in this approach we give the work of collating data from the various database table to the application layer. It queries the various relevant DB tables and then collects the data and posts it to Solr ) Overview
Possible approaches  for Incremental Indexing at real time  (the push approach) ( in this approach the data is indexed as soon as it is entered into the database.  It will involve database listeners listening for changes in the database and queuing  up the new and updated records to be indexed in a JMS queue. This queue is consumed by a indexing program that  again queries the database based on the primary keys in the JMS queue to get data, convert it to Solr XML and post them  to Solr  ) as a batch process at regular intervals  (the pull approach) (  here instead of immediate indexing, we fetch data from the database after some  configurable regular interval. Unlike to the real time approach, here the chances  of failure are minimal ) Overview
The following slides discuss a incremental indexing approach that we thought would work well for our requirements. In this approach the Search Index relevant views are created using  Database Views  and the indexing is done as a  Batch Process  and not at real time. First we need to  understand the need for the Database Views . When a search term is searched for in the index, the result page shows some details and summary of the result. For instant results these details need to be stored in the index itself  so we don’t have to hit the database just to display collated results in the results page. When creating the Solr index it then doesn't make much sense to index all the tables individually. This is because each table will have it own dependencies with child and parent tables. We will either have to create similar dependencies in the index or else create our indexes intelligently keeping the search needs in mind. This will involve creating appropriate joins  across tables to fetch all the data relevant to a search result at one shot. The database view can do this job of collating data from the parent and child tables in a  representation that exactly matches the requirements of the search index. This makes the job of the application layer hassle free. It just picks everything from the view  and indexes it as it is.  Incremental Indexing Process  ( the need for Database Views )
Next we need to understand why the  Batch Indexing process  can work well for us. Most of our search requirements would involve searching for historic data. Rarely could there be cases where we search for data put in immediately. Even these cases can be handled by setting the Batch Process interval to a very small time. The real time indexing process can become a pretty expensive process in case a large  amount of data is entered in small intervals. Also the batch process gives us the flexibility of working on a copy of the database to make  the whole indexing process an offline one. Incremental Indexing Process  ( the need for Batch indexing )
Database Result Set to XML Converter Data Fetcher Indexing Job Scheduler Database Indexer (the controller class) SOLR Index Manager (9)  Solr XML (1)  Indexing Job Name (2)  Database View Name (5)  Result Set (6)  Solr XML (3)  Query (4)  Result Set (8)  Solr XML Indexing Job - Trigger Config file ( Indexing Job Schedules ) Trigger Time 1  -  Indexing Job 1 Trigger Time 2  -  Indexing Job 2 Trigger Time 3  -  Indexing Job 3 7)  Solr XML Incremental Indexing Batch Process  ( the flow ) Components in green are explained in detail in next slide  >> Indexing Job – Database View Mapping file More than one DB view might need to be indexed at the same time, so these  can be as an Indexing Job. Indexing Job 1 – Database View1 Database View2   Database View3   Database View4 Indexing Job 2 – Database View5 Database View6 DB View Column name to Solr field mapping - Database   View 1 Column 1  - Solr Field 1 Column 2  - Solr   Field 2 Column 3  - Solr   Field 3   - Database   View 2 Column 1  - Solr Field 3 Column 2  - Solr   Field 2
Incremental Indexing Batch Process  ( the components ) An  Indexing Job  has been defined as indexing of all the set of Database Views  that need to be indexed at the same time and at equal time intervals. Triggers  holds the time information, the start time, time interval and other such  time related details. So when a Indexing Job is associated to a trigger, the job will  run according to the start time and time intervals as mentioned in the trigger. Indexing Job - Trigger Config file  has all Indexing Job Schedules. It maps triggers to indexing jobs. Indexing Job – Database View Mapping file  defines the Indexing Jobs. It associates Database Views with each Indexing Job. If a database view like the one for the messages module requires to be picked up for at a smaller time interval than the one for the shopping module, then  they will be part of different indexing jobs having different Triggers. Database Indexer  acts as the controller of the database indexing process. It does the job of calling the Data Fetcher to get database records in XML format which it sends to the Index Manager to post it to Solr. The  Data Fetcher  communicates with the database to get all the new and updated  records for a given database view along with those records that have been marked  for deletion. It then feeds this data to the Result Set to XML converter to get the  data converted to the Solr recognizable XML format. The  Result Set to XML converter  is a utility class which converts database records to  XML format. If the record is new or updated it puts it in the <add> tag. If it is marked  for deletion then it is put in the <delete> tag.  It picks up Solr Field names corresponding to the DB View Column names from the  DB View Column name to Solr field mapping  file.
Incremental Indexing Batch Process  ( the flow) The indexing process is triggered off by the  Indexing Job Scheduler . An indexing job is triggered from the Indexing Job Scheduler based on the  trigger settings to which it is associated in the  Indexing Job - Trigger Config file . The Indexing Job Scheduler makes a call to the  Database Indexer  sending the  name of the job to done as an argument. The Database Indexer   acts as the controller for this whole process. It picks up  the names of Database Views to be indexed corresponding to the Indexing Job  sent by Indexing Job Scheduler from the  Indexing Job – Database View Mapping file . The Database Indexer loops over the set of Database Views and makes a call to the  Data Fetcher  for each View. The Data Fetcher hits the database with a query to get all the latest records from the  View. The result set is sent to  Result set to XML Converter  which return the Solr XML. This Solr XML is sent back to the Database Indexer which in turn sends it to the  Index manger for posting it to Solr.
(4) Result Set (3 ) View Query Indexing Job to Database Views mapping file Job - Trigger Config file (Indexing Job Schedules) DB View Column name to  Solr field mapping (2 ) Database View Name (7) Solr XML (6) Solr XML (5) Result Set (8) Solr XML (1) Indexing Job Name Indexing Job Scheduler Triggers  Database Indexer  with an  Indexing job   based on the trigger times in the  Job - Trigger Config  file Database Indexer  ( @parameter  Indexing Job) Picks up list of  Database Views  corresponding to a Indexing Job from the  Indexing Job – Database View Mapping file Loops over the Database View Names and calls the  Data Fetcher ( @parameter  Database View Name)  for each View to get back the corresponding result set. Sends the Solr XML to the  Index Manager Data Fetcher   (@parameter  Database View Name ) Fires a generic “Select * from  View Name  ” query to get relevant data The  ResultSet to XML Converter  (@parameter  Result Set ) is called for each result set to get the Solr XML for that result set. Result Set to XML Converter (@parameter  Result Set ) Loops over the Result Set.  Based on the Result Set Metadata get the corresponding solr  field names from the  DB View Column name to Solr field  mapping  Create the Solr XML file for the result set. Database SOLR Index Manager (9) Solr XML

More Related Content

What's hot (20)

PPTX
Introduction of sql server indexing
Mahabubur Rahaman
 
PPTX
Sql server lesson6
Ala Qunaibi
 
PDF
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
Andriy Krayniy
 
PPTX
SQL Server Index and Partition Strategy
Hamid J. Fard
 
PPT
Lecture12 abap on line
Milind Patil
 
PPTX
SQL_Part1
Rick Perry
 
PDF
Optimized cluster index generation
Rutvik Pensionwar
 
PPTX
What is Link list? explained with animations
PratikNaik41
 
PDF
MySQL Indexing
BADR
 
PPTX
Indexes: The Second Pillar of Database Wisdom
gisborne
 
PPTX
dotnetMALAGA - Sql query tuning guidelines
Javier García Magna
 
PPTX
Sql introduction
Bhavya Chawla
 
PDF
MySQL: Indexing for Better Performance
jkeriaki
 
PPT
Ardbms
guestcc2d29
 
PPTX
Database Performance
Boris Hristov
 
PPTX
Quick And Dirty Databases
cwarren
 
PDF
Ijebea14 228
Iasir Journals
 
PPTX
DATASTORAGE.pptx
Neheurevathy
 
PDF
DATASTORAGE.pdf
Neheurevathy
 
Introduction of sql server indexing
Mahabubur Rahaman
 
Sql server lesson6
Ala Qunaibi
 
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
Andriy Krayniy
 
SQL Server Index and Partition Strategy
Hamid J. Fard
 
Lecture12 abap on line
Milind Patil
 
SQL_Part1
Rick Perry
 
Optimized cluster index generation
Rutvik Pensionwar
 
What is Link list? explained with animations
PratikNaik41
 
MySQL Indexing
BADR
 
Indexes: The Second Pillar of Database Wisdom
gisborne
 
dotnetMALAGA - Sql query tuning guidelines
Javier García Magna
 
Sql introduction
Bhavya Chawla
 
MySQL: Indexing for Better Performance
jkeriaki
 
Ardbms
guestcc2d29
 
Database Performance
Boris Hristov
 
Quick And Dirty Databases
cwarren
 
Ijebea14 228
Iasir Journals
 
DATASTORAGE.pptx
Neheurevathy
 
DATASTORAGE.pdf
Neheurevathy
 

Viewers also liked (20)

PPT
12. Indexing and Hashing in DBMS
koolkampus
 
PDF
1 data types
Ram Kedem
 
PDF
3 indexes
Ram Kedem
 
PPTX
Indexing the MySQL Index: Key to performance tuning
OSSCube
 
ODP
Ms sql-server
Md.Mojibul Hoque
 
PPSX
MS SQL Server
Md. Mahedee Hasan
 
PDF
Building a Real-time Solr-powered Recommendation Engine
lucenerevolution
 
PDF
Introduction to TFS 2013
Md. Mahedee Hasan
 
PPTX
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
PPT
Types of Search Engines
Surendra Kapadia
 
PPT
Lucene basics
Nitin Pande
 
ODP
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
dnaber
 
PDF
Solr4 nosql search_server_2013
Lucidworks (Archived)
 
PPTX
Search engines powerpoint
vbaker2210
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PPTX
Search Engines Presentation
JSCHO9
 
PPT
Introduction to Search Engines
Nitin Pande
 
PPTX
Search engines
Sahiba Khurana
 
PPTX
Search Engine
Ram Dutt Shukla
 
PPTX
Search Engine Powerpoint
201014161
 
12. Indexing and Hashing in DBMS
koolkampus
 
1 data types
Ram Kedem
 
3 indexes
Ram Kedem
 
Indexing the MySQL Index: Key to performance tuning
OSSCube
 
Ms sql-server
Md.Mojibul Hoque
 
MS SQL Server
Md. Mahedee Hasan
 
Building a Real-time Solr-powered Recommendation Engine
lucenerevolution
 
Introduction to TFS 2013
Md. Mahedee Hasan
 
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
Types of Search Engines
Surendra Kapadia
 
Lucene basics
Nitin Pande
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
dnaber
 
Solr4 nosql search_server_2013
Lucidworks (Archived)
 
Search engines powerpoint
vbaker2210
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Search Engines Presentation
JSCHO9
 
Introduction to Search Engines
Nitin Pande
 
Search engines
Sahiba Khurana
 
Search Engine
Ram Dutt Shukla
 
Search Engine Powerpoint
201014161
 
Ad

Similar to Database indexing framework (20)

PDF
A Review of Data Access Optimization Techniques in a Distributed Database Man...
Editor IJCATR
 
PDF
A Review of Data Access Optimization Techniques in a Distributed Database Man...
Editor IJCATR
 
PPTX
At the core you will have KUSTO
Riccardo Zamana
 
PPT
Ms sql server architecture
Ajeet Singh
 
PDF
Brad McGehee Intepreting Execution Plans Mar09
Mark Ginnebaugh
 
PDF
Brad McGehee Intepreting Execution Plans Mar09
guest9d79e073
 
PDF
MongoDB.pdf54teeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeer
MdRiyad22
 
PPT
Optimizing Data Accessin Sq Lserver2005
rainynovember12
 
PPTX
Process management seminar
apurva_naik
 
PPT
Cost Based Optimizer - Part 1 of 2
Mahesh Vallampati
 
PPTX
Database Basics
Abdel Moneim Emad
 
PPTX
Sql server introduction
Riteshkiit
 
PDF
Data warehousing interview_questionsandanswers
Sourav Singh
 
PDF
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
Sperasoft
 
PPTX
Sql server introduction fundamental
Riteshkiit
 
PDF
9 Python programming notes for ktu physics and computer application semester 4
ebindboby1
 
PDF
Bt0066 database management system1
Techglyphs
 
DOCX
Search Approach - ES, GraphDB
Sunita Shrivastava
 
PPTX
AWS RDS Migration Tool
Blazeclan Technologies Private Limited
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
Editor IJCATR
 
A Review of Data Access Optimization Techniques in a Distributed Database Man...
Editor IJCATR
 
At the core you will have KUSTO
Riccardo Zamana
 
Ms sql server architecture
Ajeet Singh
 
Brad McGehee Intepreting Execution Plans Mar09
Mark Ginnebaugh
 
Brad McGehee Intepreting Execution Plans Mar09
guest9d79e073
 
MongoDB.pdf54teeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeer
MdRiyad22
 
Optimizing Data Accessin Sq Lserver2005
rainynovember12
 
Process management seminar
apurva_naik
 
Cost Based Optimizer - Part 1 of 2
Mahesh Vallampati
 
Database Basics
Abdel Moneim Emad
 
Sql server introduction
Riteshkiit
 
Data warehousing interview_questionsandanswers
Sourav Singh
 
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
Sperasoft
 
Sql server introduction fundamental
Riteshkiit
 
9 Python programming notes for ktu physics and computer application semester 4
ebindboby1
 
Bt0066 database management system1
Techglyphs
 
Search Approach - ES, GraphDB
Sunita Shrivastava
 
Ad

Recently uploaded (20)

PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 

Database indexing framework

  • 1. Database Indexing Framework ( Version 1.0 )
  • 2. Objective To index database tables using Solr Requirements To create Search Index relevant views (like database views) by collating data from multiple database tables To convert data from database to XMLs that can be posted to Solr To enable incremental indexing Overview
  • 3. Possible approaches to create Search Index relevant views at the database level (creating Database Views) ( this will involve creation of database views based on search result requirements. For example the message module or the shopping module both have different search result requirement. So probably one view could cater to one module. These views only have columns relevant to search. Here the application layer will directly get prepared data from the database and will just have the job of posting it to Solr ) at the database level (using Procedures) ( this will involve creation of procedures to fetch the index relevant data ) at the application layer ( in this approach we give the work of collating data from the various database table to the application layer. It queries the various relevant DB tables and then collects the data and posts it to Solr ) Overview
  • 4. Possible approaches for Incremental Indexing at real time (the push approach) ( in this approach the data is indexed as soon as it is entered into the database. It will involve database listeners listening for changes in the database and queuing up the new and updated records to be indexed in a JMS queue. This queue is consumed by a indexing program that again queries the database based on the primary keys in the JMS queue to get data, convert it to Solr XML and post them to Solr ) as a batch process at regular intervals (the pull approach) ( here instead of immediate indexing, we fetch data from the database after some configurable regular interval. Unlike to the real time approach, here the chances of failure are minimal ) Overview
  • 5. The following slides discuss a incremental indexing approach that we thought would work well for our requirements. In this approach the Search Index relevant views are created using Database Views and the indexing is done as a Batch Process and not at real time. First we need to understand the need for the Database Views . When a search term is searched for in the index, the result page shows some details and summary of the result. For instant results these details need to be stored in the index itself so we don’t have to hit the database just to display collated results in the results page. When creating the Solr index it then doesn't make much sense to index all the tables individually. This is because each table will have it own dependencies with child and parent tables. We will either have to create similar dependencies in the index or else create our indexes intelligently keeping the search needs in mind. This will involve creating appropriate joins across tables to fetch all the data relevant to a search result at one shot. The database view can do this job of collating data from the parent and child tables in a representation that exactly matches the requirements of the search index. This makes the job of the application layer hassle free. It just picks everything from the view and indexes it as it is. Incremental Indexing Process ( the need for Database Views )
  • 6. Next we need to understand why the Batch Indexing process can work well for us. Most of our search requirements would involve searching for historic data. Rarely could there be cases where we search for data put in immediately. Even these cases can be handled by setting the Batch Process interval to a very small time. The real time indexing process can become a pretty expensive process in case a large amount of data is entered in small intervals. Also the batch process gives us the flexibility of working on a copy of the database to make the whole indexing process an offline one. Incremental Indexing Process ( the need for Batch indexing )
  • 7. Database Result Set to XML Converter Data Fetcher Indexing Job Scheduler Database Indexer (the controller class) SOLR Index Manager (9) Solr XML (1) Indexing Job Name (2) Database View Name (5) Result Set (6) Solr XML (3) Query (4) Result Set (8) Solr XML Indexing Job - Trigger Config file ( Indexing Job Schedules ) Trigger Time 1 - Indexing Job 1 Trigger Time 2 - Indexing Job 2 Trigger Time 3 - Indexing Job 3 7) Solr XML Incremental Indexing Batch Process ( the flow ) Components in green are explained in detail in next slide >> Indexing Job – Database View Mapping file More than one DB view might need to be indexed at the same time, so these can be as an Indexing Job. Indexing Job 1 – Database View1 Database View2 Database View3 Database View4 Indexing Job 2 – Database View5 Database View6 DB View Column name to Solr field mapping - Database View 1 Column 1 - Solr Field 1 Column 2 - Solr Field 2 Column 3 - Solr Field 3 - Database View 2 Column 1 - Solr Field 3 Column 2 - Solr Field 2
  • 8. Incremental Indexing Batch Process ( the components ) An Indexing Job has been defined as indexing of all the set of Database Views that need to be indexed at the same time and at equal time intervals. Triggers holds the time information, the start time, time interval and other such time related details. So when a Indexing Job is associated to a trigger, the job will run according to the start time and time intervals as mentioned in the trigger. Indexing Job - Trigger Config file has all Indexing Job Schedules. It maps triggers to indexing jobs. Indexing Job – Database View Mapping file defines the Indexing Jobs. It associates Database Views with each Indexing Job. If a database view like the one for the messages module requires to be picked up for at a smaller time interval than the one for the shopping module, then they will be part of different indexing jobs having different Triggers. Database Indexer acts as the controller of the database indexing process. It does the job of calling the Data Fetcher to get database records in XML format which it sends to the Index Manager to post it to Solr. The Data Fetcher communicates with the database to get all the new and updated records for a given database view along with those records that have been marked for deletion. It then feeds this data to the Result Set to XML converter to get the data converted to the Solr recognizable XML format. The Result Set to XML converter is a utility class which converts database records to XML format. If the record is new or updated it puts it in the <add> tag. If it is marked for deletion then it is put in the <delete> tag. It picks up Solr Field names corresponding to the DB View Column names from the DB View Column name to Solr field mapping file.
  • 9. Incremental Indexing Batch Process ( the flow) The indexing process is triggered off by the Indexing Job Scheduler . An indexing job is triggered from the Indexing Job Scheduler based on the trigger settings to which it is associated in the Indexing Job - Trigger Config file . The Indexing Job Scheduler makes a call to the Database Indexer sending the name of the job to done as an argument. The Database Indexer acts as the controller for this whole process. It picks up the names of Database Views to be indexed corresponding to the Indexing Job sent by Indexing Job Scheduler from the Indexing Job – Database View Mapping file . The Database Indexer loops over the set of Database Views and makes a call to the Data Fetcher for each View. The Data Fetcher hits the database with a query to get all the latest records from the View. The result set is sent to Result set to XML Converter which return the Solr XML. This Solr XML is sent back to the Database Indexer which in turn sends it to the Index manger for posting it to Solr.
  • 10. (4) Result Set (3 ) View Query Indexing Job to Database Views mapping file Job - Trigger Config file (Indexing Job Schedules) DB View Column name to Solr field mapping (2 ) Database View Name (7) Solr XML (6) Solr XML (5) Result Set (8) Solr XML (1) Indexing Job Name Indexing Job Scheduler Triggers Database Indexer with an Indexing job based on the trigger times in the Job - Trigger Config file Database Indexer ( @parameter Indexing Job) Picks up list of Database Views corresponding to a Indexing Job from the Indexing Job – Database View Mapping file Loops over the Database View Names and calls the Data Fetcher ( @parameter Database View Name) for each View to get back the corresponding result set. Sends the Solr XML to the Index Manager Data Fetcher (@parameter Database View Name ) Fires a generic “Select * from View Name ” query to get relevant data The ResultSet to XML Converter (@parameter Result Set ) is called for each result set to get the Solr XML for that result set. Result Set to XML Converter (@parameter Result Set ) Loops over the Result Set. Based on the Result Set Metadata get the corresponding solr field names from the DB View Column name to Solr field mapping Create the Solr XML file for the result set. Database SOLR Index Manager (9) Solr XML