- Lucene - Sorting
- Lucene - Analysis
- Lucene - Query Programming
- Lucene - Search Operation
- Lucene - Indexing Operations
- Lucene - Indexing Process
- Lucene - Searching Classes
- Lucene - Indexing Classes
- Lucene - First Application
- Lucene - Environment Setup
- Lucene - Overview
- Lucene - Home
Lucene Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Lucene - Indexing Classes
Indexing process is one of the core functionapties provided by Lucene. The following diagram illustrates the indexing process and the use of classes. IndexWriter is the most important and the core component of the indexing process.
We add Document(s) containing Field(s) to IndexWriter which analyzes the Document(s) using the Analyzer and then creates/open/edit indexes as required and store/update them in a Directory. IndexWriter is used to update or create indexes. It is not used to read indexes.
Indexing Classes
Following is a pst of commonly-used classes during the indexing process.
S.No. | Class & Description |
---|---|
1 | This class acts as a core component which creates/updates indexes during the indexing process. |
2 | This class represents the storage location of the indexes. |
3 | This class is responsible to analyze a document and get the tokens/words from the text which is to be indexed. Without analysis done, IndexWriter cannot create index. |
4 | This class represents a virtual document with Fields where the Field is an object which can contain the physical document s contents, its meta data and so on. The Analyzer can understand a Document only. |
5 | This is the lowest unit or the starting point of the indexing process. It represents the key value pair relationship where a key is used to identify the value to be indexed. Let us assume a field used to represent contents of a document will have key as "contents" and the value may contain the part or all of the text or numeric content of the document. Lucene can index only text or numeric content only. |