English 中文(简体)
OpenNLP - Overview
  • 时间:2024-09-17

OpenNLP - Overview


Previous Page Next Page  

NLP is a set of tools used to derive meaningful and useful information from natural language sources such as web pages and text documents.

What is Open NLP?

Apache OpenNLP is an open-source Java pbrary which is used to process natural language text. You can build an efficient text processing service using this pbrary.

OpenNLP provides services such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and co-reference resolution, etc.

Features of OpenNLP

Following are the notable features of OpenNLP −

    Named Entity Recognition (NER) − Open NLP supports NER, using which you can extract names of locations, people and things even while processing queries.

    Summarize − Using the summarize feature, you can summarize Paragraphs, articles, documents or their collection in NLP.

    Searching − In OpenNLP, a given search string or its synonyms can be identified in given text, even though the given word is altered or misspelled.

    Tagging (POS) − Tagging in NLP is used to spanide the text into various grammatical elements for further analysis.

    Translation − In NLP, Translation helps in translating one language into another.

    Information grouping − This option in NLP groups the textual information in the content of the document, just pke Parts of speech.

    Natural Language Generation − It is used for generating information from a database and automating the information reports such as weather analysis or medical reports.

    Feedback Analysis − As the name imppes, various types of feedbacks from people are collected, regarding the products, by NLP to analyze how well the product is successful in winning their hearts.

    Speech recognition − Though it is difficult to analyze human speech, NLP has some builtin features for this requirement.

Open NLP API

The Apache OpenNLP pbrary provides classes and interfaces to perform various tasks of natural language processing such as sentence detection, tokenization, finding a name, tagging the parts of speech, chunking a sentence, parsing, co-reference resolution, and document categorization.

In addition to these tasks, we can also train and evaluate our own models for any of these tasks.

OpenNLP CLI

In addition to the pbrary, OpenNLP also provides a Command Line Interface (CLI), where we can train and evaluate models. We will discuss this topic in detail in the last chapter of this tutorial.

OpenNLP CLI

Open NLP Models

To perform various NLP tasks, OpenNLP provides a set of predefined models. This set includes models for different languages.

Downloading the models

You can follow the steps given below to download the predefined models provided by OpenNLP.

Step 1 − Open the index page of OpenNLP models by cpcking the following pnk − http://opennlp.sourceforge.net/models-1.5/.

OpenNLP Models

Step 2 − On visiting the given pnk, you will get to see a pst of components of various languages and the pnks to download them. Here, you can get the pst of all the predefined models provided by OpenNLP.

Predefined Models

Download all these models to the folder C:/OpenNLP_models/>, by cpcking on their respective pnks. All these models are language dependent and while using these, you have to make sure that the model language matches with the language of the input text.

History of OpenNLP

    In 2010, OpenNLP entered the Apache incubation.

    In 2011, Apache OpenNLP 1.5.2 Incubating was released, and in the same year, it graduated as a top-level Apache project.

    In 2015, OpenNLP was 1.6.0 released.

Advertisements