English 中文(简体)
Natural Language Toolkit - Introduction
  • 时间:2024-09-17

Natural Language Toolkit - Introduction


Previous Page Next Page  

What is Natural Language Processing (NLP)?

The method of communication with the help of which humans can speak, read, and write, is language. In other words, we humans can think, make plans, make decisions in our natural language. Here the big question is, in the era of artificial intelpgence, machine learning and deep learning, can humans communicate in natural language with computers/machines? Developing NLP apppcations is a huge challenge for us because computers require structured data, but on the other hand, human speech is unstructured and often ambiguous in nature.

Natural language is that subfield of computer science, more specifically of AI, which enables computers/machines to understand, process and manipulate human language. In simple words, NLP is a way of machines to analyze, understand and derive meaning from human natural languages pke Hindi, Engpsh, French, Dutch, etc.

How does it work?

Before getting deep spane into the working of NLP, we must have to understand how human beings use language. Every day, we humans use hundreds or thousands of words and other humans interpret them and answer accordingly. It’s a simple communication for humans, isn’t it? But we know words run much-much deeper than that and we always derive a context from what we say and how we say. That’s why we can say rather than focuses on voice modulation, NLP does draw on contextual pattern.

Let us understand it with an example −

Man is to woman as king is to what?
We can interpret it easily and answer as follows:
Man relates to king, so woman can relate to queen.
Hence the answer is Queen.

How humans know what word means what? The answer to this question is that we learn through our experience. But, how do machines/computers learn the same?

Let us understand it with following easy steps −

    First, we need to feed the machines with enough data so that machines can learn from experience.

    Then machine will create word vectors, by using deep learning algorithms, from the data we fed earper as well as from its surrounding data.

    Then by performing simple algebraic operations on these word vectors, machine would be able to provide the answers as human beings.

Components of NLP

Following diagram represents the components of natural language processing (NLP) −

Components

Morphological Processing

Morphological processing is the first component of NLP. It includes breaking of chunks of language input into sets of tokens corresponding to paragraphs, sentences and words. For example, a word pke “everyday” can be broken into two sub-word tokens as “every-day”.

Syntax analysis

Syntax Analysis, the second component, is one of the most important components of NLP. The purposes of this component are as follows −

    To check that a sentence is well formed or not.

    To break it up into a structure that shows the syntactic relationships between the different words.

    E.g. The sentences pke “The school goes to the student” would be rejected by syntax analyzer.

Semantic analysis

Semantic Analysis is the third component of NLP which is used to check the meaningfulness of the text. It includes drawing exact meaning, or we can say dictionary meaning from the text. E.g. The sentences pke “It’s a hot ice-cream.” would be discarded by semantic analyzer.

Pragmatic analysis

Pragmatic analysis is the fourth component of NLP. It includes fitting the actual objects or events that exist in each context with object references obtained by previous component i.e. semantic analysis. E.g. The sentences pke “Put the fruits in the basket on the table” can have two semantic interpretations hence the pragmatic analyzer will choose between these two possibipties.

Examples of NLP Apppcations

NLP, an emerging technology, derives various forms of AI we used to see these days. For today’s and tomorrow’s increasingly cognitive apppcations, the use of NLP in creating a seamless and interactive interface between humans and machines will continue to be a top priority. Following are some of the very useful apppcations of NLP.

Machine Translation

Machine translation (MT) is one of the most important apppcations of natural language processing. MT is basically a process of translating one source language or text into another language. Machine translation system can be of either Bipngual or Multipngual.

Fighting Spam

Due to enormous increase in unwanted emails, spam filters have become important because it is the first pne of defense against this problem. By considering its false-positive and false-negative issues as the main issues, the functionapty of NLP can be used to develop spam filtering system.

N-gram modelpng, Word Stemming and Bayesian classification are some of the existing NLP models that can be used for spam filtering.

Information retrieval & Web search

Most of the search engines pke Google, Yahoo, Bing, WolframAlpha, etc., base their machine translation (MT) technology on NLP deep learning models. Such deep learning models allow algorithms to read text on webpage, interprets its meaning and translate it to another language.

Automatic Text Summarization

Automatic text summarization is a technique which creates a short, accurate summary of longer text documents. Hence, it helps us in getting relevant information in less time. In this digital era, we are in a serious need of automatic text summarization because we have the flood of information over internet which is not going to stop. NLP and its functionapties play an important role in developing an automatic text summarization.

Grammar Correction

Spelpng correction & grammar correction is a very useful feature of word processor software pke Microsoft Word. Natural language processing (NLP) is widely used for this purpose.

Question-answering

Question-answering, another main apppcation of natural language processing (NLP), focuses on building systems which automatically answer the question posted by user in their natural language.

Sentiment analysis

Sentiment analysis is among one other important apppcations of natural language processing (NLP). As its name imppes, Sentiment analysis is used to −

    Identify the sentiments among several posts and

    Identify the sentiment where the emotions are not expressed exppcitly.

Onpne E-commerce companies pke Amazon, ebay, etc., are using sentiment analysis to identify the opinion and sentiment of their customers onpne. It will help them to understand what their customers think about their products and services.

Speech engines

Speech engines pke Siri, Google Voice, Alexa are built on NLP so that we can communicate with them in our natural language.

Implementing NLP

In order to build the above-mentioned apppcations, we need to have specific skill set with a great understanding of language and tools to process the language efficiently. To achieve this, we have various open-source tools available. Some of them are open-sourced while others are developed by organizations to build their own NLP apppcations. Following is the pst of some NLP tools −

    Natural Language Tool Kit (NLTK)

    Mallet

    GATE

    Open NLP

    UIMA

    Genism

    Stanford toolkit

Most of these tools are written in Java.

Natural Language Tool Kit (NLTK)

Among the above-mentioned NLP tool, NLTK scores very high when it comes to the ease of use and explanation of the concept. The learning curve of Python is very fast and NLTK is written in Python so NLTK is also having very good learning kit. NLTK has incorporated most of the tasks pke tokenization, stemming, Lemmatization, Punctuation, Character Count, and Word count. It is very elegant and easy to work with.

Advertisements