- Natural Language Toolkit - Discussion
- Natural Language Toolkit - Useful Resources
- Natural Language Toolkit - Quick Guide
- Natural Language Toolkit - Text Classification
- Synonym & Antonym Replacement
- Natural Language Toolkit - Word Replacement
- Stemming & Lemmatization
- Looking up words in Wordnet
- Training Tokenizer & Filtering Stopwords
- Natural Language Toolkit - Tokenizing Text
- Natural Language Toolkit - Getting Started
- Natural Language Toolkit - Introduction
- Natural Language Toolkit - Home
自然语言工具包
- 自然语言工具箱——改造树木
- 自然语言工具箱——改造楚克
- Chunking & Information 排外
- 自然语言工具箱——包装
- 自然语言工具包 - 更多国家 Taggers
- 自然语言工具箱——将Taggers混为一谈
- 自然语言工具箱——Unigram Tagger
- 部分Speech(POS)基本原理
- Corpus Readers and Customs Corpora
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Synonym & Antonym Replacement
Replacing words with common synonyms
While working with NLP, especially in the case of frequency analysis and text indexing, it is always beneficial to compress the vocabulary without losing meaning because it saves lots of memory. To achieve this, we must have to define mapping of a word to its synonyms. In the example below, we will be creating a class named word_syn_replacer which can be used for replacing the words with their common synonyms.
Example
First, import the necessary package re to work with regular expressions.
import re from nltk.corpus import wordnet
Next, create the class that takes a word replacement mapping −
class word_syn_replacer(object): def __init__(self, word_map): self.word_map = word_map def replace(self, word): return self.word_map.get(word, word)
Save this python program (say replacesyn.py) and run it from python command prompt. After running it, import word_syn_replacer class when you want to replace words with common synonyms. Let us see how.
from replacesyn import word_syn_replacer rep_syn = word_syn_replacer ({‘bday’: ‘birthday’) rep_syn.replace(‘bday’)
Output
birthday
Complete implementation example
import re from nltk.corpus import wordnet class word_syn_replacer(object): def __init__(self, word_map): self.word_map = word_map def replace(self, word): return self.word_map.get(word, word)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import word_syn_replacer rep_syn = word_syn_replacer ({‘bday’: ‘birthday’) rep_syn.replace(‘bday’)
Output
birthday
The disadvantage of the above method is that we should have to hardcode the synonyms in a Python dictionary. We have two better alternatives in the form of CSV and YAML file. We can save our synonym vocabulary in any of the above-mentioned files and can construct word_map dictionary from them. Let us understand the concept with the help of examples.
Using CSV file
In order to use CSV file for this purpose, the file should have two columns, first column consist of word and the second column consists of the synonyms meant to replace it. Let us save this file as syn.csv. In the example below, we will be creating a class named CSVword_syn_replacer which will extends word_syn_replacer in replacesyn.py file and will be used to construct the word_map dictionary from syn.csv file.
Example
First, import the necessary packages.
import csv
Next, create the class that takes a word replacement mapping −
class CSVword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = {} for pne in csv.reader(open(fname)): word, syn = pne word_map[word] = syn super(Csvword_syn_replacer, self).__init__(word_map)
After running it, import CSVword_syn_replacer class when you want to replace words with common synonyms. Let us see how?
from replacesyn import CSVword_syn_replacer rep_syn = CSVword_syn_replacer (‘syn.csv’) rep_syn.replace(‘bday’)
Output
birthday
Complete implementation example
import csv class CSVword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = {} for pne in csv.reader(open(fname)): word, syn = pne word_map[word] = syn super(Csvword_syn_replacer, self).__init__(word_map)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import CSVword_syn_replacer rep_syn = CSVword_syn_replacer (‘syn.csv’) rep_syn.replace(‘bday’)
Output
birthday
Using YAML file
As we have used CSV file, we can also use YAML file to for this purpose (we must have PyYAML installed). Let us save the file as syn.yaml. In the example below, we will be creating a class named YAMLword_syn_replacer which will extends word_syn_replacer in replacesyn.py file and will be used to construct the word_map dictionary from syn.yaml file.
Example
First, import the necessary packages.
import yaml
Next, create the class that takes a word replacement mapping −
class YAMLword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = yaml.load(open(fname)) super(YamlWordReplacer, self).__init__(word_map)
After running it, import YAMLword_syn_replacer class when you want to replace words with common synonyms. Let us see how?
from replacesyn import YAMLword_syn_replacer rep_syn = YAMLword_syn_replacer (‘syn.yaml’) rep_syn.replace(‘bday’)
Output
birthday
Complete implementation example
import yaml class YAMLword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = yaml.load(open(fname)) super(YamlWordReplacer, self).__init__(word_map)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import YAMLword_syn_replacer rep_syn = YAMLword_syn_replacer (‘syn.yaml’) rep_syn.replace(‘bday’)
Output
birthday
Antonym replacement
As we know that an antonym is a word having opposite meaning of another word, and the opposite of synonym replacement is called antonym replacement. In this section, we will be deapng with antonym replacement, i.e., replacing words with unambiguous antonyms by using WordNet. In the example below, we will be creating a class named word_antonym_replacer which have two methods, one for replacing the word and other for removing the negations.
Example
First, import the necessary packages.
from nltk.corpus import wordnet
Next, create the class named word_antonym_replacer −
class word_antonym_replacer(object): def replace(self, word, pos=None): antonyms = set() for syn in wordnet.synsets(word, pos=pos): for lemma in syn.lemmas(): for antonym in lemma.antonyms(): antonyms.add(antonym.name()) if len(antonyms) == 1: return antonyms.pop() else: return None def replace_negations(self, sent): i, l = 0, len(sent) words = [] while i < l: word = sent[i] if word == not and i+1 < l: ant = self.replace(sent[i+1]) if ant: words.append(ant) i += 2 continue words.append(word) i += 1 return words
Save this python program (say replaceantonym.py) and run it from python command prompt. After running it, import word_antonym_replacer class when you want to replace words with their unambiguous antonyms. Let us see how.
from replacerantonym import word_antonym_replacer rep_antonym = word_antonym_replacer () rep_antonym.replace(‘ugpfy’)
Output
[ beautify ] sentence = ["Let us", not , ugpfy , our , country ] rep_antonym.replace _negations(sentence)
Output
["Let us", beautify , our , country ]
Complete implementation example
nltk.corpus import wordnet class word_antonym_replacer(object): def replace(self, word, pos=None): antonyms = set() for syn in wordnet.synsets(word, pos=pos): for lemma in syn.lemmas(): for antonym in lemma.antonyms(): antonyms.add(antonym.name()) if len(antonyms) == 1: return antonyms.pop() else: return None def replace_negations(self, sent): i, l = 0, len(sent) words = [] while i < l: word = sent[i] if word == not and i+1 < l: ant = self.replace(sent[i+1]) if ant: words.append(ant) i += 2 continue words.append(word) i += 1 return words
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacerantonym import word_antonym_replacer rep_antonym = word_antonym_replacer () rep_antonym.replace(‘ugpfy’) sentence = ["Let us", not , ugpfy , our , country ] rep_antonym.replace _negations(sentence)
Output
["Let us", beautify , our , country ]Advertisements