NLTK Python
NLTK Python:
NLTK, or Natural Language Toolkit, is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, semantic reasoning, and wrappers for industrial-strength NLP libraries.
Here are a few examples of how to use NLTK:
- Tokenization – Breaking down text into words, sentences, etc.
import nltk
nltk.download('punkt') # download the tokenizer model
text = "This is a sentence. This is another sentence."
sentences = nltk.sent_tokenize(text) # Sentence tokenization
print(sentences)
words = nltk.word_tokenize(text) # Word tokenization
print(words)
- Part of Speech (POS) Tagging – Assigning word types to tokens, like verb or noun.
nltk.download('averaged_perceptron_tagger') # download the POS tagger model
sentence = nltk.word_tokenize("This is a sentence")
tagged = nltk.pos_tag(sentence)
print(tagged)
- Stemming – Reducing words to their root (stem).
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
print(stemmer.stem('running'))
- Lemmatization – Reducing words to their base form (lemma), considering the context.
nltk.download('wordnet') # download Wordnet, a lexical database for English
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
print(lemmatizer.lemmatize('running', pos='v')) # Lemmatize as verb
- Named Entity Recognition (NER) – Classifying named entities in text.
nltk.download('maxent_ne_chunker') # download the NER model
nltk.download('words') # download the words corpus
sentence = nltk.word_tokenize("Apple Inc. is planning to open a new office in San Francisco")
sentence = nltk.pos_tag(sentence)
named_entities = nltk.ne_chunk(sentence)
print(named_entities)
- Stop words – Filtering common words that typically don’t contain useful information for NLP tasks.
nltk.download('stopwords') # download the stopwords corpus
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
words = nltk.word_tokenize("This is a sentence")
filtered_words = [w for w in words if not w.lower() in stop_words]
print(filtered_words)
These are just a few examples. NLTK provides a wide range of functionalities for natural language processing and understanding.
Python Training Demo Day 1
Conclusion:
Unogeeks is the No.1 IT Training Institute for Python Training. Anyone Disagree? Please drop in a comment
You can check out our other latest blogs on Python here – Python Blogs
You can check out our Best In Class Python Training Details here – Python Training
Follow & Connect with us:
———————————-
For Training inquiries:
Call/Whatsapp: +91 73960 33555
Mail us at: info@unogeeks.com
Our Website ➜ https://unogeeks.com
Follow us:
Instagram: https://www.instagram.com/unogeeks
Facebook: https://www.facebook.com/UnogeeksSoftwareTrainingInstitute
Twitter: https://twitter.com/unogeeks