SPACY for Beginners -NLP

Get started with NLP using Spacy

Pema Grg
EKbana
Published in
4 min readSep 6, 2018

--

SpaCy is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython. The library is published under the MIT license.

Today we’ll be talking about how to get started with NLP using Spacy. But before starting, make sure that you have Python and Spacy installed in your system.

To install Spacy and English Model:

In spacy, the object “nlp” is used to create documents, access linguistic annotations and different nlp properties.

1. IMPORT SPACY

The default model which is english-core-web, for which we load the “en” model.

2. WORD TOKENIZE

Tokenize words to get the tokens of the text i.e breaking the sentences into words.

3. SENTENCE TOKENIZE

Tokenize sentences if the there are more than 1 sentence i.e breaking the sentences to list of sentence.

4. STOP WORDS REMOVAL

Remove irrelevant words using nltk stop words like is,the,a etc from the sentences as they don’t carry any information.

5. Lemma

lemmatize the text so as to get its root form eg: functions,funtionality as function

7. Get word frequency

counting the word occurrence using FreqDist library. Word frequency helps us to determine how important the word is in the document by knowing how many times the word is being used.

8. POS tags

POS tag helps us to know the tags of each word like whether a word is noun, adjective etc.

9. NER

NER(Named Entity Recognition) is the process of getting the entity names

voila!!! now you know the basics of NLP 👌

You can now try some mini projects like:

  1. Extracting keywords of documents, articles.
  2. Generating part of speech for phrases.
  3. Getting the top used words among all documents.

You can also check : NLP for beginners using NLTK

Github Link for more codes: https://github.com/pemagrg1/SPACY-for-Beginners

--

--

Writer for

curretly an NLP Engineer @EKbana(Nepal)| previously worked@Awesummly(Bangalore)| internship@Meltwater, Bangalore| Linkedin: https://www.linkedin.com/in/pemagrg/