Skip to content

ErikaJacobs/Harry-Potter-Text-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Text Analysis of the Harry Potter Book Series

This project features sentiment analysis conducted on the text of the Harry Potter book series by JK Rowling. Exploratory data analysis and data visualizations are created within a Jupyter notebook, including a themed word cloud. The VADER package in Python is utilized to assign sentiment to each sentence of text from each book, allowing for trends to be captured in sentiment over time. Naive bayes classified is then used to provide most notable text features of both positive and negative sentiment.

Methods Used

  • Sentiment Analysis
  • Naive Bayes Classifier

Technologies Used

  • Jupyter Notebook
  • Python
  • R Studio

Packages Used

  • NLTK - Python
  • Pandas - Python
  • WordCloud - Python
  • VADER - Python
  • harrypotter - R Studio

Featured Notebooks

Other Repository Contents

  • Images - Used to compose Word Cloud
    • Background.jpg
    • Footsteps.png
    • HP_WordCloud.png
    • Maurader_Logo.png
    • Mauraders_Map.png
    • Thunderbolt.jpg
  • Font - Used to style World Cloud
    • Lumos.tff
  • Book Text - Output of Harry Potter text from R Studio
    • HPBook1.txt
    • HPBook2.txt
    • HPBook3.txt
    • HPBook4.txt
    • HPBook5.txt
    • HPBook6.txt
    • HPBook7.txt
  • Output - Files created in Notebooks
    • Exploratory Data Analysis
      • df.xlsx
      • HPavgwords.png
      • HPtotalwords.png
      • HPlongchaps.png
      • HPshortchaps.png
    • Word Cloud
      • HP_WordCloud_FINAL.png
    • Sentiment Analysis
      • HPTimeplot.png

Sources

About

NLP text analysis of the Harry Potter book series

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published