Text-Mining & Social Networks
latest

Contents:

  • 1. Basics in Text-Mining
  • 2. Spell Checker
  • 3. Text Classificaton
  • 4. Document Similarity
  • 5. Topic Modelling
  • 6. Resources
  • 7. Glossary
Text-Mining & Social Networks
  • Docs »
  • Text-Mining
  • Edit on GitHub

Text-MiningΒΆ

This documentation summarises various text-mining techniques in Python.

Contents:

  • 1. Basics in Text-Mining
    • 1.1. Tokenisation
    • 1.2. Vectorization
  • 2. Spell Checker
    • 2.1. Jaccard Distance on Trigram
    • 2.2. Jaccard Distance on 4-gram
    • 2.3. Edit Distance
  • 3. Text Classificaton
    • 3.1. Add New Features to Vectorizer
    • 3.2. Multi-Nominal Naive Bayes & CountVectorizer
    • 3.3. Multi-Nominal Naive Bayes & TfidfVectorizer
    • 3.4. Support Vector Machine & TfidfVectorizer
    • 3.5. Logistic Regression & TfidfVectorizer
    • 3.6. Logistic Regression & CountVectorizer
  • 4. Document Similarity
    • 4.1. Create a Similiarity Function Btw Two Documents
    • 4.2. Assign Scores to New Documents
    • 4.3. Calculate Accuarcy Score
  • 5. Topic Modelling
    • 5.1. Latent Dirichlet Allocation
  • 6. Resources
  • 7. Glossary
Next

© Copyright 2017, Jake Teo. Revision c99fbbb3.

Built with Sphinx using a theme provided by Read the Docs.