Skip to content

yernaz-togizbayev/natural_language_processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Natural Language Processing

This repository contains my solutions and implementations for a university-level Natural Language Processing (NLP) course.

The project consists of 10 exercise sheets, each covering fundamental and advanced NLP concepts using Python, NLTK, and related libraries.


📚 Course Topics Covered

Throughout the exercises, the following NLP concepts were explored:

📝 Text Processing

  • Tokenization
  • Normalization
  • Stopword removal
  • Stemming & Lemmatization
  • Regular expressions

📊 Language Modeling

  • N-gram models
  • Probability estimation
  • Perplexity evaluation

🏷 Part-of-Speech Tagging

  • POS tagging with NLTK
  • Tagging accuracy evaluation

🌳 Syntax & Parsing

  • Context-Free Grammars (CFG)
  • Constituency parsing
  • Tree representations

📚 Word Representations

  • Word frequency analysis
  • Distributional semantics
  • Vector representations

🤖 Machine Learning for NLP

  • Text classification
  • Feature extraction
  • Evaluation metrics (accuracy, precision, recall, F1)

🔍 Advanced Topics

  • Named Entity Recognition (NER)
  • Sequence labeling
  • Corpus processing

Each exercise sheet is provided in:

  • 📓 Notebook version (.ipynb) – interactive exploration
  • 🐍 Python script version (.py) – standalone implementation

🚀 Getting Started

1️⃣ Requirements

  • Python 3.9+
  • Jupyter Notebook
  • Required libraries:
pip install nltk numpy pandas scikit-learn matplotlib

If needed, download NLTK resources:

import nltk
nltk.download('all')

2️⃣ Run Notebooks

jupyter notebook

Open any Exercise_Sheet_X.ipynb.


🎯 Learning Outcomes

  • Practical experience with core NLP pipelines
  • Understanding of probabilistic language models
  • Working with real corpora
  • Implementing ML models for text classification
  • Evaluating NLP systems properly

📄 License

This repository contains coursework implementations and is shared for educational purposes only.

Releases

No releases published

Packages

 
 
 

Contributors