Skip to content

Course Profile

Description

This course offers a comprehensive introduction to Natural Language Processing (NLP), focusing on both basic methods and modern techniques. Students will explore essential topics such as preprocessing, feature extraction, and classic algorithms like Logistic Regression and Naive Bayes, used for text classification. Core concepts in language modeling and vector space models will be covered, including Minimum Edit Distance for text similarity.

Later in the course, students will delve into advanced NLP approaches, including word embeddings and sequence models, such as recurrent neural networks (RNNs) and their improvements with attention mechanisms. The course balances theory and practical application, preparing students to build and understand NLP systems in real-world contexts. Towards the end of the course, we will also touch on large language models, prompt engineering, and generative AI.

Learning Objectives

By the end of this course, students will be able to:

  • Understand the fundamentals of Natural Language Processing (NLP) and its key applications.
  • Apply various preprocessing techniques and feature extraction methods to text data.
  • Analyze and implement basic models for text classification, such as Logistic Regression and Naive Bayes.
  • Understand and utilize advanced NLP techniques, including word embeddings and sequence models.
  • Explore the concepts behind large language models, prompt engineering, and generative AI.

Assessment

The course is graded based on a written 90-minute exam at the end of the semester. To be admitted to the exam, you need to complete the assignmnets and give a presentation. You can earn bonus points for the exam through contributions to the course repository.

  • Exam: graded
  • Presentation: ungraded, but mandatory to be admitted to the exam
  • Assignments: ungraded, but eligible for bonus points
  • Contributions: ungraded, but eligible for bonus points

You can earn a maximum of 10 bonus points throughout the semester by completing assignments and making contributions.

Course Language

All course materials are provided in English. Lectures will be delivered in German, but can also be conducted in English if we have international students.

Course Format

This course will be delivered in a hybrid format, consisting of both in-person and online lectures. Note that in-person lectures will not be streamed, and online lectures will not be recorded.

Prerequisites

The following skills are recommended to participate in the course. Let me know if you have any doubts or questions regarding those prerequisites. My goal is to keep the entry barrier as low as possible!

  • Basic Programming Skills

    To complete the course, you will need basic programming skills. If you visited an introduction to programming course, you should be good to go.

    Info

    I tried to design the coding exercises in a way that students with little programming experience can solve them. If you know your basics about object oriented programming, you will be well equipped. Any advanced concpets will be explained during the lecture. In the end, we don't want to bother with advanced programming concepts but get excited with NLP! So don't worry if you just started your programming journey, you are still encouraged to take the course!

  • Basic Python Skills

    The code for this lecture is written in Python, so it is definetely an advantage if you have worked with Python before. However, if you are coming from a different language, you should be able to follow along. I tried to keep the language specific parts to a minimum and will provide explanations where necessary.

    Tutorial

    Microsoft provides a nice beginner course that you can take to get up to speed with Python.

  • Knowledge of the Linux Command Line

    Since the course is designed for a Linux development environment, it is recommended to have some basic skills with the Linux command line, i.e. the bash shell.

    However, all required commands will be provided in the instructions, so it is not necessary to have extensive Linux command line skills.

    Tutorial

    Here is a basic bash tutorial which may help if you are new to Bash and the Linux command line.

    Info

    We use make commands to automate the setup process and simplify the commands. You can check out our Makefile to see what is actucally executed. If you have not used a Makefile before, check out this guide.

  • Basic Knowledge of Git

    To participate in the course you need basic Git knowledge, like cloning a repository, commit and push changes, or pull updates.

    Since the repository is hosted in GitHub, it is an advantage to be familiar with processes like forking or pull requests.

    Tutorial

    If you have not worked a lot with Git before, please check out this Git tutorial

    If you are new to GitHub (a popular Git hosting service), you might want to check out this module.

Tip

In general, Microsoft Learn offers some great tutorials for all kinds of technologies.

Literature

Here is a list of recommended literature for this course: