Development Setup¶
This page describes how to set up your development environment for this course.
Install Python¶
The recommended Python version for this course is 3.12. in a virtual environment.
In case this doesn't work, try to add the deadsnakes PPA to your system, and try again.
On Mac, you can use Homebrew to install Python.
On Windows, it is recommended to use the Windows Subsystem for Linux (WSL). Then you can follow the instructions for Linux.
If you are a VS Code user, you need to install the WSL extension.
There is currently no setup guide for native Windows, but I'm happy to accept a pull request for this issue. 😉
Warning
You are free to use another Python version if you wish, but be aware that this may cause problems with the provided code. Also if you are using Python outside a virtual environment or with a distribution like Anaconda, the described setup may not work.
Forking the Repository (Optional)¶
You can decide between forking the course repository, or just clone it:
- When forking the repository, you act as a conributor to the course repository. You will go through the full development setup and can use your remote repository to manage your work. Also, you can contribute back changes to earn bonus points for the exam. 🏅
- When cloning the repository, you act as a user of the course repository. While this is a bit more leightweight, you work fully locally and cannot contribute to the course repository. 🙁
Tip
Forking is a very common practice in open source developlment. If you are new to open source development and have not forked a repository before, this may be a good learning opportunity for you! 🤓
If you decide to clone the repository, you can directly continue with the next step.
If you decide to fork the repository, you can follow the official GitHub documentation on how to fork a repository.
Info
If you fork a repository, a copy of the repository will be created in your personal GitHub user space.
Clone the Repository¶
Make sure you have Git installed on your system.
Cloning the repository is straightforward, no matter if you work on a fork or not. You only need to watch out where you clone from:
# When cloning your fork, make sure to clone it from your personal GitHub user space
git clone https://github.com/<your-username>/htwg-practical-nlp.git
# cloning the course repository directly
git clone https://github.com/pkeilbach/htwg-practical-nlp.git
Execute the Setup Script¶
The setup script is provided as a Makefile
.
Change into the repository directory and execute the setup script.
This should create a virtual environment and install all required dependencies.
# go to the project directory
cd htwg-practical-nlp
# use plain make to install all required dependencies
make
# if you plan to contribute, you need to install the dev dependencies
make install-dev
This may take a few minutes. ☕
If everything went well, you should be good to go.
Acticate the virtual environment
From now, make sure that you have the virtual environment activated. Usually, the IDE should automatically suggest you to activate it (e.g. VSCode). If that is not the case, you can activate the virtual environment with the following command
Test your Installation¶
You can test your installation by running the tests for the first assignment.
In your terminal, you should see lots of failed tests. 😨
But this is exactly what we want to see, since we haven't implemented anything yet! 🤓
Info
You can find more details on how we handle assignments in the assignments guide.
If you came this far, your initial setup was successful and you are ready to go! 🚀
Now we can take a look at some other components of the repository.
Jupyter¶
Some of the assignments are accompanied by Jupyter notebooks.
If your IDE supports it, you can execute the Jupyter notebooks natively in your IDE (e.g. using the VSCode Jupyter extension).
If you prefer the web UI, you can start the Jupyter lab server with the following command.
Jupyter is now accessible at http://localhost:8888/.
Serve the Lecture Notes¶
If you want, you can bring up the lecture notes on your local machine.
The lecture notes are now accessible at http://localhost:8000/.
Fetching Updates¶
During the semester, it is very likely that the course repository will be updated.
You can incorporate the updates as follows:
# if you work on a fork, you need to fetch your updates from the course repository (aka 'upstream')
git fetch upstream
git checkout main
git merge upstream/main
# make sure the correct upstream repository is set:
git remote -v
# if not, add the upstream:
git remote add upstream https://github.com/pkeilbach/htwg-practical-nlp.git
# otherwise (if you have cloned the repository), you need to fetch from origin
git fetch origin
git checkout main
git merge origin/main
Pull updates regularly
It is good practice to pull the latest changes from main
every now and then (just in case you are wondering why your assignment tests suddenly fail 😅).
However, important updates will be announced in the lecture.
Note
Find more details about syncing a fork in the official GitHub docs.