This project demonstrates how a Large Language Model (LLM) can be fine-tuned to perform specialised tasks. In this example, the base model used is Gemma 3 1B (pretrained), fine-tuned specifically to translate between English and Morse code.
Note: This LLM is intended solely for demonstration and educational purposes. It does not have practical real-world applications beyond being a teaching example.
- Python >3.11 but <3.13 (as there are known issues with 3.13)
- VS Code
- VS Code Jupyter Notebook extension
- You need an account on Hugging Face
- Visit Hugging Face and log in to your account.
- Navigate to your profile settings by clicking on your avatar in the top-right corner and selecting "Settings."
- In the settings menu, select "Access Tokens."
- Click on "New Token," provide a name for the token, and set the role to "write."
- Copy the generated token.
Run the following command in your terminal to log in:
huggingface-cli login
Paste your token when prompted.
The notebooks in this project are best run on Linux or WSL2 environment. Running them natively on Windows can present challenges. I used WSL2 with Debian.
It is recommended to create a Python virtual environment before installing the project requirements. This ensures that dependencies are isolated and do not interfere with other projects.
To create and activate a virtual environment, follow these steps:
-
Create a virtual environ 8821 ment:
python -m venv .venv
-
Activate the virtual environment:
source .venv/bin/activate
-
Install the requirements:
pip install -r requirements.txt
To use the virtual environment in a Jupyter Notebook within VS Code:
- Open the Command Palette (
Ctrl+Shift+P
orCmd+Shift+P
on Mac). - Search for and select
Python: Select Interpreter
. - Choose the interpreter located in the
venv
directory (e.g.,./venv/bin/python
or.\venv\Scripts\python.exe
). - Open your notebook, and in the top-right corner, select the kernel corresponding to the virtual environment.
Tip
Run the notebooks sequentially from 00-test-env.ipynb
to 02-fine-tune-bi.ipynb
for this guide.
To verify that the custom Morse code library is installed and to confirm that the Jupyter Notebook widgets are functioning as expected.
To create the training dataset by preparing English phrases and their Morse Code translations. This notebook includes data normalisation, encoding, deduplication, and uploading the dataset to Hugging Face.
To fine-tune the Gemma 3 model for bidirectional translation between English and Morse Code. This notebook uses the dataset prepared in 01-build-dataset.ipynb
and trains the model for both directions of translation.
To evaluate the model using TensorBoard, follow these steps:
-
Run the following command to start TensorBoard:
tensorboard --logdir outputs
-
Ensure your configuration includes the following settings:
STConfig( ... output_dir = "outputs", report = "tensorboard" )
-
Training Loss: This metric indicates how well the model is learning during training. A decreasing training loss generally signifies that the model is improving. However, if the loss plateaus or increases, it may indicate overfitting or learning issues.
-
Gradient Norm (grad_norm): This measures the magnitude of gradients during backpropagation. Large gradient norms can lead to instability, while very small norms may indicate vanishing gradients. Monitor this value to ensure stable and effective training.
After creating a GGUF file and hosting it on Hugging Face, you can download and use it in LM Studio. LM Studio is a user-friendly interface for interacting with language models, allowing you to test and deploy your fine-tuned model efficiently. Simply follow the instructions in LM Studio to load the GGUF file and start using your model.
It is possible to build a GGUF file and host it in Ollama. Ollama provides a platform for deploying and managing language models with ease. For detailed instructions on how to set this up, refer to the ollama/README.md file included in this repository.