llmbda

Welcome to the repository for the McGill University COMP550 Natural Language Processing project "llmbda". This repository contains all the code and resources required to replicate the findings and experiments presented in the report.

Project Report

Since late 2022, large language models (LLMs) like ChatGPT have gained popularity in research and industry uses due to their ability to perform diverse natural language processing tasks with human-like proficiency. In this study, we investigate the ability to understand logic of various large language models, including open-source ones like Llama 2 and closed-source models such as GPT and Gemini, on the task of parsing semi-formal natural language into propositional logic. Our analysis compared different alignment methods: zero-shot and few-shot prompting, and supervised fine-tuning.

We observe that LLMs performed well in these tasks, LLMs can understand logical semantics with appropriate training data, especially when fine-tuned with a dedicated dataset. However, chat-instruct and general-purpose LLMs suffer from inconsistent performance on this task.

Our findings suggests that there may be potential downstream engineering and research use cases for LLMs for semantic related tasks, given its understanding of logic.

Environment Setup

Prerequisites

Prompt engineering scripts

These make calls to the official OpenAI and Google APIs, and do not have specific system requiremenets.

Fine-tuning Llama2

System requirements:

Ubuntu (Some libraries used in huggingface's Transformer's API require Linux)
CUDA (torch devices are set to CUDA)

Virutal Environment

Dependency management is done using poetry, dependencies can be installed via:

poetry install

To run a single script, use:

poetry run python3 <?.py>

To enter activate the virtual environment for your shell, run:

poetry shell

Training

Experiment scritps are located under /experiments, each experiment have their own scritpt. Only the Llama finetuning requires local trainning.

Llama fine-tune scripts

Make note of the Llama fine tune scripts, where it uses the sft_finetune.py scripts from Huggingface. To run it with the same options as the experiment, run the bash script.

experiments/llama2/finetune.sh

Additonally, you can experiment with different options in the bash script.

Handling CUDA OOM

The script already specified memory efficient training (4 bit quantization + PEFT), our setup ran on a single RTX 3060 12GB. To further reduce memory usage, try reducing the batch size in the bash script.

Inferences

Inferences with GPT and Gemini can be done with their corresponding scripts under /experiments, ensure that you have a valid API key.

Llama inferences can be done locally by running llama2_chat.py for the chat instruct llama2-chat model, and llama2_finetuned.py for the finetuned model. The latter script assumes that you have a finetuned model locally, with the default model path being the same directory.

Evaluation

Evaluation is pipelined with the eval.py script. Simply run eval.py with desired options

model_name             the name of the model 
label_path             The path where the correct labeled csv is located 
pred_path              The path where the predictions csv is located 
--log_result           Option to indicate whether we save the result in a text file. Default is False. 
--ans_text_field       Column name of the correct labels in the labeled csv. Default is 'object_tree'.
--pred_text_field      Column name of the correct labels in the prediction csv. Default is 'predictions'.
--join_on              The key column name to join the two csv on. Default is 'index'.

a sample eval script is located in the justfile, you can run it with

just eval {{model_name}} {{pred_path}}

Project Report

For a comprehensive understanding of our project, methodologies, and detailed results, please refer to our project report.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
classes		classes
dataset		dataset
eval		eval
experiments		experiments
proposal		proposal
report		report
results		results
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
justfile		justfile
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llmbda

Environment Setup

Prerequisites

Prompt engineering scripts

Fine-tuning Llama2

Virutal Environment

Training

Llama fine-tune scripts

Handling CUDA OOM

Inferences

Evaluation

Project Report

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

SamZhang02/llmbda

Folders and files

Latest commit

History

Repository files navigation

llmbda

Environment Setup

Prerequisites

Prompt engineering scripts

Fine-tuning Llama2

Virutal Environment

Training

Llama fine-tune scripts

Handling CUDA OOM

Inferences

Evaluation

Project Report

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages