GitHub - terror/mcgill.courses: A course search and review platform for McGill University

mcgill.courses

A course search and review platform for McGill university.

Development

You'll need docker, cargo and pnpm installed on your machine to spawn the various components the project needs to run locally.

First, join the discord server to get access to the development environment variables:

https://discord.gg/fSVSqfPHSV

In .env within the root directory you'll have to set

MS_CLIENT_ID=
MS_CLIENT_SECRET=
MS_REDIRECT_URI=http://localhost:8000/api/auth/authorized

...and then in client/.env you'll have to set the server url

VITE_API_URL=http://localhost:8000

Second, mount a local mongodb instance with docker and initiate the replica set:

docker compose up --no-recreate -d
sleep 5
docker exec mongodb mongosh --quiet --eval 'rs.initiate()' > /dev/null 2>&1 || true

Spawn the server with a data source (in this case the /seed directory) and initialize the database (note that seeding may take some time on slower machines):

cargo run -- --source=seed serve --initialize --db-name=mcgill-courses

Finally, spawn the react frontend:

pnpm install
pnpm run dev

n.b. If you have just installed, we provide a dev recipe for doing all of the above in addition to running a watch on the server:

just dev

See the justfile for more recipes.

Gathering seed data

The server command-line interface provides a load subcommand for scraping all courses from various McGill course information websites and building a JSON data source, for example:

RUST_LOG=info cargo run -- --source=seed \
  load \
  --batch-size=200 \
  --scrape-vsb \
  --user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36"

The current defaults include scraping all current 10,000+ courses offered by McGill, current schedule information from the official visual schedule builder, and courses offered in previous terms going back as far the 2009-2010 term.

For full usage information, see the output below:

Usage: server load [OPTIONS] --user-agent <USER_AGENT>

Options:
      --user-agent <USER_AGENT>      A user agent
      --course-delay <COURSE_DELAY>  Time delay between course requests in milliseconds [default: 0]
      --retries <RETRIES>            Number of retries [default: 10]
      --batch-size <BATCH_SIZE>      Number of pages to scrape per concurrent batch [default: 20]
      --mcgill-terms <MCGILL_TERMS>  The mcgill terms to scrape [default: 2009-2010 2010-2011 2011-2012 2012-2013 2013-2014 2014-2015 2015-2016 2016-2017 2017-2018 2018-2019 2019-2020 2020-2021 2021-2022 2022-2023 2023-2024 2024-2025]
      --vsb-terms <VSB_TERMS>        The schedule builder terms to scrape [default: 202405 202409 202501]
      --scrape-vsb                   Scrape visual schedule builder information
  -h, --help                         Print help

Alternatively, if you have just installed, you can run:

just load

Tools

We have a few tools that we use throughout the project, below documents some of them. You can find them all under the /tools directory from the project root.

For python-based tools, we highly recommend you install uv on your system. On macOS or linux, you can do it as follows:

curl -LsSf https://astral.sh/uv/install.sh | sh

Follow the documentation for other systems.

`changelog-generator`

Our changelog page (https://mcgill.courses/changelog) is automated by this tool.

We feed PR titles and descriptions to a large language model (in this case hard-coded to GPT-3.5) to generate a user-friendly summary using this prompt.

The tool assumes you have an OpenAI API key set in the environment, and you can use it from the project root like:

cargo run --manifest-path tools/changelog-generator/Cargo.toml \
  -- \
  --output client/src/assets/changelog.json

This will run the changelog generator on all up-to-date merged PRs from our GitHub repository, populating changelog.json with the results.

There are a few other options the tool supports:

Usage: changelog-generator [OPTIONS]

Options:
      --output <OUTPUT>               [default: ../../client/src/assets/changelog.json]
      --regenerate [<REGENERATE>...]
      --regenerate-all
      --repo <REPO>                   [default: mcgill.courses]
      --user <USER>                   [default: terror]
  -h, --help                          Print help

For instance, you can regenerate single entries by specifying their pull request number.

`course-average-fetcher`

This tool is used to populate a JSON file with course average information we display on course pages.

We read and parse a crowdsourced google sheet with historical course averages provided generously by the McGill enhanced team.

`requirement-parser`

We parse prerequisites and corequisites using a fine-tuned large language model with custom examples, all the code lives in /tools/requirement-parser.

If you need to run the requirement parser on a file, simply:

cd tools/requirement-parser
uv sync
uv run main.py <file>

n.b. This will require an OpenAI API key and the name of the fine-tuned model to be set in the environment.

For more information about how this works, check out our research project.

`search-index-aggregator`

This tool selectively includes only the JSON fields (from database seed files) required by the search component, significantly reducing payload size and improving resource efficiency.

Deployment

We continuously deploy our site with Render using a docker image, and have a MongoDB instance hosted on Atlas.

We also use S3 to host a bucket for referring to a hash when deciding whether or not to seed courses in our production environment, and Microsoft's identity platform for handling our OAuth 2.0 authentication flow.

Prior Art

There are a few notable projects worth mentioning that are similar in nature to mcgill.courses, and have either led to inspiration or new ideas with regard to its functionality and design, namely:

uwflow.com - A course search and review platform for the University of Waterloo
cloudberry.fyi - A post-modern schedule builder for McGill students
mcgill.wtf - A fast full-text search engine for McGill courses

Name		Name	Last commit message	Last commit date
Latest commit History 399 Commits
.github/workflows		.github/workflows
.husky		.husky
.well-known		.well-known
assets		assets
bin		bin
client		client
crates		crates
cypress		cypress
seed		seed
src		src
tools		tools
.editorconfig		.editorconfig
.env.dev.example		.env.dev.example
.env.prod.example		.env.prod.example
.eslintrc.json		.eslintrc.json
.gitattributes		.gitattributes
.gitignore		.gitignore
.lintstagedrc.json		.lintstagedrc.json
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CONTRIBUTING		CONTRIBUTING
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cypress.config.js		cypress.config.js
docker-compose.yml		docker-compose.yml
jest.config.js		jest.config.js
jest.setup.ts		jest.setup.ts
justfile		justfile
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.cjs		postcss.config.cjs
rustfmt.toml		rustfmt.toml
tailwind.config.cjs		tailwind.config.cjs
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mcgill.courses

Development

Gathering seed data

Tools

`changelog-generator`

`course-average-fetcher`

`requirement-parser`

`search-index-aggregator`

Deployment

Prior Art

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

terror/mcgill.courses

Folders and files

Latest commit

History

Repository files navigation

mcgill.courses

Development

Gathering seed data

Tools

changelog-generator

course-average-fetcher

requirement-parser

search-index-aggregator

Deployment

Prior Art

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

`changelog-generator`

`course-average-fetcher`

`requirement-parser`

`search-index-aggregator`

Packages