cyan

C-to-Assembly (x86-64) compiler for a basic subset of C.

Why

Simply to learn more about compilers, assembly, and how not to design languages. :)

Features

If something is missing in the list below, then it's not planned to be implemented.

Optimizations:

Grammar

Defined using EBNF-like notation.

Definition

<program>     = <function>
<function>    = "int" <identifier> "(" "void" ")" <block>
<block>       = "{" { <block-item> } "}"
<block-item>  = <declaration> | <statement>
<declaration> = "int" <identifier> [ "=" <expression> ] ";"
<statement>   = "return" <expression> ";"
              | <expression> ";"
              | <identifier> ":" <statement>
              | "if" "(" <expression> ")" <statement> [ "else" <statement> ]
              | "break" ";"
              | "continue" ";"
              | "switch" "(" <expression> ")" <statement>
              | "while" "(" <expression> ")" <statement>
              | "do" <statement> "while" "(" <expression> ")" ";"
              | "for" "(" <initializer> [ <expression> ] ";" [ <expression> ] ";" [ <expression> ] ")" <statement>
              | "goto" <identifier> ";"
              | <block>
              | ";"
<initializer> = <declaration> | [ <expression> ] ";"
<expression>  = <factor>
              | <expression> <binary-op> <expression>
              | <expression> "?" <expression> ":" <expression>
<factor>      = <unary-op> <factor> | <postfix>
<postfix>     = <primary> { <postfix-op> }
<primary>     = <int> | <identifier> | "(" <expression> ")"
<unary-op>    = "-" | "~" | "!" | "++" | "--"
<postfix-op>  = "++" | "--"
<binary-op>   = "+" | "-" | "*" | "/" | "%"
              | "<<" | ">>" | "&" | "|" | "^"
              | "&&" | "||" | "==" | "!=" | "<" | "<=" | ">" | ">="
              | "=" | "+=" | "-=" | "*=" | "/=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>="

<identifier>  = ? An identifier token ?
<int>         = ? A constant token ?

Trees and IRs

AST

This is used to represent the syntax tree of the program, and to perform semantic analysis.

Three Address Code (TAC)

This IR stands between the AST and the assembly code, and lets us handle structural transformations separately from the details of assembly language (this is to be done), and it's also well suited for applying some compile-time optimizations (also to be done).

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
.vscode		.vscode
crates		crates
.editorconfig		.editorconfig
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
clippy.toml		clippy.toml
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

cyan

Why

Features

Grammar

Trees and IRs

AST

Three Address Code (TAC)

Assembly AST (AAST)

Links

License

About

Uh oh!

Uh oh!

Languages

License

norskeld/cyan

Folders and files

Latest commit

History

Repository files navigation

cyan

Why

Features

Grammar

Trees and IRs

AST

Three Address Code (TAC)

Assembly AST (AAST)

Links

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages