Regex to DFA Converter

A Python implementation of Thompson's construction algorithm to convert regular expressions to deterministic finite automata (DFA).

Overview

This project implements a complete pipeline for converting regular expressions to deterministic finite automata:

Regular expression parsing
Conversion to postfix notation (using Shunting Yard algorithm)
NFA construction (using Thompson's algorithm)
DFA conversion (using subset construction)
DFA simulation

Features

Supports basic regex operations: concatenation, alternation (|), Kleene star (*), plus (+), and optional (?)
Handles character escaping with backslash
Supports direct testing of regex patterns against input strings
Includes a test framework for validating regex pattern matching

Components

regex_to_postfix.py: Converts infix regex notation to postfix (Polish) notation
nfa_builder.py: Implements Thompson's construction algorithm to build NFAs
nfa_to_dfa.py: Converts NFAs to DFAs using subset construction
main.py: CLI interface for testing regex patterns

Usage

Run the main script with a test file:

python main.py [test_file.json]

If no test file is provided, it will use the default test file.

Test File Format

The test file should be a JSON array of objects with the following structure:

[
  {
    "name": "Test Case 1",
    "regex": "a(b|c)*",
    "test_strings": [
      {"input": "abcbc", "expected": true},
      {"input": "ad", "expected": false}
    ]
  }
]

Implementation Details

Regex to Postfix

The infix regex is first converted to postfix notation using the Shunting Yard algorithm, with explicit concatenation symbols added. Operator precedence is:

*, +, ? (highest)
. (concatenation)
| (alternation) (lowest)

Thompson's Construction

The postfix regex is converted to an NFA using Thompson's construction algorithm. Each regex operator is handled by specific NFA constructions with ε-transitions used to combine sub-NFAs.

Subset Construction

The NFA is converted to a DFA using the subset construction algorithm. This involves computing ε-closures and creating DFA states that represent sets of NFA states.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Assignment #2 - LFA.pdf		Assignment #2 - LFA.pdf
LFA-Assignment2_Regex_DFA_v2.json		LFA-Assignment2_Regex_DFA_v2.json
README.md		README.md
main.py		main.py
nfa_builder.py		nfa_builder.py
nfa_to_dfa.py		nfa_to_dfa.py
regex_to_postfix.py		regex_to_postfix.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regex to DFA Converter

Overview

Features

Components

Usage

Test File Format

Implementation Details

Regex to Postfix

Thompson's Construction

Subset Construction

About

Releases

Packages

Languages

infernosalex/regex2dfa

Folders and files

Latest commit

History

Repository files navigation

Regex to DFA Converter

Overview

Features

Components

Usage

Test File Format

Implementation Details

Regex to Postfix

Thompson's Construction

Subset Construction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages