Mathkicker-Nougat-LaTeX-based is fine-tuned from facebook/nougat-base with [im2latex-100k] and a custom dataset to boost its proficiency in generating LaTeX code from images.
- Prepare your dataset in this format
- Change
config/base.yaml
- Run the training script
python tools/train_experiment.py --config_file config/base.yaml --phase 'train'
- Download the model
- Install dependency
pip install -r all_requirements.txt
- You can find an example in examples folder
python examples/run_latex_ocr.py --img_path "examples/test_data/eq1.png"
-
Q: Why did you copy and place the
image_processor_nougat.py
file in the repository rather than simply importing it from thetransformers
library if there are no changes compared to the one inhuggingface/transformers
? -
A:
transformers 4.34.0
is the first version that natively supports the nougat. However, there is a bug in the nougat processor within this version, which can result in a run failure. You can review the details of this issue here. Fortunately, the developers have already addressed this bug, and I anticipate that you will be able to directly import it fromtransformers
in the next released version.
please consider leaving me a star if you find this repo helpful :)