8000 GitHub - x22x22/grpo-graph-extraction: Qwen GRPO Graph Extraction RL Finetune
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

x22x22/grpo-graph-extraction

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qwen GRPO Graph Extraction RL Finetune

We initially explored GRPO based on the Reasoning model synthetic CoT Graph Extraction data, with LLM involved in the reward function.

.
├── LICENSE
├── ground_truth_gen # data gen via DeepSeek R1
│   ├── polished_rl_training_data.csv
│   └── r1_distill_reasoning_graph_extraction.ipynb
└── train
    └── Qwen_GRPO_Graph_Extraction.ipynb # training process

update: Seems the training notebook doesnt render properly in github, check from colab instead:

Data Gen Training
Open In Colab Open In Colab

Credits

About

Qwen GRPO Graph Extraction RL Finetune

Resources

License

Stars

Watchers

Forks

3154

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%
0