README.md · beyoru/MaxCoder-4B at main

MaxCoder-4B / README.md

Update README.md

ba48b6f verified 5 months ago

1.01 kB

	---
	base_model:
	- beyoru/EvolLLM
	tags:
	- text-generation-inference
	- transformers
	- qwen3
	- code
	- tool
	- agent
	- evolution
	- merge
	- RL
	- grpo
	- rlvr
	license: apache-2.0
	language:
	- en
	library_name: transformers
	---

	This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300">
	</p>


	Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-<p align="center">

	# Support me at:

	<a href="https://www.buymeacoffee.com/ductransa0g" target="_blank">
	<img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" width="150px">
	</a>
	</p>