4 8

Zhihe Yang

zhyang2226

AI & ML interests

Trustworthy RL & Offline RL

Recent Activity

upvoted a paper 3 days ago

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

liked a model 2 months ago

NbAiLabArchive/whisper-large-v2-nob

liked a model 5 months ago

tencent/HunyuanImage-3.0

View all activity

Organizations

upvoted a paper 3 days ago

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published 4 days ago • 33

liked a model 2 months ago

NbAiLabArchive/whisper-large-v2-nob

Automatic Speech Recognition • 2B • Updated Sep 13, 2023 • 10 • 13

liked a model 5 months ago

tencent/HunyuanImage-3.0

Text-to-Image • Updated Jan 28 • 692k • • 644

liked a model 8 months ago

tencent/HunyuanVideo

Text-to-Video • Updated Mar 6, 2025 • 1.14k • • 2.13k

authored 2 papers 8 months ago

Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key

Paper • 2501.09695 • Published Jan 16, 2025 • 1

Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs

Paper • 2505.12929 • Published May 19, 2025 • 3

upvoted a paper 10 months ago

Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs

Paper • 2505.12929 • Published May 19, 2025 • 3

liked a dataset 11 months ago

BytedTsinghua-SIA/DAPO-Math-17k

Viewer • Updated Apr 18, 2025 • 1.79M • 5.81k • 156

upvoted a paper 12 months ago

Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key

Paper • 2501.09695 • Published Jan 16, 2025 • 1

liked a Space 12 months ago

AI Deadlines

⚡

662

Track upcoming AI conference and workshop deadlines

liked a dataset about 1 year ago

openbmb/RLAIF-V-Dataset

Preview • Updated Oct 14, 2025 • 931 • 206

upvoted a paper about 1 year ago

Region-Adaptive Sampling for Diffusion Transformers

Paper • 2502.10389 • Published Feb 14, 2025 • 53

liked a model about 1 year ago

lmms-lab/llava-onevision-qwen2-7b-ov

Text Generation • 8B • Updated Sep 2, 2024 • 141k • 62

updated a model about 1 year ago

zhyang2226/opadpo-lora_llava-v1.5-13b

Updated Jan 16, 2025

published a model about 1 year ago

zhyang2226/opadpo-lora_llava-v1.5-13b

Updated Jan 16, 2025

updated a model about 1 year ago

zhyang2226/opadpo-lora_llava-v1.5-7b

Updated Jan 16, 2025

published a model about 1 year ago

zhyang2226/opadpo-lora_llava-v1.5-7b

Updated Jan 16, 2025

liked a model over 1 year ago

openbmb/RLHF-V

Text Generation • Updated May 28, 2024 • 26 • 18

Zhihe Yang

AI & ML interests

Recent Activity

Organizations

zhyang2226's activity

AI Deadlines