jp1924

jp1924

AI & ML interests

Audio, Image, Text

Recent Activity

upvoted a paper about 11 hours ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

upvoted a paper 4 days ago

Self-Improving Language Models with Bidirectional Evolutionary Search

new activity 5 days ago

naver-hyperclovax/HyperCLOVAX-SEED-Think-32B:Update chat_template.jinja

View all activity

Organizations

upvoted a paper about 11 hours ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2, 2025 • 26

upvoted a paper 4 days ago

Self-Improving Language Models with Bidirectional Evolutionary Search

Paper • 2605.28814 • Published 6 days ago • 56

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B 5 days ago

Update chat_template.jinja

#12 opened 2 months ago by

jp1924

upvoted a paper 6 days ago

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Paper • 2605.25604 • Published 8 days ago • 133

upvoted a paper 7 days ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published 11 days ago • 214

upvoted 2 papers 12 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 21 days ago • 195

Process Rewards with Learned Reliability

Paper • 2605.15529 • Published 18 days ago • 53

liked a dataset 14 days ago

TeichAI/DeepSeek-v4-Pro-Agent

Traces • Updated 11 days ago • 4.01k • 7.91k • 74

upvoted a paper about 1 month ago

RLPR: Extrapolating RLVR to General Domains without Verifiers

Paper • 2506.18254 • Published Jun 23, 2025 • 35

liked 2 datasets about 1 month ago

nvidia/Nemotron-Personas-Korea

Viewer • Updated Apr 23 • 1M • 34k • 481

allenai/RLVR-IFeval

Viewer • Updated Nov 21, 2024 • 15k • 1.25k • 33

upvoted a paper about 1 month ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Paper • 2602.22495 • Published Feb 26 • 5

upvoted a paper about 2 months ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 45

liked a Space about 2 months ago

LLM Embeddings Explained: A Visual and Intuitive Guide

🚀

346

How Language Models Turn Text into Meaning, From Traditional

liked a dataset about 2 months ago

llamaindex/ParseBench

Benchmark • Updated Apr 19 • 169k • 53.1k • 89

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B 2 months ago

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

#11 opened 2 months ago by

jp1924

test

#10 opened 2 months ago by

jp1924

liked a dataset 3 months ago

aiqwe/FinShibainu

Viewer • Updated Dec 18, 2024 • 87.4k • 182 • 7

liked a Space 3 months ago

CircleCI Test Collection Helper Space

📊

Query test results for a PR

updated a dataset 3 months ago

jp1924/PatternedUtteranceWithNumber

Preview • Updated Feb 25 • 55

jp1924

AI & ML interests

Recent Activity

Organizations

jp1924's activity

Update chat_template.jinja

LLM Embeddings Explained: A Visual and Intuitive Guide

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

test

CircleCI Test Collection Helper Space