alkinun

AtAndDev

·

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

updated a model about 6 hours ago

aethercompute/aether0-2b

published a model about 6 hours ago

aethercompute/aether0-2b

reacted to aufklarer's post with 🔥 about 16 hours ago

Voice cloning models measured across five languages: OmniVoice, Chatterbox, VoxCPM2, Fish Audio I published a new Soniqo benchmark post for local voice cloning models across five languages: https://www.soniqo.audio/blog/voice-cloning-benchmarks Models: - OmniVoice int8 - Chatterbox Multilingual fp16 - VoxCPM2 bf16 - Fish Audio S2 Pro fp16 Languages: - English - German - Modern Standard Arabic - Spanish - Mandarin Chinese The benchmark uses Google FLEURS test clips as dataset references. Each row includes the reference audio, generated audio, speaker similarity, WER/CER, generated audio length, and RTF. Main result in this run: OmniVoice was the strongest all-around row set, with 0.707 mean speaker cosine across all five languages, 0.0% ASR error, and mean RTF 0.45. VoxCPM2 bf16 was especially strong on Arabic speaker match. Fish Audio S2 Pro showed strong German/Arabic similarity but slower RTF. Chatterbox Multilingual was competitive on Arabic and Spanish. This is an engineering benchmark, not a human MOS study. The speaker-similarity values should be compared within this table because every row uses the same local speaker-embedding pipeline. Try the stack locally with Speech Studio: https://www.soniqo.audio/speech-studio https://github.com/soniqo/speech-studio Underlying Swift library/CLI: https://github.com/soniqo/speech-swift Soniqo models and exports: https://huggingface.co/soniqo https://huggingface.co/aufklarer What model or language should I add next?

View all activity

Organizations

liked 2 models 1 day ago

amalia-llm/AMALIA-VL-DPO

Image-Text-to-Text • 10B • Updated 4 days ago • 145 • 1

amalia-llm/AMALIA-VL-SFT

Image-Text-to-Text • 10B • Updated 4 days ago • 98 • 3

liked a dataset 1 day ago

amalia-llm/DPO-Dataset

Viewer • Updated 9 days ago • 424k • 29 • 1

liked 2 models 1 day ago

amalia-llm/AMALIA-9B-0626-SFT

Text Generation • 9B • Updated 2 days ago • 1.19k • 15

amalia-llm/AMALIA-9B-0626-DPO

Text Generation • 9B • Updated 1 day ago • 520 • 7

liked a dataset 1 day ago

XDOF/ABC-130k

Updated 2 days ago • 621k • 74

liked a dataset 2 days ago

nvidia/Nemotron-Pretraining-Code-v3

Viewer • Updated about 1 month ago • 146M • 3.65k • 56

liked 2 models 4 days ago

LiquidAI/LFM2.5-230M

Text Generation • 0.2B • Updated 9 days ago • 33.2k • 206

Chunjiang-Intelligence/DeepSeek-v4-Fable

Text Generation • 149B • Updated 4 days ago • 2.15k • 150

liked 2 datasets 6 days ago

epfml/FineWeb2-HQ

Viewer • Updated Feb 19, 2025 • 380M • 10.6k • 71

LLM360/MegaMath

Viewer • Updated Apr 9, 2025 • 217M • 30.7k • 126

liked a Space 6 days ago

The Smol Training Playbook

The secrets to building world-class LLMs

liked 8 datasets 6 days ago

allenai/peS2o

Updated Oct 13, 2024 • 11.3k • 197

OpenCoder-LLM/opc-fineweb-code-corpus

Viewer • Updated Nov 24, 2024 • 101M • 3.38k • 54

mlfoundations/dclm-baseline-1.0-parquet

Viewer • Updated Jul 19, 2024 • 2.73B • 32.3k • 53

HuggingFaceTB/stack-edu

Viewer • Updated Mar 20, 2025 • 167M • 3.89k • 75

OpenCoder-LLM/opc-annealing-corpus

Viewer • Updated May 29, 2025 • 15.6M • 1.3k • 44

HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 36.8k • 469

HuggingFaceFW/finephrase

Viewer • Updated Mar 31 • 1.02B • 331k • 133

incredible45/Gutenberg-BookCorpus-Cleaned-Data-English

Viewer • Updated Apr 10, 2025 • 51.4k • 853 • 15