👋 Open to Work

RDTvlokip PRO

RDTvlokip

·

https://rdtvlokip.fr

AI & ML interests

None yet

Recent Activity

repliedto their post 1 day ago

I spent a week optimizing my 15M French LLM. Not one line of new architecture. And that was the whole point. After building it from scratch (custom crawler, BPE, LLaMA-style arch, 3-phase trainer), the model wrote perfect French but hallucinated facts and drifted off-topic. So I went hunting for the bottleneck, convinced it was the architecture. It wasn't. It never is. The wins came from boring places: a data pipeline that cut documents mid-sentence, two special tokens silently sabotaging generation, and one decoding hyperparameter that doubled coherence (38 → 76 tokens before drift). The flashy research, contrastive decoding, DoLa, gave the smallest gains. One of them was even a false negative caused by my own buggy eval harness. The real lesson isn't about French LLMs: Architecture is a threshold, not a lever. Once you clear it, the bottleneck is everywhere except the architecture. Measure first. Read your own data. Verify your code before you trust your conclusion. The model was never the problem. Full write-up here 👇 🔗 https://huggingface.co/blog/RDTvlokip/what-i-learned-optimizing-a-15m-french

repliedto their post 2 days ago

I spent a week optimizing my 15M French LLM. Not one line of new architecture. And that was the whole point. After building it from scratch (custom crawler, BPE, LLaMA-style arch, 3-phase trainer), the model wrote perfect French but hallucinated facts and drifted off-topic. So I went hunting for the bottleneck, convinced it was the architecture. It wasn't. It never is. The wins came from boring places: a data pipeline that cut documents mid-sentence, two special tokens silently sabotaging generation, and one decoding hyperparameter that doubled coherence (38 → 76 tokens before drift). The flashy research, contrastive decoding, DoLa, gave the smallest gains. One of them was even a false negative caused by my own buggy eval harness. The real lesson isn't about French LLMs: Architecture is a threshold, not a lever. Once you clear it, the bottleneck is everywhere except the architecture. Measure first. Read your own data. Verify your code before you trust your conclusion. The model was never the problem. Full write-up here 👇 🔗 https://huggingface.co/blog/RDTvlokip/what-i-learned-optimizing-a-15m-french

repliedto their post 2 days ago

I spent a week optimizing my 15M French LLM. Not one line of new architecture. And that was the whole point. After building it from scratch (custom crawler, BPE, LLaMA-style arch, 3-phase trainer), the model wrote perfect French but hallucinated facts and drifted off-topic. So I went hunting for the bottleneck, convinced it was the architecture. It wasn't. It never is. The wins came from boring places: a data pipeline that cut documents mid-sentence, two special tokens silently sabotaging generation, and one decoding hyperparameter that doubled coherence (38 → 76 tokens before drift). The flashy research, contrastive decoding, DoLa, gave the smallest gains. One of them was even a false negative caused by my own buggy eval harness. The real lesson isn't about French LLMs: Architecture is a threshold, not a lever. Once you clear it, the bottleneck is everywhere except the architecture. Measure first. Read your own data. Verify your code before you trust your conclusion. The model was never the problem. Full write-up here 👇 🔗 https://huggingface.co/blog/RDTvlokip/what-i-learned-optimizing-a-15m-french

View all activity

Organizations

published an article 3 days ago

Article

🔧 L'architecture est un seuil, pas un levier — ce que j'ai appris en optimisant un LLM français de 15M de paramètres 🇫🇷

RDTvlokip

•

3 days ago

• 1

published an article 3 days ago

Article

🔧 Architecture is a threshold, not a lever — what I learned optimizing a 15M French LLM 🇫🇷

RDTvlokip

•

3 days ago

• 1

published an article about 2 months ago

Article

🧠 I trained my own French LLM from scratch — alone, with a 1080 Ti, and the power went out ⚡🇫🇷

RDTvlokip

•

May 5

• 6

published an article about 2 months ago

Article

🧠 J'ai entraîné mon propre LLM français from scratch — seul, avec une 1080 Ti, et le courant a coupé ⚡🇫🇷

RDTvlokip

•

May 5

• 2

published an article 4 months ago

Article

🧲 Embeddings — When AI turns words into GPS coordinates! 📍🧠

RDTvlokip

•

Mar 9

• 1

published an article 4 months ago

Article

🧲 Embeddings — Quand l'IA transforme les mots en coordonnées GPS ! 📍🧠

RDTvlokip

•

Mar 9

• 1

published an article 4 months ago

Article

🎯 PCA (Principal Component Analysis) — Compresser les dimensions comme un boss ! 📊🔥

RDTvlokip

•

Feb 17

• 1

published an article 4 months ago

Article

🎯 PCA (Principal Component Analysis) — Compressing dimensions like a boss! 📊🔥

RDTvlokip

•

Feb 17

• 1

published an article 5 months ago

Article

🎯 K-Means — Quand l'IA organise le chaos en boîtes bien rangées ! 📦✨

RDTvlokip

•

Jan 29

• 1

published an article 5 months ago

Article

🎯 K-Means — When AI organizes chaos into neat boxes! 📦✨

RDTvlokip

•

Jan 29

• 1

published an article 6 months ago

Article

🎯 F1-Score — Quand l'Accuracy te ment en pleine face ! 📊💥

RDTvlokip

•

Jan 16

• 1

published an article 6 months ago

Article

🎯 F1-Score — When Accuracy lies to your face! 📊💥

RDTvlokip

•

Jan 16

• 1

published an article 6 months ago

Article

🎯 Precision & Recall — Les métriques jumelles qui ne sont jamais d'accord ! ⚖️🔍

RDTvlokip

•

Jan 8

• 1

published an article 6 months ago

Article

🎯 Precision & Recall — The twin metrics that never agree! ⚖️🔍

RDTvlokip

•

Jan 8

• 1

published an article 6 months ago

Article

🎆 AI 2026 — The 9 trends that will EXPLODE this year! 🚀💥

RDTvlokip

•

Jan 1

• 2

published an article 6 months ago

Article

🎆 IA 2026 — Les 9 tendances qui vont exploser cette année ! 🚀💥

RDTvlokip

•

Jan 1

• 2

published an article 6 months ago

Article

📊 Cross-Entropy — The loss function that KNOWS how to punish! 🎯🔥

RDTvlokip

•

Dec 29, 2025

• 1

published an article 6 months ago

Article

📊 Cross-Entropy — La fonction de perte qui SAIT punir ! 🎯🔥

RDTvlokip

•

Dec 29, 2025

• 2

published an article 6 months ago

Article

🎯 Learning Rate — L'accélérateur de ton réseau de neurones ! 🚗💨

RDTvlokip

•

Dec 21, 2025

• 1

published an article 6 months ago

Article

🎯 Learning Rate — The gas pedal of your neural network! 🚗💨

RDTvlokip

•

Dec 21, 2025

• 1