s3nh's picture

Open to Collab

s3nh PRO

s3nh

·

s3nhxx
s3nh

AI & ML interests

Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh

Recent Activity

reacted to Tonic's post with 🔥 about 8 hours ago

🙋🏻‍♂️hello my lovelies , it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment. repo : https://huggingface.co/spaces/Tonic/hugging-claw/tree/main (use git clone to inspect) literally the one-click link : https://huggingface.co/spaces/Tonic/hugging-claw?duplicate=true you can also run it locally and see for yourself : docker run -it -p 7860:7860 --platform=linux/amd64 \ -e HF_TOKEN="YOUR_VALUE_HERE" \ -e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \ -e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \ -e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \ registry.hf.space/tonic-hugging-claw:latest just a few quite minor details i'll take care of but i wanted to share here first

reacted to MonsterMMORPG's post with 🔥 2 days ago

SECourses Musubi Trainer upgraded to V27 and FLUX 2, FLUX Klein, Z-Image training added with demo configs - amazing VRAM optimized - read the news App is here : https://www.patreon.com/posts/137551634 Full tutorial how to use and train : https://youtu.be/DPX3eBTuO_Y

reacted to codelion's post with 🔥 4 days ago

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m

View all activity

Organizations

s3nh 's datasets 1

s3nh/alpaca-dolly-instruction-only-polish

Viewer • Updated May 2, 2023 • 23.7k • 34 • 6