Multi🤖Transformers

community

https://www.tonic-ai.com

josephpollack

tonic-ai

Activity Feed Request to join this org

AI & ML interests

🤖🤗multi media inputs and outputs to create augmented culture and better outcomes for humans everywhere.❤️🚀

Recent Activity

peaceAsh authored a paper about 2 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Narchethan updated a Space about 1 year ago

MultiTransformer/Orion

Tonic updated a Space about 1 year ago

MultiTransformer/tonic_gradio_bot

View all activity

Tonic

posted an update 13 days ago

Post

4131

🙋🏻‍♂️ Hey there folks,

since everyone liked my previous announcement post ( https://huggingface.co/posts/Tonic/338509028435394 ) so much , i'm back with more high quality proceedural datasets in the Geospacial domain for SFT training !

Check this one out :
NuTonic/sat-bbox-metadata-sft-v1

the goal is to be able to train vision models on multiple images for remote sensing analysis with one shot .

hope you like it ! 🚀

2 replies

Tonic

posted an update 18 days ago

Post

3560

🙋🏻‍♂️ Hey there folks ,

I'm sharing huggingface's largest dataset of annotated statelite images today.

check it out here : NuTonic/sat-image-boundingbox-sft-full

I hope you like it , the idea is to be able to use this with small vision models 🚀

Parveshiiii

posted an update 27 days ago

Post

536

🚀 Sonic: A lightweight Python audio processing library with tempo matching, BPM detection, time-stretching, resampling & track blending — now with GPU (CUDA) acceleration for 10x speed!

Perfect for quick remixes, batch edits or syncing tracks.

👉 https://github.com/Parveshiiii/Sonic

#Python #AudioProcessing #OpenSource #PyTorch

Parveshiiii

posted an update about 1 month ago

Post

1617

Excited to announce my latest open-source release on Hugging Face: Parveshiiii/breast-cancer-detector.

This model has been trained and validated on external datasets to support medical research workflows. It is designed to provide reproducible benchmarks and serve as a foundation for further exploration in healthcare AI.

Key highlights:
- Built for medical research and diagnostic study contexts
- Validated against external datasets for reliability
- Openly available to empower the community in building stronger, more effective solutions

This release is part of my ongoing effort to make impactful AI research accessible through **Modotte**. A detailed blog post explaining the methodology, dataset handling, and validation process will be published soon.

You can explore the model here: Parveshiiii/breast-cancer-detector

#AI #MedicalResearch #DeepLearning #Healthcare #OpenSource #HuggingFace

Reubencf

authored a paper about 2 months ago

Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language

Paper • 2603.23529 • Published Mar 7

Parveshiiii

posted an update about 2 months ago

Post

2914

Just did something I’ve been meaning to try for ages.

In only 3 hours, on 10 billion+ tokens, I trained a custom BPE + tiktoken-style tokenizer using my new library microtok — and it hits the same token efficiency as Qwen3.

Tokenizers have always felt like black magic to me. We drop them into every LLM project, but actually training one from scratch? That always seemed way too complicated.

Turns out it doesn’t have to be.

microtok makes the whole process stupidly simple — literally just 3 lines of code. No heavy setup, no GPU required. I built it on top of the Hugging Face tokenizers library so it stays clean, fast, and actually understandable.

If you’ve ever wanted to look under the hood and build your own optimized vocabulary instead of just copying someone else’s, this is the entry point you’ve been waiting for.

I wrote up the full story, threw in a ready-to-run Colab template, and dropped the trained tokenizer on Hugging Face.

Blog → https://parveshiiii.github.io/blogs/microtok/
Trained tokenizer → https://huggingface.co/Parveshiiii/microtok
GitHub repo → https://github.com/Parveshiiii/microtok

Nymbo

posted an update about 2 months ago

Post

6944

We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.

Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.

3 replies

peaceAsh

authored a paper about 2 months ago

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 24

Reubencf

posted an update 2 months ago

Post

2801

🚀 I am thrilled to announce the release of a new Konkani LLM!

We've seen some fantastic results for both translation and transliteration tasks, and I'm excited to share this progress with the community.

📖 Read the launch article and see the results: https://huggingface.co/blog/Reubencf/konkani-llm
🤖 Explore the model and collection:

konkani

I would love to hear your feedback or see what you build with it! #Konkani #LLM #NLP #HuggingFace #IndicNLP #Konkani

Tonic

posted an update 3 months ago

Post

3714

🤔 Who would win ?

- a fully subsidized ai lab
OR
- 3 random students named

kurakurai ?

demo : Tonic/fr-on-device

if you like it give the demo a little star and send a shoutout to : @MaxLSB @jddqd and @GAD-cell for absolutely obliterating the pareto frontier of the french language understanding .

4 replies

Tonic

posted an update 3 months ago

Post

3429

🙋🏻‍♂️hello my lovelies ,

it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment.

repo : Tonic/hugging-claw (use git clone to inspect)
literally the one-click link : Tonic/hugging-claw

you can also run it locally and see for yourself :

docker run -it -p 7860:7860 --platform=linux/amd64 \
-e HF_TOKEN="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \
-e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \
registry.hf.space/tonic-hugging-claw:latest

just a few quite minor details i'll take care of but i wanted to share here first

2 replies

Parveshiiii

posted an update 3 months ago

Post

343

Introducing Seekify — a truly non‑rate‑limiting search library for Python

Tired of hitting rate limits when building search features? I’ve built Seekify, a lightweight Python library that lets you perform searches without the usual throttling headaches.

🔹 Key highlights

- Simple API — plug it in and start searching instantly

- No rate‑limiting restrictions

- Designed for developers who need reliable search in projects, scripts, or apps

📦 Available now on PyPI:

pip install seekify

👉 Check out the repo: https:/github.com/Parveshiiii/Seekify
I’d love feedback, contributions, and ideas for real‑world use cases. Let’s make search smoother together!

posted an update 3 months ago

Post

253

finding https://github.com/meta-introspector/monster/blob/9a368b1dd58e72ed4a466f81f74ab2ea95c26927/experiments/bott_periodicity/monster_walk.tex#L82 I sent this to my old math prof
You can remove these primes in groups from the monster in the 10fold way
1 & 0 & 8080 & 4 & 8 \\
2 & 4 & 1742 & 4 & 4 \\
3 & 8 & 479 & 3 & 4 \\
4 & 11 & 451 & 3 & 4 \\
5 & 14 & 2875 & 4 & 4 \\
https://x.com/introsp3ctor/status/2018078520321179935

Parveshiiii

posted an update 4 months ago

Post

1644

🚀 Wanna train your own AI Model or Tokenizer from scratch?

Building models isn’t just for big labs anymore — with the right data, compute, and workflow, you can create **custom AI models** and **tokenizers** tailored to any domain. Whether it’s NLP, domain‑specific datasets, or experimental architectures, training from scratch gives you full control over vocabulary, embeddings, and performance.

✨ Why train your own?
- Full control over vocabulary & tokenization
- Domain‑specific optimization (medical, legal, technical, etc.)
- Better performance on niche datasets
- Freedom to experiment with architectures

⚡ The best part?
- Tokenizer training (TikToken / BPE) can be done in **just 3 lines of code**.
- Model training runs smoothly on **Google Colab notebooks** — no expensive hardware required.

📂 Try out my work:
- 🔗 https://github.com/OE-Void/Tokenizer-from_scratch
- 🔗 https://github.com/OE-Void/GPT

Reubencf

posted an update 4 months ago

Post

2216

📢 New release! World_events Dataset now available featuring global events spanning 2023 through 2025
🌍 https://huggingface.co/collections/Reubencf/world-events

🚀 2026 dataset dropping soon

1 reply

Reubencf

posted an update 4 months ago

Post

1901

Now Live: The Reubencf/Nano_Banana_Editor now includes 10 free requests/day! 🍌 I'm personally sponsoring these credits to help make open AI accessible to all.
(Note: Limits are subject to change based on funding).

Enjoy !

Parveshiiii

posted an update 4 months ago

Post

266

📢 The Announcement
Subject: XenArcAI is now Modotte – A New Chapter Begins! 🚀

Hello everyone,

We are thrilled to announce that XenArcAI is officially rebranding to Modotte!

Since our journey began, we’ve been committed to pushing the boundaries of AI through open-source innovation, research, and high-quality datasets. As we continue to evolve, we wanted a name that better represents our vision for a modern, interconnected future in the tech space.

What is changing?

The Name: Moving forward, all our projects, models, and community interactions will happen under the Modotte banner.

The Look: You’ll see our new logo and a fresh color palette appearing across our platforms.

What is staying the same?

The Core Team: It’s still the same people behind the scenes, including our founder, Parvesh Rawal.

Our Mission: We remain dedicated to releasing state-of-the-art open-source models and datasets.

Our Continuity: All existing models, datasets, and projects will remain exactly as they are—just with a new home.

This isn’t just a change in appearance; it’s a commitment to our next chapter of growth and discovery. We are so grateful for your ongoing support as we step into this new era.

Welcome to the future. Welcome to Modotte.

Best regards, The Modotte Team

mindchain

posted an update 4 months ago

Post

3085

Claude Code Self & Continual Learning

Hey everyone! 👋

30 GitHub Stars in 4 Days - Thank You!

I'm really grateful for the positive response to the Claude Reflect System. In just 4 days, 30 developers have shown interest by starring the project. Thank you so much!

What Is Claude Reflect?

Correct once, never again. Claude Reflect helps Claude Code remember your corrections and preferences across sessions. Instead of repeating the same feedback, the system learns and applies it automatically.

Main Features:

🧠 Learning System
- Detects corrections and preferences from conversations
- Stores them permanently in skill files
- Applies learnings in future sessions

🔒 Safety First
- Automatic backups before changes
- YAML validation
- Git version control

⚡ Two Modes
- Manual: Run /reflect when you want
- Auto: Reflects automatically at session end

How It Works

If you correct Claude to use pytest instead of unittest, this preference gets saved. Next time, Claude will remember and use pytest automatically. It's that simple.

Getting Started

1. Clone the repository
2. Install dependencies
3. Activate the skill
4. Try it out!

The python-project-creator example shows how the system learns from your feedback.

Give It a Try

https://github.com/haddock-development/claude-reflect-system

Feel free to check it out, give feedback, or contribute. Every bit of input helps improve the project!

Thank you so much for your support!

---
#ClaudeCode #AI #MachineLearning #ContinualLearning #OpenSource #Developer #Coding #Python #Productivity #DevTools #GitHub #SoftwareDevelopment #Programming #AIAssistant #DeveloperTools #CodeQuality #Tech

Feel free to give it a try by yourself.
https://github.com/haddock-development/claude-reflect-system

2 replies

Nymbo

posted an update 4 months ago

Post

3111

Genuine recommendation: You should really use this AutoHotKey macro. Save the file as macros.ahk and run it. Before sending a prompt to your coding agent, press Ctrl + Alt + 1 and paste your prompt to any regular chatbot. Then send the output to the agent. This is the actual, boring, real way to "10x your prompting". Use the other number keys to avoid repeating yourself over and over again. I use this macro prolly 100-200 times per day. AutoHotKey isn't as new or hype as a lot of other workflows, but there's a reason it's still widely used after 17 years. Don't overcomplicate it.

; Requires AutoHotkey v1.1+

; All macros are `Ctrl + Alt + <variable>`

^!1::
    Send, Please help me more clearly articulate what I mean with this message (write the message in a code block):
return

^!2::
    Send, Please make the following changes:
return

^!3::
    Send, It seems you got cut off by the maximum response limit. Please continue by picking up where you left off.
return

In my experience the past few months, Ctrl + Alt + 1 works best with Instruct models (non-thinking). Reasoning causes some models to ramble and miss the point. I've just been using GPT-5.x for this.

mindchain

posted an update 4 months ago

Post

1886

Scaling Physical AI: SAM 3D, NVIDIA Cosmos, and Unreal Engine!

The "Sim-to-Real" gap is officially history. In early 2026, we are no longer just rendering data; we are simulating reality. By bridging Meta’s SAM 3D, Unreal Engine, and the NVIDIA Cosmos suite, we’ve built an autonomous pipeline for Physical AI that evolves itself.

The 2026 Tech Stack:
SAM 3D: Generates high-fidelity digital twins from 2D photos in seconds.

Unreal Engine + MCP: The AI "Director" orchestrates environments via the Model Context Protocol, providing perfect Ground Truth.

NeMo Data Designer: The orchestration hub on GitHub. Following NVIDIA’s acquisition of Gretel in early 2025, its leading generative privacy and tabular tech are now fully integrated here.

NVIDIA Cosmos Transfer: Neural rendering that adds hyper-realism to Unreal Engine outputs.

NVIDIA Cosmos Predict: Predicts physically accurate motion (falling, sliding) without manual animation.

NVIDIA Cosmos Reason: The automated supervisor checking every frame for logical and physical consistency.

The Workflow:
Asset Capture: SAM 3D turns real-world photos into Nanite meshes for Unreal Engine.

Orchestration: NeMo Data Designer (with Gretel-powered integrity) defines the data schema, while AI builds the world in Unreal Engine.

Completion: NVIDIA Cosmos (Transfer & Predict) adds photorealism and physics, while NVIDIA Cosmos Reason guarantees quality.

By combining Gretel’s data heritage with the visual power of Unreal Engine, we generate 100,000 perfect frames per hour. Weights and tools are on Hugging Face. Stop labeling. Start simulating.

#PhysicalAI #SAM3D #NVIDIACosmos #UnrealEngine #NeMo #Gretel #SyntheticData #HuggingFace #Robotics #AI #ComputerVision

AI & ML interests

Recent Activity

Team members 106

MultiTransformer's activity