If 'fun at parties' means ignoring the potential of a 146 trillion parameter model, then yeah, Iโm the most boring person you'll ever meet. Iโll let the results do the talking from here.
I'm not saying that an 140 whetever trillion parameter model can't exist, I'm just telling that your "paper" is misleading users to believe that someone single handed made an AGI.
Just be realistic, try making a 140 Billion model once and reply me how much time it took to train it from scratch.
Training a 140B model is a calculation of compute; designing a 146T architecture is a matter of engineering. While you're stuck on the 'time' it takes others, Iโm focused on the MoE scaling and dataset curation for SKT AI. If youโre so concerned about the realism, do ๐๐ผ ๐๐ป๐ฑ ๐๐ต๐ฒ๐ฐ๐ธ ๐ข๐๐ ๐ข๐๐ฒ๐ฟ ๐ฅ๐ฒ๐ฝ๐ผ ๐๐ผ๐น
I have better things to do in my free time than look at a ""paper"" written by artificial intelligence.
Thatโs the differenceโyou have 'free time' to argue, Iโm busy engineering the future of Indian AI. If you canโt tell the difference between a roadmap and a chatbot output, thatโs on you. Enjoy your free time while I keep building. Do go and check out our repo lol
ัะบั ฮฑฮน โฮฑะฒั
Shrijanagain
AI & ML interests
๐ฉ เคเคฏ เคถเฅเคฐเฅ เคฐเคพเคฎ ๐ฉ | โ๐ฑ ๐๐๐ ๐๐ ๐๐๐๐ | ๐๐๐๐๐ ๐๐๐๐๐๐๐๐๐ ๐ฉ
๐ Mission: 200 Trillion Token Collecting ๐ฎ๐ณ
๐ฅ Vibe: "Jo Ram ka nahi, wo mere kaam ka nahi." ๐น
Recent Activity
new activity about 2 hours ago
Shrijanagain/SKT_OMNI_SUPREME:Danger repliedto their post about 6 hours ago
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: Shrijan Kumar Tiwari
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion
Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull
Whitepaper - https://github.com/SHRIJANAGAIN/PROFF repliedto their post about 7 hours ago
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: Shrijan Kumar Tiwari
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion
Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull
Whitepaper - https://github.com/SHRIJANAGAIN/PROFF