SigmaNOVA

SigmaNOVA is a custom 18-Million parameter language model trained from scratch using a novel predictive coding hierarchical architecture. It achieves 100% recall on factual information and identity in benchmark testing despite its extremely small footprint.

Because of its recurrent Gated Context Memory architecture, it scales linearly O(N) with sequence length, making it incredibly fast and memory efficient compared to traditional Transformer models.

How to Use This Model

This repository contains the model weights and the custom tokenizer.

To actually run and interact with SigmaNOVA, please head over to the official GitHub repository for the codebase and setup instructions: GitHub Repository: SigmaNOVA Codebase

Kaggle Notebook

You can also try out the model directly in your browser without any installation using our public Kaggle notebook:

Quick Start

  1. Clone the GitHub repository.
  2. Install the required dependencies.
  3. Download the files from this Hugging Face repository (nova_v6_weights_instruct.pt and nova_tokenizer.json).
  4. Create a folder named model inside the GitHub project directory and place the downloaded files inside it.
  5. Run python chat.py to launch the interactive terminal!

Training Data & Purpose

This version of SigmaNOVA is primarily a proof-of-concept test model to demonstrate the capabilities of its novel architecture.

  • Base Brain (Grammar): Pre-trained on the TinyStories dataset to learn basic English grammar and sentence structure (~35,302,519 words).
  • Instruction Tuning: Fine-tuned on a micro-dataset of just 298 question-answer pairs (~4,000 words).

Despite this extremely limited fine-tuning data, the model's new architecture allows it to achieve 100% recall on factual information and identity benchmarks!

Example Interactions

Here are some real examples of the model's responses during benchmark testing:

Greetings & Identity

  • User: What is your name?
  • SigmaNOVA: My name is SigmaNOVA. I was built to help you.

Factual Recall

  • User: What is the sun?
  • SigmaNOVA: The sun is a star that gives Earth light and heat.
  • User: What is the moon?
  • SigmaNOVA: The moon is a rocky body that orbits the Earth.

Reasoning & Concepts

  • User: What is the opposite of hot?
  • SigmaNOVA: The opposite of hot is cold.
  • User: What color is the sky?
  • SigmaNOVA: The sky is blue during the day.

Architecture Highlights

  • Parameters: 18,221,568
  • Vocab Size: 32,000 (Custom BPE Tokenizer)
  • Context Dimension: 512
  • Topology: Gated Context Memory + 3-Layer Predictive Coding Hierarchy (512 -> 1024 -> 512)
  • Training: Full Backpropagation Through Time (BPTT) with <|sep|> and <|end|> boundary tokens.

License

Please see the GitHub repository for proper attribution guidelines if you use this model or code in your project.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support