Text Generation
Transformers
Safetensors
English
llama
python
codegen
markdown
smol_llama
text-generation-inference
Instructions to use BEE-spoke-data/beecoder-220M-python with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BEE-spoke-data/beecoder-220M-python with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="BEE-spoke-data/beecoder-220M-python")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("BEE-spoke-data/beecoder-220M-python") model = AutoModelForCausalLM.from_pretrained("BEE-spoke-data/beecoder-220M-python") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use BEE-spoke-data/beecoder-220M-python with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BEE-spoke-data/beecoder-220M-python" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BEE-spoke-data/beecoder-220M-python", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/BEE-spoke-data/beecoder-220M-python
- SGLang
How to use BEE-spoke-data/beecoder-220M-python with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "BEE-spoke-data/beecoder-220M-python" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BEE-spoke-data/beecoder-220M-python", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "BEE-spoke-data/beecoder-220M-python" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BEE-spoke-data/beecoder-220M-python", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use BEE-spoke-data/beecoder-220M-python with Docker Model Runner:
docker model run hf.co/BEE-spoke-data/beecoder-220M-python
| license: apache-2.0 | |
| base_model: BEE-spoke-data/smol_llama-220M-GQA | |
| datasets: | |
| - BEE-spoke-data/pypi_clean-deduped | |
| - bigcode/the-stack-smol-xl | |
| - EleutherAI/proof-pile-2 | |
| language: | |
| - en | |
| tags: | |
| - python | |
| - codegen | |
| - markdown | |
| - smol_llama | |
| metrics: | |
| - accuracy | |
| inference: | |
| parameters: | |
| max_new_tokens: 64 | |
| min_new_tokens: 8 | |
| do_sample: true | |
| epsilon_cutoff: 0.0008 | |
| temperature: 0.3 | |
| top_p: 0.9 | |
| repetition_penalty: 1.02 | |
| no_repeat_ngram_size: 8 | |
| renormalize_logits: true | |
| widget: | |
| - text: | | |
| def add_numbers(a, b): | |
| return | |
| example_title: Add Numbers Function | |
| - text: | | |
| class Car: | |
| def __init__(self, make, model): | |
| self.make = make | |
| self.model = model | |
| def display_car(self): | |
| example_title: Car Class | |
| - text: | | |
| import pandas as pd | |
| data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]} | |
| df = pd.DataFrame(data).convert_dtypes() | |
| # eda | |
| example_title: Pandas DataFrame | |
| - text: | | |
| def factorial(n): | |
| if n == 0: | |
| return 1 | |
| else: | |
| example_title: Factorial Function | |
| - text: | | |
| def fibonacci(n): | |
| if n <= 0: | |
| raise ValueError("Incorrect input") | |
| elif n == 1: | |
| return 0 | |
| elif n == 2: | |
| return 1 | |
| else: | |
| example_title: Fibonacci Function | |
| - text: | | |
| import matplotlib.pyplot as plt | |
| import numpy as np | |
| x = np.linspace(0, 10, 100) | |
| # simple plot | |
| example_title: Matplotlib Plot | |
| - text: | | |
| def reverse_string(s:str) -> str: | |
| return | |
| example_title: Reverse String Function | |
| - text: | | |
| def is_palindrome(word:str) -> bool: | |
| return | |
| example_title: Palindrome Function | |
| - text: | | |
| def bubble_sort(lst: list): | |
| n = len(lst) | |
| for i in range(n): | |
| for j in range(0, n-i-1): | |
| example_title: Bubble Sort Function | |
| - text: | | |
| def binary_search(arr, low, high, x): | |
| if high >= low: | |
| mid = (high + low) // 2 | |
| if arr[mid] == x: | |
| return mid | |
| elif arr[mid] > x: | |
| example_title: Binary Search Function | |
| pipeline_tag: text-generation | |
| # BEE-spoke-data/beecoder-220M-python | |
| This is `BEE-spoke-data/smol_llama-220M-GQA` fine-tuned for code generation on: | |
| - filtered version of stack-smol-XL | |
| - deduped version of 'algebraic stack' from proof-pile-2 | |
| - cleaned and deduped pypi (last dataset) | |
| This model (and the base model) were both trained using ctx length 2048. | |
| ## examples | |
| > Example script for inference testing: [here](https://gist.github.com/pszemraj/c7738f664a64b935a558974d23a7aa8c) | |
| It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes. | |
|  | |
| The screenshot is on CPU on a laptop. | |
| --- |