Instructions to use BEE-spoke-data/beecoder-220M-python with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BEE-spoke-data/beecoder-220M-python with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="BEE-spoke-data/beecoder-220M-python")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("BEE-spoke-data/beecoder-220M-python")
model = AutoModelForCausalLM.from_pretrained("BEE-spoke-data/beecoder-220M-python")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use BEE-spoke-data/beecoder-220M-python with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "BEE-spoke-data/beecoder-220M-python"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BEE-spoke-data/beecoder-220M-python",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/BEE-spoke-data/beecoder-220M-python

SGLang

How to use BEE-spoke-data/beecoder-220M-python with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "BEE-spoke-data/beecoder-220M-python" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BEE-spoke-data/beecoder-220M-python",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "BEE-spoke-data/beecoder-220M-python" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BEE-spoke-data/beecoder-220M-python",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use BEE-spoke-data/beecoder-220M-python with Docker Model Runner:
```
docker model run hf.co/BEE-spoke-data/beecoder-220M-python
```

beecoder-220M-python / README.md

pszemraj

Super-squash branch 'main' using huggingface_hub

9c2dc09 verified 5 months ago

preview code

raw

history blame contribute delete

2.9 kB

	---
	license: apache-2.0
	base_model: BEE-spoke-data/smol_llama-220M-GQA
	datasets:
	- BEE-spoke-data/pypi_clean-deduped
	- bigcode/the-stack-smol-xl
	- EleutherAI/proof-pile-2
	language:
	- en
	tags:
	- python
	- codegen
	- markdown
	- smol_llama
	metrics:
	- accuracy
	inference:
	parameters:
	max_new_tokens: 64
	min_new_tokens: 8
	do_sample: true
	epsilon_cutoff: 0.0008
	temperature: 0.3
	top_p: 0.9
	repetition_penalty: 1.02
	no_repeat_ngram_size: 8
	renormalize_logits: true
	widget:
	- text: \|
	def add_numbers(a, b):
	return
	example_title: Add Numbers Function
	- text: \|
	class Car:
	def __init__(self, make, model):
	self.make = make
	self.model = model

	def display_car(self):
	example_title: Car Class
	- text: \|
	import pandas as pd
	data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
	df = pd.DataFrame(data).convert_dtypes()
	# eda
	example_title: Pandas DataFrame
	- text: \|
	def factorial(n):
	if n == 0:
	return 1
	else:
	example_title: Factorial Function
	- text: \|
	def fibonacci(n):
	if n <= 0:
	raise ValueError("Incorrect input")
	elif n == 1:
	return 0
	elif n == 2:
	return 1
	else:
	example_title: Fibonacci Function
	- text: \|
	import matplotlib.pyplot as plt
	import numpy as np
	x = np.linspace(0, 10, 100)
	# simple plot
	example_title: Matplotlib Plot
	- text: \|
	def reverse_string(s:str) -> str:
	return
	example_title: Reverse String Function
	- text: \|
	def is_palindrome(word:str) -> bool:
	return
	example_title: Palindrome Function
	- text: \|
	def bubble_sort(lst: list):
	n = len(lst)
	for i in range(n):
	for j in range(0, n-i-1):
	example_title: Bubble Sort Function
	- text: \|
	def binary_search(arr, low, high, x):
	if high >= low:
	mid = (high + low) // 2
	if arr[mid] == x:
	return mid
	elif arr[mid] > x:
	example_title: Binary Search Function
	pipeline_tag: text-generation
	---

	# BEE-spoke-data/beecoder-220M-python




	This is `BEE-spoke-data/smol_llama-220M-GQA` fine-tuned for code generation on:

	- filtered version of stack-smol-XL
	- deduped version of 'algebraic stack' from proof-pile-2
	- cleaned and deduped pypi (last dataset)

	This model (and the base model) were both trained using ctx length 2048.

	## examples

	> Example script for inference testing: [here](https://gist.github.com/pszemraj/c7738f664a64b935a558974d23a7aa8c)

	It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes.



	![image/png](https://cdn-uploads.huggingface.co/production/uploads/60bccec062080d33f875cd0c/bLrtpr7Vi_MPvtF7mozDN.png)

	The screenshot is on CPU on a laptop.

	---