Instructions to use InstaDeepAI/ChatNT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use InstaDeepAI/ChatNT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="InstaDeepAI/ChatNT", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("InstaDeepAI/ChatNT", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use InstaDeepAI/ChatNT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "InstaDeepAI/ChatNT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/InstaDeepAI/ChatNT
- SGLang
How to use InstaDeepAI/ChatNT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "InstaDeepAI/ChatNT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "InstaDeepAI/ChatNT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use InstaDeepAI/ChatNT with Docker Model Runner:
docker model run hf.co/InstaDeepAI/ChatNT
File size: 2,359 Bytes
64c0358 09222e6 64c0358 03e9d8a 64c0358 93927ba 64c0358 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | {
"architectures": [
"TorchMultiOmicsModel"
],
"auto_map": {
"AutoConfig": "chatNT.ChatNTConfig",
"AutoModel": "chatNT.TorchMultiOmicsModel"
},
"custom_pipelines": {
"ChatNT-text-generation": {
"impl": "text_generation.TextGenerationPipeline",
"pt": [
"AutoModel"
],
"tf": []
}
},
"bio_pad_token_id": 1,
"english_pad_token_id": 2,
"gpt_config": {
"add_bias_attn": false,
"add_bias_ffn": false,
"add_bias_lm_head": false,
"embed_dim": 4096,
"eos_token_id": 2,
"ffn_activation_name": "silu",
"ffn_embed_dim": 11008,
"norm_type": "RMS_norm",
"num_heads": 32,
"num_kv_heads": 32,
"num_layers": 32,
"parallel_attention_ff": false,
"rms_norm_eps": 1e-06,
"rope_config": {
"dim": 128,
"max_seq_len": 2048,
"theta": 10000.0
},
"use_glu_in_ffn": true,
"use_gradient_checkpointing": false,
"vocab_size": 32000
},
"model_type": "ChatNT",
"nt_config": {
"add_bias_ffn": false,
"add_bias_kv": false,
"alphabet_size": 4107,
"attention_heads": 16,
"attention_maps_to_save": [],
"bias_word_embedding": false,
"emb_layer_norm_before": false,
"embed_dim": 1024,
"embed_scale": 1.0,
"embeddings_layers_to_save": [
21
],
"ffn_activation_name": "swish",
"ffn_embed_dim": 4096,
"key_size": 64,
"layer_norm_eps": 1e-05,
"lm_head": "roberta",
"mask_before_attention": false,
"mask_token_id": 2,
"masking_prob": 0.0,
"masking_ratio": 0.0,
"max_positions": 2048,
"num_layers": 29,
"pad_token_id": 1,
"positional_embedding": null,
"pre_layer_norm": true,
"rescaling_factor": null,
"token_dropout": false,
"use_glu_in_ffn": true,
"use_gradient_checkpointing": false,
"use_rotary_embedding": true
},
"perceiver_resampler_config": {
"add_bias_ffn": true,
"add_bias_kv": false,
"attention_heads": 32,
"emb_layer_norm_before": false,
"embed_dim": 4096,
"ffn_activation_name": "gelu-no-approx",
"ffn_embed_dim": 11008,
"key_size": 128,
"num_layers": 3,
"resampled_length": 64,
"use_glu_in_ffn": false,
"use_gradient_checkpointing": false
},
"seq_token_id": 32000,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.1"
}
|