ztz
aabbccddwasd
AI & ML interests
LLM
Recent Activity
new activity 3 days ago
deepseek-ai/DeepSeek-V4-Flash-Base:why only 13b active on the flash? new activity about 2 months ago
Qwen/Qwen3.5-397B-A17B:我问它是谁 它有小概率说它是google训练的,,Organizations
None yet
why only 13b active on the flash?
🤯😔 3
4
#2 opened 25 days ago
by
szilard995
BF16 or FP8/FP4 mixed .safetensors version?
1
#3 opened 8 days ago
by
aabbccddwasd
我问它是谁 它有小概率说它是google训练的,,
👀 1
3
#62 opened 2 months ago
by
Zhoudaxia2024
Support SM120
❤️👍 19
6
#2 opened 3 months ago
by
darkstar3537
Anyone try this on 4x RTX 6000 Pro yet?
52
#1 opened 3 months ago
by
zenmagnets
not working on vllm
👍 1
11
#1 opened 3 months ago
by
aabbccddwasd
missing think tag
9
#2 opened 3 months ago
by
fouvy
vllm 部署oom
13
#22 opened 3 months ago
by
Chris2me
W4A16 quant
👍 2
5
#1 opened 3 months ago
by
timroethig
fake knowledge 假知识
6
#21 opened 3 months ago
by
aabbccddwasd
we still need some Air
🚀👍 65
15
#1 opened 5 months ago
by
jacek2024
Aww Man!
20
#1 opened 6 months ago
by
mtcl
FP8 and FP4 please?
➕ 1
#4 opened 9 months ago
by
aabbccddwasd
An Improvement, But Q3 30b Still Has Very Little General Knowledge
👍❤️ 3
11
#2 opened 10 months ago
by
phil111
🚀[Fine-tuning] 8x80GiB GPUs LoRA finetuning Qwen3-235B-A22B-Instruct-2507
🤗 4
1
#25 opened 10 months ago
by
study-hjt
刚部署满血deepseek r1 0528版本,推理性能提升这么多嘛?不是架构没变嘛?
12
#75 opened 12 months ago
by
jakyer
quantize deepseek-r1-0528 please
👍 2
3
#14 opened 12 months ago
by
aabbccddwasd
benchmark please?
👍 1
#1 opened 12 months ago
by
aabbccddwasd
In complex reasoning tasks Qwen3 is far behind QwQ
12
#32 opened about 1 year ago
by
AdamF92