23 1 2

ztz

aabbccddwasd

aabbccddwasd

AI & ML interests

LLM

Recent Activity

new activity 3 days ago

deepseek-ai/DeepSeek-V4-Flash-Base:why only 13b active on the flash?

new activity 8 days ago

cyberneurova/CyberNeurova-DeepSeek-V4-Flash-abliterated-GGUF:BF16 or FP8/FP4 mixed .safetensors version?

new activity about 2 months ago

Qwen/Qwen3.5-397B-A17B:我问它是谁它有小概率说它是google训练的，，

View all activity

Organizations

None yet

New activity in deepseek-ai/DeepSeek-V4-Flash-Base 3 days ago

why only 13b active on the flash?

🤯😔 3

#2 opened 25 days ago by

szilard995

New activity in cyberneurova/CyberNeurova-DeepSeek-V4-Flash-abliterated-GGUF 8 days ago

BF16 or FP8/FP4 mixed .safetensors version?

#3 opened 8 days ago by

aabbccddwasd

New activity in Qwen/Qwen3.5-397B-A17B about 2 months ago

我问它是谁它有小概率说它是google训练的，，

👀 1

#62 opened 2 months ago by

Zhoudaxia2024

New activity in nvidia/Qwen3.5-397B-A17B-NVFP4 3 months ago

Support SM120

❤️👍 19

#2 opened 3 months ago by

darkstar3537

New activity in vincentzed-hf/Qwen3.5-397B-A17B-NVFP4 3 months ago

Anyone try this on 4x RTX 6000 Pro yet?

#1 opened 3 months ago by

zenmagnets

New activity in Sehyo/Qwen3.5-397B-A17B-NVFP4 3 months ago

not working on vllm

👍 1

#1 opened 3 months ago by

aabbccddwasd

missing think tag

#2 opened 3 months ago by

fouvy

New activity in Qwen/Qwen3.5-397B-A17B 3 months ago

vllm 部署oom

#22 opened 3 months ago by

Chris2me

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic 3 months ago

W4A16 quant

👍 2

#1 opened 3 months ago by

timroethig

New activity in Qwen/Qwen3.5-397B-A17B 3 months ago

fake knowledge 假知识

#21 opened 3 months ago by

aabbccddwasd

New activity in zai-org/GLM-4.7 5 months ago

we still need some Air

🚀👍 65

#1 opened 5 months ago by

jacek2024

New activity in QuantTrio/DeepSeek-V3.2-AWQ 5 months ago

Aww Man!

#1 opened 6 months ago by

mtcl

New activity in OpenGVLab/InternVL3_5-241B-A28B 9 months ago

FP8 and FP4 please？

➕ 1

#4 opened 9 months ago by

aabbccddwasd

New activity in JimmyFoxx/Qwen3-30B-A3B-Instruct-2507-SAT-FP8-Dynamic 10 months ago

what's the difference between https://huggingface.co/JimmyFoxx/Qwen3-30B-A3B-Instruct-2507-FP8-Dynamic/tree/main

#1 opened 10 months ago by

aabbccddwasd

New activity in Qwen/Qwen3-30B-A3B-Instruct-2507 10 months ago

An Improvement, But Q3 30b Still Has Very Little General Knowledge

👍❤️ 3

#2 opened 10 months ago by

phil111

New activity in Qwen/Qwen3-235B-A22B-Instruct-2507 10 months ago

🚀[Fine-tuning] 8x80GiB GPUs LoRA finetuning Qwen3-235B-A22B-Instruct-2507

🤗 4

#25 opened 10 months ago by

study-hjt

New activity in deepseek-ai/DeepSeek-R1-0528 12 months ago

刚部署满血deepseek r1 0528版本，推理性能提升这么多嘛？不是架构没变嘛？

#75 opened 12 months ago by

jakyer

New activity in nvidia/DeepSeek-R1-NVFP4 12 months ago

quantize deepseek-r1-0528 please

👍 2

#14 opened 12 months ago by

aabbccddwasd

New activity in QuantTrio/DeepSeek-R1-0528-GPTQ-Int4-Int8Mix-Compact 12 months ago

benchmark please?

👍 1

#1 opened 12 months ago by

aabbccddwasd

New activity in Qwen/Qwen3-235B-A22B about 1 year ago

In complex reasoning tasks Qwen3 is far behind QwQ

#32 opened about 1 year ago by

AdamF92

ztz

AI & ML interests

Recent Activity

Organizations

aabbccddwasd's activity

why only 13b active on the flash?

BF16 or FP8/FP4 mixed .safetensors version?

我问它是谁 它有小概率说它是google训练的，，

Support SM120

Anyone try this on 4x RTX 6000 Pro yet?

not working on vllm

missing think tag

vllm 部署oom

W4A16 quant

fake knowledge 假知识

we still need some Air

Aww Man!

FP8 and FP4 please？

what's the difference between https://huggingface.co/JimmyFoxx/Qwen3-30B-A3B-Instruct-2507-FP8-Dynamic/tree/main

An Improvement, But Q3 30b Still Has Very Little General Knowledge

🚀[Fine-tuning] 8x80GiB GPUs LoRA finetuning Qwen3-235B-A22B-Instruct-2507

刚部署满血deepseek r1 0528版本，推理性能提升这么多嘛？不是架构没变嘛？

quantize deepseek-r1-0528 please

benchmark please?

In complex reasoning tasks Qwen3 is far behind QwQ

我问它是谁它有小概率说它是google训练的，，