Search

Showing top 40 results for "Qwen3"

Qwen3

Qwen3 is an AI model family developed by Alibaba, released as a set of large language models for natural-language tasks.

39 articles indexed Last updated 2d ago See topic hub

Videos

From RTX to Spark: NVIDIA Accelerates Gemma 4 for Local Agentic AI

["google-gemma","nvidia","ai-pc","qwen3"]

Apr 2, 2026 · Michael Fukuyama

Deploying vLLM Semantic Router on AMD Developer Cloud

…The ROCm vLLM backend running Qwen/Qwen3.5-122B-A10B-FP8 vLLM Semantic Router instance in front of that backend A reference routing profile from deploy/recipes/balance.yaml A dashboard for…

May 12, 2026 · Haichen Zhang

NVIDIA DGX Spark and DGX Station Power the Latest Open-Source and Frontier Models From the Desktop

…This includes a variety of advanced AI models including Kimi-K2 Thinking, DeepSeek-V3.2, Mistral Large 3, Meta Llama 4 Maverick, Qwen3 and OpenAI gpt-oss-120b. “NVIDIA GB300 is typically…

Jan 5, 2026 · Chris Marriott

I let Alibaba’s AI agent plan my holiday. I ended up doing more work

["qwen3"]

Feb 13, 2026 · Kinling Lo

Discussions and forums

Hacker News · u/thc1006 · 3w ago

Qwen3.6-35B-A3B speculative decoding is net-negative on RTX 3090

5 2

Hacker News · u/GreenGames · 3w ago

We got 207 tok/s with Qwen3.5-27B on an RTX 3090

165 52

Hacker News · u/freakynit · 4w ago

Show HN: Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant

https://w418ufqpha7gzj-80.proxy.runpod.netStarted for myself, but since Im not using it continuously, sharing it:Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant (TheTom/llama-cpp-turboquant) on RTX 3090 (Runpod spo…

4 2

r/LocalLLaMA · u/Signal_Ad657 · 1w ago

Qwen3.6-27B vs Coder-Next

Burned about 20 hours of side-by-side compute on my two RTX PRO 6000 Blackwells trying to get a definitive answer on which of these two models was clearly better. As with many things in life, after many tokens and kWhs l…

Hacker News · u/lastdong · 1w ago

Club-3090 Recipes for serving QWEN3.6 27B locally on RTX 3090s

A “diff” tool for AI: Finding behavioral differences in new models

…A “Chinese Communist Party Alignment” feature found in the Qwen3-8B and DeepSeek-R1-0528-Qwen3-8B models. This controls pro-government censorship and propaganda in these Chinese-developed models, and is…

Mar 13, 2026

Advancing Emerging Optimizers for Accelerated LLM Training with NVIDIA Megatron | NVIDIA Technical Blog

…Muon training performance on NVIDIA GB300 NVL72 Table 1 summarizes training throughput of the Kimi K2 and Qwen3 30B models with Muon and the AdamW optimizer on the NVIDIA GB300 NVL72 system…

Apr 22, 2026 · Hao Wu

NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design | NVIDIA Technical Blog

…The Qwen3-VL vision-language submission used the vLLM open source framework, showing how the community is rapidly building advanced multimodal optimizations to accelerate image-heavy inference workloads on the latest GPUs…

Apr 1, 2026 · Ashraf Eassa

Robotics Archives

…robotic systems, NVIDIA Nemotron speech models are used for fast and accurate natural voice interactions. Qwen3 4B, served locally via vLLM, interprets requests and generates responses with low latency, no cloud link…

May 7, 2026