Skip to content

Remove quant_mode#438

Open
apsonawane wants to merge 1 commit into
mainfrom
asonawane/qwen-recipes
Open

Remove quant_mode#438
apsonawane wants to merge 1 commit into
mainfrom
asonawane/qwen-recipes

Conversation

@apsonawane
Copy link
Copy Markdown
Contributor

This pull request updates the quantization configuration for several text.json model files across different Qwen model sizes and hardware backends. The main change is the removal of the quant_mode option from the extra_options section, and for some 0.8B models, the addition of an explicit int4_algo_config setting.

Quantization configuration updates:

  • Removed the quant_mode parameter from the extra_options in all affected text.json files for the 0.8B, 2B, 4B, 9B, and 27B models, across CPU, CUDA, and WebGPU backends. This simplifies the quantization settings and avoids redundancy.

Algorithm configuration enhancements (0.8B models):

  • Added "int4_algo_config": "k_quant_linear" to the 0.8B model configs for CPU, CUDA, and WebGPU, making the quantization algorithm explicit.

Minor cleanup:

  • Cleaned up trailing commas and ensured consistent formatting in the extra_options sections where options were removed.
    These changes make the quantization settings more consistent and explicit, reducing potential confusion or misconfiguration.

Copilot AI review requested due to automatic review settings May 29, 2026 18:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Removes the quant_mode extra option from Qwen3.5 model text.json configs across CPU, CUDA, and WebGPU backends, and makes the int4 algorithm explicit for the 0.8B variants.

Changes:

  • Remove quant_mode from extra_options in all affected text.json files (0.8B, 2B, 4B, 9B, 27B).
  • Add "int4_algo_config": "k_quant_linear" to all three 0.8B backend configs.
  • Clean up trailing commas after option removal.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file
File Description
Qwen-Qwen3.5-9B/builtin/webgpu/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-9B/builtin/cuda/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-9B/builtin/cpu_and_mobile/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-4B/builtin/webgpu/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-4B/builtin/cuda/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-4B/builtin/cpu_and_mobile/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-2B/builtin/webgpu/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-2B/builtin/cpu_and_mobile/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-27B/builtin/cuda/text.json Drop quant_mode: int4 from extra_options.
Qwen-Qwen3.5-0.8B/builtin/webgpu/text.json Add int4_algo_config: k_quant_linear; drop quant_mode: default.
Qwen-Qwen3.5-0.8B/builtin/cuda/text.json Add int4_algo_config: k_quant_linear; drop quant_mode: default.
Qwen-Qwen3.5-0.8B/builtin/cpu_and_mobile/text.json Add int4_algo_config: k_quant_linear; drop quant_mode: default.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants