move float8 blockwise kernels out of prototype #3256

vkuzo · 2025-10-29T11:05:21Z

Summary:

These will be useful as a fallback path for a_1_128_w_128_128 (DeepSeek)
scaling support for float8 inference. Bringing out of prototype folder.

Test Plan:

pytest test/kernel/test_blockwise_triton.py -s -x
python benchmarks/benchmark_blockwise_scaled_linear_triton.py

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

vkuzo · 2025-10-29T11:05:23Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-10-29T11:05:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3256

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long queue for ROCM runners, also B200 and XPU queueing is observed

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

danielvegamyhre · 2025-10-29T16:04:15Z

test/kernel/test_blockwise_triton.py

 triton = pytest.importorskip("triton", reason="Triton required to run this test")

-from torchao.prototype.blockwise_fp8_inference.blockwise_quantization import (
+from torchao.kernel.blockwise_quantization import (


note for later: let's rename this torchao.kernels* (plural)

[ghstack-poisoned]

* Update [ghstack-poisoned] * Update [ghstack-poisoned] * Update [ghstack-poisoned]

vkuzo added 2 commits October 29, 2025 04:05

Update

990ef89

[ghstack-poisoned]

Update

cce08f0

[ghstack-poisoned]

This was referenced Oct 29, 2025

properly skip float8 inference tests without fbgemm #3255

Merged

add a_1_128_w_128_128 (DeepSeek) float8 scaling for inference #3257

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2025

vkuzo requested review from danielvegamyhre and jerryzh168 October 29, 2025 11:05

vkuzo added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Oct 29, 2025

Update

f76e10b

[ghstack-poisoned]

danielvegamyhre approved these changes Oct 29, 2025

View reviewed changes

vkuzo mentioned this pull request Oct 29, 2025

add bias handling for a_1_128_w_128_128 float8 scaling #3259

Merged

Update

57b8876

[ghstack-poisoned]

vkuzo changed the base branch from gh/vkuzo/156/head to main October 30, 2025 11:17

vkuzo merged commit 1e473ed into main Oct 30, 2025
36 of 45 checks passed

namgyu-youn pushed a commit to namgyu-youn/ao that referenced this pull request Nov 21, 2025

move float8 blockwise kernels out of prototype (pytorch#3256)

c170f2a

* Update [ghstack-poisoned] * Update [ghstack-poisoned] * Update [ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

move float8 blockwise kernels out of prototype #3256

move float8 blockwise kernels out of prototype #3256

Uh oh!

vkuzo commented Oct 29, 2025

Uh oh!

vkuzo commented Oct 29, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

danielvegamyhre Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

move float8 blockwise kernels out of prototype #3256

move float8 blockwise kernels out of prototype #3256

Uh oh!

Conversation

vkuzo commented Oct 29, 2025

Uh oh!

vkuzo commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3256

❗ 1 Active SEVs

Uh oh!

danielvegamyhre Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vkuzo commented Oct 29, 2025 •

edited

Loading

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading