Skip to content

Conversation

@jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Nov 18, 2025

Summary:
att, we want to make sure the output of

F.conv3d(input, weight, ...) and F.conv3d(input, fp8_weight, ...) have the same memory_format

Test Plan:
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3352

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fe4167f with merge base ff0e461 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 18, 2025
@jerryzh168 jerryzh168 changed the title Align memory_format for conv2d and conv3d in Float8Tensor with high p… Align memory_format for conv2d/3d in Float8Tensor with hp Tensor Nov 18, 2025
@jerryzh168 jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Nov 18, 2025
# output should use channels_last format as long as any of the
# input or weight is channels_last
if is_input_channels_last or is_weight_channels_last:
output = output.to(memory_format=torch.channels_last_3d)
Copy link

@jbschlosser jbschlosser Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right thing to semantics-wise, but note that this will incur a copy if the output isn't already in channels_last. Ideally, the kernel itself would output into channels_last directly to avoid the copy.

Edit: oh I think you're already aware of this :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the fbgemm kernel should output a tensor that's already in this format, so this becomes a no-op when either input or weight is in channels_last format

Comment on lines 555 to 556
act_qdata = act_qdata.contiguous()
weight_qdata = weight_qdata.contiguous()
Copy link

@jbschlosser jbschlosser Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these contiguous() calls are the right thing to do here - note that this will clobber channels_last for the activation and weight. Calling contiguous(memory_format=torch.channels_last) would be more correct

Edit: from offline discussion, can't forget the permute()! we want a contiguous() (N, D, H, W, C_in) tensor, which is equivalent to a properly-permuted, contiguous(channels_last) (N, C_in, D, H, W) tensor

Copy link
Contributor Author

@jerryzh168 jerryzh168 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored this to first to contiguous(memory_format=torch.channels_last_3d) and then do permute to make it easier to follow

@jerryzh168 jerryzh168 force-pushed the conv-memory-format branch 2 times, most recently from 58bf5d1 to 642e3d0 Compare November 19, 2025 00:45
…recision Tensors

Summary:
att, we want to make sure the output of

`F.conv3d(input, weight, ...)` and `F.conv3d(input, fp8_weight, ...)` have the same
memory_format

Test Plan:
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 merged commit 4f5bc7a into main Nov 19, 2025
20 checks passed
namgyu-youn pushed a commit to namgyu-youn/ao that referenced this pull request Nov 21, 2025
…orch#3352)

Align memory_format for conv2d and conv3d in Float8Tensor with high precision Tensors

Summary:
att, we want to make sure the output of

`F.conv3d(input, weight, ...)` and `F.conv3d(input, fp8_weight, ...)` have the same
memory_format

Test Plan:
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_fp8_conv_variants

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants