Hi @Carlofkl
Want to ask about deployment on device in your paper
1.Was your W8A8 model created using PTQ, QAT, quantization-aware distillation, or mixed precision?
2.Which layers/operations were kept above W8A8?
- Maybe detail about how you do it?
I am interested on research quantize unet and diffusion etc
Hi @Carlofkl
Want to ask about deployment on device in your paper
1.Was your W8A8 model created using PTQ, QAT, quantization-aware distillation, or mixed precision?
2.Which layers/operations were kept above W8A8?
I am interested on research quantize unet and diffusion etc