FX Graph Mode量化模式
训练后量化有多种量化类型(仅权重、动态和静态),配置通过qconfig_mapping ( prepare_fx函数的参数)完成。
FXPTQ API 示例:
import torch
from torch.ao.quantization import (
get_default_qconfig_mapping,
get_default_qat_qconfig_mapping,
QConfigMapping,
)
import torch.ao.quantization.quantize_fx as quantize_fx
import copy
model_fp = UserModel()
#
# post training dynamic/weight_only quantization
#
# we need to deepcopy if we still want to keep model_fp unchanged after quantization since quantization apis change the input model
model_