FrameworkInfo Class

The following API can be used to pass MCT framework-related information to use when optimizing the network

class model_compression_toolkit.FrameworkInfo(kernel_ops, activation_ops, no_quantization_ops, activation_quantizer_mapping, weights_quantizer_mapping, kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping)

A class to wrap all information about a specific framework the library needs to quantize a model. Specifically, FrameworkInfo holds lists of layers by how they should be quantized, and multiple mappings such as layer to it kernel channels indices, and a layer to its min/max values, etc. The layers lists are divided into three groups: kernel_ops: Layers that have coefficients and need to get quantized (e.g., Conv2D, Dense, etc.) activation_ops: Layers that their outputs should get quantized (e.g., Add, ReLU, etc.) no_quantization_ops:Layers that should not get quantized (e.g., Reshape, Transpose, etc.)

Parameters
  • kernel_ops (list) – A list of operators that are in the kernel_ops group.

  • activation_ops (list) – A list of operators that are in the activation_ops group.

  • no_quantization_ops (list) – A list of operators that are in the no_quantization_ops group.

  • activation_quantizer_mapping (Dict[QuantizationMethod, Callable]) – A dictionary mapping from QuantizationMethod to a quantization function.

  • weights_quantizer_mapping (Dict[QuantizationMethod, Callable]) – A dictionary mapping from QuantizationMethod to a quantization function.

  • kernel_channels_mapping (DefaultDict) – Dictionary from a layer to a tuple of its kernel in/out channels indices.

  • activation_min_max_mapping (Dict[str, tuple]) – Dictionary from an activation function to its min/max output values.

  • layer_min_max_mapping (Dict[Any, tuple]) – Dictionary from a layer to its min/max output values.

Examples

When quantizing a Keras model, if we want to quantize the kernels of Conv2D layers only, we can set, and we know it’s kernel out/in channel indices are (3, 2) respectivly:

>>> import tensorflow as tf
>>> kernel_ops = [tf.keras.layers.Conv2D]
>>> kernel_channels_mapping = DefaultDict({tf.keras.layers.Conv2D: (3,2)})

Then, we can create a FrameworkInfo object:

>>> FrameworkInfo(kernel_ops, [], [], kernel_channels_mapping, {}, {})

and pass it to keras_post_training_quantization().

To quantize the activations of ReLU, we can create a new FrameworkInfo instance:

>>> activation_ops = [tf.keras.layers.ReLU]
>>> FrameworkInfo(kernel_ops, activation_ops, [], kernel_channels_mapping, {}, {})

If we don’t want to quantize a layer (e.g. Reshape), we can add it to the no_no_quantization_ops list:

>>> no_quantization_ops = [tf.keras.layers.Reshape]
>>> FrameworkInfo(kernel_ops, activation_ops, no_quantization_ops, kernel_channels_mapping, {}, {})

If an activation layer (tf.keras.layers.Activation) should be quantized and we know it’s min/max outputs range in advanced, we can add it to activation_min_max_mapping for saving the statistics collection time. For example:

>>> activation_min_max_mapping = {'softmax': (0, 1)}
>>> FrameworkInfo(kernel_ops, activation_ops, no_quantization_ops, kernel_channels_mapping, activation_min_max_mapping, {})

If a layer’s activations should be quantized and we know it’s min/max outputs range in advanced, we can add it to layer_min_max_mapping for saving the statistics collection time. For example:

>>> layer_min_max_mapping = {tf.keras.layers.Softmax: (0, 1)}
>>> FrameworkInfo(kernel_ops, activation_ops, no_quantization_ops, kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping)