模型優化_剪枝_Pruning_
21.剪枝 Pruning
- 此為鐵人賽系列文示範文件,參考TensorFlow Lite官方範例修改而成。
- TF Lite 評估函數參考來源。
- 剪枝 Pruning將無關緊要的權重歸零刪除歸零,在壓縮時能明顯縮小尺寸。
- 經過剪枝且量化的模型可以縮小的原來1/10大小。
- Tensorflow 模型優化模組的
prune_low_magnitude()
,可以將Keras模型在訓練期間將影響較小的權重修剪歸零。 - 在本範例中,您將使用與示範訓練後量化相同的基準模型進行優化。
# 建立評估模型的dict
MODEL_SIZE = {}
ACCURACY = {}
!pip install -q -U tensorflow_model_optimization
import tensorflow as tf
import tensorflow_model_optimization as tfmot
import numpy as np
import os
建立基本模型
- 模型採用
tf.keras.datasets.mnist
,用CNN進行建模。
# Load MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0
def model_builder():
keras = tf.keras
model = keras.Sequential([
keras.layers.InputLayer(input_shape=(28, 28)),
keras.layers.Reshape(target_shape=(28, 28, 1)),
keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(10, activation='softmax')
])
return model
baseline_model = model_builder()
baseline_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
baseline_model.summary()
baseline_model.save_weights('baseline_weights.h5')
baseline_model.fit(
train_images,
train_labels,
epochs=1,
shuffle=False
)
# 儲存未量化模型
baseline_model.save('non_pruned.h5', include_optimizer=False)
# 評估模型並紀錄準確率
_, ACCURACY['baseline Keras model'] = baseline_model.evaluate(test_images, test_labels)
# 紀錄模型大小
MODEL_SIZE['baseline h5'] = os.path.getsize('non_pruned.h5')
ACCURACY
MODEL_SIZE
使用剪枝調整模型
- 進行剪枝,另外因為剪枝模型方法有增加一層包裝層,摘要顯示的參數會增加。
# Get the pruning method
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
# Compute end step to finish pruning after 2 epochs.
batch_size = 128
epochs = 2
validation_split = 0.1 # 10% of training set will be used for validation set.
num_images = train_images.shape[0] * (1 - validation_split)
end_step = np.ceil(num_images / batch_size).astype(np.int32) * epochs
# Define pruning schedule.
pruning_params = {
'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
initial_sparsity=0.50,
final_sparsity=0.80,
begin_step=0,
end_step=end_step)
}
# Pass in the trained baseline model
model_for_pruning = prune_low_magnitude(
baseline_model,
**pruning_params
)
# `prune_low_magnitude` requires a recompile.
model_for_pruning.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
model_for_pruning.summary()
- 查看模型中某一層的權重。
- 剪枝前,有些微弱的權重。
- 剪枝後,其中許多將被清零。
# 剪枝前的模型權重
model_for_pruning.weights[1]
- 重新訓練模型。並在 Callback 增加
tfmot.sparsity.keras.UpdatePruningStep()
參數。
# Callback to update pruning wrappers at each step
callbacks=[tfmot.sparsity.keras.UpdatePruningStep()]
# Train and prune the model
model_for_pruning.fit(
train_images,
train_labels,
epochs=epochs,
validation_split=validation_split,
callbacks=callbacks
)
- 重新訓練後已修剪,觀察同一層的權重變化,許多不重要的權重已歸零。
# 剪枝後的模型權重
model_for_pruning.weights[1]
剪枝後移除包裝層
- 剪枝之後,您可以用
tfmot.sparsity.keras.strip_pruning()
刪除包裝層以具有與基線模型相同的層和參數。 - 此方法也有助於保存模型並導出為
*.tflite
檔案格式。
# Remove pruning wrappers
model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)
model_for_export .summary()
- 因為包裝器已被移除,相同的模型權重,已移置索引[0]。
model_for_export.weights[0]
- 將剪枝後的檔案保存為
*.h5
,此時模型與修剪前大小相同。但一旦壓縮模型則改善 相當明顯。
# Save Keras model
model_for_export.save('pruned_model.h5', include_optimizer=False)
# Get uncompressed model size of baseline and pruned models
MODEL_SIZE['pruned non quantized h5'] = os.path.getsize('pruned_model.h5')
MODEL_SIZE
模型壓縮3倍術
- 剪枝後的模型再壓縮。
- 壓縮後檔案大小約為原本1/3,這是因為剪枝後歸零的權重可以更有效的壓縮。
import tempfile
import zipfile
_, zipped_file = tempfile.mkstemp('.zip')
with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
f.write('pruned_model.h5')
MODEL_SIZE['pruned non quantized h5'] = os.path.getsize('pruned_model.h5')
MODEL_SIZE
模型壓縮10倍術
- 現在嘗試將已精剪枝後的模型再量化。
- 量化原本就會縮小約4倍,將剪枝模型壓縮後再量化,與基本模型相比,這使模型減少了約 10 倍。
- 小10倍精度還能維持水準。
# 剪枝壓縮後再量化模型
converter = tf.lite.TFLiteConverter.from_keras_model(baseline_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
with open('pruned_quantized.tflite', 'wb') as f:
f.write(tflite_model)
MODEL_SIZE['pruned quantized tflite'] = os.path.getsize('pruned_quantized.tflite')
MODEL_SIZE
- 即便小十倍,精度還維持原本水準。
# A helper function to evaluate the TF Lite model using "test" dataset.
# from: https://www.tensorflow.org/lite/performance/post_training_integer_quant_16x8#evaluate_the_models
def evaluate_model(filemane):
#Load the model into the interpreters
interpreter = tf.lite.Interpreter(model_path=str(filemane))
interpreter.allocate_tensors()
input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]
# Run predictions on every image in the "test" dataset.
prediction_digits = []
for test_image in test_images:
# Pre-processing: add batch dimension and convert to float32 to match with
# the model's input data format.
test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
interpreter.set_tensor(input_index, test_image)
# Run inference.
interpreter.invoke()
# Post-processing: remove batch dimension and find the digit with highest
# probability.
output = interpreter.tensor(output_index)
digit = np.argmax(output()[0])
prediction_digits.append(digit)
# Compare prediction results with ground truth labels to calculate accuracy.
accurate_count = 0
for index in range(len(prediction_digits)):
if prediction_digits[index] == test_labels[index]:
accurate_count += 1
accuracy = accurate_count * 1.0 / len(prediction_digits)
return accuracy
# Get accuracy of pruned Keras and TF Lite models
_, ACCURACY['pruned model h5'] = model_for_pruning.evaluate(test_images, test_labels)
ACCURACY['pruned and quantized tflite'] = evaluate_model('pruned_quantized.tflite')
成果
ACCURACY
MODEL_SIZE
參考
- TensorFlow Lite官方範例。
- TF Lite 評估函數參考來源。