What is Pruning?
AI InfrastructureRemoving unnecessary weights or neurons from neural networks to reduce model size without significant accuracy loss.
Pruning identifies and removes connections in neural networks that contribute little to model output. Combined with quantization and distillation, it enables deploying large models on resource-constrained devices.