What is LoRA (Low-Rank Adaptation)?
AI EngineeringLast updated:
A parameter-efficient fine-tuning method that trains small adapter matrices instead of updating all model weights.
LoRA freezes the original model weights and injects small trainable low-rank matrices into transformer layers. This reduces GPU memory requirements by 10x or more while preserving fine-tuning quality, making it practical to customize large models on consumer hardware.