What is Evaluation Metrics?
AI EngineeringMeasurements used to assess AI model quality including accuracy, perplexity, and human preference.
AI evaluation uses automated metrics (BLEU, ROUGE, perplexity) and human evaluation. For LLMs, human preference ratings and task-specific benchmarks are most meaningful.