OCRacle Beta

A Benchmarking Tool for Vision LLMs

Introducing a benchmark for comparing the performance of various SOTA Visual Language Models (VLM) on historical image documents, based on the GT4HistOCR dataset.

DISCLAIMER: This tool is currently in development and provided as a beta version. Results may be incomplete, inconsistent, or subject to change. Use at your own discretion, and do not rely on the tool for critical evaluations or decisions.

Last update: Loading...

Evaluation Metrics

1

Accuracy

Percentage of characters that match exactly between the OCR output and ground truth text. Higher values indicate better performance.

85% match rate
2

Character Error Rate (CER)

Ratio of character-level errors (insertions, deletions, substitutions) to the total number of characters in the ground truth. Lower values indicate better performance.

+ - ~
Character-level errors
3

Word Error Rate (WER)

Ratio of word-level errors (insertions, deletions, substitutions) to the total number of words in the ground truth. Lower values indicate better performance.

the quik brown
Word-level comparison
4

Execution Time

Average time taken by each model to process and transcribe an image, measured in seconds. Lower values indicate faster processing.

⏱️
2.3s
Processing time

Leaderboard

Loading...

Results by Category

Loading...