Introducing a benchmark for comparing the performance of various SOTA Visual Language Models (VLM) on historical image documents, based on the GT4HistOCR dataset.
DISCLAIMER: This tool is currently in development and provided as a beta version. Results may be incomplete, inconsistent, or subject to change. Use at your own discretion, and do not rely on the tool for critical evaluations or decisions.
Percentage of characters that match exactly between the OCR output and ground truth text. Higher values indicate better performance.
Ratio of character-level errors (insertions, deletions, substitutions) to the total number of characters in the ground truth. Lower values indicate better performance.
Ratio of word-level errors (insertions, deletions, substitutions) to the total number of words in the ground truth. Lower values indicate better performance.
Average time taken by each model to process and transcribe an image, measured in seconds. Lower values indicate faster processing.