$extrastylesheet
Olena  User documentation 2.1
An Image Processing Platform
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Groups Pages
Toolchains

Modules

 Documents
 Pictures

Functions

QSet< QString > scribo::toolchain::nepomuk::text_extraction (const QImage &input, const QString &language)

Detailed Description

Full toolchains performing content analysis and extraction.

Function Documentation

QSet<QString> scribo::toolchain::nepomuk::text_extraction ( const QImage &  input,
const QString &  language 
)

Extract text from a document.

This is a convenient routine to be used in Nepomuk.

Parameters
[in]inputA document image.
[in]languageThe main language used in the input document image. Improve text recognition quality if accurate.
Returns
A set of recognized words.

Don't forget to define NDEBUG for compilation to disable debug checks.

Depending on your version of Tesseract (OCR) you may define HAVE_TESSERACT_2 or HAVE_TESSERACT_3 .