$extrastylesheet
Modules | |
Documents | |
Pictures |
Functions | |
QSet< QString > | scribo::toolchain::nepomuk::text_extraction (const QImage &input, const QString &language) |
Full toolchains performing content analysis and extraction.
QSet<QString> scribo::toolchain::nepomuk::text_extraction | ( | const QImage & | input, |
const QString & | language | ||
) |
Extract text from a document.
This is a convenient routine to be used in Nepomuk.
[in] | input | A document image. |
[in] | language | The main language used in the input document image. Improve text recognition quality if accurate. |
Don't forget to define NDEBUG for compilation to disable debug checks.
Depending on your version of Tesseract (OCR) you may define HAVE_TESSERACT_2 or HAVE_TESSERACT_3 .