VintaSoft Imaging .NET SDK and Plug-ins Discussions

Questions, comments and suggestions concerning VintaSoft Imaging .NET SDK.

Board index < VintaSoft Imaging < VintaSoft Imaging .NET SDK and Plug-ins Discussions

We are migrating to new forums engine, no new registration or posting currently available. TIA for your patience.

Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4



Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by David_karlsson »

Hi
Does Vintasoft.Imaging.Ocr.Tesseract plugin use Tesseract 4? If not, is it possible to do so?

Can I use multiple languages? In tesseract we can use eng+latin etc.


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by Alex »

Hi David,
Does Vintasoft.Imaging.Ocr.Tesseract plugin use Tesseract 4? If not, is it possible to do so? Current version of Vintasoft OCR .NET Plugin uses Tesseract OCR 3.04. We plan to use Tesseract OCR 4 in near time.

Can I use multiple languages? In tesseract we can use eng+latin etc. Please read how to recognize text in two languages here:
https://www.vintasoft.com/docs/vsimagin ... uages.html

Best regards, Alexander


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by David_karlsson »

Current version of Vintasoft OCR .NET Plugin uses Tesseract OCR 3.04. We plan to use Tesseract OCR 4 in near time. Near time? 1 month ? 1 year?
Please read how to recognize text in two languages here:
https://www.vintasoft.com/docs/vsimagin ... uages.html
I have already read the documentation. In the documentation it says how to OCR interpret different sections of a pdf with different languages.
In tesseract to do a better interpretation of same section (page) it is possible to combine different languages ex. eng+deu. It can even be used with multiple languages traineddata at a time eg. English and German:
tesseract myscan.png out -l eng+deu
https://github.com/tesseract-ocr/tesseract/wiki


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by Alex »

Near time? 1 month ? 1 year? Tesseract 4 will be available in version 8.7.2 in 2 months.

I have already read the documentation. In the documentation it says how to OCR interpret different sections of a pdf with different languages.
In tesseract to do a better interpretation of same section (page) it is possible to combine different languages ex. eng+deu.

It can even be used with multiple languages traineddata at a time eg. English and German:
tesseract myscan.png out -l eng+deu
https://github.com/tesseract-ocr/tesseract/wiki
Thank you for information. We will analyze information and will try to provide the best solution.


Best regards, Alexander


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by David_karlsson »

Perfect. I will happily wait for release of Tesseract 4 plugin.


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by David_karlsson »

Hi !
Tesseract 4 will be available in version 8.7.2 in 2 months. Is there any preview of version 8.7.2 ? We have started to develop our system. I need Vintasoft.Imaging.Ocr.Tesseract API for tesseract 4.
Is it possible to access Vintasoft.Imaging.Ocr.Tesseract 8.7.2 in advance?


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by Alex »

Hi David,
Tesseract 4 will be available in version 8.7.2 in 2 months. Is there any preview of version 8.7.2 ? We have started to develop our system. I need Vintasoft.Imaging.Ocr.Tesseract API for tesseract 4.
Is it possible to access Vintasoft.Imaging.Ocr.Tesseract 8.7.2 in advance?
I think preview version will be available in 2 weeks.

Best regards, Alexander


Re: Vintasoft.Imaging.Ocr.Tesseract and Tesseract 4

Post by Alex »

Hi David,
Tesseract 4 will be available in version 8.7.2 in 2 months. Is there any preview of version 8.7.2 ? We have started to develop our system. I need Vintasoft.Imaging.Ocr.Tesseract API for tesseract 4.
Is it possible to access Vintasoft.Imaging.Ocr.Tesseract 8.7.2 in advance?
Version 8.7.2.1 has been released today. In this version the used Tesseract OCR engine has been updated to version 4.0.

Also in version 8.7.2.1 you can specify that text must be recognized in several languages. Here is an example that shows how to recognize text written in English and German languages: https://www.vintasoft.com/docs/vsimagin ... uages.html

Best regards, Alexander


Page 1 from 1: 1