Based on our record, OCRopus should be more popular than PDFCrowd. It has been mentiond 2 times since March 2021. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
To follow on with this, I've used https://pdfcrowd.com/ too. Source: about 3 years ago
I know that ocropy and kraken can be trained. There's a guide written a few years back for training with ocropy and kraken has documentation on model training. Source: 12 months ago
Here is yet another OCR project but requires some training to get right/it's not out of the box so may have trouble with some images. Source: about 2 years ago
pdflayer - Free, powerful HTML to PDF API supporting both URL and raw HTML conversion. Unlimited document size, lightning-fast and compatible PHP, Python, Ruby, etc.
Tesseract - Tesseract is an optical character recognition engine for various operating systems
DocRaptor - As the only API powered by the Prince HTML-to-PDF engine, DocRaptor provides the best support for complex PDFs with powerful support for headers, page breaks, page numbers, flexbox, watermarks, accessible PDFs, and much more
GOCR - GOCR homepage. GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License.
PDFShift - Convert any HTML documents to high-fidelity PDF using a single POST request
Onlineocr.net - Free Online OCR service allows you to convert PDF document to MS Word file, scanned images to editable text formats and extract text from JPEG/TIFF/BMP files