Software Alternatives & Reviews

OpenAI is too cheap to beat

Tesseract CommonCrawl
  1. Tesseract is an optical character recognition engine for various operating systems
    > Does android even have native OCR? Tesseract? <a href="https://github.com/tesseract-ocr/tesseract">https://github.com/tesseract-ocr/tesseract</a>.

    #OCR #Image Recognition #PDF Editor 72 social mentions

  2. Common Crawl
    Common Crawl claims to have 82% of the tokens used to train GPT-3, and it's available to anyone. Add all the downloadable material at archive.org and you've got a formidable corpus. https://commoncrawl.org/.

    #Search Engine #Web Scraping #Data Extraction 90 social mentions

Discuss: OpenAI is too cheap to beat

Log in or Post with