解压后执行tesseract-ocr-setup-3.02.02.exe安装,tessdata 目录存放的是语言字库文件,本安装程序默认包含了英文字库。如果想识别中文将压缩包中的chi_sim.traineddata放入tessdata即可。
2019-12-21 18:49:33 29.91MB OCR识别 OCR中文识别 tesseract OCR安装包
1
----基于google tesseract-ocr-3.02版本;2012-12; ----Linux环境编译, 2014-12-01。 直接导入jar包,并把libs目录拷进项目即可使用。 ocr识别库下载 地址:: https://code.google.com/p/tesseract-ocr/downloads/list 调用示例程序: public void testGetUTF8Text() { // First, make sure the eng.traineddata file exists. final String inputText = "hello"; final Bitmap bmp = getTextImage(inputText, 640, 480); // Attempt to initialize the API. final TessBaseAPI baseApi = new TessBaseAPI(); baseApi.init(TESSBASE_PATH, DEFAULT_LANGUAGE); baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_LINE); baseApi.setImage(bmp); // Ensure that the result is correct. final String outputText = baseApi.getUTF8Text(); assertEquals("\"" + outputText + "\" != \"" + inputText + "\"", inputText, outputText); // Ensure that getHOCRText() produced a result. final String hOcr = baseApi.getHOCRText(0); assertNotNull("HOCR result found", hOcr); // Ensure getRegions() works. final Pixa regions = baseApi.getRegions(); assertEquals("Found one region", regions.size(), 1); // Ensure getWords() works. final Pixa words = baseApi.getWords(); assertEquals("Found one word", words.size(), 1); // Iterate through the results. final ResultIterator iterator = baseApi.getResultIterator(); String lastUTF8Text; float lastConfidence; int[] lastBoundingBox; int count = 0; iterator.begin(); do { lastUTF8Text = iterator.getUTF8Text(PageIteratorLevel.RIL_WORD); lastConfidence = iterator.confidence(PageIteratorLevel.RIL_WORD); lastBoundingBox = iterator.getBoundingBox(PageIteratorLevel.RIL_WORD); count++; } while (iterator.next(PageIteratorLevel.RIL_WORD)); // Attempt to shut down the API. baseApi.end(); bmp.recycle(); }
2014-12-02 00:00:00 2.95MB tesseract ocr
1
开源OCR引擎Tesseract的Java API封装Tess4J 1.0版本
2013-01-19 00:00:00 5.24MB Tess4J Tesseract OCR
1
开源OCR引擎Tesseract的Java API封装Tess4J。 下载后需要自己编译生成jar文件,再按照说明导入到程序中使用。识别率比Asprise高不少。 但注意此引擎需要在32位JVM下运行,只支持Windows。
2012-02-16 00:00:00 3.56MB Tesseract OCR Java Tess4J
1