Skip to content
Unverified Commit 49cf9346 authored by aszlig's avatar aszlig
Browse files

pyocr: Add patch to support Tesseract 3.05.00



This is from the commit message I've written for the upstream pull
request (jflesch/pyocr#62):

    This is a bit more involved, because Tesseract 3.05.00 comes not
    only with improvements but also with a few quirks we need to deal
    with.

    The first quirk is that the order arguments of the `tesseract'
    command now matters and the list of configurations has to be at the
    end of the command line. So we add a new attribute tesseract_flags
    to the BaseBuilder class that contains a list of all the flags to
    pass to `tesseract', the tesseract_configs attribute however remains
    pretty much the same but now only really contains a list of configs
    instead of being mixed with flag arguments.

    Another quirk has to do with Leptonica >= 1.74 which Tesseract
    3.05.00 now requires. Leptonica has special handling of files that
    reside in /tmp and assumes that it's an internal temporary file of
    Leptonica. In order to deal with it, we now run Tesseract in a
    temporary directory, which contains the input/output files and use
    the relative name of these files because Leptonica only searches for
    path names beginning with /tmp.

    Fortunately the last item we need to address is not really a quirk,
    but an API change. In Tesseract 3.05.00 there is now a new function
    called TessBaseAPIDetectOrientationScript(), which doesn't fill the
    OSResults object anymore but now allows to pass the values we're
    interested in directly by reference. We need to use this new
    function because the old function TessBaseAPIDetectOS() now *always*
    returns false.

I've tested this specifically on NixOS and in conjunction with Paperwork
(the only package that's using pyocr so far) and all the tests of the
dependency chain are now succeeding. However, I didn't do manual tests
of Paperwork though.

Signed-off-by: default avataraszlig <aszlig@redmoonstudios.org>
parent 121751e1
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment