| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE pkgmetadata SYSTEM "http://www.gentoo.org/dtd/metadata.dtd"> |
| <pkgmetadata> |
| <maintainer type="person"> |
| <email>tomka@gentoo.org</email> |
| </maintainer> |
| <longdescription> |
| pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which |
| contain only images (no text) will be processed by optical character |
| recognition (OCR) and the text will be added to each page invisibly |
| "behind" the images. |
| |
| pdfsandwich is a command line tool which is supposed to be useful to |
| OCR scanned books or journals. It is able to recognize the page layout |
| even for multicolumn text. |
| |
| Essentially, pdfsandwich is a wrapper script which calls the following |
| binaries: convert, cuneiform, gs, and hocr2pdf. It is known to run on |
| Unix systems and has been tested on Linux and MacOS X. It supports |
| parallel processing on multiprocessor systems. |
| </longdescription> |
| <upstream> |
| <remote-id type="sourceforge">pdfsandwich</remote-id> |
| </upstream> |
| </pkgmetadata> |