This site has been archived. To learn more about our current products Ibexa Content, Ibexa Experience, Ibexa Commerce head over to the Ibexa Developer Portal

eZ Community » Learn » eZ Publish » Indexing Multiple Binary File Types

Indexing Multiple Binary File Types

Wednesday 20 September 2006 10:35:00 pm

  • Currently 5 out of 5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


eZ publish ships with the ability to index PDF files and Word documents (assuming you have installed the pstotext and wvware utilities). However, we found that this functionality didn't meet our needs, so we did an extensive search for other parsing tools. Our solution is based on the tools listed below.

  • pdftotext (for parsing PDFs): a full blown PDF reader that also provides numerous PDF and PS utilities.
  • catdoc (for parsing Word documents): a set of parsers and utilities including:
    • catppt (for parsing Powerpoint documents)
    • xls2csv (for parsing Excel documents): by default, this parses XLS files into comma-delimited format, but it also provides options to specify other output formats.

These parsers handle PDFs, Word documents, Powerpoint presentations, and Excel spreadsheets. Our solution is customizable, allowing you to add other parsers as needed, but this set of parsers covers the most common file formats.

Install these parsers in a locations where they can be executed by your web server user / group.


Place the following code in your settings/override/binaryfile.ini.append.php file (in the siteaccess folder of choice):

# Here you can add handlers for new datatypes.

# The full path to your log file (used for debugging/testing)</span>

Note that this configuration example is for eZ publish version 3.8. If you are using previous versions of eZ publish (we tried it on 3.6) remove "ez" from the "ezbinaryfile" strings.

Save this file and clear the cache. Next, touch the file where you placed the configuration code to create an empty log file in the specified location. (Make sure that this file is writeable by your web server user / group.)

36 542 Users on board!

Tutorial menu


Printer Friendly version of the full article on one page with plain styles


Proudly Developed with from