Fulltext search with Ruby and groonga - Ranguba

README

README — An introduction of ChupaText, a text extraction utility

Name

ChupaText

Author

  • Nobuyoshi Nakada <nakada@clear-code.com>

  • Kouhei Sutou <kou@clear-code.com>

License

What's this?

ChupaText is a text extraction utility. It can extracts text and metadata from PDF and office documents. You can use it vie library, command line and Web service.

Dependency libraries and softwares

Required:

  • GLib >= 2.24

  • libgsf

Optional:

  • Poppler

  • wv

  • libgoffice

  • Gnumeric

  • LibreOffice, OpenOffice.org or unoconv

  • ruby >= 1.9.2

Repository

There is the repository for ChupaText on GitHub .

% git clone git://github.com/ranguba/chupatext.git

Install

See install .

Usage

% chupatext [OPTION ...] FILE ...

FILE is a file what you want to extract from.

See chupatext for more details.

Thanks

  • Yuto Hayamizu