I18N_Arabic
[ class tree: I18N_Arabic ] [ index: I18N_Arabic ] [ all elements ]

Procedural File: CompressStr.php

Source Location: /Arabic/CompressStr.php



Classes:

I18N_Arabic_CompressStr
This PHP class compress Arabic string using Huffman-like coding


Page Details:

----------------------------------------------------------------------

Copyright (c) 2006-2016 Khaled Al-Sham'aa.

http://www.ar-php.org

PHP Version 5

----------------------------------------------------------------------

LICENSE

This program is open source product; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (LGPL) as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see <http://www.gnu.org/licenses/lgpl.txt>.

----------------------------------------------------------------------

Class Name: Compress string using Huffman-like coding

Filename: CompressStr.php

Original Author(s): Khaled Al-Sham'aa <khaled@ar-php.org>

Purpose: This class will compress given string in binary format using variable-length code table (derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol) for encoding a source symbol

----------------------------------------------------------------------

Arabic Compress String Class

Compress string using Huffman-like coding

This class compresses text strings into roughly 70% of their original size by benefit from using compact coding for most frequented letters in a given language. This algorithm associated with text language, so you will find 6 different classes for the following languages: Arabic, English, French, German, Italian and Spanish language.

Benefits of this compress algorithm include:

  • It is written in pure PHP code, so there is no need to any PHP extensions to use it.
  • You can search in compressed string directly without any need uncompress text before search in.
  • You can get original string length directly without need to uncompress compressed text.
Note: Unfortunately text compressed using this algorithm lose the structure that normal zip algorithm used, so benefits from using ZLib functions on this text will be reduced.

There is another drawback, this algorithm working only on text from a given language, it does not working fine on binary files like images or PDF.

Example:

  1.  include('./I18N/Arabic.php');
  2.  $obj new I18N_Arabic('CompressStr');
  3.  
  4.  $obj->setInputCharset('windows-1256');
  5.  $obj->setOutputCharset('windows-1256');
  6.  
  7.  $file 'Compress/ar_example.txt';
  8.  $fh   fopen($file'r');
  9.  $str  fread($fhfilesize($file));
  10.  fclose($fh);
  11.  
  12.  $zip $obj->compress($str);
  13.  
  14.  $before strlen($str);
  15.  $after  strlen($zip);
  16.  $rate   round($after 100 $before);
  17.  
  18.  echo "String size before was: $before Byte<br>";
  19.  echo "Compressed string size after is: $after Byte<br>";
  20.  echo "Rate $rate %<hr>";
  21.  
  22.  $str $obj->decompress($zip);
  23.  
  24.  if ($obj->search($zip$word)) {
  25.      echo "Search for $word in zipped string and find it<hr>";
  26.  else {
  27.      echo "Search for $word in zipped string and do not find it<hr>";
  28.  }
  29.  
  30.  $len $obj->length($zip);
  31.  echo "Original length of zipped string is $len Byte<hr>";
  32.  
  33.  echo '<div dir="rtl" align="justify">'.nl2br($str).'</div>';




Tags:

author:  Khaled Al-Sham'aa <khaled@ar-php.org>
copyright:  2006-2016 Khaled Al-Sham'aa
link:  http://www.ar-php.org
filesource:  Source Code for this file
license:  LGPL








Documentation generated on Fri, 01 Jan 2016 10:25:54 +0200 by phpDocumentor 1.4.0