I18N_Arabic
[ class tree: I18N_Arabic ] [ index: I18N_Arabic ] [ all elements ]

Procedural File: Identifier.php

Source Location: /Arabic/Identifier.php



Classes:

I18N_Arabic_Identifier
This PHP class identify Arabic text segments


Page Details:

----------------------------------------------------------------------

Copyright (c) 2006-2016 Khaled Al-Sham'aa.

http://www.ar-php.org

PHP Version 5

----------------------------------------------------------------------

LICENSE

This program is open source product; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License (LGPL) as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see <http://www.gnu.org/licenses/lgpl.txt>.

----------------------------------------------------------------------

Class Name: Identify Arabic Text Segments

Filename: Identifier.php

Original Author(s): Khaled Al-Sham'aa <khaled@ar-php.org>

Purpose: This class will identify Arabic text in a given UTF-8 multi language document, it will return array of start and end positions for Arabic text segments.

----------------------------------------------------------------------

Identify Arabic Text Segments

Using this PHP Class you can fully automated approach to processing Arabic text by quickly and accurately determining Arabic text segments within multiple languages documents.

Understanding the language and encoding of a given document is an essential step in working with unstructured multilingual text. Without this basic knowledge, applications such as information retrieval and text mining cannot accurately process data and important information may be completely missed or mis-routed.

Any application that works with Arabic in multiple languages documents can benefit from the ArIdentifier class. Using this class, applications can take a fully automated approach to processing Arabic text by quickly and accurately determining Arabic text segments within multiple languages document.

Example:

  1.      include('./I18N/Arabic.php');
  2.      $obj new I18N_Arabic('Identifier');
  3.  
  4.      $hStr=$obj->highlightText($str'#80B020');
  5.  
  6.      echo $str '<hr />' $hStr '<hr />';
  7.  
  8.      $taggedText $obj->tagText($str);
  9.  
  10.      foreach($taggedText as $wordTag{
  11.          list($word$tag$wordTag;
  12.  
  13.          if ($tag == 1{
  14.              echo "$word is Noun, ";
  15.          }
  16.  
  17.          if ($tag == 0{
  18.              echo "$word is not Noun, ";
  19.          }
  20.      }




Tags:

author:  Khaled Al-Sham'aa <khaled@ar-php.org>
copyright:  2006-2016 Khaled Al-Sham'aa
link:  http://www.ar-php.org
filesource:  Source Code for this file
license:  LGPL








Documentation generated on Fri, 01 Jan 2016 10:26:04 +0200 by phpDocumentor 1.4.0