Al-Kashi Project
This is one of the experimental products developed in the Ar-PHP project labs
Al-Kashi was one of the best mathematicians in the Islamic world

In French, the law of cosines is named Théorème d'Al-Kashi (Theorem of Al-Kashi), as al-Kashi was the first to provide an explicit statement of the law of cosines in a form suitable for triangulation. In one of his numerical approximations of pi, he correctly computed 2pi to 16 decimal places of accuracy. This was far more accurate than the estimates earlier given.

Al-Kashi

We aim in Al-Kashi project to provide a rich PHP package full of statistical functions useful for online business intelligent and data mining, possible applications may include an online log file analysis, Ad's and Campaign statistics, or survey/voting results on-fly analysis. It is published under GPL license; you can download it from PHPClasses.org website, and you can check the change log here.

Khaled Al-Sham'aa
E-Mail account

Would you like to know more about statistical concepts and procedures implemented in this project? Please download this free electronic book assembled from Wikipedia articles to get detailed background information.

لمزيد من المعلومات عن هذا المشروع باللغة العربية إحيلكم إلى هذه التدوينات

Example Data

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).

    $sep = "\t"; $nl  = "\n";

    $content = file_get_contents('data.txt');

    $records = explode($nl, $content);
    $header  = explode($sep, trim(array_shift($records)));
    $data    = array_fill_keys($header, array());

    foreach ($records as $id=>$record) {
        $record = trim($record);
        if ($record == '') continue;
    
        $fields = explode($sep, $record);
        $titles = $header;
        
        foreach ($fields as $field) {
            $title = array_shift($titles);
            $data[$title][] = $field;
        }
    }

    $x = $data['wt'];
    $y = $data['mpg'];

    require('kashi.php');

    $kashi = new Kashi();
x = y =

Summary Statistics:

Mean (x)3.21725
Median (x)3.325
Mode (x)Array ( [0] => 3.44 )
Variance (x)0.95737896774194
SD (x)0.9784574429897
%CV (x)30.412850819479
Skewness (x)0.46591610679299
Kurtosis (x)0.41659466963493

    // $x is an array of values
    echo 'Mean: ' . $kashi->mean($x) . '
'; // It will use previous data set if you dont provide one as an argument to the method echo 'Mode: ' . print_r($kashi->mode()) . '
'; echo 'Median: ' . $kashi->median() . '
'; echo 'Variance: ' . $kashi->variance() . '
'; echo 'SD: ' . $kashi->sd() . '
'; echo '%CV: ' . $kashi->cv() . '
'; echo 'Skewness: ' . $kashi->skew() . '
'; echo 'Kurtosis: ' . $kashi->kurt() . '
';
Top

Correlation, Regression, and t-Test:

Covariance (x, y)-5.1166846774194
Correlation (x, y)-0.86765937651723
Significant of Correlation1.2939593840855E-10
Regression (y = a + b*x)
Array
(
    [intercept] => 37.285126167342
    [slope] => -5.3444715727227
    [r-square] => 0.75283279365826
)
t-Test unpaired-15.632569384303
Test of null hypothesis that mean of x = mean of y Probability is5.5511151231258E-16
t-Test paired-13.847209446072
Test of null hypothesis that mean of x-y = 0 Probability is8.1046280797636E-15

    echo 'Covariance: '  . $kashi->cov($x, $y) . '
'; echo 'Correlation: ' . $kashi->cor($x, $y) . '
'; $r = $kashi->cor($x, $y); $n = count($x); echo 'Significant of Correlation: ' . $kashi->corTest($r, $n) . '
'; echo 'Regression: ' . print_r($kashi->lm($x, $y), true) . '
'; echo 't-Test unpaired: ' . $kashi->tTest($x, $y, false) . '
'; echo 'Test: ' . $kashi->tDist($kashi->tTest($x, $y, false), (count($x)-1)*(count($y)-1)) . '
'; echo 't-Test paired: ' . $kashi->tTest($x, $y, true) . '
'; echo 'Test: ' . $kashi->tDist($kashi->tTest($x, $y, true), count($x)-1) . '
';
Top

Distributions:

Normal distribution (x=0.5, mean=0, sd=1)0.3520653267643
Probability for the Student t-distribution (t=3, n=10) one-tailed0.01334365502257
Probability for the Student t-distribution (t=3, n=10) two-tailed0.0066718275112848
Probability for F distribution (f=2, df1=12, df2=15)0.10268840717083
Inverse of the standard normal cumulative distribution, with a probability of (p=0.95)1.6448536251337
t-value of the Student's t-distribution for the probability $p and $n degrees of freedom (p=0.05, n=29)2.0452296438589

Standardize (x)
(mean=0 & variance=1)
-0.21856996516336, -0.18157645907083, -0.5232223682783, -0.19028081344554, 0.2340564623217, 0.32400145752706, 0.89123521761251, -0.62912534650397, -0.40281213276144, 0.16152017586576, 0.36462177794239, 0.6591191009535, 0.47922911054277, 0.79113514230331, 1.7007401744608, 1.7259828021474, 1.0907100053663, -1.9333277769817, -1.7280500863114, -2.2038881254624, -0.31359250042064, 0.70989450147266, 0.74108510464871, 1.0754773852106, 0.22027456789507, -1.2319018869528, -1.0135676647204, -1.7428474887485, 0.61559732907994, -0.0082147344411346, 0.78968441657419, -0.25338738266221

    echo 'Normal distribution (x=0.5, mean=0, sd=1): '  . $kashi->norm(0.5, 0, 1) . '
'; echo 'Probability for the Student t-distribution (t=3, n=10) one-tailed: '; echo $kashi->tDist(3, 10, 1) . '
'; echo 'Probability for the Student t-distribution (t=3, n=10) two-tailed: '; echo $kashi->tDist(3, 10, 2) . '
'; echo 'F probability distribution (f=2, df1=12, df2=15): ' . $kashi->fDist(2, 12, 15) . '
'; echo 'Inverse of the standard normal cumulative distribution (p=0.95): '; echo $kashi->inverseNormCDF(0.95) . '
'; echo 't-value of the Student\'s t-distribution (p=0.05, n=29): '; echo $kashi->inverseTCDF(0.05, 29) . '
'; echo 'Standardize (x) (i.e. mean=0 & variance=1): '; echo implode(', ', $kashi->standardize()) . '
';
Top

Chi-square test or Contingency tables (A/B testing):

Calculate the probability that number of cylinders distribution in automatic and manual transmission cars is same0.012646605046107

    $table['Automatic'] = array('4 Cylinders' => 3, '6 Cylinders' => 4, '8 Cylinders' => 12);
    $table['Manual']    = array('4 Cylinders' => 8, '6 Cylinders' => 3, '8 Cylinders' => 2);

    $results     = $kashi->chiTest($table);
    $probability = $kashi->chiDist($result['chi'], $result['df']);
    echo 'Chi-square test probability: ' . $probability . '
';
Top

Diversity index:

Shannon index for number of forward gears1.0130227035447
Simpson index for number of cylinders0.357421875

    $gear = array('3' => 15, '4' => 12, '5' => 5);
    $cyl  = array('4' => 11, '6' => 7, '8' => 14);

    echo 'Shannon index for gear: ' . $kashi->diversity($gear) . '
'; echo 'Simpson index for cyl: ' . $kashi->diversity($cyl, 'simpson') . '
';
Top

Analysis of Variance (ANOVA):