Al-Kashi Project
This is one of the experimental products developed in the Ar-PHP project labs
Al-Kashi was one of the best mathematicians in the Islamic world

In French, the law of cosines is named Théorème d'Al-Kashi (Theorem of Al-Kashi), as al-Kashi was the first to provide an explicit statement of the law of cosines in a form suitable for triangulation. In one of his numerical approximations of pi, he correctly computed 2pi to 16 decimal places of accuracy. This was far more accurate than the estimates earlier given.

Al-Kashi

We aim in Al-Kashi project to provide a rich PHP package full of statistical functions useful for online business intelligent and data mining, possible applications may include an online log file analysis, Ad's and Campaign statistics, or survey/voting results on-fly analysis. It is published under GPL license; you can download it from PHPClasses.org website, and you can check the change log here.

Khaled Al-Sham'aa
E-Mail account

Would you like to know more about statistical concepts and procedures implemented in this project? Please download this free electronic book assembled from Wikipedia articles to get detailed background information.

لمزيد من المعلومات عن هذا المشروع باللغة العربية إحيلكم إلى هذه التدوينات

Example Data

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models). You can download example data file from here.

    $sep = "\t"; $nl  = "\n";

    $content = file_get_contents('data.txt');

    $records = explode($nl, $content);
    $header  = explode($sep, trim(array_shift($records)));
    $data    = array_fill_keys($header, array());

    foreach ($records as $id=>$record) {
        $record = trim($record);
        if ($record == '') continue;
    
        $fields = explode($sep, $record);
        $titles = $header;
        
        foreach ($fields as $field) {
            $title = array_shift($titles);
            $data[$title][] = $field;
        }
    }

    $x = $data['wt'];
    $y = $data['mpg'];

    require('kashi.php');

    $kashi = new Kashi();
x = y =

Summary Statistics:

Mean (x)3.21725
Mean (x, "geometric")3.0701885671208
Mean (x, "harmonic")2.9182632148104
Median (x)3.325
Mode (x)Array ( [0] => 3.44 )
Variance (x)0.95737896774194
SD (x)0.9784574429897
%CV (x)30.412850819479
Skewness (x)0.46591610679299
Is it significant (i.e. test it against 0)?bool(false)
Kurtosis (x)0.41659466963493
Is it significant (i.e. test it against 0)?bool(false)

Rank (x)
9, 12, 7, 16, 18, 21, 23, 15, 13, 18, 18, 29, 25, 26, 30, 32, 31, 6, 2, 3, 8, 22, 17, 27, 28, 4, 5, 1, 14, 10, 23, 11

    // $x is an array of values
    echo 'Arithmetic Mean: ' . $kashi->mean($x) . '
'; echo 'Aeometric Mean: ' . $kashi->mean($x, "geometric") . '
'; echo 'Harmonic Mean: ' . $kashi->mean($x, "harmonic") . '
'; echo 'Mode: ' . print_r($kashi->mode($x)) . '
'; echo 'Median: ' . $kashi->median($x) . '
'; echo 'Variance: ' . $kashi->variance($x) . '
'; echo 'SD: ' . $kashi->sd($x) . '
'; echo '%CV: ' . $kashi->cv($x) . '
'; echo 'Skewness: ' . $kashi->skew($x) . '
'; echo 'Is it significant (i.e. test it against 0)? '; var_dump($kashi->isSkew($x)); echo 'Kurtosis: ' . $kashi->kurt($x) . '
'; echo 'Is it significant (i.e. test it against 0)? '; var_dump($kashi->isKurt($x)); echo 'Rank (x): '; echo implode(', ', $kashi->rank($x)) . '
';
Top

Statistical Graphics:

Boxplot
Array
(
    [min] => 1.513
    [q1] => 2.62
    [median] => 3.325
    [q3] => 3.73
    [max] => 5.282
    [outliers] => Array
        (
            [0] => 5.345
            [1] => 5.424
        )

)
Histogram
Array
(
    [1.513-2.002] => 4
    [2.002-2.491] => 4
    [2.491-2.98] => 4
    [2.98-3.469] => 9
    [3.469-3.957] => 7
    [3.957-4.446] => 1
    [4.446-4.935] => 0
    [4.935-5.424] => 3
)
Normal Q-Q Plotx = -0.62609901275838, -0.36012989155586, -0.83051087731871, -0.039176085543034, 0.11776987461046, 0.36012989155586, 0.53340970683585, -0.11776987461046, -0.27769043950814, 0.19709908415753, 0.27769043950814, 1.2298587580185, 0.72451438304624, 0.83051087731871, 1.417797139161, 2.1538746917937, 1.6759397215193, -0.94678175657479, -1.6759397215193, -1.417797139161, -0.72451438304624, 0.44509652516901, 0.039176085543034, 0.94678175657479, 1.0775155681381, -1.2298587580185, -1.0775155681381, -2.1538746917937, -0.19709908415753, -0.53340970683585, 0.62609901275838, -0.44509652516901

y = 2.62, 2.875, 2.32, 3.215, 3.44, 3.46, 3.57, 3.19, 3.15, 3.44, 3.44, 4.07, 3.73, 3.78, 5.25, 5.424, 5.345, 2.2, 1.615, 1.835, 2.465, 3.52, 3.435, 3.84, 3.845, 1.935, 2.14, 1.513, 3.17, 2.77, 3.57, 2.78
Ternary Plotx = 0.729, 0.722, 0.734, 0.706, 0.695, 0.675, 0.659, 0.723, 0.701, 0.692, 0.679, 0.663, 0.676, 0.654, 0.577, 0.574, 0.625, 0.779, 0.785, 0.788, 0.716, 0.667, 0.664, 0.645, 0.691, 0.763, 0.766, 0.796, 0.689, 0.723, 0.672, 0.718

y = 0.356, 0.36, 0.369, 0.382, 0.376, 0.419, 0.407, 0.364, 0.406, 0.387, 0.408, 0.398, 0.395, 0.422, 0.463, 0.459, 0.403, 0.312, 0.317, 0.31, 0.394, 0.407, 0.417, 0.41, 0.368, 0.34, 0.323, 0.3, 0.375, 0.354, 0.381, 0.377

    echo 'Boxplot: 
';
    print_r($kashi->boxplot($x));
    echo '

'; echo 'Histogram:
';
    print_r($kashi->hist($x, 8));
    echo '

'; echo 'Normal Q-Q Plot:
'; $qq = $kashi->qqnorm($x); echo 'x = ' . implode(', ', $qq['x']) . '
'; echo 'y = ' . implode(', ', $qq['y']) . '
'; echo 'Ternary Plot:
'; $xy = $kashi->ternary($data['wt'], $data['mpg'], $data['qsec']); echo 'x = ' . implode(', ', $xy['x']) . '
'; echo 'y = ' . implode(', ', $xy['y']) . '
';
Top

Correlation, Regression, and t-Test:

Covariance (x, y)-5.1166846774194
Correlation (x, y)-0.86765937651723
Significant of Correlation1.2939593840855E-10
Path Analysis
Array
(
    [1] => -0.70763801614376
    [2] => -0.20274707094052
    [3] => 0.15145821845688
)
Regression (y = a + b*x)
Array
(
    [intercept] => 37.285126167342
    [slope] => -5.3444715727227
    [r-square] => 0.75283279365826
    [adj-r-square] => 0.74459388678021
    [intercept-se] => 1.8776273372559
    [intercept-2.5%] => 33.450499570026
    [intercept-97.5%] => 41.119752764658
    [slope-se] => 0.55910104509932
    [slope-2.5%] => -6.486308238383
    [slope-97.5%] => -4.2026349070623
    [F-statistic] => 91.375325003762
    [p-value] => 1.2939604943085E-10
)
Multiple Regression (y = a + b1*x1 + b2*x2)
Array
(
    [intercept] => 37.227270116447
    [b1] => -3.8778307424046
    [b2] => -0.031772946982161
    [r-square] => 0.82678545188279
    [adj-r-square] => 0.81483962097816
    [intercept-se] => 0
    [intercept-2.5%] => 37.227270116447
    [intercept-97.5%] => 37.227270116447
    [b1-se] => 0
    [b1-2.5%] => -3.8778307424046
    [b1-97.5%] => -3.8778307424046
    [b2-se] => 0
    [b2-2.5%] => -0.031772946982161
    [b2-97.5%] => -0.031772946982161
    [F-statistic] => 69.211213391777
    [p-value] => 9.1090543852236E-12
)
t-Test unpaired-15.632569384303
Test of null hypothesis that mean of x = mean of y (assumed equal variances)2.2204460492503E-16
Test of null hypothesis that mean of x = mean of y (assumed unequal variances)0
t-Test paired-13.847209446072
Test of null hypothesis that mean of x-y = 0 Probability is8.1046280797636E-15

    echo 'Covariance: '  . $kashi->cov($x, $y) . '
'; echo 'Correlation: ' . $kashi->cor($x, $y) . '
'; $r = $kashi->cor($x, $y); $n = count($x); echo 'Significant of Correlation: ' . $kashi->corTest($r, $n) . '
'; echo 'Path Analysis: ' . print_r($kashi->path($y, array(1=>$x, $data['hp'], $data['qsec'])), true) . '
'; echo 'Regression: ' . print_r($kashi->lm($y, $x), true) . '
'; echo 'Multiple Regression: ' . print_r($kashi->lm($data['mpg'], $data['wt'], $data['hp'])), true) . '
'; echo 't-Test unpaired: ' . $kashi->tTest($x, $y, false) . '
'; echo 'Test (assumed equal variances): ' . $kashi->tDist($kashi->tTest($x, $y, false), $kashi->tTestDf($x, $y, true, false)) . '
'; echo 'Test (assumed unequal variances): ' . $kashi->tDist($kashi->tTest($x, $y, false), $kashi->tTestDf($x, $y, false, false)) . '
'; echo 't-Test paired: ' . $kashi->tTest($x, $y, true) . '
'; echo 'Test: ' . $kashi->tDist($kashi->tTest($x, $y, true), $kashi->tTestDf($x, $y, false, true)) . '
';
Top

Distributions:

Normal distribution (x=0.5, mean=0, sd=1)0.3520653267643
Probability for the Student t-distribution (t=3, n=10) one-tailed0.01334365502257
Probability for the Student t-distribution (t=3, n=10) two-tailed0.0066718275112848
Probability for F distribution (f=2, df1=12, df2=15)0.10268840717083
Inverse of the standard normal cumulative distribution, with a probability of (p=0.95)1.6448536251337
t-value of the Student's t-distribution for the probability $p and $n degrees of freedom (p=0.05, n=29)2.0452296438589

Standardize (x)
(mean=0 & variance=1)
-0.61039956748153, -0.34978526910097, -0.91700462439985, -0.002299537926887, 0.22765425476185, 0.24809459188973, 0.36051644609311, -0.027849959336746, -0.068730633592521, 0.22765425476185, 0.22765425476185, 0.8715248742903, 0.52403914311621, 0.57513998593593, 2.0775047648356, 2.2553356978483, 2.1745963661931, -1.0396466471672, -1.6375265081579, -1.4126827997511, -0.76881218022266, 0.3094156032734, 0.22254417047987, 0.63646099731959, 0.64157108160156, -1.3104811141117, -1.1009676585508, -1.7417722275101, -0.048290296464633, -0.45709703902238, 0.36051644609311, -0.44687687045844

    echo 'Normal distribution (x=0.5, mean=0, sd=1): '  . $kashi->norm(0.5, 0, 1) . '
'; echo 'Probability for the Student t-distribution (t=3, n=10) one-tailed: '; echo $kashi->tDist(3, 10, 1) . '
'; echo 'Probability for the Student t-distribution (t=3, n=10) two-tailed: '; echo $kashi->tDist(3, 10, 2) . '
'; echo 'F probability distribution (f=2, df1=12, df2=15): ' . $kashi->fDist(2, 12, 15) . '
'; echo 'Inverse of the standard normal cumulative distribution (p=0.95): '; echo $kashi->inverseNormCDF(0.95) . '
'; echo 't-value of the Student\'s t-distribution (p=0.05, n=29): '; echo $kashi->inverseTCDF(0.05, 29) . '
'; echo 'Standardize (x) (i.e. mean=0 & variance=1): '; echo implode(', ', $kashi->standardize($x)) . '
';
Top

Chi-square test or Contingency tables (A/B testing):

Calculate the probability that number of cylinders distribution in automatic and manual transmission cars is same