In statistics, a linear regression is an approach to model the relationship between a scalar dependent
variable y and one or more explanatory variables denoted x, and the correlation is a statistical measurement
that describes the dependence between both variables. .
This tool retrieves for the linear relationship between x and y values (the formula y= ax+b)
and the Pearson correlation coefficient (r) that describes the degree that linear dependence.
To use this tool, just include in the form x values and the dependent variables y.
Each value for x and y must be separated by a line break, and the same number of values for x and y are required.
Often, non-linear relationships between two variables are linealized by applying to x or y values
their logaritm or squares. You may do it when required by checking the corresponding checkboxes.
|
$v){
$vals_x_array[$k]=log10($v);
}}
// logy
if($_POST["logy"]==1){
foreach($vals_y_array as $k => $v){
$vals_y_array[$k]=log10($v);
}}
// x^2
if($_POST["x2"]==1){
foreach($vals_x_array as $k => $v){
$vals_x_array[$k]=$v*$v;
}}
// y^2
if($_POST["y2"]==1){
foreach($vals_y_array as $k => $v){
$vals_y_array[$k]=$v*$v;
}}
// compute correlation_regression
$curve=correlation_regression ($vals_x_array,$vals_y_array);
// print results
print_form($vals_x,$vals_y);
if ($curve){
print_results($curve["a"],$curve["b"],$curve["r"]);
}else{
print "Error: input data is not correct";
}
}
function correlation_regression ($vals_x,$vals_y){
if (sizeof($vals_x)!= sizeof($vals_y)){return;}
$sum_x=0;
$sum_x2=0;
$sum_y=0;
$sum_y2=0;
$sum_xy=0;
$n=sizeof($vals_x);
foreach($vals_x as $key => $val){
$val_x=$val;
$val_y=$vals_y[$key];
$sum_x+=$val_x;
$sum_x2+=$val_x*$val_x;
$sum_y+=$val_y;
$sum_y2+=$val_y*$val_y;
$sum_xy+=$val_x*$val_y;
//print "$val_x\t$val_y\n";
}
//print " sum_x\t$sum_x\nsum_y\t$sum_y\nsum_x2\t$sum_x2\nsum_y2\t$sum_y2\nsum_xy\t$sum_xy\n";
// y=ax+b
// calculate a
$curve["a"]=($n*$sum_xy-$sum_x*$sum_y)/($n*$sum_x2-$sum_x*$sum_x);
// calculate b
$curve["b"]=($sum_y/$n)-($curve["a"]*$sum_x/$n);
// calculate regression
$curve["r"]=($sum_xy-(1/$n)*$sum_x*$sum_y)/((sqrt($sum_x2-(1/$n)*$sum_x*$sum_x)*(sqrt($sum_y2-(1/$n)*$sum_y*$sum_y))));
return $curve;
}
//########print form
function print_form($vals_x,$vals_y,$a,$b,$r){
?>
2";}
$y="y";
if($_POST["logy"]==1){$y="logy";}
if($_POST["y2"]==1){$y="y2";}
print "
Values for curve $y=a$x+b
a = $a
b = $b
Correlation (r) = $r
|
";
}
//########print example
function print_example($a,$b,$r){
$apples=round($a*35+$b);
print "
Example: The number of apples arriving to the restaurant per box and their weight
in kilograms were registered. Data is shown in the form above.
We want to estimate the number of apples in a box when a new box arrives to the restaurant,
so that we may decide the number of menus with apples we may offer to our clients.
We have computed the linear regresión between both parameters and we have obtained the
value a=$a and b=$b to be used in the formula y=ax+b.
When a 35 kilos box arrives to the restaurant, by applying the formula
the number of apples in the box is easily estimated:
y= $a *35 + $b = $apples apples
As correlation coefficient is good (r = $r), the number of apples computed
will be a good estimation.
|
";
}
?>
|