This page will serve as an introduction to some basic statistics and show how Perl can be used to get these, along with a general overview of Perl.
Statistics is based on the idea of a normal distribution which is represented
by the Gaussian function or Bell Curve. This is the distribution which you
would see if you dropped thousands of balls from the same spot and had them
cascade through a lattice of square pegs. For a perfect, infinite continuous
distribution you can define concepts such as the mean median mode and
standard deviation . In reality our distributions are always
finite so it is necessary to make some slight modifications to the definitions.
Mean = Sum of all values / N (number of Values)
Median = Value point where 50 % of the values are above and 50 % below
This would be the knife edge that you could balance the curve on.
Mode = The most frequently occuring value.
Sample Standard Deviation = SQRT( (Sum_of( X - Mean)**2) / (N - 1) )
So putting this in words, Take the difference between the value and the Mean and square it. Now sum this over all values. Now divide this result by N - 1, the number of values minus one. Note we must use N - 1 here to account for the fact that we have discrete entities. This is called the bessel correction. In the perfect continuous infinite world it would be N, but we are finite beings remember!
Here is the
perlstat script to do everything in the last paragraph.
Note, it has one fancy hash list array in it. Other than that all the
routines are very straight forward. You can review the concepts for the
map command to generate from an array a hash table in the Perl tutorials
referenced in the next section. It is under:
Data Structures: Scalars, Array, Hashes
There are numerous good Perl tutorials on the web, please look at
this one for starters:
Here are some of my favorite scripts which I use:
One can additionally run Perl scripts inside of an html web page.
This is called CGI Common Gateway Interface.
To enable CGI you must have Perl compiled on your Linux Box or MAC OS
X System. See
Note, in addition on a MAC OS X you must edit a
file called /etc/httpd/users/victor.conf . This assumes your
username is victor. Yours may be called something different.
Here is a sample:
If you are too lazy to do all the configuration and you
just want to try my global httpd.conf file in etc/httpd
directory. Here it is:
The point is there are two levels to running CGI Perl
scripts. The Apache Setup tells you how to modify
/etc/httpd/httpd.conf . This is saying how your Apache Web
Server works on a global level. Secondly for each user logging in they
have their own httpd.conf file which is read in addition to the global
one called username.conf Here is where we allow victor
to run CGI scripts in the directory /Users/victor/Sites
Following are some sample CGI Scripts to try:
Is there a way to test these scripts on my own localhost, that is
without being connected to the Internet?
Of course there is. If you remember the built in loopback address of 127.0.0.1 is also called localhost. So just type in your browser on the MAC OS X :
Note, this assumes the username is victor and the *.cgi files live in /Users/victor/Sites
or on your Linux host :
Note, this assumes you have copied the *.cgi files to the default directory of /var/www/cgi-bin
For the Linux host, all you have to do is install apache. You do not have to edit any httpd.conf files! To install apache just type: urpmi apache at the Terminal Konsole which you should be very familiar with now. Feed it the CD's and let it update and you are done.
Let me finally leave you with one advanced Perl gem . Suppose
you want to change all the *.html files in a directory replacing
"converttolinux.com" with "localhost". That is you are testing your
Webpage on the localhost before uploading it with the new changes.
to see the perl file. Note, it is based on the SED command for globally
editing masses of files. It uses the command:
sed -e 's/foo/bar/g' myfile.txt which you probably already know. Note, you have to have the g in there to make it global, otherwise it will only replace the first occurence on each line.
A perl script can also be used to aid in deciphering the output
from tcpdump for troubleshooting network problems. Suppose
you wanted to see all the http traffic on port 80 between you and a
certain host. i.e.:
tcpdump -xls 1500 port 80 and host converttolinux.com | ./tcpdump-data-filter.pl > tcpdump.out
The tcpdump-data-filter.pl perl script prints the ascii values at the end of each output line so you can make sense of the hexadecimal values that are generated. Here is the source for the perl file.
Now that you know quite a bit about Perl you may be tempted to
write really fancy code. However, you need to think about the person whohas to follow you. Here is an example of structured Perl. Note,
it uses the -w switch with the initial invocation of perl as in:
This will generate warnings. Additionally we use the parameter:
Note all sub routines and functions are declared in advanced as well as all variables and arrays. This may seem tedious but will really pay off in the long run. Click here to see the text.
For a second structured example, click here to see the text. This program filters a list that was downloaded from a website to turn it into a US Postal Service mailing list. Here are the input and the output files.
Making use of some of the functions in the previous script, you can use this Perlstopproc.txt script to kill processes stuck in memory on a MAC based upon a search pattern. Use with caution. Type ./perlstopproc.txt with no parameters and read the caution statement. I use it every day to stop the HPscanjet program using ./perlstopproc HP to kill the process. Note, if you are using Ubuntu , use this script instead.
For a third structured example, click here to see the text. This program converts a Roman Numeral under MMMM to its Arabic Value.
For a fourth structured example, click here to see the text. This program converts an Arabic Number less than 4000 to a Roman Numeral. Note, both this and the previous example are based on Ozawa Sakuro's program available on the web. See the comments of this last example for more details.
For a fifth structured example, click
to see the text. This program takes a time string and adds
minutes to it. The result is output in the original format.
perltimadd "9:13 AM" , 20
will yield "9:33 AM" as the output. Note, there needs to be a space on both sides of the comma.
Now, that you are in a mathematical vain, here is a program using recursiion to calculate the Fibonacci Series, i.e. [0, 1, 1, 2, 3, 5, 8, 13 ...] Click here to see the text. Note, I have based it loosely on this web tutorial. I tightened up the code and added lots of comments.
Here is a non-recursive way to calculate the PI Series. It is based on the well known Gregory-Leibnitz series. I have added in a timer so you can go up to a billion iterations if you like. This took 10 minutes on my Linux Host. Click here to see it.
Here is a non-recursive way to calculate the beautiful number e discovered and used by Euler and DeMoivre among others. This is an original derivation. Please see: this.