Convert Microsoft to LINUX: Benchmark Testing

Benchmarking is the process of testing software and hardware under stress to see how it will function in the real world. First I will explain how to test hard drive performance for file i/o. Webservers and databases can also be tested to see how they do under stress. Note, in today's world a lot of servers and databases live on Virtual Machines such as VMWare which can span several physical devices.

Hadoop is an expansion of the Apache Web Server created by the Apache folks. It resides on large datasets across clusters of servers instead of just one machine. It is designed to be easily scalable from a single machine to thousands of servers. The library involved detects failures at the application layer and handles them appropiately. Thus if a few servers go down in a cluster you can still run ok. Here is an excellent document for installing it.

MongoDB is an open source database similar to MySQL but designed for humongus data sets. It is a document-oriented database written in the C++ programming language. It manages collections of BSON documents.It is easy to query and index, which allows many applications to store data in a natural way that matches their native data types and structures.

Memcached is an open source general purpose distributed memory caching system developed by Danga Interactive for LiveJournal, but is now used by many other sites such as YouTube and Facebook. It is used to speed up dynamic database driven websites by caching data and objects in RAM. This reduces the number of times an external data source (such as a database or API) must be read. Memcached runs on Unix, Linux, Windows and MacOSX and is distributed under the standard Open Source Creative Commons License. Memcached's APIs provide a giant hash table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order. Applications using Memcached typically layer requests and additions into RAM. If the information is not cached then the normal call to the database or API is made.

When you are doing your tests you want to be sure that the data you are testing is at least twice the size of your RAM. Otherwise you are just testing how quick your RAM is at processing stuff. Let us say you want to test your hard disk. You could set up a sample partition say of 3 Gig, assuming your RAM is only 1 Gig and perform tests on that. Below are examples of the most common tests. You want to be root when you run these. For all testing I first created a /dev/sda4 partition of 3 Gig and formatted it as ext2 using the gparted as described in the commands page
Or you can use the command
mke2fs -j /dev/sda4
to create the file system and format it as ext2.

To test Webservers there are a lot of packages out there such as Specweb created by the Standard Performance Evaluation Corporation. (SPEC) is a non-profit organization that aims to "produce, establish, maintain and endorse a standardized set" of performance benchmarks for computers." as stated on their web site. They recently released a beta version of their tool called Server Efficiency Rating Tool (SERT). Please see their website for more information. SPEC was founded in 1988. SPEC benchmarks are widely used to evaluate computer systems performance. Finally, here is a interesting paper from Gabriel Kerneis and Juliusz Chroboczek discussing benchmarking webservers. The focus is on threaded verses event-driven communication calls. I got it from
http://hal.archives-ouvertes.fr/hal-00434374/fr/
and am including it here
Enjoy!