Following one too many posts talking about increasing the speed of PHP scripts by using single quotes instead of double quotes, preincrementing rather than postincrementing variables and the like I wrote
7 tips for lightning fast PHP sites. This post was supposed to be a spoof celebrating the worst aspects of these types of posts. I suggested that a hundred nanoseconds or so saved was going to make a practical difference. I based my findings on just one run and justified this by looping over a function a million times.
I had thought that with comparing aliases of functions seven times over people would realise what I was doing but apparently my post was just too close to the sad reality and lacking in sufficient humour for people to catch on.
Here I again present benchmarks for the seven pairs of functions I compared in my last post. The difference being that this time the benchmark I use is my best attempt. If you think you can do better I would like to hear from you.
Test Environment
For this more rigorous test I switched from my local development server to a remote shared hosting account running PHP 5.2.3 on a Linux system.
Benchmark Methodology
As before the code used to run the test is available for
download. Instead of just running each function one million times and timing it multiple rounds of replication are now used. Each function is run one thousand times and then its partner is run one thousand times. This process is also repeated one thousand times during the execution of a single script. This gives the one million runs performed in the previous post. This is considered to be a single test. This test is run in blocks of ten with two second intervals between each request. Each of these ten test blocks are run every five minutes via a cron job. This allows 120 'tests', and 120 million function executions, to be run an hour without any supervision.
After leaving it to run for an hour or so I got to work processing the stats.
I opened up Minitab 14 and started doing some T tests. This gives the probability that two sets of data originate from the same underlying population. Generally speaking a probability below 0.05 (5%) is considered indicative of a statistically significant difference.
I plugged in the numbers and got nothing even close to 0.05. I did get a 0.946 though which I believe takes the record as being the closest result to 1 I've ever seen. So far everything is pointing towards no detectable differences.
It's possible though that the sample size was just too small for the differences to be detected above the noise. Minitab has a useful function however which allows us to answer this very question. Using a power calculation it is possible to work out what sample size you need to have a set chance of detecting a set difference if it exists. A power of 0.8 - 0.9 is usually considered sufficient to justify an experiment but because I want to be sure of these results I'll use 0.99. I'll set the difference I'm looking for at 5%. This means that there is a
99% chance of detecting just a 5% difference if it exists.
The sample sizes are listed below:
| Test | Sample size | Mean | Difference | Power value | SD |
| 1 | 3167 | 0.7528 | 0.0376 | 0.99 | 0.3495 |
| 2 | 3467 | 0.6132 | 0.0307 | 0.99 | 0.2977 |
| 3 | 2446 | 1.1113 | 0.0556 | 0.99 | 0.4532 |
| 4 | 1511 | 1.2591 | 0.0630 | 0.99 | 0.4035 |
| 5 | 2004 | 17.5610 | 0.8781 | 0.99 | 6.4826 |
| 6 | 1280 | 4.2624 | 0.2131 | 0.99 | 1.2572 |
| 7 | 353 | 19.5333 | 0.9767 | 0.99 | 3.0226 |
The Results
After about a day of letting things stew the results as in:
Test 1 - sizeof vs count: 0.209
Test 2 - is_int vs is_integer: 0.539
Test 3 - chop vs rtrim: 0.857
Test 4 - doubleval vs floatval: 0.553
Test 5 - fwrite vs fputs: 0.735
Test 6 - ini_alter vs ini_set: 0.259
Test 7 - implode vs join: 0.454
Nothing is close to 0.05
What does this mean?
Overall I conclude that there is no statistical difference between the aliases of a function in the tests that have been run. Although this was the expected result I hope that the analysis presented is sufficiently rigorous to discount the possibility of personal bias in the benchmark.
I've spoken a lot about statistical significance and essentially ignored what I would call practical significance. I've done this because my focus here was demonstrating what I view as a robust benchmark. If you really want to find that 1-5% difference in a single function this is what you should be doing.
If you want to find that 1+% difference in the speed of your entire script this type of analysis is irrelevant. Even
on a good day using count rather than sizeof or even fputs instead of fwrite will make less than 1% difference in virtually all projects.
If you want to save that 1+% you'll need to analyse your code, figure out which parts are slow and then tackle those aspects. Whether that analysis is as simple as echo-ing microtime at points throughout your script or using a profiler such as
Xdebug is up to you.
When I first came across Jonathan Street’s “7 tips for lightning fast PHP sites” blog post via PHPDeveloper.org, my first reaction was something like: “Egads! These benchmarks are stupid and misleading! These functions are simp...
Tracked: Sep 25, 21:14
In light of the recent posts on PHP Function Aliases, I just wanted to point out a script I host: Function Alias List in PHP The script was written by Dave Barr, originally. I'd intended to eventually port it to Doc Web with the other tools but I gu
Tracked: Sep 25, 21:30
Following on the back of my recent posts looking at the (hopefully) best and worst of benchmarks I thought it would be useful to finish off with some genuine tips for creating 'lightning fast' websites. I probably lack the experience and insight to bring
Tracked: Sep 29, 11:25
基本上这些篇都是函数的对比.但同样是对比,或多或少都还是有不同的. 1 call_user_func 慢. 2 类的magic method全都慢. 3 通过类的读取慢 4 Iterators 慢 Benchmarking magic | GarfieldTech 这个就更清晰了. 自己...
Tracked: Jan 01, 09:19