Saturday, July 21. 2007
Posted by Jonathan Street
in AJAX, PHP Programming, Programming, Web Tools at
22:58
Comments (0)
Trackbacks (0)
Comments (0)
Trackbacks (0)
Xenu : Stats aggregation for any site
I've previously mentioned popuri.us as one of the better examples of a website stats aggregator but I think my loyalty is switching to Xenu.
It was initially flagged by techcrunch about a week ago. At the time it was struggling under the onslaught of being highlighted both on techcrunch and elsewhere. A few days later seomoz brought to my attention that the load had got so bad that the creator felt unable to cope and had released the source code to the community. There are now 13 mirrors you can use including Italian, German, Bulgarian, French and Dutch versions.
Source Code
The backend is all PHP while the frontend relies heavily on javascript. The source code is an interesting, though far from easy, read. Some of the functionality is in my opinion needlessly excessive. I really don't see the point of being able to drag the stats boxes around the screen for instance. The source code would also benefit from descriptive filenames. Principally in the results folder where stats are returned by 46 files cunningly named 1-46.
Despite this there are some real insights to be had for anyone interested in scraping these sorts of stats. For instance a stunningly simple method to grab the alexa rank is used which I hadn't come across before. It doesn't involve paying to access the API and you don't need to wrestle a css file into submission to extract the rank from the alexa site.
The Spoiler
The alexa data is returned in file number 7. Just in case you're not overly thrilled by the notion of opening all 46 files to find the one that accesses the service you're particularly interested in there is a useful key in the following file - js/general_without_encryption.js
It was initially flagged by techcrunch about a week ago. At the time it was struggling under the onslaught of being highlighted both on techcrunch and elsewhere. A few days later seomoz brought to my attention that the load had got so bad that the creator felt unable to cope and had released the source code to the community. There are now 13 mirrors you can use including Italian, German, Bulgarian, French and Dutch versions.
Source Code
The backend is all PHP while the frontend relies heavily on javascript. The source code is an interesting, though far from easy, read. Some of the functionality is in my opinion needlessly excessive. I really don't see the point of being able to drag the stats boxes around the screen for instance. The source code would also benefit from descriptive filenames. Principally in the results folder where stats are returned by 46 files cunningly named 1-46.
Despite this there are some real insights to be had for anyone interested in scraping these sorts of stats. For instance a stunningly simple method to grab the alexa rank is used which I hadn't come across before. It doesn't involve paying to access the API and you don't need to wrestle a css file into submission to extract the rank from the alexa site.
The Spoiler
The alexa data is returned in file number 7. Just in case you're not overly thrilled by the notion of opening all 46 files to find the one that accesses the service you're particularly interested in there is a useful key in the following file - js/general_without_encryption.js
