Google cruft

Am I the only one in the world wondering the hell why can’t Google merge Analytics, Webmaster Tools, Adwords and maybe even AdSense in a single, centralized control panel?
In this way they could possibly also improve the display of the data (the usage data reported by Analytics, Webmaster Tools and Adwords are all different – for good reasons, I know – yet comparing them on three different websites is a pain in the butt!) greatly simplifying the life of users in the process… Sure, we don’t pay for the service (as far as cold hard cash is concerned, but we pay in countless other ways) but still…

Page suggestions via user tracking

For a site I’ve been working on I’m developing a PHP module that allows to display page suggestions based not on the page content but rather on which pages past visitors requested.

The rationale for this is – intuitively – that the pages visited by the majority of  past visitors probably are the pages that the majority of future visitors could be interested in. It actually may sound more complicated than it really is: what it means is that when a user requests a certain page, the module extracts from the server log which other pages have been visited the most by past visitors of the current page.

Starting from this simple idea I’ve been adding a bunch of refinements, such as filtering out pages already linked by the current page, that try to improve the quality of the suggestions. As a bonus, I implemented a (rather basic) visualization of the scores as computed by the module. This can be a quite handy way to spot immediately if there are pages that are not performing really well. Visualization of the pairwise "relatedness"This is an early screenshot of this visualization (the column on the left contains the name of all the pages on the site and the column on the right contains the corresponding number of page views) that already highlights some problems, namely that the two top visited pages are poorly linked to the rest of the site (the corresponding row is completely red almost everywhere).

Obviously this approach is far from perfect, but I think is an interesting concept nonetheless. I already have some ideas about how to further improve this method – for example by taking into account not only the current page that the user is viewing but rather all pages the user has visited up to now. Also, since the website has (for now) very low traffic, scalability is not (yet) a problem but obviously for this to be really useful it should be made as scalable as possible.