Part II: Google Analytics V2 is AWESOME, but still falls short for my complex needs…
I began my career in “information retrieval” back in the 1990’s when there was no Google. At a little startup out of think tank. You can still use the technology that I was very proud to work on. It is called InQuery and is still installed on Library Of Congress’ web site and used by Clinton Bush to search the congressional record (if he does that… maybe he has people to do that is all I’m saying
).
We didn’t call it Search then. When I spoke about what I did, people thought I worked for that evil organization in that De Niro movie Brazil (called “information retrieval” for those who haven’t seen it). Since I lived in a place called the “Happy Valley,” I understood why people thought that. Then I would explain how it all started with Ranganathan, moving those who cared enough to listen all the way through to tf-idf.
The day the engineering team told me to check out what some guys at Stanford were doing with IR was very memorable. “Stop using Northern Light, Google is more precise” they said. I dig minimalist Art, so the interface instantly impressed me (Krug must be proud). From that day forward, I’ve been hooked on the recall and its precision. Honestly, I started using Google about three weeks after it came out on the Internet. I still use it (apparently mostly on Tuesday’s in March around 11pm this year).
Google has revolutionized 21st century business. I have only respect and admiration for their accomplishments, organizational brainpower, and ethics. I’ve reconciled the Great Firewall of China stuff. And for the record, you can get through the Great Firewall using RSS (for now) . You can even still use Google.com I hear (though I have never been to China to test that out…).
I’m a very happy investor as well. In retrospect, I should’ve been less risk averse and sold everything I owned, maxed my credit cards, ate rice and beans to buy more shares. But I digress…
Right now, I, however, wouldn’t use Google Analytics V2 for an advanced web analytics implementation (yet
) that required deep integration with a data warehouse and business intelligence software across 100 sites with an amazing amount of cardinality and heterogeneous design patterns. Here’s why:
-
No control over processing. Some sites only need daily batch, other sites are driven off of the notion for real-time. As media companies begin to realize they need to bake automatic content targeting based off of analytics data to do real-time site optimization based on personas, can Google Analytics stand up right now to the challenge? Could performance-based models based on reader intent survive with GA? Imagine if Digg was updated once a day?!!
-
No scalable method for rolling-up. As anyone who has ever had to manage more than one site and had a boss, aggregate reporting is a must-have that I don’t have with GA. I realize I can do it if I don’t care who has access to which domains. I’d give the same tracking code to all sites and use filters to separate them, leaving one profile that isn’t separated out. The challenge here is first party cookies as people jump domains - a new visit starts. So I could then treat the various domains as third party shopping carts and have them carry cookies: http://www.google.com/support/analytics/bin/answer.py?answer=26915&topic=7282
So then I might have duplicate naming issues, like on the home page, right? So I can use use urchintracker() to fix that. Then if I didn’t want site owners to be able to administer each other’s sites, I’d have to create different accounts for each site, and then put a second set of GA code on all the sites. With that method, I now have two code bases to maintain. Did I ever mention that for a time I ran SCM for a publicly traded $100M software company (using SourceSafe and then StarTeam, when Starbase owned it! Heh). I learned to really like having a single baseline. -
No ability to extend the data model. I am somewhat well-versed with urchintracker(), but that only goes so far. How can I define Events, Interactions, Contributions, and Personae as part of my model? At a more basic-level, can I define or create custom metrics? Better yet, if I have an overwhelming amount of structured data with custom dimensions that must be integrated with the web analytics data mart, can GA do it?
-
No published, public schema. How do I extend or integrate my data model if I don’t know Google’s?
-
Limited features for decoding and looking-up. You can only access/manipulate the data that is currently in GA. I hear lookup tables do exist but they are not publicly available. They were in the old Urchin on Demand, right? I realize you can do some of this stuff with Custom Advanced Filters, but can a small team easily maintain those methods across many sites?
URI stems are just fine to examine when you “get it,” but to other folks they are just G(r)eek ruminations. Does an editor want to see ../article/45ty893kld323jdq2.html or the editorially-endowed title “Web Analytics Geek?” Say I have a cookie in the format 112456787.4958423452.210342943, what could I lookup that might help me understand that visitor better? Perhaps an email address? I realize that with urchintracker() this can be done to some degree, but I am concerned about such hard coded methods scaling and remaining maintainable across millions of pages. If I use the urchintracker() function to replace stems with page titles, then I can’t use the Overlay. -
No alerting. It’s very useful to let the right people know when some metric goes up (or down) based on a pre-defined threshold.
-
No bot detection. SEOers tell me “if we can’t identify how bots crawl the site and where they exit, then we can’t do our jobs to the best of our ability.”
-
No public API to key interfaces using open standards. I dig standards. My old, amazingly cool, brilliant, ethical boss, a guy named Nate T, who is now an SVP at FAST sent me to Amsterdam in 2000 for the 9th w3c convention at the RAI. At that time XML was just coming into vogue. I quickly realized, like 7 years ago, that without XML based API’s, I would limit my ability to reach what was then called the Semantic Web. We still aren’t even close to being semantic in web analytics (and honestly I am still working on that really means in my head).
If there was an API you could pull the data out of GA and then manipulate it in some other application. I realize you can shred the URL and easily decode the URL structure to do some custom coding to extract data, but what happens to my implementation if/when the URL structure changes? Other vendors do have API’s.
I have helped deploy Google Analytics on a number of non-profit sites and friend’s sites who don’t have enterprise-level needs, and they love it. In fact, I totally dig it too. I would even say I “wicked” dig it for all the New Englanders out there.
I have only the utmost respect for the team and how they’ve literally helped to push forward the industry and raise the bar of web analytics technology, but I only use it within context and based on client needs. The issue I’m getting at here is that I can deploy Google analytics in an enterprise, but would I want to right now? Especially if I have complex data integration needs or need to maintain one technology across many sites.
One of the areas of GA I *really* like is the services model. By delegating services and support work to Google Analytics Partners, like Robbin Steif’s team at LunaMetrics and Justin Cutroni’s team at EpikOne, you get some of the best minds and most dedicated professionals working in web analytics helping you implement, solve problems, and extend your web analytics implementation. These folks know their stuff and are consummate, ethical professionals. If you are so inclined, you can even work with Avinash Kaushik (ever heard of him?
).
To align it all with your strategic needs, you can even call my good friend Eric Peterson for some strategic consulting.
While other vendor’s have a preferred partner ecosystem, I commend Google for helping out entrepreneurs and bringing together the best in the business to help you. High fives!
Imho, the best thing Google could do next is raise the bar even higher and think “integration, integration, integration” in V3. V2 is amazing aesthetic improvement by a very smart, capable team (hello Jeff G!), but the technology is still not a total match for complex, enterprise needs. But it could be. One of my geek fantasies is sitting down with the great GOOG and helping them figure that out.
My eye is how GOOG extends the technology behind the tool in the future beyond SMB. I’m also keeping an eye on that super-duper, amazingly cool Graphing feature (shown below)… WOWZERS!













