Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Director at a large multichannel media company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for May, 2007

Part II: Google Analytics V2 is AWESOME, but still falls short for my complex needs…

I began my career in “information retrieval” back in the 1990’s when there was no Google. At a little startup out of think tank.  You can still use the technology that I was very proud to work on.   It is called InQuery and is still installed on Library Of Congress’ web site and used by Clinton Bush to search the congressional record (if he does that… maybe he has people to do that is all I’m saying ;) ).

We didn’t call it Search then. When I spoke about what I did, people thought I worked for that evil organization in that De Niro movie Brazil (called “information retrieval” for those who haven’t seen it). Since I lived in a place called the “Happy Valley,” I understood why people thought that. Then I would explain how it all started with Ranganathan, moving those who cared enough to listen all the way through to tf-idf.

The day the engineering team told me to check out what some guys at Stanford were doing with IR was very memorable. “Stop using Northern Light, Google is more precise” they said. I dig minimalist Art, so the interface instantly impressed me (Krug must be proud). From that day forward, I’ve been hooked on the recall and its precision.  Honestly, I started using Google about three weeks after it came out on the Internet.  I still use it (apparently mostly on Tuesday’s in March around 11pm this year). 

Google has revolutionized 21st century business.   I have only respect and admiration for their accomplishments, organizational brainpower, and ethics.  I’ve reconciled the Great Firewall of China stuff.  And for the record, you can get through the Great Firewall using RSS (for now) . You can even still use Google.com I hear (though I have never been to China to test that out…). 

I’m a very happy investor as well.  In retrospect, I should’ve been less risk averse and sold everything I owned, maxed my credit cards, ate rice and beans to buy more shares.   But I digress…

Right now, I, however, wouldn’t use Google Analytics V2 for an advanced web analytics implementation (yet :) ) that required deep integration with a data warehouse and business intelligence software across 100 sites with an amazing amount of cardinality and heterogeneous design patterns. Here’s why:

  • No control over processing. Some sites only need daily batch, other sites are driven off of the notion for real-time. As media companies begin to realize they need to bake automatic content targeting based off of analytics data to do real-time site optimization based on personas, can Google Analytics stand up right now to the challenge? Could performance-based models based on reader intent survive with GA? Imagine if Digg was updated once a day?!!

  • No scalable method for rolling-up. As anyone who has ever had to manage more than one site and had a boss, aggregate reporting is a must-have that I don’t have with GA.  I realize I can do it if I don’t care who has access to which domains.  I’d give the same tracking code to all sites and use filters to separate them, leaving one profile that isn’t separated out.  The challenge here is first party cookies as people jump domains - a new visit starts.  So I could then treat the various domains as third party shopping carts and have them carry cookies: http://www.google.com/support/analytics/bin/answer.py?answer=26915&topic=7282
    So then I might have duplicate naming issues, like on the home page, right?  So I can use use urchintracker() to fix that.  Then if I didn’t want site owners to be able to administer each other’s sites, I’d have to create different accounts for each site, and then put a second set of GA code on all the sites.  With that method, I now have two code bases to maintain.  Did I ever mention that for a time I ran SCM for a publicly traded $100M software company (using SourceSafe and then StarTeam, when Starbase owned it! Heh).  I learned to really like having a single baseline.

  • No ability to extend the data model. I am somewhat well-versed with urchintracker(), but that only goes so far. How can I define Events, Interactions, Contributions, and Personae as part of my model? At a more basic-level, can I define or create custom metrics? Better yet, if I have an overwhelming amount of structured data with custom dimensions that must be integrated with the web analytics data mart, can GA do it?

  • No published, public schema.  How do I extend or integrate my data model if I don’t know Google’s?

  • Limited features for decoding and looking-up. You can only access/manipulate the data that is currently in GA.  I hear lookup tables do exist but they are not publicly available.  They were in the old Urchin on Demand, right?  I realize you can do some of this stuff with Custom Advanced Filters, but can a small team easily maintain those methods across many sites?
    URI stems are just fine to examine when you “get it,” but to other folks they are just G(r)eek ruminations. Does an editor want to see ../article/45ty893kld323jdq2.html or the editorially-endowed title “Web Analytics Geek?” Say I have a cookie in the format 112456787.4958423452.210342943, what could I lookup that might help me understand that visitor better? Perhaps an email address? I realize that with urchintracker() this can be done to some degree, but I am concerned about such hard coded methods scaling and remaining maintainable across millions of pages. If I use the urchintracker() function to replace stems with page titles, then I can’t use the Overlay.

  • No alerting. It’s very useful to let the right people know when some metric goes up (or down) based on a pre-defined threshold.

  • No bot detection. SEOers tell me “if we can’t identify how bots crawl the site and where they exit, then we can’t do our jobs to the best of our ability.”

  • No public API to key interfaces using open standards.  I dig standards.  My old, amazingly cool, brilliant, ethical boss, a guy named Nate T, who is now an SVP at FAST sent me to Amsterdam in 2000 for the 9th w3c convention at the RAI.  At that time XML was just coming into vogue.  I quickly realized, like 7 years ago, that without XML based API’s, I would limit my ability to reach what was then called the Semantic Web. We still aren’t even close to being semantic in web analytics (and honestly I am still working on that really means in my head).  
    If there was an API you could pull the data out of GA and then manipulate it in some other application.  I realize you can shred the URL and easily decode the URL structure to do some custom coding to extract data, but what happens to my implementation if/when the URL structure changes? Other vendors do have API’s.  

I have helped deploy Google Analytics on a number of non-profit sites and friend’s sites who don’t have enterprise-level needs, and they love it. In fact, I totally dig it too.  I would even say I “wicked” dig it for all the New Englanders out there.

I have only the utmost respect for the team and how they’ve literally helped to push forward the industry and raise the bar of web analytics technology, but I only use it within context and based on client needs.  The issue I’m getting at here is that I can deploy Google analytics in an enterprise, but would I want to right now?  Especially if I have complex data integration needs or need to maintain one technology across many sites.

One of the areas of GA I *really* like is the services model.  By delegating services and support work to Google Analytics Partners, like Robbin Steif’s team at LunaMetrics and Justin Cutroni’s team at EpikOne, you get some of the best minds and most dedicated professionals working in web analytics helping you implement, solve problems, and extend your web analytics implementation.   These folks know their stuff and are consummate, ethical professionals.  If you are so inclined, you can even work with Avinash Kaushik (ever heard of him? :) ). 

To align it all with your strategic needs, you can even call my good friend Eric Peterson for some strategic consulting. 

While other vendor’s have a preferred partner ecosystem, I commend Google for helping out entrepreneurs and bringing together the best in the business to help you.  High fives! 

Imho, the best thing Google could do next is raise the bar even higher and think “integration, integration, integration” in V3. V2 is amazing aesthetic improvement by a very smart, capable team (hello Jeff G!), but the technology is still not a total match for complex, enterprise needs. But it could be. One of my geek fantasies is sitting down with the great GOOG and helping them figure that out.

My eye is how GOOG extends the technology behind the tool in the future beyond SMB. I’m also keeping an eye on that super-duper, amazingly cool Graphing feature (shown below)… WOWZERS!

google_analytics_v2_context_from_time_renamed1.jpg

Hot Tamale! Let’s Highlight the Web Analytics Association’s 2007 Plan

On Sunday night May 6, the Web Analytics Association had their annual meeting.  Attended by about 100 hardcore members who trekked from all over the Earth to participate, we went over the business at hand.

Some of the highlights for me:

  • Positive cash flow.  The WAA’s budget is looking strong.
  • Solid executive management.  I spent some time with Bryan Induni and was impressed with his plans for the organization.  I also learned that in Idaho, if you like to hike in the woods (I do), it’s a good idea to bring a gun (to prevent cougar bites).
  • Strong strategic leadershipDirectors Emeritus, best-selling author Bryan Eisenberg, and savvy WebTrends CEO, Greg Drew, were presented with honors for their significant contributions to the industry.  Jim Sternepresented them with “taking sticks” and a plaque commemorating their achievements.  In Native American culture, tribe leaders presented these foot-long intricately carved totems to individuals who wanted to address the tribe.  Since Jim didn’t take them back, and I saw the guys leave with them, that means they’re sticking around, if just a bit more “personamously!”

Jim Novo, emperor of all things Direct Marketing and beyond, and Raquel Collins, analytics educator and master planner, have some forward-thinking plans for the UBC WA course.  We’re going to see partnerships with other institutions of higher learning and more significant credentialing of graduates.  

  • Membership surveyCheck it out.  Audience questions showed we’re “wicked” geeks–New England slang for “awesome.”  My favorite: “Have we looked at how satisfaction levels across length of membership.”  Hear that WAA?  Let’s cross dimensions! Heh.

So the future is bright for the WAA.  If you aren’t a member, you are certainly missing out on what my Greg Drew said is, and I agree, “the most exciting time in your career!”

So join!  Membership has its rewards… and discounts… and hot tamales!

And, for regular readers, I’ll be back next week, once I return from VACAY, with an update all about EMetrics intertwined with some uniqueness of experience from my time out West.  Today, I’m off to Muir Woods

The DOMAINS Report, Ugh.

We’ve all seen the Domains report, sometimes known as the “Most Frequent Organizations” report.   It looks like this:

domains_report.jpg 

Notice that it’s pretty much all ISP’s.  This fact is due to the way companies have structured their internet access or outsourced the infrastructure

This data must be used very carefully by marketing or sales experts whom, at first glance, may consider this data cold-call material.  Companies must use caution!  And they must employ savvy analysts (or consultants) who use extreme care and have deep domain expertise (no pun intended) to interpret it wisely and correctly.  It wasn’t until row 72 I found something, kind-of interesting and useful about the way Clint Ivy’s old company visits a site. 

Eric Peterson Announces Web Analytics Demystified INCORPORATED!

My friend and a person whom I admire greatly has achieved the highest echelon of Maslow:

  • Self-actualization - The fulfillment of the self through our efforts in developing our potential, the essence we are born with, and the acceptance of our limitations. Our life purpose unfolded, integrated into the self and lived.

Eric made the announcement at EMetrics after presenting very compelling research proving that a process-centric approach to web analytics yields:

  • Quicker and clearer return on investment
  • More satisfaction
  • Increased value generation
  • Higher salaries

Everyone in the industry is proud of Eric, for his graciousness, gentle nature, brilliant thought, and visionary leadership in helping drive the industry forward. 

The cosmos is the limit!  Go, go, go Eric!

 

maslow.jpg

 

No! Sleep! till EMetrics! EMetrics!

I am San Francisco bound for an excellent adventure. 

san-francisco-real-estate-appraiser.jpg

I’m going to California for four days of fun, sun, and trolleys to attend EMetrics.  I’m particularly excited for this gig because the lineup is just phenomenal.  And because I’m honored to speak with my friend Ian Houston.  I’m looking forward to seeing *you* on Wednesday at 11am for a discussion on Measuring Web 2.0: From Page Views to Events.

I’m also eagerly anticipating sponging up the vast knowledge I know will be presented in the following sessions (and every session):

If you’re going to EMetrics, you’re fortunate.  Not just because of the cerebral power you will have attained at the denouement, and all the good folks you’ll meet and connect with who dig what you dig: web metrics.  Mother Nature, you see, seems to dig her analytics too.  She’s bestowed upon us the gift of clear blue skies, sun, sea air, and little breeze all week.  For a Bostonian just dethawing from winter hibernation, it will sure be mighty fine to open my sails in the California blue, green, and gold outside San Francisco’s most historic hotel: The Palace.  It was the first hotel in the city to install “rising rooms.” We now call them elevators, with names like Otis and Schmacher.

So tomorrow I will be on the righteous Left Coast as a little flash in the history of that grand establishment, pushing the buttons in those rising rooms like thousands of souls in the distant past have before me.  Then I’ll wander (hello Dave) on my way to find and take in what apparently is ”one of the world’s most beautiful public spaces,” like the inexplicable splendor of white and gold in the Grand Platz.

See you there! 

jetplane.gif

Non-Transactional Web Analytics:
A Methodology I presented
at a cool ClickZ Conference

New York City is always amazing, even if it’s full of Yankees fans (go Red Sox!). Pitching always beats hitting, btw. And I’m looking forward to my Green Monster seats versus the Tigers the Monday after Emetrics.

NYC Gotham is just so full of diversity, life, and energy, I just can’t help staying out really late and soaking it all in. Where else can you see a public installation of an Alexander Calder mobile and then walk down the street to check out a Hans Hoffman mural? Did your know the New York Public Library has the original manuscript for T.S. Eliot’s The Wasteland? And they let your look at it (if you are persistent and just keep asking for years). The original title, before Ezra Poundcrafted it, was “We do the police in different voices.” Ahhh, that helps understanding, ay?

So I was in and out of the world’s Great Metropolis for a whirlwind evening and day, thanks to an invitation to speak with the smart folks at ClickZ and Incisive Media.

Rebecca Lieb, EIC at ClickZ, impaneled me with the master of monetization, Jason Burby, and the savvy applier of insights, Neil Mason, to present on “Non-Transactional Analytics.” Y’know the kind when you don’t really sell stuff, and instead produce content and/or sites that are informational and navigational.

Neil riffed on context and the strategic approach to non-transactional analytics. Jason riffed on what I’ll call “monetization modeling.” Both had some brilliant thoughts and imparted actionable knowledge to an audience hungry for knowledge about metrics. I’m certainly looking forward to Jason’s forthcoming book, and reading Neil’s next column.

Me? We’ll I was nicely sandwiched in the middle, and presented a tactical, seven-phase methodology for “non-transactional analytics.” The process has measurement endemic throughout and is iterative and recursive (self-referential). Here it is:

  • Phase 1: Identify
  • Phase 2: Discover
  • Phase 3: Understand
  • Phase 4: Segment
  • Phase 5: Create/Optimize
  • Phase 6: Test
  • Phase 7: Validate

In Phase 1, the analyst identifies:

  • Internet ecosystem
  • Tools
  • Processes impacted
  • Goals:
    • As I like to say “a metric/KPI has meaning in position and relation to a goal (the signified).”
  • Revenue and contribution
  • Impact on value chain

identify.jpg

In Phase 2: Discover, the analyst discovers:

discover3.jpg

In Phase 3: Understand, the analyst reconciles:

  • Origin of the traffic
  • Sequences of events that generate value
  • On-site success events
  • Visitor population and how it responds to:
    • Recency – how recent is the content for the site’s audience?
    • Frequency – how frequent are visitors visiting?
    • Monetary– what’s the monetization impact? Jason Burby will know.
    • Engagement – I refer to my esteemed friend, Eric Peterson.
    • AttentionStephane Hamel has some excellent thoughts on Davenport’s Attention economy.
    • Currency – How current is the content to your audience?
    • Relevancy – How relevant is the content to your audience?

rfm.jpg 

In Phase 4: Segment, the analyst segments by:

  • Behavior
  • Demographics
  • Referrers
  • Time-based metrics
  • Event orientation
  • Psychographics
  • Topographics

segment1.jpg

In Phase 5: Create/Optimize, the analyst should work with his geeks and other gurus to create and optimize:

  • New dimensions
  • Groupings
  • Filters
  • Content types
  • Audience development strategies
  • Metadata
  • Self-describing naming conventions

create.jpg

In Phase 6: Test, you guessed it, the analyst tests using:

test1.jpg

In Phase 7: Validate, the analyst watches the numbers:

  • Key metrics from your data collection methods
  • Surveying (in context)
  • Audience panel data (in context)
  • Backtesting:
    • Comparative reporting
    • AB reporting
  • Counting the money

validate.jpg

Then, you just keep on rinsing, lathering, and repeating across your business processes:

  • Identifying goals
  • Discovering new data/ideas
  • Understanding online behavior
  • Segmenting the data
  • Creating new content and/or optimizing the delivery of existing content
  • Testing hypotheses
  • Validating your strategies to your goals

So there it is a simple model for “doing web analytics” for non-transactional sites.

Thanks for visiting! Please come again!

idea_light_bulb_1.jpg