Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Senior Director at a large, global Internet company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Web 2.0'

Next Entries »

Web Analytics Wiki! The times they are a-changing!

Awesome news.  Thanks to my friend, Dylan Lewis -some call him Bob or Meriwether- the web analytics industry has a WIKI.  According to the almighty “define:” operator at Google via Answers.com, a Wiki is:

  • A website or similar online resource which allows users to add and edit content collectively.
  • A collection of websites of hypertext, each of them can be visited and edited by anyone. “Wiki wiki” means “rapidly” in the Hawaiian language.
  • Online collaboration model and tool that allows any user to edit some content of webpages through a simple browser.
  • A web application that allows users to add content, as on an Internet forum, but also allows anyone to edit the content. Wiki also refers to the collaborative software used to create such a website.

In true New England diction, it’s a wicked wiki.  Wicked awesome that is.

Here’s the word from the Passionate Analyst, himself:

I am pleased to announce that WikiWebAnalytics.com is now up and running. WikiWebAnalytics.com is THE place to provide details, articles, lore, and information about the world of web analytics.

http://www.wikiwebanalytics.com/

This wiki is meant to provide an online resource for web analytics professionals and people wanting to know more about web analytics. Contributing to it will help shape the web analytics industry, community, and future web analysts.

Here is the goal - create 300 articles in 3 months. 300 articles will help the wiki become THE resource for new and existing web analytics professionals.

Check it out at http://www.wikiwebanalytics.com.  Have fun starting an article or editing one. 

It may be high time for the Standards Committee at the Web Analytics Association to add currently-approved definitions, methinks.

I stumbled upon the Open Web Analytics Project… interesting…

Found this site in blogistan:  The Open Web Analytics Project

Peter Adams, former CTO of LookSmart (NASDAQ:LOOK) wants to “make analytics free.”  While I already thought we had a rather awesome free tool, it looks like Peter may also want to “make analytics open.”  That’s inspired me to alert you about his work in my blog.

I quote:

“Open Web Analytics (OWA) is an open source web analytics framework written in PHP. OWA was born out of the need for an open source framework that could be used to easily add web analytics features to web sites and applications. The OWA framework also comes with built-in support for popular web applications such as Wordpress and MediaWiki. As a generic web analytics framework, OWA can be extended to track and analyze any web application.”

While I haven’t dug into this project deeply, I’m intrigued on the surface for a number of reasons:

1) Free.  OWA even has a wiki.

2) Open and Interoperable.  Supports a PHP API, PHP invocation, HTTP API, and Javascript.

3) Integrated with WordPress and MediaWiki. New media features are provided out of the box.  RSS tracking is present.  There’s Google Maps integration (visitor plotting), and it outputs Google KML files (for Google Earth).

4) Event-based framework.  Composed of ”event types and event handlers“ that perform a specific analytic or logging function. “Events are composed of an Event type and a message. An Event’s message could be an array, object or any other data type.“ 

5) Provides developers with a feature set including a full model-view-controller based framework, a extensible module and plugin framework, an object relational mapping layer, and a lite templating layer.  Database-driven configuration.  There’s even a heatmap (ClickHeat project).

OWA provides an interesting model for how vendors can move toward technical openness.  To me, OWA is another sign of how innovation outside of the “top vendors” pushes our industry forward to adapt to the rapidly-evolving internet and the future need for system and business actuation from integrated analytics.  

If this innovation can generate scale, it has the potential to be disrupting, but right now it still seems a bit esotericly technical and overly dependent on one person (but that’s how Linux started isn’t it…).  The average marketer wouldn’t know how to get started with it, but the Web 2.0 geek would know how use it.

I’m looking forward to seeing if new mashups provide open access to their analytics using OWA… 

One to watch…

sunnyclouds.jpg

Second Life, World of Warcraft, and other Virtual Worlds need Web Analytics API’s… or else they may be “DOOM”ed by Open 3D Environments

Virtual Worlds and Web Analytics… Y’all ever play around with Second Life or World of Warcraft?  I have.  I think the concepts and worlds are very, very interesting and fun.  I find their messaging around the analytics of their user base even more entertaining though.  It’s like looking at ComScore and NNR for accurate web analytics data… really fascinating demographic stuff of questionable accuracy outside the frame of their audience panel and technology.  For example, I have three avatars, but have only downloaded one client. The trends are compelling though…

Some CMO’s I know won’t touch Second Life with a virtual ten foot, paisley, polygon pole.  Some finance folks I know laugh over beers about Linden Dollars.  Does that mean specific corporations become a central bank setting monetary policy subordinate to the central bank in the server’s home country?  How do International Fisher Relations apply when you have no interest rate?  My friends who have physical bodies say “virtual worlds are for when you have no friends in the real one.” Harsh criticisms, but they don’t negate the fact that something is happening and people are participating on some scale.  We’re all going to “do web analytics” on virtual worlds some day (maybe sooner than we think).

Where are the API’s for analytics data from these companies?  I believe Linden Labs announcing an analytics API would help push adoption by marketers forward and increase spend rates.  When I look at emerging technologies for 3D online collaboration, like OpenCroquet, I see the end of walled gardens like Second Life and WoW unless they open up the platform:

“Second Life doesn’t create a computational environment that belongs to its users - it uses a constrained computational environment (its servers) to capture “eyeballs” for a variety of schemes to derive revenue from them. With Croquet, users/developers may freely share, modify and view the source code (due to Croquet’s liberal license), the technology is not hosted on a single organization’s server (and hence governed by that organization as was the case with ViOS and now with Second Life), and it provides a complete professional programmer’s language (Smalltalk/Squeak), integrated development environment (IDE), and class library in every distributed, running participant’s copy (the programming development environment itself is simultaneously shareable and extensible). Croquet based worlds can also be updated while the system is live and running.”

Other online collaboration environments that would benefit from an open source of verifiable measurement include:

  • Uni-verse.  An “open source Internet platform for multi-user, interactive, distributed, high-quality 3D graphics and audio for home, public and personal use.”
  • Muse. A “software platform allowing organizations to create collaborative custom solutions that utilize rich media, 3D environments, and multi-user capabilities. Using Muse, developers can create immersive 3D environments that unite video and animation, audio, html, 3D models and much more.”
  • Virtual Object System.  A “free and open platform for multiuser 3D virtual reality and interactive, collaborative 3D virtual spaces, and collaborative data systems in general.”

And the big guys and gals over at Microsoft and Sun are experimenting too (where’s Google and Yahoo? - do tell me!):

  • Microsoft’s Task Gallery.  A “novel approach to bring existing, unmodified Windows applications into a running 3D virtual environment. The result is a working platform for experimentation in 3D user interfaces, in which the user retains all familiar productivity tools. This also allows for a smooth transition between traditional 2D interfaces and our new 3D territory.”
  • Sun’s Looking Glass Project. A “Java technology and explores bringing a richer user experience to the desktop and applications via 3D windowing and visualization capabilities.”

Notice what all of these visionary ideas have in common: openness.  It’s only through open standards to key interfaces in these systems that we web analysts will be able to do what we do.  

So that beckons the rhetorical question, which web analytics tools right now could even work with extended data models for 3D virtual collaboration environments? 

I’m looking forward to how management at the following companies evolves their business models to focus on openness through analytics enabling their sustainable growth rate:

As Marshall Sponder forms the Web Analytics Association’s Social Media working group, I’m looking forward to hearing your voice on the phone calls.  Make sure you also read my good friend Eric Peterson’s take on some of this area as well.

avatar_renamed.jpg

Part II: Google Analytics V2 is AWESOME, but still falls short for my complex needs…

I began my career in “information retrieval” back in the 1990’s when there was no Google. At a little startup out of think tank.  You can still use the technology that I was very proud to work on.   It is called InQuery and is still installed on Library Of Congress’ web site and used by Clinton Bush to search the congressional record (if he does that… maybe he has people to do that is all I’m saying ;) ).

We didn’t call it Search then. When I spoke about what I did, people thought I worked for that evil organization in that De Niro movie Brazil (called “information retrieval” for those who haven’t seen it). Since I lived in a place called the “Happy Valley,” I understood why people thought that. Then I would explain how it all started with Ranganathan, moving those who cared enough to listen all the way through to tf-idf.

The day the engineering team told me to check out what some guys at Stanford were doing with IR was very memorable. “Stop using Northern Light, Google is more precise” they said. I dig minimalist Art, so the interface instantly impressed me (Krug must be proud). From that day forward, I’ve been hooked on the recall and its precision.  Honestly, I started using Google about three weeks after it came out on the Internet.  I still use it (apparently mostly on Tuesday’s in March around 11pm this year). 

Google has revolutionized 21st century business.   I have only respect and admiration for their accomplishments, organizational brainpower, and ethics.  I’ve reconciled the Great Firewall of China stuff.  And for the record, you can get through the Great Firewall using RSS (for now) . You can even still use Google.com I hear (though I have never been to China to test that out…). 

I’m a very happy investor as well.  In retrospect, I should’ve been less risk averse and sold everything I owned, maxed my credit cards, ate rice and beans to buy more shares.   But I digress…

Right now, I, however, wouldn’t use Google Analytics V2 for an advanced web analytics implementation (yet :) ) that required deep integration with a data warehouse and business intelligence software across 100 sites with an amazing amount of cardinality and heterogeneous design patterns. Here’s why:

  • No control over processing. Some sites only need daily batch, other sites are driven off of the notion for real-time. As media companies begin to realize they need to bake automatic content targeting based off of analytics data to do real-time site optimization based on personas, can Google Analytics stand up right now to the challenge? Could performance-based models based on reader intent survive with GA? Imagine if Digg was updated once a day?!!

  • No scalable method for rolling-up. As anyone who has ever had to manage more than one site and had a boss, aggregate reporting is a must-have that I don’t have with GA.  I realize I can do it if I don’t care who has access to which domains.  I’d give the same tracking code to all sites and use filters to separate them, leaving one profile that isn’t separated out.  The challenge here is first party cookies as people jump domains - a new visit starts.  So I could then treat the various domains as third party shopping carts and have them carry cookies: http://www.google.com/support/analytics/bin/answer.py?answer=26915&topic=7282
    So then I might have duplicate naming issues, like on the home page, right?  So I can use use urchintracker() to fix that.  Then if I didn’t want site owners to be able to administer each other’s sites, I’d have to create different accounts for each site, and then put a second set of GA code on all the sites.  With that method, I now have two code bases to maintain.  Did I ever mention that for a time I ran SCM for a publicly traded $100M software company (using SourceSafe and then StarTeam, when Starbase owned it! Heh).  I learned to really like having a single baseline.

  • No ability to extend the data model. I am somewhat well-versed with urchintracker(), but that only goes so far. How can I define Events, Interactions, Contributions, and Personae as part of my model? At a more basic-level, can I define or create custom metrics? Better yet, if I have an overwhelming amount of structured data with custom dimensions that must be integrated with the web analytics data mart, can GA do it?

  • No published, public schema.  How do I extend or integrate my data model if I don’t know Google’s?

  • Limited features for decoding and looking-up. You can only access/manipulate the data that is currently in GA.  I hear lookup tables do exist but they are not publicly available.  They were in the old Urchin on Demand, right?  I realize you can do some of this stuff with Custom Advanced Filters, but can a small team easily maintain those methods across many sites?
    URI stems are just fine to examine when you “get it,” but to other folks they are just G(r)eek ruminations. Does an editor want to see ../article/45ty893kld323jdq2.html or the editorially-endowed title “Web Analytics Geek?” Say I have a cookie in the format 112456787.4958423452.210342943, what could I lookup that might help me understand that visitor better? Perhaps an email address? I realize that with urchintracker() this can be done to some degree, but I am concerned about such hard coded methods scaling and remaining maintainable across millions of pages. If I use the urchintracker() function to replace stems with page titles, then I can’t use the Overlay.

  • No alerting. It’s very useful to let the right people know when some metric goes up (or down) based on a pre-defined threshold.

  • No bot detection. SEOers tell me “if we can’t identify how bots crawl the site and where they exit, then we can’t do our jobs to the best of our ability.”

  • No public API to key interfaces using open standards.  I dig standards.  My old, amazingly cool, brilliant, ethical boss, a guy named Nate T, who is now an SVP at FAST sent me to Amsterdam in 2000 for the 9th w3c convention at the RAI.  At that time XML was just coming into vogue.  I quickly realized, like 7 years ago, that without XML based API’s, I would limit my ability to reach what was then called the Semantic Web. We still aren’t even close to being semantic in web analytics (and honestly I am still working on that really means in my head).  
    If there was an API you could pull the data out of GA and then manipulate it in some other application.  I realize you can shred the URL and easily decode the URL structure to do some custom coding to extract data, but what happens to my implementation if/when the URL structure changes? Other vendors do have API’s.  

I have helped deploy Google Analytics on a number of non-profit sites and friend’s sites who don’t have enterprise-level needs, and they love it. In fact, I totally dig it too.  I would even say I “wicked” dig it for all the New Englanders out there.

I have only the utmost respect for the team and how they’ve literally helped to push forward the industry and raise the bar of web analytics technology, but I only use it within context and based on client needs.  The issue I’m getting at here is that I can deploy Google analytics in an enterprise, but would I want to right now?  Especially if I have complex data integration needs or need to maintain one technology across many sites.

One of the areas of GA I *really* like is the services model.  By delegating services and support work to Google Analytics Partners, like Robbin Steif’s team at LunaMetrics and Justin Cutroni’s team at EpikOne, you get some of the best minds and most dedicated professionals working in web analytics helping you implement, solve problems, and extend your web analytics implementation.   These folks know their stuff and are consummate, ethical professionals.  If you are so inclined, you can even work with Avinash Kaushik (ever heard of him? :) ). 

To align it all with your strategic needs, you can even call my good friend Eric Peterson for some strategic consulting. 

While other vendor’s have a preferred partner ecosystem, I commend Google for helping out entrepreneurs and bringing together the best in the business to help you.  High fives! 

Imho, the best thing Google could do next is raise the bar even higher and think “integration, integration, integration” in V3. V2 is amazing aesthetic improvement by a very smart, capable team (hello Jeff G!), but the technology is still not a total match for complex, enterprise needs. But it could be. One of my geek fantasies is sitting down with the great GOOG and helping them figure that out.

My eye is how GOOG extends the technology behind the tool in the future beyond SMB. I’m also keeping an eye on that super-duper, amazingly cool Graphing feature (shown below)… WOWZERS!

google_analytics_v2_context_from_time_renamed1.jpg

Non-Transactional Web Analytics:
A Methodology I presented
at a cool ClickZ Conference

New York City is always amazing, even if it’s full of Yankees fans (go Red Sox!). Pitching always beats hitting, btw. And I’m looking forward to my Green Monster seats versus the Tigers the Monday after Emetrics.

NYC Gotham is just so full of diversity, life, and energy, I just can’t help staying out really late and soaking it all in. Where else can you see a public installation of an Alexander Calder mobile and then walk down the street to check out a Hans Hoffman mural? Did your know the New York Public Library has the original manuscript for T.S. Eliot’s The Wasteland? And they let your look at it (if you are persistent and just keep asking for years). The original title, before Ezra Poundcrafted it, was “We do the police in different voices.” Ahhh, that helps understanding, ay?

So I was in and out of the world’s Great Metropolis for a whirlwind evening and day, thanks to an invitation to speak with the smart folks at ClickZ and Incisive Media.

Rebecca Lieb, EIC at ClickZ, impaneled me with the master of monetization, Jason Burby, and the savvy applier of insights, Neil Mason, to present on “Non-Transactional Analytics.” Y’know the kind when you don’t really sell stuff, and instead produce content and/or sites that are informational and navigational.

Neil riffed on context and the strategic approach to non-transactional analytics. Jason riffed on what I’ll call “monetization modeling.” Both had some brilliant thoughts and imparted actionable knowledge to an audience hungry for knowledge about metrics. I’m certainly looking forward to Jason’s forthcoming book, and reading Neil’s next column.

Me? We’ll I was nicely sandwiched in the middle, and presented a tactical, seven-phase methodology for “non-transactional analytics.” The process has measurement endemic throughout and is iterative and recursive (self-referential). Here it is:

  • Phase 1: Identify
  • Phase 2: Discover
  • Phase 3: Understand
  • Phase 4: Segment
  • Phase 5: Create/Optimize
  • Phase 6: Test
  • Phase 7: Validate

In Phase 1, the analyst identifies:

  • Internet ecosystem
  • Tools
  • Processes impacted
  • Goals:
    • As I like to say “a metric/KPI has meaning in position and relation to a goal (the signified).”
  • Revenue and contribution
  • Impact on value chain

identify.jpg

In Phase 2: Discover, the analyst discovers:

discover3.jpg

In Phase 3: Understand, the analyst reconciles:

  • Origin of the traffic
  • Sequences of events that generate value
  • On-site success events
  • Visitor population and how it responds to:
    • Recency – how recent is the content for the site’s audience?
    • Frequency – how frequent are visitors visiting?
    • Monetary– what’s the monetization impact? Jason Burby will know.
    • Engagement – I refer to my esteemed friend, Eric Peterson.
    • AttentionStephane Hamel has some excellent thoughts on Davenport’s Attention economy.
    • Currency – How current is the content to your audience?
    • Relevancy – How relevant is the content to your audience?

rfm.jpg 

In Phase 4: Segment, the analyst segments by:

  • Behavior
  • Demographics
  • Referrers
  • Time-based metrics
  • Event orientation
  • Psychographics
  • Topographics

segment1.jpg

In Phase 5: Create/Optimize, the analyst should work with his geeks and other gurus to create and optimize:

  • New dimensions
  • Groupings
  • Filters
  • Content types
  • Audience development strategies
  • Metadata
  • Self-describing naming conventions

create.jpg

In Phase 6: Test, you guessed it, the analyst tests using:

test1.jpg

In Phase 7: Validate, the analyst watches the numbers:

  • Key metrics from your data collection methods
  • Surveying (in context)
  • Audience panel data (in context)
  • Backtesting:
    • Comparative reporting
    • AB reporting
  • Counting the money

validate.jpg

Then, you just keep on rinsing, lathering, and repeating across your business processes:

  • Identifying goals
  • Discovering new data/ideas
  • Understanding online behavior
  • Segmenting the data
  • Creating new content and/or optimizing the delivery of existing content
  • Testing hypotheses
  • Validating your strategies to your goals

So there it is a simple model for “doing web analytics” for non-transactional sites.

Thanks for visiting! Please come again!

idea_light_bulb_1.jpg

Inspired by User Generated Content, Web Analytics, and Wine…

This past Thursday evening I went to an event at Mistral in Boston hosted by an Internet consultancy named Molecular. Molecular began its web business in the mid-1990’s. They “did” Fidelity’s first website. That’s cool stuff in my book. Since then, they’ve done so much, and are now linked by Isobar.

My favorite part of a Molecular event is the opportunity to listen to smart people speaking about Internet innovation. Not to mention the fine food, good wine, and bright sommelier (who digs the first growth)… From a semiotic perspective, such a well-planned and engaging evening tells me a lot about Molecular as a company: focused, creative, organized, smart, connected, and successful.

So why I am telling you this on an analytics blog? Well, the evening’s topic was “user generated content”:

    Today, technology has given customers control to determine what messages they will listen to and when they will listen – as well as a means to let their own voice be heard. This may be difficult, as it is much different than what we are accustomed, but denial of the customer voice will not make it go away - it’s only getting louder. Only marketers who can learn to adapt will remain successful in the jungles of untamable content.
    Effective marketers must learn to utilize user-generated content to their benefit by creating authentic, positive, and valuable ways to engage customers in a “conversation” and incorporate their voice. During this provocative discussion, panelists will share their insight on this concept, its challenges, and benefits. These marketing experts will share their real world experiences and insight into such issues as managing, surviving, and spinning negative content, as well as maximizing the advantages of the positive.

UGC is powerful stuff. The mainstream internet and media meshing has made it unavoidable. Has what I said influenced your opinion about Molecular? Made you want to eat at Mistral the next time in Boston?

So how do you measure User Generated Content? That was the question I asked to the speakers from Reebok and TripAdvisor at Thursday’s party.

The good news is that both companies claim to use web analytics to measure UGC, and, like everyone it seems, looking to do it even better. That means making better use of existing data, deploying or upgrading technology, and/or extending their data model.

So I was thinking about making better use of existing data by working with and segmenting metrics and dimensions.

UGC dimensions could include:

  • Event:
    • Post
    • Comment
    • Interaction (with types: play, pan, zoom, edit)
    • Contribution (with types: mashup and file)
  • Visitor
  • Persona

UGC metrics could include:

  • Value scores.
  • Counts of inbound/outbound links and new/return/repeat visitors.
  • Search metrics, like organic search visits and visit rate.
  • Time-based metrics, like total time online per visitor and average visit frequency and duration. 

When the web analyst creates this type of mental model for measuring UGC, selecting new technology or working with your geeks to extend the data model becomes more a lucid, focused activity.

For example, I could take a look at some cool UGC and:

  • Value score events subordinate to the page view.
  • Value score an engagement level of those events.
  • Multiply the two together to generate a type of engagement metric.
  • Identify the “Event Path” with the highest engagement.
  • Identify the “visitor” or “visit” with the highest engagement. 

Then I could wield the extended data model in my analytics tool to identify the following online behavior and better understand my UGC during a period:

  • Ratio of:
    • events:visitors
    • events:visits
    • contributions:visitors
    • contributions:cookies set
    • visitors:personas
    • comments:new posts
    • comments:existing posts
    • posts:visitors
    • comments:visitors
    • mashups:visitors
  • Percent of:
    • high/medium/low contributing visitors
    • high/medium/low interacting visitors
    • high/medium/low engaged visitors
    • new posts
    • new comments
    • new mashups
  • Number of:
    • events per page
    • interactions per contribution
    • comments per post
    • mashups created
    • linked posts
    • contributions per persona
    • visitors per persona
    • total events by post, comment, contribution, interaction

I know I could create other derivatives and use other metrics too. Events, like Interaction and Contribution, need more edification in future posts, but I think the beginnings of this model are clear.

The User Generated Content revolution doesn’t just affect Web business. It’s becoming part of modern capitalism whether you make sneakers or sell ads. This revolution is making web analytics an even more critical process in your value chain.

Page Tagging Web 2.0 Events with Google Analytics and Unica NetInsight

One of the web analytics bloggers in my blogroll is Robbin Steif.  She runs LunaMetrics, leads the Marketing Committee for the Web Analytics Association, and was even recently elected to the Board of Directors for the Web Analytics Association.  Her history includes an impressive set of initialisms: IBM, HBS, MBA, CFO, CEO. 

A few weeks ago she invited me to guest blog on how to measure Rich Internet Applications and enable event tracking. The post evolved into a three-part series:

Check Robbin out at Emetrics in San Francisco.  She’s going to tell you how to put your best foot forward and understand the wild, wide world of web analytics.  And, if you’re lucky enough to attend Emetrics, also come see me and Ian Houston on Wednesday, May 9th.  We’re going speak about Web 2.0 analytics and present a conceptual framework for measuring all the new and cool stuff going on today.

Next Entries »