Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Senior Director at a large, global Internet company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Data Collection'

« Previous Entries Next Entries »

Thinking about Key Performance Indicators…

The infoglut in web analytics is enormous.  So much data.  Companies report that 69% of all people who consume the data don’t understand it.  How does a business go about making sense of it all?  Formulating a comprehensive KPI (Key Performance Indicator) strategy is a big part of differentiating signal from noise and directing appropriate tool usage.  We’ve all heard about KPI’s before.  They are ratios or derivatives of metrics that pinpoint critical, business relevant web performance.   My good friend, Eric, even wrote a book (a BIG one) about it. 

The process of moving an organization through KPI Change Management starts with a well formulated plan for doing so.  Here are some tips for formulating your KPI plan: 

  • Educate senior management and get managerial buy-in.  Education and buy-in can take shape via a number of methods.  Maybe you publish and circulate an internal-only white paper about the importance of KPI’s measurement.  Maybe you leave Eric’s book on the chair of your C-level executives.  Perhaps you hold a meeting and present the web site optimization process and how measurement via KPI’s provides the foundational informational on which to make site optimization decisions.  Perhaps you take your boss out to lunch and explain that basic reporting and tool access is helpful, but “Web analytics is hard” and that KPI’s give context to the data to staff that’s otherwise somewhat confused about what they pull for the tool.  You explain that KPI’s provide a focal point for centering analysis around business goals.  Whatever the method, the goal is managerial approval that “yes, you can do KPI’s.”
  • Determine the audience for the KPI’s and train them.The importance of KPI’s will vary by stakeholder, and your KPI strategy needs to take that into account. Different segments of stakeholders will be interested in specific KPI’s, and you must accommodate that need.  As an analyst, you should identify the functional roles and job responsibilities of the people who are going to receive KPI reports.   Everyone may not be the right choice (though it could be), and it may make sense to concentrate a KPI rollout on the needs of the few or it may make sense to “go broad.”  Follow up with comprehensive training about your KPI project and how KPI’s can most effectively be used.
  • Start with simple, well-qualified, highly relevant KPI’s.  While some folks with want to throw a “kitchen sink” strategy at KPI’s.  That’s a mistake.  If you report more than 5 to 10 KPI’s (imho) per stakeholding group you may end up with a set of unworkable, confusing, and neglected reports.  It’s better to report just a few, well qualified, highly relevant KPI’s.  How do you qualify them? By mapping KPI’s to important business objectives.  How do you know they are highly-relevant? Because you’ve compelled management to buy-in and to agree that they are critical indicators of site success. 
  • Elicit the business goals for the KPI’s, compare KPI’s to goals, and report associated variances (i.e. deviations). Make sure you have determined business performance goals for KPI’s.  Goals give context for performance. It’s that simple.  Without goals, you have no context for determining what’s good and what’s bad.  If your conversion rate KPI is 5%.  Great!  So what though?  If you know your goal is 3%.  Awesome job.  If you know your goal is 10%.  Stop reading now, and get back to work - you have much work cut out for you. 
  • Identify the frequency and format for reporting.  You need to determine a frequency that is timely and sustainable, and the format in which you present KPI reports needs to common enough that people can easily examine the data. Perhaps you deliver the reporting in Excel, make it available directly in your tool, use Xcellius, or create reports using a BI tool. 
  • Automate the delivery of the reporting.Without automation, you may put on the Report Monkey suit and enter Excel hell.  Critical to the successful rollout of any KPI reporting is an automation plan.  Do you email reports, put them in a shared directory, create a set of reports in the tool and provide access, or deliver them in weekly presentations?  The best choice is the option that gets people to use them, listen, and understand what you are trying to do with KPI’s.
  • Following the reporting up with analysis and guidance.  Depending on the size and scale or your organization and the resources you have to work with, it may not be possible to provide every stakeholder with detailed analysis.  But you need to do your best to follow up KPI reporting with true analysis and guidance.  Why are KPI’s going up or down?  What are the drivers of the changes? 
  • Segment, segment, segment. Site level KPI’s are helpful in understanding overall audience and customer behavior, but they hide important details.  When you slice a KPI by a specific segment, you will realize insights that help you conclude what action to take next.  Overall site repeat visit rate is 37%, but the repeat visit rate for customers who use your “product lookup tool” is 96%.  What does that data indicate about how you market the site, or about why people are coming to the site? 
  • Test, test, test.  As you measure > report > analyze > guide based on KPI’s you will undoubtedly determine actions to take on the site.  You should be testing the hypothesis behind these actions via controlled experimentation.   

There’s obviously a lot more to talk about here - from what constitutes a good KPI, to what types of KPI’s different stakeholders should examine, to what are the best KPI’s for particular site types and more.  I guess there’s more blog posts for that, but in the meantime I hope you’ve found this blogviation useful.  Let me know if you have any thoughts to share.

Questions to Ask When Assessing Web Analytics and some Random Thoughts…

At some point in the career of a web analyst, you will be asked to investigate, assess, and possibly judge the current state of how a company “does” web analytics.  What are some of the areas you should ask about?  Here are some thoughts and a few questions to ask to help inform your analysis (and grease your mental gears):

  • Business strategy.  Why does the organization do web analytics?  What’s the goal of having a web analytics team?  Who defines the strategy?  What is the strategy?
  • Analytics organization and team structure.  Who is the chief owner of web analytics?  What does the analytics team look like?  How has the team structure been formalized in the organization?  Is the web analytics team effectively staffed and have enough control over resources to do the job?
  • Process.  What analytics processes have been defined?  How does a site or site feature progress from not being measured to being effectively measured?
  • Data collection. What methods for data collection are being used?  How much data is being collected, and for how long is it stored, and at what level (i.e. detail, aggregate)?
  • Reporting.  What data is reported?  What do the reports look like?  Who creates them?  How are they distributed, and in what format?  To whom?  When?  How?
  • Analysis.  What’s the difference in this company between reporting and analysis?  How is analysis communicated to stakeholders?  When?  How?
  • KPI’s.  What Key Performance Indicators are you measuring?  How are they relevant to the business?  What actions have people taken from KPI analysis that improved business performance?
  • Segmentation.  What audience and customer segments exist?  What audience and customer dimensions and attributes are segmented?  Why are they meaningful to the business?  What has the business learned and what action has been taken from the current segmentation analysis strategy?
  • Technology.  What analytics technologies are being used?  What does the schema for web analytics look like?  What homegrown technologies are used?  What external technologies have you bought or deployed for analytics?
  • Integration.  How is web analytics data integrated with other internal and external data?  Is it integrated with other systems, how? 
  • Site Optimization.  Does the company test landing pages, and/or use AB or Multivariate testing software?  If so, whose software, and what business gains have been realized?
  • Advertising/Advertisers. How is analytics used to inform or enable advertisers and advertising?
  • Privacy.  What safeguards does the company take in protecting analytics data? 
  • Qualitative Data.  Is qualitative data contextualized with web analytics data? Do you capture voice-of-customer data?  Use Net Promoter Scores?  Have a research department?  Does web analytics collaborate with research? 

Those are just a few questions to ask.  Many others can be asked.  What would you want to know, and what would you ask?  Please leave a comment.  I’d love to hear your thoughts.

Now for some random thoughts:

  • News from Orem.  API / Fusion / Video Tracking… cool.  I’m pretty psyched that Omniture announced a web services API.  That’s fantastic, and confirms how truly important integration is now and will be in the future for analytics data (as I’ve been saying for years… Google will be next). 

Omniture has announced a new methodology, Fusion, and improved capabilities for tracking video.  All sounds very exciting.  But, like Eric, I’m wondering what revolutionary new methodology Fusion really is?  Or is just what Eric’s been saying for the last 4 yearsbranded by Omniture and delivered by the Great Belkin? 

Regarding the video capabilities, I haven’t seen a real demo yet, but I wasn’t immediately impressed with what I saw on my friend Marshall’s blog.  Instead of quartile tracking, it seems like you track the playhead (the part of the video playing) across audience aggregates in increments of one-twelfth, and you get some bubbly visualization (what would that look like with 10,000 videos on your site?), and better access to forums.

I’m hoping I haven’t seen the whole ball of wax, and I look forward to Omniture giving me the grand tour. 

But for a playhead visualization, I was much more impressed with what I saw from Visible Measures and their engagement curve.  And what the heck are those folks at Divinity Metrics up to for measuring video? 

  • News from Novato.  One of my favorite gangs of web analytics folks reside in Northern California.  My colleagues at Semphonic have just released a rather impressive “Omniture Implementation Toolkit.” 

I was able to procure a copy, and I’m totally impressed.  It’s full of hard-learned and hard-earned real world practitioner knowledge.  If you are trying to implement Omniture, it is well worth the money. 

Now I’m not sure if this document competes with or acts as a companion to Fusion.  All I can say is that I know the folks at Semphonic are smart, savvy, and very experienced, and there are thousands of Omniture customers out there who could benefit from this document.

  • X Change Conference.  I am totally excited for X Change brought to us this year by Semphonic and Web Analytics Demystified.  The last X Change in Napa at COPIA was one of the most intimate, educational, stimulating, and enjoyable conferences that I’ve been too (and did I mention the wine?).  It was pure “class” all the way (in both the sense of style and learning, and did I mention the wine? ;-). 

This year attendance is limited to 100 folks (99 if you count me ;).  Last year, I huddled on “Deploying Measurement Systems in Globally Distributed Enterprises.”  

If you aren’t familiar with X Change or Semphonic  check them out, and make sure to read a few of my favorite bloggers - the prolific deep thinker and expert Gary Angel, the always impressive (and fun) June D(ershewitz), and bright author and web analytics veteran, Phil Kemelor.

Web Analytics Data Collection for Beginners

I’ll get back to talking about the web analytics team soon, but I’ve been getting a few emails from folks just starting out who are a bit confused about data collection.  So I figured I’d blog about it…

When web analysts talk about data collection, they are referring to the method by which counts and measures of things, like page views and durations, are captured by a web analytics tool.  If you’re new to web analytics, data collection can be slightly confusing.  There are three “generally-accepted” methods for data collection in the web analytics industry: 

  • Page tags.  Client-side data collection involves using little snippets of HTML code that reference a JS file and communicate via a beacon to a “page tag server” - the machine that collects the data so it can be sessionized by the web analytics tool (it may not be called that by your vendor).  As a web analyst, if you are using page tags you will have lots of fun tagging every page on your web site and instrumenting the tags with custom variables and campaign codes.  Reasons why people like page tags are numerous, and include the fact that they are fairly efficient in filtering out non-human traffic (as long as the robot doesn’t execute javascript) and can count proxy cached pages (improving accuracy). Page tags are probably the most ubiquitous method for collecting web data today.
  • Log files. Server-side data collection involves parsing text-based log files generated by Web servers.  The server, when instructed to do so, logs every request received by clients in a file called the “log file.”  There are many formats for log files.   Each line in a log file is called a “hit” and contains lots of different stuff - from the ip address, a request date/time stamp, the item requested, user agent, referrer, and more.  Many “hits” make up a single page view - that’s why it’s incorrect to use the term “hits” to refer to “page views.”  As a web analyst you will be defining the format of the log file within your tool and moving and synchronizing log files so that they can be processed by your tool.  Some people will claim log file analysis is dated (historic may be more appropriate), or less accurate than page tags (due to caching issues).  Other people like logs because they can reprocess their data. 
  • Packet sniffers.  Network data collection involves deploying either software or hardware that intercepts and logs traffic coming over a network.  Every packet is captured and decoded according to a configuration you define.  Your web analytics tool can be configured to process the data captured and decoded by the sniffer.  Packet sniffers are a less common approach for data collection by web analytics vendors.  

Interestingly some vendors offer “hybrid” data collection, which combines multiple data collection methods.  This mode could be considered a “fourth type” of data collection.  Most commonly hybrid data collection means using logs and page tags to collect different data elements, but other combinations are possible as well. 

As you investigate the best data collection method for your implementation ensure you deeply consider the pros and cons of each method.   For example page tags capture information about the browser (like screen resolution) that logs just can’t.  But what about if you need to measure non-javascript executing clients, like some mobile devices?  Log files capture information about crawlers (i.e. robotic traffic) that page tags just can’t.  But can you adequately filter robotic traffic and maintain host exclusions?  Packet sniffers capture pretty much everything, but can be challenging to customize to your exact data needs (and you’ll need a fair amount of IT support). 

Which one is correct for your implementation?  It depends on your business goals defining what you need to measure…  

onlinedata.jpg

Tracking Rich Internet Applications with Google Analytics

About a year ago, I wrote a guest blog post over on Robbin Steif’s blog about using Google Analytics for tracking Javascript and Flash events.  This weekend Jeremy Geelan, SVP over at Sys-Con Media, asked if he could republish the work.  Of course I said “yes.”  Then I noticed that a lot has happened to GA in a year (and more to come, ahem, API’s!).  What I had wrote was now incomplete, so what you’ll find below is my attempt to sum up “event tracking” using ga.js and the Great Google’s Event Tracking Data Model.  Let me know how I did covering it, and if you think I should clarify of expand on anything.

Since we all know about page tags, let’s get down to business with “the Google” and how it tracks “the Rich Media.”  Google Analytics currently has two different javascript page tags:

  • urchin.js.  The legacy version of the Google Analytics page tag.
  • ga.js.  The current, rebranded version of the Google Analytics page tag.

How you track rich media depends on which page tag you are using.  I’ll discuss using urchin.js first, then ga.js.  I’ll also provide some information about Google’s Event Tracking function for capturing specific “events” within their event architecture.

Tracking Rich Media using Urchin.js

In the legacy version of Google Analytics, the smarties at Google created a little JavaScript function called urchinTracker() that enables event tracking.  Use the JavaScript function with an argument specifying a name for the event. For example, the function:

javascript:urchinTracker(’/mysite/flashrichmedia/playbutton’); 

logs each occurrence of that Flash event as a page view of:

/mysite/flashrichmedia/playbutton

Some caveats:

  1. Always use a forward slash to begin the argument.
  2. Actual pages with these filenames do not need to exist.
  3. You can organize your events into any structure or hierarchy you want.

Important: Google says to place your tracking code “between the opening tag and the JavaScript call” if your pages include a call to urchinTracker(), utmLinker(), utmSetTrans(), or utmLinkPost(). For example, if the page view is the major event and the “play” event a minor event; then, your hierarchy would be Page View > Event, where the page contains an event, such that:

/mysite/ria_bittons/playbutton
/mysite/ria_bittons/pausebutton
/mysite/ria_bittons/playbutton
/mysite/ria_clips/clip

Some examples of the code (from Google Help):

on (release) {
// Track with no action
getURL(”javascript:urchinTracker(’/folder/file’);”);
}

This one above tracks when you click and release (although technically, it just notices the release) of a flash button (and records the file you specify as a page view).

on (release) {
//Track with action
getURL(”javascript:urchinTracker(’/folder/file’);”);
_root.gotoAndPlay(3);
myVar = “Flash Track Test”
}

The second one is the same, but by using a function, passing it a parameter, and identifying the instance you want to track, you can measure when your file was used in a specific scene in a little flash movie. So it is a more specific method for handling event tracking in Flash.

onClipEvent (enterFrame) {
getURL(”javascript:urchinTracker(’/folder/file’);”);
}

And the third one repeats the action throughout the movie so that each time the file is loaded, it gets tracked as an event. If you were to pass a unique file at the end of the movie, you could recognize it using this method (or the other methods) to know that the whole movie was watched (as long as your session doesn’t time out). Next, wait until Google updates your analytics, then check the Top Content report to see if it all worked. Now let’s discuss how to the exact same thing using the new trackPageview function released with ga.js.

Tracking Rich Media using ga.js

In the current version of Google Analytics, the brainiacs at Google created a little JavaScript function called trackPageview() that enables event tracking.  Use the JavaScript function with an argument specifying a name for the event.For example, the function:  

javascript:pageTracker._trackPageview (“/mysite/flashrichmedia/playbutton”);

logs each occurrence of that Flash event as a page view of:

/mysite/flashrichmedia/playbutton

Some caveats:

  1. Always use a forward slash to begin the argument and use quotes around the argument.
  2.  Actual pages with these filenames do not need to exist.
  3. You can organize your events into any structure or hierarchy

You must put calls to _get._getTracker and _initData above the call to _trackPageView.  For example, you would insert the following code:

<script type=”text/javascript”>
var pageTracker = _gat._getTracker(”UA-xxxxxx-x”);
pageTracker._initData();
pageTracker._trackPageview();
</script>

Here are some examples of the ga.js code (from Google Help) that replicate what I described above using the most recent code:

on (release) {
// Track with no action
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
}

This one above tracks when you click and release (although technically, it just notices the release) of a flash button (and records the file you specify as a page view).

on (release) {
//Track with action
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
_root.gotoAndPlay(3);
myVar = “Flash Track Test”;
}

The second one is the same, but by using a function, passing it a parameter, and identifying the instance you want to track, you can measure when your file was used in a specific scene in a little flash movie. So it is a more specific method for handling event tracking in Flash.

onClipEvent (enterFrame) {
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
}

And the third one repeats the action throughout the movie so that each time the file is loaded, it gets tracked as an event. If you were to pass a unique file at the end of the movie, you could recognize it using this method (or the other methods) to know that the whole movie was watched (as long as your session doesn’t time out).

Tracking Rich Media using Google Analytics Event Tracking

When Google released ga.js in fourth quarter 2007, Google also released a data model for tracking events.  It provides more flexibility and ease of customization than the methods I described above.   The data model makes use of:

  • Objects. These are named instances of the eventTracker class and appear within the reporting interface.

var videoTracker = pageTracker._createEventTracker(”Movies”);

  • Actions. A string you pass to an event tracker class instance as a parameter.

videoTracker._trackEvent(”Stop”);

  • Labels. An optional parameter you can supply for a named object.

downloadTracker._trackEvent(”Movies”, “/mymovies/movie1.mpg”);

  • Values. A numerical value assigned to a tracked object.

To set up event tracking you should:

1. Identify the events you want to track.
2. Create an event tracker instance for each set of events.
3. Call the _trackEvent() method on your page.
4. Enable “event tracking” in your profile.

To instantiate an event tracker object, you might do something like this:

var myEventObject = pageTracker._createEventTracker(”Object Name”);
myEventObject._trackEvent(”Required Action Name”, “Optional Label”, optionalValue);

createEventTracker() is order dependent and must be called after the main tracking code (ga.js) has been loaded.Next you would call the _trackEvent() method in your source code either on every page that contains the event or as part of the tracking code for every page:

_trackEvent(action, optional_label, optional_value)

If you wanted to track interaction with the Flash UI, such as the button on a Flash Video Player, you would create a videoTracker object with name “Video”:

var videoTracker = pageTracker._createEventTracker(’Video’);

Then, in your Flash code for the video player, you would call the videoTracker object and pass a value for the action and label for the event:

onRelease (button) { 
   ExternalInterface (”javascript:videoTracker._trackEvent(’Play’, ‘MyVideo’);”)
}

You could also use the ExternalInterface ActionScript function as an eval() function to parse FlashVars and attach them to every Flash UI element that needs a tracking action.  For example, the code below associates a Stop action for the Video object and retrieves the provided label and value from the FlashVars:

onRelease (button) { 
   ExternalInterface (”javascript:videoTracker._trackEvent(’Stop’” + label + “,” + value + “);”)
}

Adding event tracking code would generate event reports in the Content section of the Google Analytics Interface.  Pretty cool stuff, Google!

google-analytics-event-tracking.png

Web Analytics Prognostications for 2008

What’s the future hold for Web Analytics in 2008?  Here are a few predictions:

  • Google Analytics releases a real API for getting (and perhaps setting) data.  As you know, I think GA is a fine tool for web analytics, but has severe limitations when you want to control over your data or to feed data into other systems.  Thus, I predict Google Analytics will go beyond the “Tracking API” and release a real API that allows you to at least get data out of the tool (if not set data as well).  Think of what Feedburner does with their REST-based Awareness API.  Wouldn’t that be nice to have with GA?!
  • HBX Analytics goes away.  I’d be more than a bit nervous if I were an HBX customer because Omniture is going to sunset HBX and migrate everyone to SiteCatalyst, then try to aggressively sell them the rest of the suite. 
  • Long live Visual Sciences.  VS is a powerful tool quite superior in some regards and very different than anything else Omniture offers.  It’s also real in-house software, not some blackbox.  VS’ extensible schema, flexibility in reporting, scalability, and performance is quite unparalleled in the industry.  I can’t envision Omniture killing it (unless they peel it apart in order to create Discover 3), like they will HBX. 
  • WebTrends rebrands.  I’m not sure if you agree, but imho WebTrends Marketing Lab was an attempt to rebrand WebTrends.  I expect that interim management will continue attempting to differentiate WebTrends by rebranding products and perhaps the entire company.
  • New and updated standards are released.  As a member of the IAB’s Measurement Council I can tell you that the IAB is getting ready to release the IAB Audience Measurement Reach Guidelines, which attempt to clarify and take a stand on various aspects of server/client-side analytics and audience measurement.  I also envision the WAA increasing the number of terms they define.  But standards are just dandy and quite meaningless unless they are adopted… thus…
  • Standards enforcement is attempted in order to propel adoption. Existing and forthcoming standards will be enforced in 2008.  Enforcement from the WAA will probably come in the form of a publication of a matrix or documentation citing which vendors adhere to the standards and to what degree, what’s missing, what’s different, and so on.  If decision-makers who control budgets believe in standards, this type of document will cause the question ”do you adhere?” to be asked.  If vendors start losing deals because the answer is “no, not at all,” vendors will adopt the standards. 
  • Internal data integration becomes more important for companies and problematic for ASP’s.  When we talk about “integration” I often think people can be a bit shortsighted.  They want to integrate data from other third-party services and tools (like Salesforce.com and their ad server).  While there is certainly real value in integrating external data with web analytics data, significant value comes from integrating web analytics with internal data, such as data residing in internally-hosted CRM systems, finance, subscription, and lead generation databases. Most vendors have barely figured out how to deal with detail-level external data integration in 2007, even though many customers are demanding it.  I expect that in 2008, internal data integration will be more commonly demanded and even more problematic for ASP’s. 
  • BI tools provide better support for and integration with Web Analytics tools.  The current allotment of “enterprise” level web analytics tools are inferior to the capabilities provided by business intelligence tools from companies like Business Objects or Cognos.  Expect these BI vendors to create features for dealing with web analytics data in 2008.  Either that, or these web analytics tools need to grow up and learn a few things from BI. 
  • Web Analytics as performance management.  KPI-based site optimization means using data to guide the modification of user experience to deliver on goals.   Since goals are measurable and can be plotted against performance, it’s totally logical to use web analytics as a performance management tool.  Expect to see that gestalt in tool usage come into vogue and be discussed more in 2008. 
  • Web Analytics as part of business process automation.  Having the marketing department fielding page tags with campaign codes may work for some (small) companies, but when you work for an enterprise with thousands of clients and simultaneous campaigns across multiple channels, endemic tagging and subsequent tool configuration becomes challeging.  As part of the web analytics process, I expect to see tools support some level of business process automation enabling web analytics.
  • Features for measuring the Mobile Web.  Right now, with a log file based tool, I can segment out Mobile traffic based on user agent.  If I want to use a page tag, I have to consider js limitations.  The mobile web is the next frontier, and I only know of one web analytics vendor who is doing a decent job measuring it right now, so I expect to see more features released this year for measuring Mobile.  

So that’s that.  Like a band named PIL once said in the song called Rise “I could be wrong, could be right!”  Am I off-base, misguided, accurate, do you disagree, agree, then let me know… I’d love to hear your thoughts and your predictions for Web Analytics 2008…

crystalball1.jpg

The Yin and Yang of Online Metrics: Audience Measurement and Web Analytics

I write a monthly column for Mediapost’s Metrics Insider.  This month I wanted to talk about the different schools of thought in online metrics because at the end of the day we are all in Internet measurement together. Hope you enjoy the read:

Audience measurement and Web analytics systems are like the yin and yang of online metrics. Yin and yang are different, opposing forces, but they also complement each other. Think of Web analytics and audience measurement data in the same way: different, sometimes in opposition, but complementary.

The major difference between these systems is data collection:

  • Audience measurement companies don’t collect data directly from the sites being measured. They all rely on proprietary methods. Hitwise gets data from ISPs. Compete uses a toolbar that you can download as well as ISP and panel information. Nielsen and comScore use data collected from panels to create online metrics that they believe accurately represent overall Internet usage. Due to all these different data collection methods and no shared standards across companies, metrics from audience measurement firms are never identical with each other.
  • In Web analytics, data is collected directly from actual site activity. Methods include client-side data collection via javascript page tagging, server-side data collection via log file processing, or network data collection via packet sniffing. Sometimes methods such as page tagging and log file processing are combined in what’s called “hybrid data collection.” Vendors include Coremetrics, Webtrends, Unica, Visual Sciences, Omniture, Google, and others. The challenge with Web analytics tools is that each tool will calculate different numbers from the same source for identical metrics. In other words, Omniture numbers won’t match Google’s. That’s because each tool has its own “secret sauce” for “sessionization” — the fancy term for the way metrics are counted and measured by analytics technology. For example, certain tools may be configured to include or exclude certain filetypes or server responses. Robotic traffic may or may not be filtered.

It’s worth noting that a company named Quantcast uses panel data and also enables a site to add page tags to collect actual site data, which are then merged together in a completely different type of “hybrid” model.

All these different approaches to data collection lead to opposition when these systems are used for the same purpose. For example, conflict arises between the yin and yang when identifying reach using unique visitor metrics. Audience measurement firms may cry “cookie deletion” when analytics tools are used to count unique visitors, and Web analytics firms may shout back “coverage error” and “selection bias” at the unique visitor numbers from panel-based firms. Another area of opposition is demographics. I’ve been told that only audience measurement firms provide demographic data, and that you can’t get demographic data from Web analytics systems. That’s not true at all.

All enterprise-level Web analytics systems provide demographic location information at the country, city, state, and MSA levels. This information will be different than that provided by audience measurement companies.

Demographics that are harder to elicit from a Web analytics system, but are easily provided by audience measurement, include attributes like a visitor’s age, gender, occupation, income, and education.

But it is possible to integrate very detailed demographic attributes per visitor into a Web analytics system! Once demographic information is captured in a registration database, it can be joined with behavioral data in the Web analytics system and reported on. For a real-world example of analytics/demographic integration, take a look at what Microsoft is doing with Gatineau, the company’s free Web analytics offering currently in beta. Microsoft is joining Web site behavioral data with rich demographic data from MS Live profiles.

Even with differences and oppositions between these online metrics systems, companies find ways to use the data in complementary ways:

  • Audience measurement data is useful for competitive intelligence. All the paid and free services provide data for comparing the performance of a site to other sites, for understanding audience behavior across one or more sites by demographics, and for understanding generalized Internet traffic trends and search terms.
  • Web analytics data is useful for understanding site effectiveness, for defining key performance indicators, for determining conversion rates for marketing campaigns by channel (such as search, email, rss), for understanding what sites and keywords are driving traffic to your site, and for segmenting and reporting online metrics.

You can even use both data sources as part of the same site optimization activity. For example, you could use audience measurement data to determine that a competitor is gaining ground on a particular product or search term. Then you could look at your Web analytics tool to see how you’re doing for the same term and how visitors who searched for that keyword behave on your site. You may find a high bounce rate and low conversion rate for the keyword, so you segment that data perhaps by demographics! Next you suggest a hypothesis to minimize bounce and maximize conversion for each segment. Then you test your hypothesis, and reexamine the data. Based on the results, you then continuously improve your online performance through controlled experimentation. At the end of the day, you will drive more online revenue by understanding how the yin of audience measurement and the yang of Web analytics complement each other, than by worrying about how they differ and oppose.

yin_yang.jpg

Web Analytics and Targeting: A Quick Blogviation

Targeting refers to the process of identifying characteristics of a segment so that relevant content may be matched to it and delivered at a time when the segment is most open to the message. The idea is the right content to the right visitor at the right time (optimally in real time). 

For example, you may visit a site, and see some type of ad unit calling out at you to “meet singles in <insert_your_city>.” When browsing real estate you may see ad units for realtors and mortgage companies.  After entering a keyword such as “car prices” and clickingthrough the SERP, you may see an ad for a local car dealer.   That’s targeting in a nutshell.  It’s simple: 

  1. Visitor X has these attributes. 
  2. We have content that we think will appeal to Vistor X’s attributes. 
  3. Let’s show that content. 

While targeting has helped to increase ad clickthrough rates, it’s far from an ideal science.  Current methods for targeting have inefficiencies.  What if Visitor X just bought a new car after his recent marriage?  Unless the targeting engine is made aware of the visitor’s current state, the targeting may be off and not yield desired results. 

Even with limitations around “current awareness” targeting is perceived in the Internet industry as a crucial activity for maximizing the effectiveness of advertising and content.  Targeting is the next stage after A/B and multivariate testing.  Once you determine the preference of segments based on testing, you identify content to target. 

In new media, targeting is something associated with paid search campaigning, ad serving, and content optimization.  It’s not uncommon for targeting activities to be based on:

  • Category and sub-category.  Conceptual constructs like “categories” of topics on a media web site or products on an ecommerce site can be targeted to include certain types of ads or messages.  The notion of a “zone” fits in here as well.  The idea is that if visitors are browsing in your category for “hardware floors” you could offer them an ad or content specific to “flooring installation services.” 
  • Geography.  Country, region, city, state, DMA are all targetable constructs.  You may choose target people surfing from 02141 (Cambridge, MA) an ad for pre-sale Red Sox tix or content about Mike Lowell’s recent contract.
  • Browsing environment such as the connection speed, type of browser, operating system, user software, domain, and ISP.  An ad network serves an ad for Verizon DSL to a modem-based surfer by detecting the visitor’s browsing environment.
  • Time.  The idea of only showing content during specific periods of time is called “parting.”  Common types include day-parting and season-parting.  For example, a B2B site only choosing to show ads for a particular manufacturers product during business hours – the site’s busiest time of day – would be an example of day parting.
  • Keyword.  There are many different types of keyword targeting.  Google does fantastic things with targeting ads based on the keywords in queries.  Content Management Systems can target content based on on-site search keywords or referring keywords.  “Keywords” may be associated as metadata with site sections or pages, similar to a zone or a category targeting on an ad server.  Once a page is associated with “keyword” metadata, you can tell your server to target that keyword (and all pages where it exists as metadata).  If two categories each with different content share a targetable keyword, I can target ads across both categories to pages tagged with that specific keyword.
  • Language.  When a language is set, you can target ads to visitors with that setting. Think Google.  Keep in mind that when you target by language, the creative copy is not translated. 
  • Demographics. If the ad server is aware of a segment’s demographics, such as age, gender, income, title, purchasing power, and so on, an ad can be targeted on that basis.  Sometimes this is called “profile targeting.”
  • Context.  Think of Google AdSense and how it matches ads based on the semantics in site content.  Now you understand content targeting based on context.
  • Profile.  Targeting is possible based on conclusions drawn and rules created from the known attributes (such as purchasing propensity) about and individual or segment.

Enter one of the holy grails of online advertising and new media: “behavioral targeting” – an advanced form of targeting. Behavioral targeting refers to the process in which content is shown to a visitor based on the web sites they visit (or have visited) and the actions they take on those sites.  

Behavioral targeting involves:

  1. Knowing where a visitor “comes from” and what they’ve done in the past. 
  2. Determining the context of the visitor on the site. 
  3. Detecting the visitor’s current behavior.
  4. Serving relevant content and/or ads matched to the behavior.

By understanding the visitor’s past history, current state, and most recent behavior the marketer can target content in order to influence some point in the customer buying cycle- often at the stages of awareness and consideration.

So where does web analytics come in?  You would think web analytics data from “web analytics” technology would provide the seed data for enabling “targeting.”  It can be but in most cases, targeting is a function provided by the ad server or network or another technology called the “behavioral targeting platform,” not the analytics tool… the data does not come directly from the web analytics tool.  I’d love to hear how well (or if at all) Omniture TouchClarity is integrated with Omniture Discover or other offerings. 

In order to make web analytics data useful for targeting (if you can at all), you will need to use your web analytics data to:

  1. Define segments to target (hard to export from web analytics tools)
  2. Feed those segments and associated behavioral data to another tool (achievable if you own your data and run a tool in-house.  Harder and more costly if not).
  3. Report on segment performance after targeting (that requires employing the right people and enabling them with the right tools)..
  4. Analyze segment performance after targeting (again employ the right people and enable them with the appropriate tools and resources).

While I’ve only covered a very little bit about “targeting” and even less about “behavioral targeting” in the context of web analytics, I hope that my simple description of current methods for targeting and some thinking about “what is BT” will help you understand the emerging ecosystem in which analytics tool are interoperating now and will interoperate in the future.

bt.bmp

Video Analytics? Thoughts on Web Analytics for Internet Video…

Measuring video content with web analytics isn’t super difficult, but it has its nuances and challenges.  I’ve been thinking a bit about it lately, and have had some good conversations with a few people.  Folks I know are playing around with the likes of Joost, Vuze, and Hulu, TVUNetworks, as well as using BrightCove and Videoegg.  And, man, the popularity of BitTorrent and other swarm structure 4th gen P2P networks is larger than ever.

Simply speaking video measurement can be divided into the following types:

  • Instream measurement.  Refers to measuring the video itself and the various abstract elements of the video experience, such as duration metrics (average viewing time) and interaction metrics (number of stops, plays, pauses, rewinds, fast forwards, and clicks on video content).
  • Outstream measurement.  Refers to measuring the content environment and user experience surrounding the video, such as the conversion metrics (percentage of visits downloading or viewing a video), behavioral metrics (referrers to the video page, players used), and content metrics (percentage videos per channel, percentage videos viewed by topic, percent videos viewed by file type). 

By categorizing the web video analytics into these two buckets, you are better able to answer meaningfully the following questions, which must be considered prior to any rollout:

  1. What are the business objectives for rolling out video features on the site?
  2. What format are the videos in?
  3. Are the videos downloads or streams?
  4. Am I using a content distribution network or streaming video network?
  5. Does my web analytics tool have the features necessary for video measurement? Or should I look for a third party, niche vendor?
  6. What data collection method should I use?
  7. Do I understand event models?
  8. What KPI’s are relevant and important based on my business goals?

To help you formulate answers to those questions, here’s some thinking:

  • Business objectives.  You, the analyst, must understand why your company is rolling out video.  In other words, what’s the goal and what strategy underpins the goal?  While video is “the rage” right now, simply rolling out video because “everyone is doing it” is no strategy (though doing so may yield a strategy ;).  A goal for video deployment could be “to generate leads,” thus you measure the scenario conversion rate for the funnel resulting in the lead generation and video download (outstream video analysis).  The objective might be “to keep visitors on the site longer,” then you would measure duration and interaction (instream video analysis).  As you all know, I firmly believe that it the business goal that allows you to contextualize what you’re measuring so that you may build KPI’s.
  • Video format. Lots of different video file types exist: mpegs, qt, mov, swf, flv, avi, wma, ra, wmf, mp4 and more.  You’ll need to identify the video types you want to track so you can configure your web analytics tool to measure them.  Removing or adding filters or changing your tag’s javascript might be necessary. 
  • Download or streams.  Videos can be downloaded (by right clicking) or spawned in a media player.  They can also exist embedded on the page or in another object for on-page streaming.  Thus, the way you instrument your pages will differ based on the way you present the video content. For example, if you are streaming videos, you may want to use javascript (or a vendor provided scripting language) to instrument your pages to track the video.  If you are just hosting downloads, you may simply want to run your logs to detect the number of times videos were downloaded.
  • Content distribution network or video network. If your video content is distributed by a CDN or a video network, you will have to apply page tags on all the pages rendered by combining your server’s content with the content served by the CDN. Some video networks provide basic reporting that you can extend with a client-side page tagging solution.  Alternatively, you can process the logs provided by a CDN. The challenge with CDN log file processing is that you will most likely not be able to merge the data with your log files for the same site, resulting in two “profiles” of analytics data related to one site: one profile with the site analytics data and one with the CDN analytics data.
  • Data collection method.  If you’ve read this far in my blogivation, you probably picked up that the data collection method you have at your disposal will constrain or enable the way you measure video.  Page tags will enable you to instrument your pages with onclick functions that pass values to the javascript and in turn to the analytics server.  Packet sniffers and log files enable you to measure downloads without modifying code.   If you need modify your web analytics tool or tag configuration to track video filetypes, you can reprocess logs to access the data.  With tags any data related to downloads or interactions with the video object prior to the config change will be lost.
  • Web analytics tool features. Many web analytics tools will allow you track a video play or download in your page view reports, but only two tools support true event models: Unica NetInsight and Google Analytics.  At Emetrics San Fran in May 2007, Ian Houston and I gave a preso on “from page views to events.”  It looks like the vendors agreed, ay? ;)
  • Third party tools.  With the convergence of internet and television, we’re not many years away from having a single-screen for viewing the internet, tv, and movies.  Many of us already connect our TV’s to our computers (Windows Media Server), use Slingbox, have had Tivo for years, use BitTorrent and perhaps even consume content from the sites I listed at the beginning of this post.  Companies like Visible MeasuresZango, VidMetrix, and Maven Networks already provide some flavor of a video measurement solution too.
  • Event models provide the conceptual and logical framework for measuring interactions that are subordinate, equal, or a replacements for the page view.  Without getting into much detail, “events” are interactions such as the play, stop, pause in a video stream, or the pan, zoom events in a online mapping experience.  In order to articulate the instream video experience, you should understand what an event model is and how it applies in Web Analytics 2.0.
  • KPI’s.Based on business goals resulting from site strategy, you can build KPI’s related to instream and outstream video measurement.  For example:

Instream:

  • Percentage high duration streams
  • Percentage medium duration streams
  • Percentage low duration streams
  • Average viewing time per stream/overall across all streams
  • Percentage visits who complete stream
  • Percentage visits that stop stream within 10 seconds
  • Percentage visits when this stream was the last video viewed
  • Percentage visits when this stream was the first video viewed

Outstream:

  • Conversion rates by video filetype, video topic, channel, taxonomy node, referrer, geography, keyword, and so on
  • Average streams per visit
  • Percent visits/views from different channels (such as email, organic search, paid search, direct, offline)
  • Average time since last stream/video downloads
  • Average time between stream/video downloads
  • Repeat visit rate for visits involving a stream/video download

The Internet has come a long way since I saw my first streaming video over 9 years ago (VIVO for those old timers out there).  The options for consuming video content over the web are growing everyday (and not at all limited to YouTube, ay?).  I firmly believe video on the Internet is still in its infancy, and video measurement technologies both inside and outside of “web analytics” are quite embryonic.  What a huge space for growth! 

As the internet-originated video becomes even more pervasive for home entertainment and for business communication, companies will need to employ analysts who know how to create frameworks measuring video content.  Do you? 

videosegmentation.png