Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Director at a large multichannel media company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Web 2.0'

« Previous Entries

AVG LinkScanner Bot Executes JavaScript?!?

The  well-researched answer is “no.”  The AVG LinkScanner Bot appears to prefetch the js and the gif (and pretty much everything else on the page), which for certain tools and their tag configurations generates false page views and visits (and the derivatives thereof), just like it’s “legitimate” traffic. 

If your tag configuration is set up with noscript tags, AVG will fetch the content in the tags, including the gif, which means that:

  • The bot may be infesting the data of customers of web analytics vendor who configure page tag-based data collection in this way. 
  • The bot may be inflating the data in such products/services offered by various web analytics companies.
  • Customers may be paying for server calls generated by this bot.

Vendors, of course, could easily filter the user agent to protect their customers:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813) 

But I haven’t heard a peep from any SaaS vendors about excluding the user agent, filtering already collected data, or refunding customers the cost of robotically generated server calls (regardless of AVG). Have you?

Think about this: many SaaS page tag vendors don’t provide detailed visitor-level data and user agent reporting.  That means that their customers have no ability to investigate this bot or detect it by filtering their reported data by the the true user agent.

I’ve been talking about JS executing bots screwing with web data for about a year nowSEOMoz and the folks at SlickSurface confirmed it quite recently (quoting me no less in their fantastic analysis).  So they do exist…

Now let me tell you a little story.  Once upon a time I was at a conference called eMetrics when the CEO of a company came up to me and said “hey I read your blog about bot detection, and I looked in my web metrics tool for traffic with high page view to visit ratios.”  Then he narrated a story to me about how he found a bunch of traffic that had page view to visit ratios of 5,000 to 1.”  I said “do you use page tags” He said “that’s all my vendor provides, so yeah.”  And I said “you’ve found a javascript executing bot in your data.”  “I know” he said. “Well did you call your vendor and let them know?”  I said.  Now for the punch line:  he told me that the vendor (who shall remain nameless) told him “well, the traffic executed server calls”  And they wouldn’t give him a refund!

It’s worth mentioning that this bot definitely affects log file tools and packet sniffer tools.  Both must be configured to filter the AVG LinkScanner user agent.

Now here’s the rub for me.  I use AVG!!!  But I now find it increasingly difficult to support the company or continue using their products.  Why?  Because they are wearing a “bad hat” here:

  • First, they are fully aware of the affect of this bot on web analytics systems. They just don’t seem to care (yet).  UPDATE:  They have set up a Google Group to discuss this issue.  They must understand how companies of all types in all sectors use web analytics data to optimize their sites, set their marketing budgets, determine expected server load, and much more.  What do their Internet Marketers think? 
  • Second, the Link Scanner tool may have a short shelf life and may offer limited protection.  Malware creators will easily adjust. Check out what my friend Steve McInerney, a very smart security expert, said on the Web Analytics Association’s Yahoo Forum:
What strikes me about this particular solution by AVG is how
incredibly … stupid it is on several fronts.
1. Noticeably impacting a users bandwidth is, technically, a security
breach in the first place, aka Denial of Service Attack.
2. Some of us live in countries that have rather severe bandwidth
charges/limits and the like, whom shall I send my excess bandwidth
bill to?
…(this) method is fundamentally
flawed. ie malware ignores any first request and only infects on a
second request - alternate cloaking. Whatever. This type of “solution”
only provides weak protection for a strictly limited period of time.
…not just “no security” but bad
security. Because folk feel they are being protected when they are
not, and hence will take greater risks and hence inflict greater harm
on themselves. :-( 
Ignoring the balance of positive to harm that this problem inflicts on
the users who use this product.
  • Third, AVG just doesn’t seem to “get it” yet.  They are potentially messing with the ability to drive commerce via data driven decision making, e-commerce analytics, site optimization, and online media measurement!  To quote The Register “chief of research Roger Thompson - who designed the AVG LinkScanner - indicated he may do away with that unique user agent. His chief concern is security, and he doesn’t want webmasters or malware writers gaming his scanner. “In order to detect the really tricky - and by association, the most important - malicious content, we need to look just like a browser driven by a human being,” he argues.

WebMasterWorld has some good stuff about to say here.  Read the Register’s first article here.  And check out the dude’s blog who broke the news first and responses from AVG here and here.

Interesting stuff. So what do you all think? Have you seen evidence of this bot in user agent data from your page tag solutions that use the noscript tag for the image? 

Sunday Night Thinking on Mobile Analytics…

Mobile analytics for Internet-enabled wireless devices is a fairly hot topic for companies seeking to acquire customers, extend their brand, or expose content in “innovative” ways.  Obviously, the iPhone and Blackberry are pushing development in this area forward, but there really aren’t a lot of players in this space. 

Nedstat, CoreMetrics, and Omniture offer capabilities mixed into their current offerings.  Nedstat even carves out some mobile specific reporting.  You can gain some insight into mobile activity from companies that enable log file processing, like Unica and WebTrends, but be prepared to configure a bunch of filters to isolate the data.

Lesser known companies pushing mobile offerings include: Amethon, Mobilytics, Bango, TigTags, Xiti, and AdMob.  Some of these mobile players are even offering capabilities where they cross-sell analytics as an integrated part of their ad networks, content delivery  and transactional processing systems, marketing and barcoding services, and even as infrastructure or network appliances.

On the audience measurement side, we’ve seen comScore acquire M:Metrics, which was no surprise to me.

On the multivariate testing side, we see my friends at SiteSpect offering mobile MVT testing capabilities. 

And I’ll bet we see Google get into this space within the next 6 months…  I’d even wager an announcement at eMetrics DC…

From what I can gather, when we’re talking about “mobile analytics” we’re talking about “mobile browser” activity across a variety of handsets, not everything that happens on the device. 

Measurement issues in this area include:

  • Data Collection.  As many of you know, not all mobile browsers will execute javascript.  They cached the imagesThus, vendors offer us choices.  Folks like Mobilytics and Bango use an image-based data collection method, while Amethon offers a packet sniffer (they call it wireline detection), and we even have Omniture and Coremetrics talking about “no tag” implementations - what my good friend Phil Kemelor mentioned on his CMS Watch blog (”To compensate, you need to stuff the image tag with query strings that will collect the data you require for reporting.”)  Then we have Unica and WebTrends with log files.  Interestingly, packet sniffing has some advantages here because some devices pass unique id’s (such as the phone number) in the HTTP header or other unique id’s.
  • Unique visitor identification due to lack of cookie support and IP addresses changing.  IP addresses change, I’m told, as they switch from tower to tower.   In addition many mobile devices will take the IP address of the gateway, making all the devices look the same “person.”  I’ve certainly seen evidence of the host changing pretty quickly during a mobile session. Compounding the difficulty in assessing “uniqueness” is that not all mobile devices support cookies.  In web analytics, cookies are used to define uniqueness.  The fallback method when you can’t use a cookie is IP address/user agent.  If you can’t set cookies and the IP address and user agents are the same, how do you identify uniqueness?   However, when you can detect a unique value in the header, you can easily detect uniqueness.
  • Handset capability detection.  Does the device support WAP pushing, streaming video, ringtones, downloading video clips, and so on?
  • Phone and Manufacturer identification.  Database from WURFL and DeviceAtlas can be used to identify phone and manufacturer device attributes.  Larger vendors are further behind on integrating this data into their current offerings, whereas the smaller niche players are making use of it. 
  • Screen resolution detection.  The Mobile Marketing Association’s (MMA) standards for the four “standard” screen sizes may carry enough weight to push this disdained piece of metrics trivia available from javascript based tagging in web analytics into a brighter spotlight.
  • Traffic source detection.  Capabilities for traffic sources seem rudimentary.  I don’t just want to know about search and direct entry.  But I want detection of sources from my marketing and advertising campaigns, rss feeds, and email newsletters, if mobile visitors are coming in from those channels.   Interestingly, Bango solves the campaign tracking issue by pushing you to a Bango-specific URL.
  • Geographic identification.  Where are the visitors viewing your site coming from?  And what does the mobile audience environment “look like” in each country.  From this information you can extrapolate country-specifics for site optimization.  But not all devices enable geographic detection because the gateway’s IP address is used or the IP address from the network is used, not a GPS signal.
  • No standards.  There are few, if any, commonly supported mobile standards and no web data standards, so the problem is no standards for the devices and no standards for the tools.  There are no standards.  Did I mention that there are no standards. 

So I was thinking, what would I want to see in a mobile analytics solution?  Allow me to riff here.

  • Dashboards for KPI and specific-metric reporting.  Views, visits, visitors, referrers, popular pages, traffic sources, resolutions, geography, time-based reporting and custom defined KPI’s….
  • Support for multiple data collection methods.  Logs, no-js image tags , and packet sniffers.  Let me pick what I need for whatever application fits my goals.
  • Support for mobile-specific constructs not present in historic web analytics data.  Manufacturers, operators, handsets, and device capabilities.
  • Advertising-based reports.  CTR, CPM, eCPM, that stuff…
  • Tracking for mobile downloads, installed applications, SMS, and MMS.  Seems like a no-brainer.
  • API’s.  Closed systems are dead ends for integrated marketing, so give me an API or enable pre-built integrations with other systems, like CRM.
  • Segmentation.  By country, by device, by network, by manufacturer, and so on.  It’s necessary.
  • Repeat or return visitor identification.  Simple measures of recency and frequency, core to media buying and planning and to site optimization, should be a data point available in mobile analytics.
  • Conversion and goal metrics.  Do visitors on mobile devices convert better, worse, the same?  Do they reach site goals?  Without tying performance data  and outcomes to mobile visitor activity, I’m left wondering…
  • Value scoring for engagement or proxy scoring for revenue and ROI analysis.  I want to be able to score attributes or actions to approximate an engagement score or to identify value or indicate revenue. 
  • Non-human traffic and web-browser based detection and reporting.  Mobile pages are full of links.  The ads are links.  Mobile vendors must support detecting, filtering, and reporting, non human and web-based agents from pure mobile agents - otherwise the mobile data gets muddled and skewed.
  • Data Export.  Must be able to export reports to Excel or Word, and email them.

So there’s a quick blogviation on Mobile.  Am I right, wrong, what did I miss?  Let me know…

Some More Thinking about Key Performance Indicators for Web Analytics

Web Analytics Key Performance Indicators (KPI’s) are critical for breaking through the dataglut spewing forth from your web analytics tool.   I mean there’s a just a ton of data in web analytics, and the majority of it tends not to be very useful or applicable for improving your business performance.  While it’s wonderful to have a tool that lets you cut, cross, and slice loads of data every which way but loose, its can be a real challenge to frame the data or put it in context in a way that helps your business optimize the web site.   That’s why I like KPI’s - they identify meaningful, business-focused relationships in your analytics data.  By understanding KPI drivers, setting expectations for KPI performance, and analyzing your KPI’s toward defined goals for those KPI’s, you increase understanding of data, alleviate data confusion, and provide focus for the usage of your web analytics tool.

For those of you who don’t have a KPI strategy or who are just getting into analytics, an easy way to understand a KPI is to consider the example of when you are driving somewhere and trying to get there within a certain period of time.  If your goals is drive 60 miles (kilometers, my European friends) in exactly 60 minutes, you know that you need to drive 60 miles per hour (or KPH).  If you go faster, you will arrive early, if you go slower you won’t meet your goal and will arrive past your deadline.   So as you travel along the road, you measure the KPI of your speed. That’s what is important to measure on your trip.  Of course you may measure other KPI’s like the amount of fuel left or the miles you’ve traveled… those certainly may be KPI’s you measure.  But you definitely don’t need to measure you compression ratio or oil pressure even though it’s available data from your car.  In the same way, when you are looking at web analytics data, you don’t want to track everything, only those things that are important to your business performance toward goals. 

Several activities can assist the creation of KPI’s.  Here are a few of them:

  • Determine the Business Strategy.  Why is the company funding and developing an online mission?  What is the strategy?  KPI’s can help you figure out if it’s working.  To find the KPI’s that will help, the web analyst should be asking the question how can web analytics be used to formulate, implement and evaluate cross-functional decisions that will enable an organization to achieve objectives? How will web analytics be used in the process of specifying the organization’s objectives, developing policies and plans to achieve these objectives, and allocating resources to implement the policies and plans to achieve the organization’s objectives?
  • Define the Site’s Goals and why the Site ExistsI covered this in a post a few months ago.  A understanding of why your site exists enables you to effectively use online metrics.  You need to define the purpose of your site in order to create effective KPI’s.  Once you’ve defined your site’s purpose, you are positioned to examine Web data in way that helps you determine whether your site delivers on its purpose — does it exist effectively?   Create your KPI’s, identify goals for your KPI’s, and track your performance against those goals.
  • Recognize Value Drivers.  How does the business make money on the site? Monetization, in cases where profitability is important, influences what you should be measuring.  If you run a media site, you probably make money from content consumption (the recency and frequency of content consumption), conversation (social media, such as contributions or comments), and conversion (the rate at which people complete certain value driving actions, like signing up for newsletters, rss feed, webcasts, print subscriptions, or downloading certain content types, like white papers).  So you create goals for and measure KPI performance around those value drivers.
  • Map Organizational Roles.  Classify your organization into audiences for your KPI’s based what they do on your web site.  You may create KPI’s around function or action of the actors who receive your KPI reports.  Function defines the group that KPI’s are focused for, such as product development or editorial.  Action defines what those people do on the site to make it successful.  By understanding function and action of key actors on your sites, you gain insight into the type of data needed in KPI’s and the number of different KPI reports you may need to roll out.
  • Understand the Customer.  KPI’s purely focused on internal function and actions are important, they need to be customer focused.   If you think measuring conversion is important, while your customers tend to come to your site for informational or non-transactional purposes and then go elsewhere to convert, you may be disconnected from the reality of why your site exists.   Learn customer goals from VOC (voice of customer) data and by examining historic behavioral data of key segments.  Make sure you don’t create KPI’s that are vain or inane.  Instead create KPI’s that help you guide action internally so that your business meets the needs of your customers.

Framing your KPI development around the five bullet points I listed above will help you create KPI’s that assist your team in guiding business performance toward goals - while not forgetting to consider some of the core elements of online business: business strategy, site performance goals, value drivers, the human organization, and the customer. 

Now segment, segment, segment your KPI’s!

Tracking Rich Internet Applications with Google Analytics

About a year ago, I wrote a guest blog post over on Robbin Steif’s blog about using Google Analytics for tracking Javascript and Flash events.  This weekend Jeremy Geelan, SVP over at Sys-Con Media, asked if he could republish the work.  Of course I said “yes.”  Then I noticed that a lot has happened to GA in a year (and more to come, ahem, API’s!).  What I had wrote was now incomplete, so what you’ll find below is my attempt to sum up “event tracking” using ga.js and the Great Google’s Event Tracking Data Model.  Let me know how I did covering it, and if you think I should clarify of expand on anything.

Since we all know about page tags, let’s get down to business with “the Google” and how it tracks “the Rich Media.”  Google Analytics currently has two different javascript page tags:

  • urchin.js.  The legacy version of the Google Analytics page tag.
  • ga.js.  The current, rebranded version of the Google Analytics page tag.

How you track rich media depends on which page tag you are using.  I’ll discuss using urchin.js first, then ga.js.  I’ll also provide some information about Google’s Event Tracking function for capturing specific “events” within their event architecture.

Tracking Rich Media using Urchin.js

In the legacy version of Google Analytics, the smarties at Google created a little JavaScript function called urchinTracker() that enables event tracking.  Use the JavaScript function with an argument specifying a name for the event. For example, the function:

javascript:urchinTracker(’/mysite/flashrichmedia/playbutton’); 

logs each occurrence of that Flash event as a page view of:

/mysite/flashrichmedia/playbutton

Some caveats:

  1. Always use a forward slash to begin the argument.
  2. Actual pages with these filenames do not need to exist.
  3. You can organize your events into any structure or hierarchy you want.

Important: Google says to place your tracking code “between the opening tag and the JavaScript call” if your pages include a call to urchinTracker(), utmLinker(), utmSetTrans(), or utmLinkPost(). For example, if the page view is the major event and the “play” event a minor event; then, your hierarchy would be Page View > Event, where the page contains an event, such that:

/mysite/ria_bittons/playbutton
/mysite/ria_bittons/pausebutton
/mysite/ria_bittons/playbutton
/mysite/ria_clips/clip

Some examples of the code (from Google Help):

on (release) {
// Track with no action
getURL(”javascript:urchinTracker(’/folder/file’);”);
}

This one above tracks when you click and release (although technically, it just notices the release) of a flash button (and records the file you specify as a page view).

on (release) {
//Track with action
getURL(”javascript:urchinTracker(’/folder/file’);”);
_root.gotoAndPlay(3);
myVar = “Flash Track Test”
}

The second one is the same, but by using a function, passing it a parameter, and identifying the instance you want to track, you can measure when your file was used in a specific scene in a little flash movie. So it is a more specific method for handling event tracking in Flash.

onClipEvent (enterFrame) {
getURL(”javascript:urchinTracker(’/folder/file’);”);
}

And the third one repeats the action throughout the movie so that each time the file is loaded, it gets tracked as an event. If you were to pass a unique file at the end of the movie, you could recognize it using this method (or the other methods) to know that the whole movie was watched (as long as your session doesn’t time out). Next, wait until Google updates your analytics, then check the Top Content report to see if it all worked. Now let’s discuss how to the exact same thing using the new trackPageview function released with ga.js.

Tracking Rich Media using ga.js

In the current version of Google Analytics, the brainiacs at Google created a little JavaScript function called trackPageview() that enables event tracking.  Use the JavaScript function with an argument specifying a name for the event.For example, the function:  

javascript:pageTracker._trackPageview (“/mysite/flashrichmedia/playbutton”);

logs each occurrence of that Flash event as a page view of:

/mysite/flashrichmedia/playbutton

Some caveats:

  1. Always use a forward slash to begin the argument and use quotes around the argument.
  2.  Actual pages with these filenames do not need to exist.
  3. You can organize your events into any structure or hierarchy

You must put calls to _get._getTracker and _initData above the call to _trackPageView.  For example, you would insert the following code:

<script type=”text/javascript”>
var pageTracker = _gat._getTracker(”UA-xxxxxx-x”);
pageTracker._initData();
pageTracker._trackPageview();
</script>

Here are some examples of the ga.js code (from Google Help) that replicate what I described above using the most recent code:

on (release) {
// Track with no action
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
}

This one above tracks when you click and release (although technically, it just notices the release) of a flash button (and records the file you specify as a page view).

on (release) {
//Track with action
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
_root.gotoAndPlay(3);
myVar = “Flash Track Test”;
}

The second one is the same, but by using a function, passing it a parameter, and identifying the instance you want to track, you can measure when your file was used in a specific scene in a little flash movie. So it is a more specific method for handling event tracking in Flash.

onClipEvent (enterFrame) {
getURL(”javascript:pageTracker._trackPageview(’/folder/file.html’);”);
}

And the third one repeats the action throughout the movie so that each time the file is loaded, it gets tracked as an event. If you were to pass a unique file at the end of the movie, you could recognize it using this method (or the other methods) to know that the whole movie was watched (as long as your session doesn’t time out).

Tracking Rich Media using Google Analytics Event Tracking

When Google released ga.js in fourth quarter 2007, Google also released a data model for tracking events.  It provides more flexibility and ease of customization than the methods I described above.   The data model makes use of:

  • Objects. These are named instances of the eventTracker class and appear within the reporting interface.

var videoTracker = pageTracker._createEventTracker(”Movies”);

  • Actions. A string you pass to an event tracker class instance as a parameter.

videoTracker._trackEvent(”Stop”);

  • Labels. An optional parameter you can supply for a named object.

downloadTracker._trackEvent(”Movies”, “/mymovies/movie1.mpg”);

  • Values. A numerical value assigned to a tracked object.

To set up event tracking you should:

1. Identify the events you want to track.
2. Create an event tracker instance for each set of events.
3. Call the _trackEvent() method on your page.
4. Enable “event tracking” in your profile.

To instantiate an event tracker object, you might do something like this:

var myEventObject = pageTracker._createEventTracker(”Object Name”);
myEventObject._trackEvent(”Required Action Name”, “Optional Label”, optionalValue);

createEventTracker() is order dependent and must be called after the main tracking code (ga.js) has been loaded.Next you would call the _trackEvent() method in your source code either on every page that contains the event or as part of the tracking code for every page:

_trackEvent(action, optional_label, optional_value)

If you wanted to track interaction with the Flash UI, such as the button on a Flash Video Player, you would create a videoTracker object with name “Video”:

var videoTracker = pageTracker._createEventTracker(’Video’);

Then, in your Flash code for the video player, you would call the videoTracker object and pass a value for the action and label for the event:

onRelease (button) { 
   ExternalInterface (”javascript:videoTracker._trackEvent(’Play’, ‘MyVideo’);”)
}

You could also use the ExternalInterface ActionScript function as an eval() function to parse FlashVars and attach them to every Flash UI element that needs a tracking action.  For example, the code below associates a Stop action for the Video object and retrieves the provided label and value from the FlashVars:

onRelease (button) { 
   ExternalInterface (”javascript:videoTracker._trackEvent(’Stop’” + label + “,” + value + “);”)
}

Adding event tracking code would generate event reports in the Content section of the Google Analytics Interface.  Pretty cool stuff, Google!

google-analytics-event-tracking.png

The Yin and Yang of Online Metrics: Audience Measurement and Web Analytics

I write a monthly column for Mediapost’s Metrics Insider.  This month I wanted to talk about the different schools of thought in online metrics because at the end of the day we are all in Internet measurement together. Hope you enjoy the read:

Audience measurement and Web analytics systems are like the yin and yang of online metrics. Yin and yang are different, opposing forces, but they also complement each other. Think of Web analytics and audience measurement data in the same way: different, sometimes in opposition, but complementary.

The major difference between these systems is data collection:

  • Audience measurement companies don’t collect data directly from the sites being measured. They all rely on proprietary methods. Hitwise gets data from ISPs. Compete uses a toolbar that you can download as well as ISP and panel information. Nielsen and comScore use data collected from panels to create online metrics that they believe accurately represent overall Internet usage. Due to all these different data collection methods and no shared standards across companies, metrics from audience measurement firms are never identical with each other.
  • In Web analytics, data is collected directly from actual site activity. Methods include client-side data collection via javascript page tagging, server-side data collection via log file processing, or network data collection via packet sniffing. Sometimes methods such as page tagging and log file processing are combined in what’s called “hybrid data collection.” Vendors include Coremetrics, Webtrends, Unica, Visual Sciences, Omniture, Google, and others. The challenge with Web analytics tools is that each tool will calculate different numbers from the same source for identical metrics. In other words, Omniture numbers won’t match Google’s. That’s because each tool has its own “secret sauce” for “sessionization” — the fancy term for the way metrics are counted and measured by analytics technology. For example, certain tools may be configured to include or exclude certain filetypes or server responses. Robotic traffic may or may not be filtered.

It’s worth noting that a company named Quantcast uses panel data and also enables a site to add page tags to collect actual site data, which are then merged together in a completely different type of “hybrid” model.

All these different approaches to data collection lead to opposition when these systems are used for the same purpose. For example, conflict arises between the yin and yang when identifying reach using unique visitor metrics. Audience measurement firms may cry “cookie deletion” when analytics tools are used to count unique visitors, and Web analytics firms may shout back “coverage error” and “selection bias” at the unique visitor numbers from panel-based firms. Another area of opposition is demographics. I’ve been told that only audience measurement firms provide demographic data, and that you can’t get demographic data from Web analytics systems. That’s not true at all.

All enterprise-level Web analytics systems provide demographic location information at the country, city, state, and MSA levels. This information will be different than that provided by audience measurement companies.

Demographics that are harder to elicit from a Web analytics system, but are easily provided by audience measurement, include attributes like a visitor’s age, gender, occupation, income, and education.

But it is possible to integrate very detailed demographic attributes per visitor into a Web analytics system! Once demographic information is captured in a registration database, it can be joined with behavioral data in the Web analytics system and reported on. For a real-world example of analytics/demographic integration, take a look at what Microsoft is doing with Gatineau, the company’s free Web analytics offering currently in beta. Microsoft is joining Web site behavioral data with rich demographic data from MS Live profiles.

Even with differences and oppositions between these online metrics systems, companies find ways to use the data in complementary ways:

  • Audience measurement data is useful for competitive intelligence. All the paid and free services provide data for comparing the performance of a site to other sites, for understanding audience behavior across one or more sites by demographics, and for understanding generalized Internet traffic trends and search terms.
  • Web analytics data is useful for understanding site effectiveness, for defining key performance indicators, for determining conversion rates for marketing campaigns by channel (such as search, email, rss), for understanding what sites and keywords are driving traffic to your site, and for segmenting and reporting online metrics.

You can even use both data sources as part of the same site optimization activity. For example, you could use audience measurement data to determine that a competitor is gaining ground on a particular product or search term. Then you could look at your Web analytics tool to see how you’re doing for the same term and how visitors who searched for that keyword behave on your site. You may find a high bounce rate and low conversion rate for the keyword, so you segment that data perhaps by demographics! Next you suggest a hypothesis to minimize bounce and maximize conversion for each segment. Then you test your hypothesis, and reexamine the data. Based on the results, you then continuously improve your online performance through controlled experimentation. At the end of the day, you will drive more online revenue by understanding how the yin of audience measurement and the yang of Web analytics complement each other, than by worrying about how they differ and oppose.

yin_yang.jpg

A Few Tips on Web Analytics Segmentation

Market segmentation existed long before web analytics.  It’s a method for dividing a population into specific groups (segments) that share one or more characteristics.  The goal of segmentation is to maximize future value of that segment by optimizing your marketing mix.

Segment analysis will tell you different things about your audience than you will realize from studying overall population metrics.  In traditional market research, segments are created from demographics (such as age), psychographics (such as attitude), geography (such as zip code), behavior (such as usage patterns), and value (revenue earned and cost).

Using a web analytics tool to segment your online audience requires a bit of upfront thinking and requirements gathering before getting down to business.  Like all things web analytics, segmentation requires process.  Here are some tips that may help you create a process for web analytics segmentation:

  • Determine your business objectives.  Like everything in web analytics, you can’t optimize what you haven’t defined as a goal.  A business objective driving segmentation might be to “increase conversion rate (over historical numbers)” or “to improve frequency” by offering something valuable to that segment.
  • Define segments. Basic dimensions for segmentation in web analytics include: new visitors, repeat visitors, geography, time, referrer, keyword, and campaign type.  Many more dimensions and attributes can be used for segmentation too.
  • Identify expected segment behavior.  By reconciling goals, historic performance data, and VOC research, you should be able to identify the expected behavior of the segment.  If your business objective is to “increase conversion rate,” your expected segment behavior might be to “complete the form” or “click on a link.”
  • Measure current segment behavior. Sounds easy, right, but it will take system configuration and the right tool.  Pages may need to be (re)instrumented, tracking codes may need to be applied, query string parameters may need to be parsed, and in the worse case dimensions you want to segment or the metrics you may want to measure against may not be available in your web analytics tool.  For example, how would you use your tool identify the “conversion rate” for a segment of repeat visitors from newsletter X who came from Tokyo and previously downloaded a whitepaper?
  • Create “optimization hypotheses.”  Once you’ve measured current behavior, create a hypothesis or hypotheses to test in order to optimize the behavior.  You may want to perform controlled experimentation whether a simple AB test (i.e. champion/challenger), multivariate test, or experience test.  For example, I may have detected that repeat visitors from Newsletter X responded better to Y offer after being exposed to a certain element than those visitors in the same segment who did were not exposed.  That element could have been a content theme, offer, call to action, creative, and so on.  Thus, I might create a hypothesis to test that combines elements of the user experience that I feel are key to persuading the behavior and thus fulfilling the business objective.
  • Optimize content, offerings, user experience, and other site elements.  Based on your hypothesis, make subsequent changes to the elements that you think will drive the desired segment behavior.  For example, you may split traffic to two landing pages each with a completely different offer, creative, and call to action.  Or you may choose to switch out specific elements on one landing page (such as an image or call to action) using multivariate methods just to get Visitor X to “complete that form” or “click that link” to improve your “conversion rate.”
  • Analyze segment behavior against hypothesis.  How did the segment perform against expected behavior and business objectives based on testing your hypotheses?  Tools that provide drill-down/drill-up and cross-dimensional capability allow to analyze segments and answer such questions. The tools I’m talking about are advanced and powerful, such as Unica NetInsight, Visual Sciences Visual Site, Omniture Discover, and WebTrends Marketing Warehouse.  Capabilities for segmentation analytics vary by tool, so make sure to dig deep into the offerings because not all tools with let you correlate metrics like “conversion rate” with dimensions like “keyword,” let alone build complex multi-dimensional segments.  In fact, some free web analytics don’t allow you to segment data at all (just filter it)!
  • Go with what works.  Web analytics segmentation analysis will let you know what appeals to and works for a segment.  Success with web analytics segmentation means that you met your business goals and improved key performance of that segment.  Rinse, lather, and repeat the segmentation analysis and optimization process so your campaign outperforms and margins soar!

As a result of well-executed web analytics segment analysis and hypothesis testing you can realize new value in your customers and new opportunities in your campaigns. 

gatineau_segmentation12.JPG

Online Metrics need an XML Standard

I’m contributing a monthly article to MediaPost’s Metrics Insider Column.  My first contribution was published last week, and I’m reposting it here to get your thoughts.  Soon I hope to describe what I mean in more detail… The article was called “The Most Measurable Medium needs and XML Standard.” In case you missed it here it is:

OVER THE LAST TWO WEEKS, my fellow Metrics Insider columnists have correctly pointed out that online metrics are neither standardized nor easily integrated across systems. Vocabulary is muddled. Numbers do not match. Data exists in silos and is isolated from related data. Systems do not adequately or easily talk to each other. Research services, ad servers, and Web analytics tools report similarly named, overlapping and often conflicting metrics. Unfortunately, these problems will not disappear anytime soon, even with emerging “standards” and continued attention paid by the industry to these important issues.

Current industry standards for Web metrics are limited, basic, and come from independent entities. Most recently, the Web Analytics Association released a set of “standards.” The WAA’s standards are elementary definitions of concepts from various periods of Internet measurement. Web 2.0 concepts like “events” are mingled with dated measurements like “hits.” Regardless, these definitions provide a very useful starting point for framing a discussion about metrics. Recently, I’ve learned that the IAB and MRC are developing a set of IAB Reach Measurement Guidelines. Let’s hope the IAB and WAA align their work efforts.

The IAB and MRC are also currently auditing “audience measurement” firms, like Comscore and Nielsen. It’s rather unclear to practitioners what standards the IAB/MRC are applying to the audit. But the hope is that auditing will expose issues of coverage error and selection bias in the black box methodologies used to create the panels and generate the audience measurement data.

It is important to note that the IAB’s audit has two parts. The first is certification, which indicates the company being audited is applying the “standards,” and the second is accreditation, which demonstrates adherence to the IAB standards.

Only time will tell if companies like Hitwise, Compete, and Quantcast will be asked to submit to auditing. It’s worth mentioning that legacy metrics “standards” (and audits) from historic organizations like ABCe still occur and carry weight with publishers and advertisers (especially outside of the United States). It’s entirely possible that newly formed organizations, like the Association for Downloadable Media will offer their perspective on “standards” for online metrics.

The idea of “standards for the standards”–however absurd it sounds on the surface — starts to seem like a good idea when considering that all these parallel efforts aren’t intersecting. Honestly though, I question whether “standards” that are purely “definitional,” even if agreed upon, will solve many of the measurement challenges companies have when trying to understand Web data and take action from it.

Standard definitions are helpful for promoting understanding and creating a controlled vocabulary for discussing online metrics, but they don’t help with what I see as a huge challenge in today ’s metrics technologies. The problem is this: currently available online metrics systems do not adequately separate data from presentation . That’s a huge limitation preventing Web data from being easily integrated with other systems.

Detailed-level Web data (the raw data) is often costly to extract, if available at all. It is nearly impossible to deliver detailed data in real time from Web analytics, ad serving, and research-based technologies in order to feed other systems. The majority of hosted (ASP) metrics systems are closed and do not allow access to key interfaces using open software standards. For the most part, today’s metrics technologies are black boxes where data goes in, but can only be extracted in various file formats after creating a report. Common export formats include csv, pdf, and doc. While XML exports are often available from many vendors, there is no standard XML schema for describing the same type of Web data across different sources!

The industry must begin collaborating and creating a standard XML schema for describing Web data. Creating a widely used, consensus-based, published, and maintained XML standard for online metrics would make it possible to more easily share, transform, and use Web data in other systems.

I firmly believe that current metrics standards must go beyond simple definitions and tackle issues pertaining to data portability and system interoperability. Then we’ll all be in a better position to reuse Web data across the enterprise value chain. Once we all agree on “standard” definitions, I encourage us to start working together to develop a standard Online Metrics Markup Language.

Thoughts on Deploying Measurement and Web Analytics Systems (as I discussed at Semphonic XChange)

Last week I attended the 1st Annual SEMPhonic XChange and led a collaborative discussion on “deploying measurement systems in distributed companies.” In case you hadn’t heard about SEMPhonic, the company is a boutique consulting firm in Novato, CA (near one of my favorite towns on Earth: San Francisco).  They do some very unique work with web analytics (Functionalism) and other facets of new media technology.  A fine gentleman named Gary Angel leads the group. He’s recently hired smart folks, like long-time web analytics expert, June Dershewitz, and notable blogger and author of the Web Analytics Report, Phil Kemelor.

A few months ago the idea came up for me to lead a “huddle” at their Xchange conference.  I was intrigued with this huddle idea.  It was different. New.  I’d facilitate a discussion on a topic of my choice in a small, Socratic, group setting.  No PPT!

Fast forward to last week, and I found myself sitting in Napa, CA at COPIA: The American Center for Wine, Food, and the Arts. COPIA was a brilliant place for an industry colloquy.  Being a guy who likes culture (and conferences), this venue offered the best of both worlds.  Instead of a hotel, I found myself surrounded by gardens, vineyards, art, food, wine, web analytics, and the brightest in the industry.  Cool!

Thinking back on the event, SEMPhonic Xchange, in my opinion, is a “must-attend conference.”  Not only is the “huddle” format unique and fun in which to participate, it’s also a format that promotes deep discussion that leads to truly actionable insights. It’s a conference based on dialog and collaborative discussions between participants.  The huddle format provides 10 hours of free collaborative consulting (5 huddles) with employed people you can’t hire (and some consultants you can)!  Compare a daily consulting rate to the conference cost, and it’s a no brainer.

My discussion on best practices for the successful deployment of measurement systems lasted a little over two hours.  My group collaborated nicely (so I thought).  It included very intelligent folks like Jared from Intuit, Scott and Christel from Xilinx, Sami and Fred from Adobe, Renata and Matthew from American Express, Aaron from Webtrends, Rupa from Cisco, Amy from JPMorganChase, Jeff from New England Journal of Medicine, and Kevin from Charles Schwab.  Thanks to anyone reading this blog who attended.  Your valuable insights and knowledge sharing contributed to the success of the huddle.

At the ends of the huddle I went over a tips for a successful deployment.  Here are a few:

  • Identify business goals, match those goals to business strategy, and align metrics and KPI’s to support those goals.  I always say “metrics and kpi’s have significance in position and relation to a goal.”  You can’t measure against performance unless you’ve identified your business goals.  Goals are supported by strategy.  You create KPI’s for measuring against goals and for guiding work toward objectives.  Every business unit could have different goals.  Management must align and support the standardized goals that you bake into your KPI’s (and any specific derivatives you create to support the goals of differentiated business units).
  • Verify the technology implementation and the metrics collected.  A web analytics or other measurement system must be verified to ensure conformance with “standard” best practices based on the company’s goals.  You need to make sure the numbers can be trusted.  That means, validating data and reconciling differences with historic reporting systems or whatever the business demands, understands (and possibly pays for in real dollars, time, or resources allocated), such as ABCe, WAA, IAB, MRC, XYZ, 123, FBCINSA whatever it takes… ;-)
  • Provide education and training.  While some analysts want to keep all the data within their analytics group, I don’t think that practice scales across a large enterprise unless you are lucky enough to have a large analytics team.   I know I can’t individually and alone satisfy the metrics needs for thousands of stakeholders or hundreds of brands.  Successful companies deploying on a large scale adopt a “train the trainer” approach.  The trainers guide their business units, become local experts, and help foment a data-driven culture.   Let the data be free and the people educated, I say.  ;-)
  • Consider the corporate culture.  A metrics tool changes organizational culture (for the better or worse).  Suddenly, everyone is being measured and perhaps evaluated on goals implicit in the measurements.  Some will greet the tool with open arms while others will see it as a threat.  Solid management needs to foster buy-in and support for the tool.  That way, organizational resistance is overcome and clarity of mission is realized.
  • Help business units ”use the metrics” to improve performance.  My friend Eric Peterson likes to say “web analytics is hard.”  Yes it is!  That means the expert needs to work with global business teams to mentor and teach how to separate signal from noise.   As the tool is used, business units will identify “pain points” that will need to be addressed.  The analytics team should work with business units heal the pain and improve performance.
  • Plan to ”get granular” and the “get integrated” with the measurement system.   Additional requirements will come out of the woodwork due to organizational learning after you golive (even if you think you’ve elicited all reqs prior to deploying your tool and building reports).  New requirements are an early sign of success (people are “getting it” after all).   As the business learns, it will become necessary to extend the system.  Consider brainstorming about the possible ways in which the system should be extended prior to rollout, and perhaps create a basic plan for extending the system. When the system is launched, modify that system enhancement plan and execute on it to support business goals.  Employ a project manager to plan and help execute!
  • Manage and guide stakeholder expectations to minimize risk.  Be realistic when for guiding and setting expectations so that you can minimize risk (to your overall mission, your job, the company ;).  Risk comes from incorrect metrics, organizatonal issues with adoption, changes in business goals, shifting managerial priorities, technology problems, and issues resulting from resource allocation.  Make sure you manage around these issues or find ways to directly deal with them. 
  • Generously explain why the measurement system is being deployed and more.  You must create a well-thought out communication plan for promoting adoption and use of a metrics tool.  The communication plan should focus on answering:
    • Why the tool is being deployed.
    • What people are supposed to do with the tool.
    • How people should use the tool.
    • What key metric/KPI’s people and business units should be looking at to manage their performance.
    • How to go from “insight to action” based on analysis of the metrics.
  • Share best practices and lessons learned across the enterprise.  As new insights are realized and the company starts taking action from the metrics, you should provide a way for the enterprise to communicate the “highest and best use” of the tool.  By promoting collaboration and knowledge sharing, the company is more likely to succeed quickly with the measurement tool and realize a demonstrable ROI from it.
  • Realize that “premature optimization is the root of all bugs.”  When deploying a measurement system, you need to establish a baseline system before extending the system to get more “granular.”  An implementation must be granular before integrating data from other systems.  While these concepts will overlap in deployment and may occur in close proximity or in a waterfall (i.e granularity may be enabled via integration) you need to ensure you don’t put the cart before the horse.  Make sure you correctly measuring and understanding the basics (like recency, frequency, clickstream, referrers, bounced visits, and depth) then moving forward with more advanced and necessary measurements and reporting (like bounce rate, conversions, view thrus, voice of the customer, and ”engagement.”)   Set clear expectations and guidelines with the consultants.  Don’t move too fast with development.  QA, then segment away, I say… ;-) 

A lot of more was discussed and shared over in Napa and throughout the SEMphonic XChange conference.  Please share your thoughts, if you feel like it!  Thanks for reading!

measurement_systems.jpg