Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Director at a large multichannel media company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Measurement'

« Previous Entries

Let’s Use Web Analytics Data for Targeting

I’ve been thinking a bit about targeting, and how we in the web analytics industry have just a ton of visitor or segment-level data that can be used for targeting ads or content, but most tools don’t let you use the data or easily feed it to other systems to do any targeting.  It’s rather odd, don’t you think?   Even Omniture Test and Target isn’t using, as far as I’ve learned, a single data model or the data collected from their behavioral tools, like HBX or SiteCatalyst, for targeting.  All their data models and thus, their data, are unique to the products in their platform.   So I decided to resussitate/revise a blogviation and offer it as food for thought on MediaPost.  When I reread this post, it’s more of an informational post for product managers on how I’d begin thinking about targeting with analytics data and what types of targeting are possible, so here it goes.   

Targeting refers to the process of delivering content or ads to segments or visitors based on their known attributes.  The goal of targeting is simple to understand: maximizing the performance of content or an ad by serving it to visitors at a time when they are most open to the receiving the message. 

For example, you may visit a site, and see some type of ad unit calling out at you to “meet singles in <insert_your_city>.”  When browsing a real estate site, you may see ad units for realtors and mortgage companies.  After entering a keyword such as “car insurance” and clicking through the search results, you may land on a site and see an ad for a car insurance company or land on a page that persuades you to begin the process for creating an insurance price quote.  That’s targeting in a nutshell.  It’s simple for a site owner to understand:

  1. Visitor X has these attributes.  
  2. We have content or an ad that we think will appeal to Visitor X’s attributes. 
  3. Let’s show the relevant content or ad. 

In online media, targeting is associated with paid search campaigning, ad serving, and content optimization based on recognizing and responding to the following attributes:

  • Category and sub-category.  Conceptual constructs like “categories” of topics on a media web site or products on an ecommerce site can be targeted to include certain types of ads or messages.   The idea is that if visitors are browsing your category for “hardware floors,” you could offer them an ad or content specific to “flooring installation services.” 
  • Geography.  Country, region, city, state, DMA are all targetable constructs.  You may run a sports site and choose to target people surfing in from 02116 (Boston) an ad for Red Sox tickets or content about Manny Ramirez’s recent trade to the Dodgers.
  • Browsing environment such as the connection speed, type of browser, operating system, user software, domain, and ISP.  An ad network could serve an ad for DSL to a modem-based surfer by detecting the visitor’s browsing environment.
  • Time.  The idea of only showing content during specific periods of time is called “parting.”  Common types include day-parting and season-parting.  For example, a B2B site only choosing to show ads for a particular manufacturer’s product during business hours — the site’s busiest time of day — would be an example of day-parting.
  • Keyword.  There are many different types of keyword targeting.  Search engines target ads based on keywords in queries.  Content Management Systems target content based on site search keywords or referring keywords.  “Keywords” may be associated as metadata with site sections or pages, similar to zone or category targeting on an ad server.  Once a page is associated with keyword metadata in an ad tag, you can tell your ad server to target ads to that keyword on whatever page or pages the tag was placed. 
  • Language.  When a language can be detected or known in advance, you can target ads to visitors in their language.
  • Demographics. If the ad server is aware of a segment’s demographics, such as age, gender, income, title, purchasing power, and so on, an ad can be targeted on that basis. 
  • Context.  Think of AdSense and how it matches text ads based on the semantics in site content.  Or when, after adding a product to your cart, a site offers you “free shipping” if your total purchase exceeds a certain price.  This is content targeting based on context.
  • Profile.  Targeting is possible based on conclusions drawn and rules created from attributes about an individual or segment (such as purchasing propensity or job title).
  • Rules.  Serve an interstitial ad only to visitors who don’t have a cookie set for the site.
  • Events.  Someone deposits a large sum of money into his bank account, so the online banking site offers him a CD product on his next login.

We’ve all heard, of course, about a very specific type of often-discussed targeting in online advertising: “behavioral targeting.”  Behavioral targeting refers to the technology and process in which an ad or content is shown to a visitor based on their past actions and reactions.

Behavioral targeting involves:

  1. Collecting behavioral data about visitors.
  2. Identifying when those visitors visit a site.
  3. Determining the current context of visitors on the site.  
  4. Detecting the visitor’s current behavior.
  5. Serving relevant ads (or content) matched to the behavior.

The goal being to use past behavioral data to influence the customer buying cycle or marketing lifecycle, in order to more effectively and more quickly deliver on advertiser and site goals.

So where does Web analytics come in?  You would think Web analytics data from “Web analytics” technology would be used to enabling “targeting.”  After all the best Web analytics systems store detailed visitor level data about past behavior.  Web analytics data certainly can be used, but in most cases, targeting is a function provided by the ad server or network, perhaps the ISP, or another technology called the “behavioral targeting platform,” not from data collected by the Web analytics tool.

In order to make Web analytics data useful for targeting, you will need to use your data to:

  1. Define segments to target or identify visitors to target.
  2. Feed past behavioral data about segments or visitors to the targeting technology.
  3. Analyze segment and visitor performance against site or advertiser goals after targeting.

Targeting has a proven ability and amazing potential to generate tremendous returns, especially when combined with the rich, detailed behavioral data available in Web analytics.  As a method for optimizing site content and advertising, targeting technologies that integrate with Web analytics data will only become more important and a necessary “must have” for innovative companies that want to maximize business opportunities on the Internet. 

X Change: X Citing X Cogitation!!

Alright, I had to have fun with the title. :) We’re about 4 weeks ago from the newest and most unique analytics conference on the scene: X Change, hosted this year by Semphonic and Web Analytics Demystified

If you missed the first year in Napa, you gotta head to San Fran this year!  Allow me to explain how X Change differentiates as I see it:

  • Conversational. You don’t sit in a room and listen to people drone on in front of their powerpoints.  People sit in Socratic circles and talk about a topic of interest in “huddles.”  The huddle leader will bring up a topic, perhaps riff on some hard-learned experience or data point related to the topic, and ask for commentary from the participants.  The conversation then flows, like Jazz, until there’s a cadence, then the huddle leader phrases a few more notes and progression begins again…  Its atypical format depends on participants for success.  No one is going to sit there and read you slides and provide one-sided opinions.  You won’t just be sitting there listening (unless you want to).  The best huddles are interactive and encourage active participation in the pursuit of shared knowledge, not passive reception of an individual’s knowledge.
  • Focused.  The huddle topics are highly specific and deeply relevant to the real world practice of web analytics today - from attribution to mobile measurement to integration to privacy to team structure, the huddle leaders selected topics that interest them to share with the participants. The focused conversational format should lead to symbiotic exchanges of information directly relevant to your job.
  • Small.  100 people, 20 huddle leaders.  You get to make meet interesting people and build working relationships with them.  Cool folks like Bob Page, Rachel ScottoMarshall Sponder, John Lovett, Jared Waxman, “Bob” Dylan Lewis will be leading huddles and hanging out.  The Web Analytics Tuesday event will probably be bigger than the whole X Change conference!
  • Exclusive.  The huddle leaders were hand selected.  In attendance will be industry leaders, corporate executives, industry analysts.  All of the attendees work with analytics.  And for gosh sake, it is at the Ritz in one of America’s most beautiful and eccentric cities. 

I think X Change is a unique experience and a worthwhile event where you get to really connect, and well, exchange (!) expertise with your peers and go home with new knowledge.  At least I did last year.  I’ll be leading a couple of huddles, one of the web analytics team and one on knowing when you’ve outgrown you analytics tool, so say hello when you see me. 

Make sure you check out the official web site at Semphonic and sign up today.  The event will sell out soon.  15% discounts are available for Web Analytics Association members. 

AVG Fixes LinkScanner!!

AVG has released an updated version that corrects the LinkScanner bot issue (build 138, July 4), which we’ve all noticed slamming our servers and analytics data over the last several weeks:

We have modified the Search-Shield component of the product to
only notify users of malicious sites.Search-Shield no longer
scans each search result online for new exploits, which was
causing the spikes that web masters addressed with us. However,
it is important to note that AVG still offers full protection
against potential exploits through the Active Surf-Shield
component of our product, which checks every page for malicious
content as it is visited, but before it is opened.

As you’ve just read in the quote above, AVG has stopped scanning each page that returns in a SERP for users of their free tool.  Instead pages will be scanned by proxy after a user clicks on the link. 

For paid users, it’s a little different.  SERP’s will still be scanned but via a pure database approach (not the DDOS approach :), which means the sites listed in SERP’s will be compared to a black list of known “bad” sites.  The blacklist is based on internal AVG research and from the real-time results reported by users who have opted-into AVG’s “prevalence reporting system” (a feature of AVG 8).  This means AVG is still scanning sites, but on a very limited basis, thus the detrimental effects on analytics should be very minimal and only caused by users who participate in prevelance reporting.  Still some data pollution will occur…  

AVG hasn’t confirmed that they’ve released a fix to the “noscript” issue I mentioned.  I do know they are working on it and have fixed the problem in internal builds.  Regardless, if the LinkScanner is working in the way they say it is, the problem will be negligible (but some data pollution will still occur ;).

Kudos to AVG Corporate, Roger Thompson, Pat Bitton, Greg Mosher, and all the other engineers who listened to the community on the web and worked quickly to fix the problem.  Now let’s hope the the build 138 update works as described. Time will tell.

AVG LinkScanner Bot Executes JavaScript?!?

The  well-researched answer is “no.”  The AVG LinkScanner Bot appears to prefetch the js and the gif (and pretty much everything else on the page), which for certain tools and their tag configurations generates false page views and visits (and the derivatives thereof), just like it’s “legitimate” traffic. 

If your tag configuration is set up with noscript tags, AVG will fetch the content in the tags, including the gif, which means that:

  • The bot may be infesting the data of customers of web analytics vendor who configure page tag-based data collection in this way. 
  • The bot may be inflating the data in such products/services offered by various web analytics companies.
  • Customers may be paying for server calls generated by this bot.

Vendors, of course, could easily filter the user agent to protect their customers:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813) 

But I haven’t heard a peep from any SaaS vendors about excluding the user agent, filtering already collected data, or refunding customers the cost of robotically generated server calls (regardless of AVG). Have you?

Think about this: many SaaS page tag vendors don’t provide detailed visitor-level data and user agent reporting.  That means that their customers have no ability to investigate this bot or detect it by filtering their reported data by the the true user agent.

I’ve been talking about JS executing bots screwing with web data for about a year nowSEOMoz and the folks at SlickSurface confirmed it quite recently (quoting me no less in their fantastic analysis).  So they do exist…

Now let me tell you a little story.  Once upon a time I was at a conference called eMetrics when the CEO of a company came up to me and said “hey I read your blog about bot detection, and I looked in my web metrics tool for traffic with high page view to visit ratios.”  Then he narrated a story to me about how he found a bunch of traffic that had page view to visit ratios of 5,000 to 1.”  I said “do you use page tags” He said “that’s all my vendor provides, so yeah.”  And I said “you’ve found a javascript executing bot in your data.”  “I know” he said. “Well did you call your vendor and let them know?”  I said.  Now for the punch line:  he told me that the vendor (who shall remain nameless) told him “well, the traffic executed server calls”  And they wouldn’t give him a refund!

It’s worth mentioning that this bot definitely affects log file tools and packet sniffer tools.  Both must be configured to filter the AVG LinkScanner user agent.

Now here’s the rub for me.  I use AVG!!!  But I now find it increasingly difficult to support the company or continue using their products.  Why?  Because they are wearing a “bad hat” here:

  • First, they are fully aware of the affect of this bot on web analytics systems. They just don’t seem to care (yet).  UPDATE:  They have set up a Google Group to discuss this issue.  They must understand how companies of all types in all sectors use web analytics data to optimize their sites, set their marketing budgets, determine expected server load, and much more.  What do their Internet Marketers think? 
  • Second, the Link Scanner tool may have a short shelf life and may offer limited protection.  Malware creators will easily adjust. Check out what my friend Steve McInerney, a very smart security expert, said on the Web Analytics Association’s Yahoo Forum:
What strikes me about this particular solution by AVG is how
incredibly … stupid it is on several fronts.
1. Noticeably impacting a users bandwidth is, technically, a security
breach in the first place, aka Denial of Service Attack.
2. Some of us live in countries that have rather severe bandwidth
charges/limits and the like, whom shall I send my excess bandwidth
bill to?
…(this) method is fundamentally
flawed. ie malware ignores any first request and only infects on a
second request - alternate cloaking. Whatever. This type of “solution”
only provides weak protection for a strictly limited period of time.
…not just “no security” but bad
security. Because folk feel they are being protected when they are
not, and hence will take greater risks and hence inflict greater harm
on themselves. :-( 
Ignoring the balance of positive to harm that this problem inflicts on
the users who use this product.
  • Third, AVG just doesn’t seem to “get it” yet.  They are potentially messing with the ability to drive commerce via data driven decision making, e-commerce analytics, site optimization, and online media measurement!  To quote The Register “chief of research Roger Thompson - who designed the AVG LinkScanner - indicated he may do away with that unique user agent. His chief concern is security, and he doesn’t want webmasters or malware writers gaming his scanner. “In order to detect the really tricky - and by association, the most important - malicious content, we need to look just like a browser driven by a human being,” he argues.

WebMasterWorld has some good stuff about to say here.  Read the Register’s first article here.  And check out the dude’s blog who broke the news first and responses from AVG here and here.

Interesting stuff. So what do you all think? Have you seen evidence of this bot in user agent data from your page tag solutions that use the noscript tag for the image? 

Sunday Night Thinking on Mobile Analytics…

Mobile analytics for Internet-enabled wireless devices is a fairly hot topic for companies seeking to acquire customers, extend their brand, or expose content in “innovative” ways.  Obviously, the iPhone and Blackberry are pushing development in this area forward, but there really aren’t a lot of players in this space. 

Nedstat, CoreMetrics, and Omniture offer capabilities mixed into their current offerings.  Nedstat even carves out some mobile specific reporting.  You can gain some insight into mobile activity from companies that enable log file processing, like Unica and WebTrends, but be prepared to configure a bunch of filters to isolate the data.

Lesser known companies pushing mobile offerings include: Amethon, Mobilytics, Bango, TigTags, Xiti, and AdMob.  Some of these mobile players are even offering capabilities where they cross-sell analytics as an integrated part of their ad networks, content delivery  and transactional processing systems, marketing and barcoding services, and even as infrastructure or network appliances.

On the audience measurement side, we’ve seen comScore acquire M:Metrics, which was no surprise to me.

On the multivariate testing side, we see my friends at SiteSpect offering mobile MVT testing capabilities. 

And I’ll bet we see Google get into this space within the next 6 months…  I’d even wager an announcement at eMetrics DC…

From what I can gather, when we’re talking about “mobile analytics” we’re talking about “mobile browser” activity across a variety of handsets, not everything that happens on the device. 

Measurement issues in this area include:

  • Data Collection.  As many of you know, not all mobile browsers will execute javascript.  They cached the imagesThus, vendors offer us choices.  Folks like Mobilytics and Bango use an image-based data collection method, while Amethon offers a packet sniffer (they call it wireline detection), and we even have Omniture and Coremetrics talking about “no tag” implementations - what my good friend Phil Kemelor mentioned on his CMS Watch blog (”To compensate, you need to stuff the image tag with query strings that will collect the data you require for reporting.”)  Then we have Unica and WebTrends with log files.  Interestingly, packet sniffing has some advantages here because some devices pass unique id’s (such as the phone number) in the HTTP header or other unique id’s.
  • Unique visitor identification due to lack of cookie support and IP addresses changing.  IP addresses change, I’m told, as they switch from tower to tower.   In addition many mobile devices will take the IP address of the gateway, making all the devices look the same “person.”  I’ve certainly seen evidence of the host changing pretty quickly during a mobile session. Compounding the difficulty in assessing “uniqueness” is that not all mobile devices support cookies.  In web analytics, cookies are used to define uniqueness.  The fallback method when you can’t use a cookie is IP address/user agent.  If you can’t set cookies and the IP address and user agents are the same, how do you identify uniqueness?   However, when you can detect a unique value in the header, you can easily detect uniqueness.
  • Handset capability detection.  Does the device support WAP pushing, streaming video, ringtones, downloading video clips, and so on?
  • Phone and Manufacturer identification.  Database from WURFL and DeviceAtlas can be used to identify phone and manufacturer device attributes.  Larger vendors are further behind on integrating this data into their current offerings, whereas the smaller niche players are making use of it. 
  • Screen resolution detection.  The Mobile Marketing Association’s (MMA) standards for the four “standard” screen sizes may carry enough weight to push this disdained piece of metrics trivia available from javascript based tagging in web analytics into a brighter spotlight.
  • Traffic source detection.  Capabilities for traffic sources seem rudimentary.  I don’t just want to know about search and direct entry.  But I want detection of sources from my marketing and advertising campaigns, rss feeds, and email newsletters, if mobile visitors are coming in from those channels.   Interestingly, Bango solves the campaign tracking issue by pushing you to a Bango-specific URL.
  • Geographic identification.  Where are the visitors viewing your site coming from?  And what does the mobile audience environment “look like” in each country.  From this information you can extrapolate country-specifics for site optimization.  But not all devices enable geographic detection because the gateway’s IP address is used or the IP address from the network is used, not a GPS signal.
  • No standards.  There are few, if any, commonly supported mobile standards and no web data standards, so the problem is no standards for the devices and no standards for the tools.  There are no standards.  Did I mention that there are no standards. 

So I was thinking, what would I want to see in a mobile analytics solution?  Allow me to riff here.

  • Dashboards for KPI and specific-metric reporting.  Views, visits, visitors, referrers, popular pages, traffic sources, resolutions, geography, time-based reporting and custom defined KPI’s….
  • Support for multiple data collection methods.  Logs, no-js image tags , and packet sniffers.  Let me pick what I need for whatever application fits my goals.
  • Support for mobile-specific constructs not present in historic web analytics data.  Manufacturers, operators, handsets, and device capabilities.
  • Advertising-based reports.  CTR, CPM, eCPM, that stuff…
  • Tracking for mobile downloads, installed applications, SMS, and MMS.  Seems like a no-brainer.
  • API’s.  Closed systems are dead ends for integrated marketing, so give me an API or enable pre-built integrations with other systems, like CRM.
  • Segmentation.  By country, by device, by network, by manufacturer, and so on.  It’s necessary.
  • Repeat or return visitor identification.  Simple measures of recency and frequency, core to media buying and planning and to site optimization, should be a data point available in mobile analytics.
  • Conversion and goal metrics.  Do visitors on mobile devices convert better, worse, the same?  Do they reach site goals?  Without tying performance data  and outcomes to mobile visitor activity, I’m left wondering…
  • Value scoring for engagement or proxy scoring for revenue and ROI analysis.  I want to be able to score attributes or actions to approximate an engagement score or to identify value or indicate revenue. 
  • Non-human traffic and web-browser based detection and reporting.  Mobile pages are full of links.  The ads are links.  Mobile vendors must support detecting, filtering, and reporting, non human and web-based agents from pure mobile agents - otherwise the mobile data gets muddled and skewed.
  • Data Export.  Must be able to export reports to Excel or Word, and email them.

So there’s a quick blogviation on Mobile.  Am I right, wrong, what did I miss?  Let me know…

Why Don’t the Numbers Match?!?

A question any practitioner of Internet-based analytics will be asked by many different stakeholders is “why don’t the numbers match?”  Counts of the identically named metrics from ad servers don’t match the web analytics tool, which don’t match the for-pay third party audience measurement tools, which don’t match the free audience measurement tools, which never match any of the homegrown internal measurement tools.  And none of them ever match each other.

So it’s a good question certainly valid to ask.  The answers are even fairly easy to understand, but the root causes are often difficult to pinpoint and even harder, if possible at all, to remedy.  The fact of the matter is that data discrepancies in analytics result for a multitude of reasons, such as:

  • Different data collection methods.  We have a bunch of tools and services that collect web data using various, non-standardized, proprietary data collection methods.  Ad servers use javascript page tags.  Many web analytics tools use page tags too, but it’s not uncommon in web analytics to use additional methods, such as log files or packet sniffers.  Or perhaps a combination of these methods, called hybrid data collection.  And all the tools have different algorithms for processing the data collected.

On the audience measurement side, data is collected from self-selecting panels who install proprietary software (i.e. toolbars and so on) on their computers, perhaps at work or at their university, but most likely at home.  Then, the collected data from different panels is rolled-up and combined, and the limited subset of the Internet population that chooses to be monitored, in exchange for some incentive, is inflated and projected to the entire Internet audience using proprietary statistical methods.  We also have data collected from a limited set of geographically specific ISP’s.  And regardless of whether we’re talking about audience measurement or web analytics, the different data collection methods often, but not always, involve cookies and all their inherent issues of cookie deletion.  

  • Unique data models.  Ad servers aren’t focused on counting page views and the other dimension of web analytics (visits, time, and so on).  Rather ad servers focus on serving and counting impressions served (and loads of related derivative calculations, like CTR, CPC, and view–thru).  Metrics are based on an ad request and an ad code.  Ads may or may not be targeted to a page, and instead to various constructs, like a “zone” or “keyword.”  What that means is that the “page” dimension may not even exist in your ad server’s data model.  In other words, you aren’t looking at impressions measured on a page, but rather at the number of impressions served in a different conceptual construct.  That’s one of the reasons why people say metrics and ad-serving systems “don’t measure the same thing.”
  • Untagged pages.  Specific to technologies that collect data or serve ads using javascript page tags, there are challenges to ensuring and verifying complete coverage of page tags across every page on a site.  When the pages aren’t all tagged with the different tags for the assorted technologies, guess what?  The numbers won’t come close to falling within tolerable variances.  And questions and skepticism will ensue.
  • Non-JS executing clients and ad blocking software.  Let’s imagine for the moment, your site is perfectly tagged for all technologies, so the numbers between your ad server will be close to your web analytics system, right?  Nope, regardless of data model issues, not all browsers execute javascript and many Firefox users have installed Ad Block Plus. 
  • Cookie issues.  When you’re counting based on cookies, third-party cookies get blocked (often by privacy software).  Many ad servers and web analytics tools still serve third party cookies, and many corporations have not tricked out their DNS to accommodate this issue.  And we all know how cookie deletion affects unique visitor counts, even if you use first-party cookies.
  • Many other issues.  Latency from visitors moving off the page prior to the tag executing to latency in the call to pick up an ad from a third party while your ad server counts the traffic (so your ad count differs from the agency’s count), to refresh rates making it hard to correlate page views and impressions, to no rich media installed and no fallback, to robotic traffic not being filtered from logs or tags, to certain types of user agents (such as mobile devices) not executing javascript… there’s a whole host of other factors that cause data discrepancies.

And of course, there’s always the nebulous issue around the complete lack of consensus-based, enforceable standards for online measurement.  No industry organization can say what vendors or companies “must” do, only what they “should” do… And no industry body is going to get successful companies to change their secret sauce just because they said so…

So what’s a practitioner to do?  Understand the potential sources of discrepancies.  Work with your team (from IT to vendors) to prevent and minimize the root causes when possible.  Educate your team when discrepancies are not remediable.  Ensure you use the different sources of metrics judiciously in the context of your business goals.  Finally, realize that none of the tools are more “correct” than any other.  All of our analytics tools serve different, and sometimes overlapping, business purposes - from counting ads, to influencing media buying, to sizing audiences, to measuring business performance, and to optimizing the site.

Five Rules for and some Thoughts on Deep Packet Inspection

One of the many things on my mind in the online world these days is “deep packet inspection.” 

First, let me digress, packet sniffing isn’t new to web analytics.  From Accrue to Omniture (Visual Discover Sensor?) to AuriQ to Metronome Labs.  Packet sniffers are used to “do web analytics.”  It’s an uncommon method when compared to javascript page tags.

Web analytics packet sniffers are used to write logs for sessionization (and thus measure) the traffic on behalf of site owners (who don’t want to use tags or logs).  Once you’ve logged and sessionized you know what content people have looked at or downloaded on your site. 

“Deep packet inspection,” like WA sniffers looks at the entire payloadof packets in real-time across a huge number of simultaneous sessions.  Deep packet inspection, like regular packet sniffing, examines the files downloaded and the content of the pages viewed - the whole ball of wax. 

Deep packet inspection is being offered as a hardware/software technology by companies like FrontPorch and Sandvine (in the US) and Phorm(in the UK).  These companies are selling the technology to ISP’s (like Charter, Comcast, and Virgin Media) so that they can monitor the sites visited and the keywords used by customers, and then use the data collected for behavioral targeting.  The ISP’s want a slice of the juicy, lucrative online ad business.

What’s the difference?  Site owners collect data about what you do on ONE site (or a portfolio of their sites).  ISP’s collect data about what you do on EVERY site you visit.  As I understand it, some of these companies create an anonymous profile of your surfing activity by assigning a unique key to your browser.  Then they monitor the site’s visited by your browser, and use that data so that the ISP, or the companies to which they sell your data, can serve you what they conclude to be relevant, behaviorally targeted ads. 

Get it?  Packet sniffing by site owners = knowing about one site you visit.  Deep packet inspection by ISP’s = knowing about every site you visit.

Now to digress… In web analytics, we know that web analytics data is collected anonymously.  Unless there’s a login, you don’t know exactly who is coming from that IP address.  And in many cases, most companies data warehouses only contain purchase information, not the entire clickstream.  Once the data is collected, if you have the right architectures you can decode cookie values to people, and make that data non-anonymous (i.e PII).  Not difficult to do with some smart BI folks on your side.  

An ISP already knows who you are and can already identify the sites you visit.  Probably not that easily though on individual level.  They can dig through the logs, etc… 

So what’s the big deal and all the hoo-hah about  the “deep packet inspection” Phorm and FrontPorch are doing?   It’s the data they are collecting and the repository they are building containing data about every site you visit and all the content you view and download… Of course, these companies say that it’s all done anonymously and that your “privacy” is preserved “to the greatest extent possible.” 

Now let me quote Sir Tim Berners-Lee about the data collected from Phorm’s ISP tracking: “It’s mine - you can’t have it. If you want to use it for something, then you have to negotiate with me. I have to agree, I have to understand what I’m getting in return.”

And that’s the point of the blogviation, Tim is correct.  In web analytics, we do this - we try to operate within Tim’s constraints.  We enable opt-in with P3P statements and disclosures when you register/login.  Privacy policies disclose what we are doing with the data.  It’s just ethical and smart business practice to do so.

Thus, I think FrontPorch and Phorm and all the ISP’s who want a piece of online advertising should adhere to the following five rules for their services.

  1. Move to an obvious “opt-in” model with full disclosure.  Tracking via “deep packet inspection” should be an all opt-in model.  If you want anonymous data from your browser collected so that you can be behaviorally targeted, then you should opt-in to be.  Right now, it’s seems to be all opt-out.  You probably don’t know if it’s being done to you.  It’s buried in fine print you’ve probably never read.  Is that your fault you didn’t read the fine print? Yeah, but the point is it shouldn’t be buried in the fine print…
  2. Provide me with access to the data collected.  If I opt-in, I should be able to see the data collected from my browser.  It’s very simple.  I demand to see what you are collecting about my browser.  If you are building a profile, then I demand to see the data collected in the profile.  If it’s all anonymous, then explain how it is in detail, and then follow rule #1.
  3. Enable me to edit or prevent the data from being collected.  If I opt-in, I want to be able to edit or prevent certain types of data from being collected.  If you’re tracking my browser, alert me before the data is transmitted, so I can decide if I want to share it.  If a profile is built, I want to be able to edit it!
  4. Let me opt-out at any time EASILY. If I’ve opted in, and I’m unhappy with the service, allow me to opt-out simply.  Having to set an opt-out cookie on my browser is absolutely and completely absurd.  I want to be able to fully opt-out at the ISP level, just once forever, not at the browser level every time cookies are deleted.  Make it easy and permanent, not easily deletable.
  5. Disclose who you sell my data too.  Like online list rentals, the next step in all this ISP profiling is selling the data to third-parties.  Let me know what you’re doing with my data-before you do it- so I can opt out or prevent it from being sold to parties to which I don’t want it being sold.

Consumers must be given a choice for preserving their privacy.  Anonymity to the “greatest extent possible” is not enough and neither are short-sighted opt-out cookies.  Companies like Phorm and Front Porch would be wise to apply these rules to regulate themselves.  Otherwise freedom-loving governments will almost certainly regulate them

And I haven’t even mentioned the issues with net neutrality and deep packet inspection (i.e. traffic shaping and access restrictions (called “throttling” as Clint points out in the comment), have I?

Some More Thinking about Key Performance Indicators for Web Analytics

Web Analytics Key Performance Indicators (KPI’s) are critical for breaking through the dataglut spewing forth from your web analytics tool.   I mean there’s a just a ton of data in web analytics, and the majority of it tends not to be very useful or applicable for improving your business performance.  While it’s wonderful to have a tool that lets you cut, cross, and slice loads of data every which way but loose, its can be a real challenge to frame the data or put it in context in a way that helps your business optimize the web site.   That’s why I like KPI’s - they identify meaningful, business-focused relationships in your analytics data.  By understanding KPI drivers, setting expectations for KPI performance, and analyzing your KPI’s toward defined goals for those KPI’s, you increase understanding of data, alleviate data confusion, and provide focus for the usage of your web analytics tool.

For those of you who don’t have a KPI strategy or who are just getting into analytics, an easy way to understand a KPI is to consider the example of when you are driving somewhere and trying to get there within a certain period of time.  If your goals is drive 60 miles (kilometers, my European friends) in exactly 60 minutes, you know that you need to drive 60 miles per hour (or KPH).  If you go faster, you will arrive early, if you go slower you won’t meet your goal and will arrive past your deadline.   So as you travel along the road, you measure the KPI of your speed. That’s what is important to measure on your trip.  Of course you may measure other KPI’s like the amount of fuel left or the miles you’ve traveled… those certainly may be KPI’s you measure.  But you definitely don’t need to measure you compression ratio or oil pressure even though it’s available data from your car.  In the same way, when you are looking at web analytics data, you don’t want to track everything, only those things that are important to your business performance toward goals. 

Several activities can assist the creation of KPI’s.  Here are a few of them:

  • Determine the Business Strategy.  Why is the company funding and developing an online mission?  What is the strategy?  KPI’s can help you figure out if it’s working.  To find the KPI’s that will help, the web analyst should be asking the question how can web analytics be used to formulate, implement and evaluate cross-functional decisions that will enable an organization to achieve objectives? How will web analytics be used in the process of specifying the organization’s objectives, developing policies and plans to achieve these objectives, and allocating resources to implement the policies and plans to achieve the organization’s objectives?
  • Define the Site’s Goals and why the Site ExistsI covered this in a post a few months ago.  A understanding of why your site exists enables you to effectively use online metrics.  You need to define the purpose of your site in order to create effective KPI’s.  Once you’ve defined your site’s purpose, you are positioned to examine Web data in way that helps you determine whether your site delivers on its purpose — does it exist effectively?   Create your KPI’s, identify goals for your KPI’s, and track your performance against those goals.
  • Recognize Value Drivers.  How does the business make money on the site? Monetization, in cases where profitability is important, influences what you should be measuring.  If you run a media site, you probably make money from content consumption (the recency and frequency of content consumption), conversation (social media, such as contributions or comments), and conversion (the rate at which people complete certain value driving actions, like signing up for newsletters, rss feed, webcasts, print subscriptions, or downloading certain content types, like white papers).  So you create goals for and measure KPI performance around those value drivers.
  • Map Organizational Roles.  Classify your organization into audiences for your KPI’s based what they do on your web site.  You may create KPI’s around function or action of the actors who receive your KPI reports.  Function defines the group that KPI’s are focused for, such as product development or editorial.  Action defines what those people do on the site to make it successful.  By understanding function and action of key actors on your sites, you gain insight into the type of data needed in KPI’s and the number of different KPI reports you may need to roll out.
  • Understand the Customer.  KPI’s purely focused on internal function and actions are important, they need to be customer focused.   If you think measuring conversion is important, while your customers tend to come to your site for informational or non-transactional purposes and then go elsewhere to convert, you may be disconnected from the reality of why your site exists.   Learn customer goals from VOC (voice of customer) data and by examining historic behavioral data of key segments.  Make sure you don’t create KPI’s that are vain or inane.  Instead create KPI’s that help you guide action internally so that your business meets the needs of your customers.

Framing your KPI development around the five bullet points I listed above will help you create KPI’s that assist your team in guiding business performance toward goals - while not forgetting to consider some of the core elements of online business: business strategy, site performance goals, value drivers, the human organization, and the customer. 

Now segment, segment, segment your KPI’s!

So What Else Does/Could a Web Analyst Do beyond Web Analysis?

Wow!  It’s been a few weeks since I’ve had any time to blogviate. 

What other things do web analysts do?  Besides blog and do WAA stuff… And ensure tool configuration/administration, date collection, data verification/validation, reporting, KPI generation, conversion optimization, deep site analysis, stakeholder guidance, outcomes evaluation and so on… Well the fun answer is “it depends” on a things like your boss, the organization you work and the holy org chart, your recognized skill set, and what you want to do.   But as I talk to my colleagues in the industry, I’ve noticed some web analysts do a lot of different things.  Here’s a few beyond the norms (or in some case maybe part of the norm, but not often discussed):

  • Write business requirements.  You may be writing biz reqs for the extension and maintenance of your own tool, or you may be asked to participate in the definition of the metrics strategy for product or site features.  The analyst may define the attributes, capability, and characteristics that are necessary to accomplish given business objectives.  Generally these biz reqs will be functional (the system must do this in this way and look like this) and not technical (but every so often you may need to justify why you keep saying “ah, page tags, not logs” or vice-versa or packet sniffers or hybrid).  Fun!  And time consuming! 

  • Participate in product development and usability discussions.  A rich topic here for sure.  As web analysis sort of fractures into those who study how the site routes visitors, navigational elements, information architecture, and into those who prepare AB and MV tests and report the results, it’s not uncommon for analysts to be called into to determine what should go where and what functionality should or should not exist on the site in order to drive business or conversion goals.

  • Contribute to the keyword set.  As I explained in my last post, web analytics is morphing into multichannel analytics.  Analysts are increasing leveraged to participate in and analyze the outcomes of SEO and SEM.  Based on keyword data, I have a few friends who spend a ton of time selecting and managing the keyword portfolio and even the bids! 

  • Have a say in “strategy”.  Analysis informs tactical decision making, which is guided by strategy (and analysis and decision making and strategy again).  When fully leveraged, a web analyst has much to offer the strategic decision making process.  Think about something as simple as using referrers to establish content syndication and affiliate partnerships…  Cool.

  • Guide the content agenda.  For those who work in what my buddy, Alex Langshur (who runs a boutique consultancy in the public sector), calls “content-rich” and “mission driven” sites, the web analytics tool has utility as an editorial or content research tool.  From understanding what keywords/phrases are driving traffic to determining whether the editorial plan is actually mapped to the information demands of site visitors, web analysts can have a lot to say, if asked.  But be weary, the last thing an editor wants is some hot shot web jockey telling them what to write. That’s not what I’m saying to do, rather, some analysts work with content and editorial teams to ensure frequently demanded content topics are rounded out on the site, expanded on/developed, put on the content plan, or simply just known about, so the content folks can do what they do… 

  • Code. Yeah, some of us know how to do it, and many of us just don’t tell anybody.  Because “that’s not what I want to do anymore” as my friend who works at a local agency told me the other night.  My personal opinion is that code is better left to the coders, but any web analyst who can throw down with web development and talk about things like X-Forwarded From headers will only make themselves more valuable to the organization.  Then again, some analysts would rather analyze data than futz around with overly esoteric tags and variables and the plumbing of web pages.  Then again some of us love that.

  • Direct IT.  Those of us fortunate enough to have control over our web analytics technology already know they’ll be spending perhaps inordinate amounts of time with our good buddies in IT.  They may be the audience for your business requirements, or you just may need to connect with them to ensure your technology is factored into the larger plan for next generation integrated, service oriented architectures.

  • Due diligence on acquisitions.   A fun one for you MBA’ers is when you get drafted into the acquisition or merger process, having to examine the target’s web traffic.  You gain real insight into the core of their web business, and may even find things, I’ve heard, like page view inflation from not filtering bots on including things like favicon.ico to inflate page views.  Heh!

And more!  So yeah, it’s not all about spending all day just thinking about who comes to the site, why, what do they do, and do they complete their purpose according to specific goals.  While that is all a big and important part of it, the role of web analyst can go far beyond tradition, if you are capable and you work for the right business that lets you excel!

juggling.bmp

Thinking about Key Performance Indicators…

The infoglut in web analytics is enormous.  So much data.  Companies report that 69% of all people who consume the data don’t understand it.  How does a business go about making sense of it all?  Formulating a comprehensive KPI (Key Performance Indicator) strategy is a big part of differentiating signal from noise and directing appropriate tool usage.  We’ve all heard about KPI’s before.  They are ratios or derivatives of metrics that pinpoint critical, business relevant web performance.   My good friend, Eric, even wrote a book (a BIG one) about it. 

The process of moving an organization through KPI Change Management starts with a well formulated plan for doing so.  Here are some tips for formulating your KPI plan: 

  • Educate senior management and get managerial buy-in.  Education and buy-in can take shape via a number of methods.  Maybe you publish and circulate an internal-only white paper about the importance of KPI’s measurement.  Maybe you leave Eric’s book on the chair of your C-level executives.  Perhaps you hold a meeting and present the web site optimization process and how measurement via KPI’s provides the foundational informational on which to make site optimization decisions.  Perhaps you take your boss out to lunch and explain that basic reporting and tool access is helpful, but “Web analytics is hard” and that KPI’s give context to the data to staff that’s otherwise somewhat confused about what they pull for the tool.  You explain that KPI’s provide a focal point for centering analysis around business goals.  Whatever the method, the goal is managerial approval that “yes, you can do KPI’s.”
  • Determine the audience for the KPI’s and train them.The importance of KPI’s will vary by stakeholder, and your KPI strategy needs to take that into account. Different segments of stakeholders will be interested in specific KPI’s, and you must accommodate that need.  As an analyst, you should identify the functional roles and job responsibilities of the people who are going to receive KPI reports.   Everyone may not be the right choice (though it could be), and it may make sense to concentrate a KPI rollout on the needs of the few or it may make sense to “go broad.”  Follow up with comprehensive training about your KPI project and how KPI’s can most effectively be used.
  • Start with simple, well-qualified, highly relevant KPI’s.  While some folks with want to throw a “kitchen sink” strategy at KPI’s.  That’s a mistake.  If you report more than 5 to 10 KPI’s (imho) per stakeholding group you may end up with a set of unworkable, confusing, and neglected reports.  It’s better to report just a few, well qualified, highly relevant KPI’s.  How do you qualify them? By mapping KPI’s to important business objectives.  How do you know they are highly-relevant? Because you’ve compelled management to buy-in and to agree that they are critical indicators of site success. 
  • Elicit the business goals for the KPI’s, compare KPI’s to goals, and report associated variances (i.e. deviations). Make sure you have determined business performance goals for KPI’s.  Goals give context for performance. It’s that simple.  Without goals, you have no context for determining what’s good and what’s bad.  If your conversion rate KPI is 5%.  Great!  So what though?  If you know your goal is 3%.  Awesome job.  If you know your goal is 10%.  Stop reading now, and get back to work - you have much work cut out for you. 
  • Identify the frequency and format for reporting.  You need to determine a frequency that is timely and sustainable, and the format in which you present KPI reports needs to common enough that people can easily examine the data. Perhaps you deliver the reporting in Excel, make it available directly in your tool, use Xcellius, or create reports using a BI tool. 
  • Automate the delivery of the reporting.Without automation, you may put on the Report Monkey suit and enter Excel hell.  Critical to the successful rollout of any KPI reporting is an automation plan.  Do you email reports, put them in a shared directory, create a set of reports in the tool and provide access, or deliver them in weekly presentations?  The best choice is the option that gets people to use them, listen, and understand what you are trying to do with KPI’s.
  • Following the reporting up with analysis and guidance.  Depending on the size and scale or your organization and the resources you have to work with, it may not be possible to provide every stakeholder with detailed analysis.  But you need to do your best to follow up KPI reporting with true analysis and guidance.  Why are KPI’s going up or down?  What are the drivers of the changes? 
  • Segment, segment, segment. Site level KPI’s are helpful in understanding overall audience and customer behavior, but they hide important details.  When you slice a KPI by a specific segment, you will realize insights that help you conclude what action to take next.  Overall site repeat visit rate is 37%, but the repeat visit rate for customers who use your “product lookup tool” is 96%.  What does that data indicate about how you market the site, or about why people are coming to the site? 
  • Test, test, test.  As you measure > report > analyze > guide based on KPI’s you will undoubtedly determine actions to take on the site.  You should be testing the hypothesis behind these actions via controlled experimentation.   

There’s obviously a lot more to talk about here - from what constitutes a good KPI, to what types of KPI’s different stakeholders should examine, to what are the best KPI’s for particular site types and more.  I guess there’s more blog posts for that, but in the meantime I hope you’ve found this blogviation useful.  Let me know if you have any thoughts to share.

« Previous Entries