Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Director at a large multichannel media company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Due Diligence'

« Previous Entries

AVG LinkScanner Obfuscates User Agent!

AVG has obfuscated their user agent.  One of the current agents for customers of their free and paid tool now cloaks itself as IE6:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

In addition to the easily detectable user agents:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)
User Agent:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)  
User Agent:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

This news is not good.  If you filter SV1 agent, you risk filtering legitimate traffic from the IE6 browser.  A few folks have commented to me that one should filter the user agent anyway, because 1) IE6 is in decline and 2) most IE6 users have .NET installed, which will show in the user agent.  Still filtering it makes me a little uneasy.

Is this the death toll for log file analysis and services provided by ABCe (since they can’t filter this user agent either)?  Maybe it is.  AVG is touting that agent lacks HTTP Accept-Encoding, which is just dandy, but that information isn’t normally captured in logs.

So the current situation is this:

  1. AVG has two user agents.  Both are filterable, but the SV1 agent is problematic to filter because you risk filtering legitimate traffic.
  2. Both agents in the current version request gifs in noscript tags, inflating counts in page tag implementations with noscript configurations.  AVG claims they will fix this issue.
  3. The bot uses”mad” bandwidth.  I’ve heard stories of bandwidth increasing 100x normal levels.  Some webmasters are serving dummy files to the recognizable user agents, some aren’t serving content to IE 6 browsers (crazy), and some are redirecting the bot back to AVG (thus inflating AVG’s bandwidth, LOL!).
  4. Evidence points to this bot NOT inflating clicks from paid search (i.e. PPC) and thus NOT committing click fraud.   But it doesn’t remain out of the realm of possibility that the scanner may be accessing an ad vendor click redirector and causing a click.  Not trying to spread FUD here, just making a point. 
  5. AVG is looking at option of checking either an external db (hosted by AVG) or a local cache to verify sites in SERP’s have been “scanned by AVG,” instead of repeatedly scanning sites every time they are listed in SERP, to reduce the bandwidth issue and minimize fraudulent entries in log files.
  6. AVG is thinking about enabling white listing of sites, so they are skipped by the scanner.
  7. AVG is thinking about exposing a meta-tag that instructs the scanner to ignore the site.

Good luck with this nasty bot!  Interestingly, here’s how you smurf a site with the AVG LinkScanner. 

Update on AVG LinkScanner

Here’s the deal.  AVG LinkScanner doesn’t execute javascript nor take cookies.  I had that confirmed by the Chief Research Officer at AVG, Roger Thompson. 

So why is the AVG user agent showing up in that data collected from certain page tag configurations?  The AVG LinkScanner currently requests gifs in noscript tags!

A best practice in web analytic’s page tag configuration is to use the noscript tag to serve the gif to non-javascript executing browsers.  Here’s some commonly seen (obscured) code for doing that:

<noscript>
<div><img alt=”foo” id=”bar” width=”1″ height=”1″ src=”http://
foo.bar.com/xyzab57yw10000s1s8g0boozt_9t1x/foo.gif?baruri=/
nojavascript&xy.js=No&xy.tv=1.2.3″ mce_src=”http://
foo.bar.com/xyzab57yw10000s1s8g0boozt_9t1x/foo.gif?baruri=/
nojavascript&xy.js=No&xy.tv=1.2.3″div>
</noscript>
<NOSCRIPT>
<IMG
src=”//foo.bar.com/xyz.gif?Log=1&URL=/javascript_disabled” mce_src=”//foo.bar.com/xyz.gif?Log=1&URL=/javascript_disabled”
BORDER=”0″ WIDTH=”1″ HEIGHT=”1″ />
</NOSCRIPT>
<noscript>
<img src=http://pt.foobar.com/images/xyz.gif?js=0” height=”1″
width=”1″
border=”0″ hspace=”0″ vspace=”0″ alt=”"> 

Thus, if you are using noscript tags in your page tag *and* someone with the AVG Linkscanner views a SERP (search engine results page)  from Google/Yahoo/MSN that lists your site, the traffic from the LinkScanner will be counted. 

Of course the simple solution to fix this problem is to exclude the user agent: 

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)

If don’t have full control over your page tag based web analytics implementation (i.e. hosted), you need to verify that your vendor has excluded this agent.   And you should have them audit your data going back to April, and refund/credit you any money.  Good luck with that though! :)

How big is the problem?  Well, it depends! :)

The amount of AVG traffic will vary dramatically by site.  Your site must show up in the SERP’s on computers of visitors that have AVG LinkScanner installed, and you must be using noscript tags to serve the gif.

I’ve made AVG aware of this issue.  And frankly, they’ve been a fantastic company to work with, so I’m sticking with them (for now ;).  First they allowed me to join a private Google group to discuss my findings, both the Head of Global Communications and Chief Research Officer quickly responded to all my emails (good social media response), and their engineers are looking into this issue so that they can fix it…  That’s pretty impressive and quick response.  So cheers to them!

It’s worth mentioning that the LinkScanner isn’t _supposed_ to request images, so I do think this issue will get fixed.

Only time will tell whether or not AVG obfuscates the user agent so it looks just like a “normal” browser.  Let’s hope not! 

What I do find interesting is that I’m already hearing that an agent exists with the string (Mozillia/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813). Note the “ia” mispelling of Mozilla as incorrectly documented here.  And it accepts cookies.  So AVG’s agent is already being spoofed.  Not good, not good.

Sunday Night Thinking on Mobile Analytics…

Mobile analytics for Internet-enabled wireless devices is a fairly hot topic for companies seeking to acquire customers, extend their brand, or expose content in “innovative” ways.  Obviously, the iPhone and Blackberry are pushing development in this area forward, but there really aren’t a lot of players in this space. 

Nedstat, CoreMetrics, and Omniture offer capabilities mixed into their current offerings.  Nedstat even carves out some mobile specific reporting.  You can gain some insight into mobile activity from companies that enable log file processing, like Unica and WebTrends, but be prepared to configure a bunch of filters to isolate the data.

Lesser known companies pushing mobile offerings include: Amethon, Mobilytics, Bango, TigTags, Xiti, and AdMob.  Some of these mobile players are even offering capabilities where they cross-sell analytics as an integrated part of their ad networks, content delivery  and transactional processing systems, marketing and barcoding services, and even as infrastructure or network appliances.

On the audience measurement side, we’ve seen comScore acquire M:Metrics, which was no surprise to me.

On the multivariate testing side, we see my friends at SiteSpect offering mobile MVT testing capabilities. 

And I’ll bet we see Google get into this space within the next 6 months…  I’d even wager an announcement at eMetrics DC…

From what I can gather, when we’re talking about “mobile analytics” we’re talking about “mobile browser” activity across a variety of handsets, not everything that happens on the device. 

Measurement issues in this area include:

  • Data Collection.  As many of you know, not all mobile browsers will execute javascript.  They cached the imagesThus, vendors offer us choices.  Folks like Mobilytics and Bango use an image-based data collection method, while Amethon offers a packet sniffer (they call it wireline detection), and we even have Omniture and Coremetrics talking about “no tag” implementations - what my good friend Phil Kemelor mentioned on his CMS Watch blog (”To compensate, you need to stuff the image tag with query strings that will collect the data you require for reporting.”)  Then we have Unica and WebTrends with log files.  Interestingly, packet sniffing has some advantages here because some devices pass unique id’s (such as the phone number) in the HTTP header or other unique id’s.
  • Unique visitor identification due to lack of cookie support and IP addresses changing.  IP addresses change, I’m told, as they switch from tower to tower.   In addition many mobile devices will take the IP address of the gateway, making all the devices look the same “person.”  I’ve certainly seen evidence of the host changing pretty quickly during a mobile session. Compounding the difficulty in assessing “uniqueness” is that not all mobile devices support cookies.  In web analytics, cookies are used to define uniqueness.  The fallback method when you can’t use a cookie is IP address/user agent.  If you can’t set cookies and the IP address and user agents are the same, how do you identify uniqueness?   However, when you can detect a unique value in the header, you can easily detect uniqueness.
  • Handset capability detection.  Does the device support WAP pushing, streaming video, ringtones, downloading video clips, and so on?
  • Phone and Manufacturer identification.  Database from WURFL and DeviceAtlas can be used to identify phone and manufacturer device attributes.  Larger vendors are further behind on integrating this data into their current offerings, whereas the smaller niche players are making use of it. 
  • Screen resolution detection.  The Mobile Marketing Association’s (MMA) standards for the four “standard” screen sizes may carry enough weight to push this disdained piece of metrics trivia available from javascript based tagging in web analytics into a brighter spotlight.
  • Traffic source detection.  Capabilities for traffic sources seem rudimentary.  I don’t just want to know about search and direct entry.  But I want detection of sources from my marketing and advertising campaigns, rss feeds, and email newsletters, if mobile visitors are coming in from those channels.   Interestingly, Bango solves the campaign tracking issue by pushing you to a Bango-specific URL.
  • Geographic identification.  Where are the visitors viewing your site coming from?  And what does the mobile audience environment “look like” in each country.  From this information you can extrapolate country-specifics for site optimization.  But not all devices enable geographic detection because the gateway’s IP address is used or the IP address from the network is used, not a GPS signal.
  • No standards.  There are few, if any, commonly supported mobile standards and no web data standards, so the problem is no standards for the devices and no standards for the tools.  There are no standards.  Did I mention that there are no standards. 

So I was thinking, what would I want to see in a mobile analytics solution?  Allow me to riff here.

  • Dashboards for KPI and specific-metric reporting.  Views, visits, visitors, referrers, popular pages, traffic sources, resolutions, geography, time-based reporting and custom defined KPI’s….
  • Support for multiple data collection methods.  Logs, no-js image tags , and packet sniffers.  Let me pick what I need for whatever application fits my goals.
  • Support for mobile-specific constructs not present in historic web analytics data.  Manufacturers, operators, handsets, and device capabilities.
  • Advertising-based reports.  CTR, CPM, eCPM, that stuff…
  • Tracking for mobile downloads, installed applications, SMS, and MMS.  Seems like a no-brainer.
  • API’s.  Closed systems are dead ends for integrated marketing, so give me an API or enable pre-built integrations with other systems, like CRM.
  • Segmentation.  By country, by device, by network, by manufacturer, and so on.  It’s necessary.
  • Repeat or return visitor identification.  Simple measures of recency and frequency, core to media buying and planning and to site optimization, should be a data point available in mobile analytics.
  • Conversion and goal metrics.  Do visitors on mobile devices convert better, worse, the same?  Do they reach site goals?  Without tying performance data  and outcomes to mobile visitor activity, I’m left wondering…
  • Value scoring for engagement or proxy scoring for revenue and ROI analysis.  I want to be able to score attributes or actions to approximate an engagement score or to identify value or indicate revenue. 
  • Non-human traffic and web-browser based detection and reporting.  Mobile pages are full of links.  The ads are links.  Mobile vendors must support detecting, filtering, and reporting, non human and web-based agents from pure mobile agents - otherwise the mobile data gets muddled and skewed.
  • Data Export.  Must be able to export reports to Excel or Word, and email them.

So there’s a quick blogviation on Mobile.  Am I right, wrong, what did I miss?  Let me know…

A Few Thoughts After Another Awesome eMetrics….

Back from another excellent eMetrics.  I’m a very big fan of the eMetrics Marketing Optimization Summit…  Props go to Jim Sterne for growing this event from a little seed into an incredible, blogworthy blossom.  How involved is Jim in eMetrics?  I’d say he’s completely immersed in every little piece - he even came up to me at the SF WAW (way to go June D!) to find out about the renegade AV work I did in one of the sessions, and to get my take on how it could have been avoided.  He’s that intimately connected to what’s going on.  Macro and micro, micro and macro.  And when you have one of the best Internet Marketers in the world, keeping a tight rein on the Clydesdale of conferences, you know you’re in for one heck of fun ride. 

And so it was for about 500+ of the top web analytics in the beautiful Palace hotel.  Props to consummate conference organizers Matt Finlay and his crew at Rising Media for keeping the road smooth as we all trotted on it as well.  Fanny, you are one helpful polyglot of a marketing manager!  I never knew German keyboards were so wild… Thanks.

The eMetrics sessions were informative and actionable.  The lobby bar and after-hours parties fun and enlightening.  You really can’t ask for more out of a conference.  As I flew home thinking back on it all, there was a lot to blog about, including:

  • It’s all about attitude, dude – as in attitudinal data.  Like my father says “it’s all about your attitude.”  And so it is on the Internet in 2008.  From ForeSeeResults, to iPerceptions, to OpinionLab, to CRMMetrix, the often missing link in customer analytics is attitudinal data.  I’m talking here about Voice of Customer (VOC) technology that allows you to ask a question set to site visitors and then apply some sort of algorithm or model to express the meaningfulness of the data in quantifiable terms.  From the American Customer Satisfaction Index to 4Q.  VOC technology enables you to participate in a continuous, automated dialog with your customers in order to identify problem points on your web site and enable you to measure purpose and success of your most valuable segments.  Expect to see some of the big players gobble up these smaller companies.  Omniture, Unica, WebTrends, and CoreMetrics should be thinking about acquisition in this space to round out their offerings.
  • Testing, 123… as in multivariate, MVT.  The rage is site optimization technologies beyond the simple A/B, champion challenger, test.  In this category you find folks like SiteSpect (the only non-intrusive multivariate testing solution!).  I’m a big fan of these guys (and was in 2006 long before they ever sponsored a WAW, thanks to a nice demo from Larry at my old job).  Eric Hansen and his crew have specialized software that you install in your data center.  No futzing with damned tags.  Swap out your variations, create different recipes, determine what’s statistically significant in giving you a lift to your macro or micro conversion goal, and you’re off to the races.  The good folks at Google are doing it and doing it well with Google Site Optimizer (thanks for the t-shirts!).  Interwoven is baking in Optimost to the CMS, and Omniture has their Test and Target integrated with the Business Optimization Suite.  Accenture has MemetricsKefta too. And what ever happened to Verster?

In a nutshell, these technologies enable you to test variations of content themes, colors, creative, calls to action, points of resolution, buttons, navigational elements, –whatever you want to call the stuff on the screen—to determine what combination performs best against your goals.  But of course, this is all just software, so don’t get too excited.  The tests are about as good as the people creating them…  And complex tests that take a long time to execute may not finish.  Imagine 1-800-Flowers starting a test in January and not finishing until March, missing Valentine’s Day.  Or Intuit running a test beyond April 15th for a tax product.  Go humbly and carefully into this space, my friends, or you may end up optimizing for everyone and appealing to none.

  • Tying it all back to the dollar for profit-generating sites and to the mission of non-profit generating sites…  It seems like a “no, duh” moment but metrics for the sake of metrics can be a big waste of time.  If you can’t tie metrics or visitor actions back to value on a revenue-producing site or to the betterment of a non-profit site’s core mission, then what’s really the point of the measurement…  That’s why I’m a big fan of the stuff ZaaZ does.  They totally get the fact of how actionable metrics turn the wheel of Internet commerce and ad-based models, and they can model it all to prove it out the ROI.  Folks like newly elected WAA Director Alex Langshur’s company Public InSite do similar stuff for content driven sites.  That is they know how to use metrics to optimize the channel to goals, not to just puke confusing data, like most web analytics tools do.  Again, it’s all about the people you hire, not the tools you use… My good friend Avinash, right again!
  • The emergence and rise of deeply psychological and neuro-behavioral methods for automating persuasion and conversion.   Anyone who knows my good friend Joseph Carrabis, over at NextStage Evolution, knows that besides being one heck of giant kite flying, music master, he’s also got the models and the patents to help target and respond to human behavior across programmable devices.  We’re already seeing some companies, like Seven Billion Joe’s, er People, taking what he’s been saying for years and going to market with it.  The idea here being that if you can identify the affective, behavior, and motivational drivers of site visitors, you can maximize cognition in elements on the site (like pictures, text, informational flow) to appeal to target segments and persuade/provoke desired behavior.  It’s like a higher rung on the optimization ladder.  It’s not test what they see, it’s figure out how they think, then make the site better because of it.  Cool stuff.  Blows my mind.
  • Integrated, multichannel marketing.  Just ask my good friend Akin Arikan, author of the newly released Multichannel Marketing.  (Disclaimer: I was a technical editor on the book.  It’s easy to do when you edit brilliance).  Make sure to check it out!  Marketing in general will become more Internet-centric, but will continue to clutch the roots of broadcast and print.  You will have the database marketer and statistical modelers working with a union of web channel and offline data.  What’s preventing it now?  A unified marketing database.  You see companies like Salford Systems circulating in this space.  And take a look at Unica’s blend of Enterprise Marketing Management…  I’d stay tuned to see what Unica has up their sleeve for bringing together online and offline.  When you can segment and target across online and offline campaigns, if I were pure web channel player only, like Omniture or CoreMetrics, I’d be a bit concerned that people are waking up to open systems, not closed black boxes.  WebTrends is already moving in this direction…  But they all remain far behind Unica when it comes to multichannel marketing.

And that’s just a few of the things the phenomenal eMetrics got me thinking about…  I hope to see you in Washington DC in October! 

What Questions would you ask “the experts” about Web Analytics and Audience Measurement?

Next Sunday afternoon I am moderating a panel at eMetrics San Fran.  The panel is called ”Web Analytics -vs- Audience Measurement.”  Andrea Hadley at NetSetGo was the brainchild of this panel idea (and yes that is her picture on her site :).  In fact, I was a panelist on the same panel at eMetrics Toronto, filling in for my friend Marshall Sponder.  Since he’s going to be in San Fran, I yielded my seat 0n the panel and decided to stand up at the podium.   Other panelists include Jodi McDermott, Director of Product Management, at ClearSpring, and some other surprise guests (from comScore and IAB maybe)… You’ll have to show up and find out… :)

The panel description is as follows:

Are you confused about the number of customers visiting your website? Are the metrics reported by your web analytics tool different from the metrics reported by your online media, or by audience measurement organizations? The WAA invites eMetrics Marketing Optimization Summit attendees and the local San Francisco business community of web marketers, publishers and agencies to attend this community meeting. A panel of experts will discuss the value of the metrics, methods and tools used by web analytics practitioners, online advertising media and audience measurement organizations. Find out how-to use these metrics and tools to better understand your customers, your website’s competitive standing and overall website value.

The goals for this panel include:

  • Adding clarity around the tools and data associated with each set of technology and metrics - web analytics technologies and website data, ad servers and ad data, and audience measurement tools and data.
  • Learning how each data source can be used to expand our understanding of customers, how effective our website is as a business channel, the website’s competitive standing and value, and so on.
  • Providing insight into the role of the web analytics practitioner and how this role is growing in importance and influence over business, marketing, product, and strategic decisions.
  • Discussing the role of the Web Analytics Association (WAA) and how the WAA serves the practitioner.  That the WAA is an unbiased organization that doesn’t serve advertisers, publishers, or technology vendors, rather that the WAA serves and exists for the benefit and betterment of the the practitioner and the web marketer/strategist.
  • Articulating the announcement made at eMetrics Toronto on the important collaboration between the IAB and the WAA for standards review.

My goal as the moderator is not to critique, demean, or criticize audience measurement, Internet advertising technologies, or to embellish or hype up web analytics tools.  Rather I hope to clarify the differences between the technologies and speak about the value they hold together - like I did in my article for MediaPost called the Yin and Yang of Online Metrics.

So why am I telling you all of this on my blog???  Well it’s because I really want your help, whether you are going to eMetrics or not…  Since I’m the moderator, I get to ask the questions, and I don’t want to just ask “my” questions, I want to know what questions YOU would ask if you had the chance to ask.  Of course, those of you reading this and attending the panel will be given the microphone if you raise your hand.

Please help my crowdsource by telling me in comments or via email to judah (at) webanalyticsdemystified.com:

What questions would you ask to clarify the differences and value between web analytics and audience measurement tools?

Any questions you think worth asking from “why don’t the numbers match?” to complexly “what are the differences between audience measurement and web analytics systems in terms of data collection?” would be awesome and appreciated.  Thanks in advance for your help!  I’m eager to see if this social media experiment in blog-based crowdsourcing actually works! :)

So What Else Does/Could a Web Analyst Do beyond Web Analysis?

Wow!  It’s been a few weeks since I’ve had any time to blogviate. 

What other things do web analysts do?  Besides blog and do WAA stuff… And ensure tool configuration/administration, date collection, data verification/validation, reporting, KPI generation, conversion optimization, deep site analysis, stakeholder guidance, outcomes evaluation and so on… Well the fun answer is “it depends” on a things like your boss, the organization you work and the holy org chart, your recognized skill set, and what you want to do.   But as I talk to my colleagues in the industry, I’ve noticed some web analysts do a lot of different things.  Here’s a few beyond the norms (or in some case maybe part of the norm, but not often discussed):

  • Write business requirements.  You may be writing biz reqs for the extension and maintenance of your own tool, or you may be asked to participate in the definition of the metrics strategy for product or site features.  The analyst may define the attributes, capability, and characteristics that are necessary to accomplish given business objectives.  Generally these biz reqs will be functional (the system must do this in this way and look like this) and not technical (but every so often you may need to justify why you keep saying “ah, page tags, not logs” or vice-versa or packet sniffers or hybrid).  Fun!  And time consuming! 

  • Participate in product development and usability discussions.  A rich topic here for sure.  As web analysis sort of fractures into those who study how the site routes visitors, navigational elements, information architecture, and into those who prepare AB and MV tests and report the results, it’s not uncommon for analysts to be called into to determine what should go where and what functionality should or should not exist on the site in order to drive business or conversion goals.

  • Contribute to the keyword set.  As I explained in my last post, web analytics is morphing into multichannel analytics.  Analysts are increasing leveraged to participate in and analyze the outcomes of SEO and SEM.  Based on keyword data, I have a few friends who spend a ton of time selecting and managing the keyword portfolio and even the bids! 

  • Have a say in “strategy”.  Analysis informs tactical decision making, which is guided by strategy (and analysis and decision making and strategy again).  When fully leveraged, a web analyst has much to offer the strategic decision making process.  Think about something as simple as using referrers to establish content syndication and affiliate partnerships…  Cool.

  • Guide the content agenda.  For those who work in what my buddy, Alex Langshur (who runs a boutique consultancy in the public sector), calls “content-rich” and “mission driven” sites, the web analytics tool has utility as an editorial or content research tool.  From understanding what keywords/phrases are driving traffic to determining whether the editorial plan is actually mapped to the information demands of site visitors, web analysts can have a lot to say, if asked.  But be weary, the last thing an editor wants is some hot shot web jockey telling them what to write. That’s not what I’m saying to do, rather, some analysts work with content and editorial teams to ensure frequently demanded content topics are rounded out on the site, expanded on/developed, put on the content plan, or simply just known about, so the content folks can do what they do… 

  • Code. Yeah, some of us know how to do it, and many of us just don’t tell anybody.  Because “that’s not what I want to do anymore” as my friend who works at a local agency told me the other night.  My personal opinion is that code is better left to the coders, but any web analyst who can throw down with web development and talk about things like X-Forwarded From headers will only make themselves more valuable to the organization.  Then again, some analysts would rather analyze data than futz around with overly esoteric tags and variables and the plumbing of web pages.  Then again some of us love that.

  • Direct IT.  Those of us fortunate enough to have control over our web analytics technology already know they’ll be spending perhaps inordinate amounts of time with our good buddies in IT.  They may be the audience for your business requirements, or you just may need to connect with them to ensure your technology is factored into the larger plan for next generation integrated, service oriented architectures.

  • Due diligence on acquisitions.   A fun one for you MBA’ers is when you get drafted into the acquisition or merger process, having to examine the target’s web traffic.  You gain real insight into the core of their web business, and may even find things, I’ve heard, like page view inflation from not filtering bots on including things like favicon.ico to inflate page views.  Heh!

And more!  So yeah, it’s not all about spending all day just thinking about who comes to the site, why, what do they do, and do they complete their purpose according to specific goals.  While that is all a big and important part of it, the role of web analyst can go far beyond tradition, if you are capable and you work for the right business that lets you excel!

juggling.bmp

Why Does Your Site Exist?

That’s the first question to answer when determining strategy for using online metrics.  You should be able to answer in 10 seconds.  If you don’t know, or if key stakeholders can’t agree on your site’s purpose, then you are unable to use online metrics efficiently.  And, worse yet, you are missing chances for improving your business performance. 

Your web site exists for a purpose, perhaps multiple purposes, such as:

  • Providing information or data.  Many sites entice people to visit for access to valuable, differentiated information or data.  Traffic is then monetized primarily through site advertising.  Many internal and external analytics packages will tell you where visitors come from and what they do onsite, which, when combined with demographic information, can be used to qualify a specific audience to an advertiser.
  • Generating leads.  A content asset is placed on a site and gated using a form.  People fill out the form and download the asset.  The information captured in the form is stored and used by the company that generated the leads or profitably sold to another company.
  • Selling products.  The typical ecommerce model involves acquiring customers via some method or offer, providing a product catalog or landing page, and creating a strong call to action and funnel that persuades people to purchase a product.
  • Connecting people.  The explosion of social networking sites where people connect to other people, interact with each other, and use widgets, apps, and data services is a modern phenomenon in which many of us participate. 

Understanding why your site exists enables you to effectively use online metrics.  Once you’ve defined your site’s purpose, you are positioned to examine web data in way that helps you determine whether your site delivers on its purpose – does it effectively exist? 

Metrics and ratios that help you assess if you site fulfills its purpose are called Key Performance Indicators (KPI’s) – see Eric Peterson’s Big Book of KPI’s for a detailed review of the topic:

  • For information or data driven sites, you may want to look at KPI’s that measure goal or task completion and conversion rates.  For example, if your site’s purpose is to expose video content to an audience, then a relevant KPI would be the percentage of all visitors that streamed a video or the number of streams per visit. 
  • For lead generation sites, a key KPI you will track is the lead conversion rate.  In other words, of all the visitors that came to your site, what percentage of visitors successfully filled out a form and generated a lead. 
  • For ecommerce sites, a key KPI that you might track is average order value, which is how much money the average visitor who purchases a product spends on a single transaction.
  • For social networking sites, you may want to measure the average time between visits (latency) and the repeat visitor rate. 

But here’s the challenge with KPI’s: they are all academic, unless you have business goals for KPI’s.  KPI’s help you track progress toward predefined business goals.  What are the business goals associated with your site’s purpose?  For your informational site, what’s the goal for video streams per visit or time spent?  For your lead generation site, what’s the goal for the lead conversion rate?  By comparing business goals for KPI’s to actual KPI’s, you can begin to answer the question: “is my site successfully existing and fulfilling its purpose?”

You will continue to answer that question by segmenting your KPI’s, investigating distributions beyond averages, and using other techniques for data analysis.  You may ask: do certain referring sites, have a lead generation conversion rate higher than other referring sites, and why?  Do certain audience segments spend more time on site?  If so, where do they go on the site and what do they do?  If my goal for average time between visits (latency) to my site is five days, and certain customer segments haven’t visited in ten days (recency), what does that indicate about current business performance?

By defining why your site exists, creating KPI’s based on your site’s purpose, establishing business goals for KPI’s, and investigating what’s driving those KPI’s, you can enhance your online business performance in a way that increases bottom-line profit – from optimizing user experience and landing pages, to more efficiently allocating your marketing budget, to improving your product mix, and much more.

existence.jpg

Web Analytics Data Collection for Beginners

I’ll get back to talking about the web analytics team soon, but I’ve been getting a few emails from folks just starting out who are a bit confused about data collection.  So I figured I’d blog about it…

When web analysts talk about data collection, they are referring to the method by which counts and measures of things, like page views and durations, are captured by a web analytics tool.  If you’re new to web analytics, data collection can be slightly confusing.  There are three “generally-accepted” methods for data collection in the web analytics industry: 

  • Page tags.  Client-side data collection involves using little snippets of HTML code that reference a JS file and communicate via a beacon to a “page tag server” - the machine that collects the data so it can be sessionized by the web analytics tool (it may not be called that by your vendor).  As a web analyst, if you are using page tags you will have lots of fun tagging every page on your web site and instrumenting the tags with custom variables and campaign codes.  Reasons why people like page tags are numerous, and include the fact that they are fairly efficient in filtering out non-human traffic (as long as the robot doesn’t execute javascript) and can count proxy cached pages (improving accuracy). Page tags are probably the most ubiquitous method for collecting web data today.
  • Log files. Server-side data collection involves parsing text-based log files generated by Web servers.  The server, when instructed to do so, logs every request received by clients in a file called the “log file.”  There are many formats for log files.   Each line in a log file is called a “hit” and contains lots of different stuff - from the ip address, a request date/time stamp, the item requested, user agent, referrer, and more.  Many “hits” make up a single page view - that’s why it’s incorrect to use the term “hits” to refer to “page views.”  As a web analyst you will be defining the format of the log file within your tool and moving and synchronizing log files so that they can be processed by your tool.  Some people will claim log file analysis is dated (historic may be more appropriate), or less accurate than page tags (due to caching issues).  Other people like logs because they can reprocess their data. 
  • Packet sniffers.  Network data collection involves deploying either software or hardware that intercepts and logs traffic coming over a network.  Every packet is captured and decoded according to a configuration you define.  Your web analytics tool can be configured to process the data captured and decoded by the sniffer.  Packet sniffers are a less common approach for data collection by web analytics vendors.  

Interestingly some vendors offer “hybrid” data collection, which combines multiple data collection methods.  This mode could be considered a “fourth type” of data collection.  Most commonly hybrid data collection means using logs and page tags to collect different data elements, but other combinations are possible as well. 

As you investigate the best data collection method for your implementation ensure you deeply consider the pros and cons of each method.   For example page tags capture information about the browser (like screen resolution) that logs just can’t.  But what about if you need to measure non-javascript executing clients, like some mobile devices?  Log files capture information about crawlers (i.e. robotic traffic) that page tags just can’t.  But can you adequately filter robotic traffic and maintain host exclusions?  Packet sniffers capture pretty much everything, but can be challenging to customize to your exact data needs (and you’ll need a fair amount of IT support). 

Which one is correct for your implementation?  It depends on your business goals defining what you need to measure…  

onlinedata.jpg

Thinking about Measuring Internet Video?

Every month I write a column for MediaPost’s Metrics Insider.  This month I wanted tackle my evolving take on Internet video measurement.  Very few companies offer solutions in this space.  Only a few are really differentiated.  Check out Visible Measures, NedStat, TubeMogul, Divinity Metrics, and the usual suspects, Omniture, Unica, WebTrends, ComScore, and Neilsen NetRatings

Here’s my column:

IN LATE 2007, THE DIGITAL Video Barometer Executive Survey indicated that more than 80% of media and entertainment executives believe tracking, measuring, and monitoring Internet video content is critical to bottom-line profit.  That’s not surprising. Accurate measurement informs decision-making and improves business performance, and Internet video is more mainstream and popular than ever before.  What may be surprising to those executives is that technology for measuring Internet video generally focuses on video content served on-site, not off-site.  It’s fairly straightforward for a Web analytics tool to tell you how people are consuming and interacting with on-site video, but consumption and interaction of videos distributed across multiple sites, perhaps virally or via social media campaigning, aren’t directly measurable by Web analytics tools.  Panel-based technologies can approximate certain off-site measures of video consumption and distribution, but don’t provide very deep on-site metrics. Measurements of Internet video consumption, interaction, and distribution may be categorized as follows:

  • Instream measurement.  Refers to measuring the video itself and the various events and behaviors that occur during a video viewing experience, such as time-based duration metrics and interaction and behavioral metrics (for example, the number of stops, plays, pauses, rewinds, fast-forwards, sites that posted or syndicated the video, clicks on hotspots and social media features).
  • Outstream measurement.  Refers to measuring the content environment and user experience surrounding the video on the site or in the skin, such as the conversion metrics (percentage of visitors downloading or viewing a video), source metrics (refers to the video page, players used), and content metrics (percentage videos viewed by topic, percent videos viewed by file type). 

Those categories form a framework for Key Performance Indicators (KPI’s) that help to identify how people interact with videos, how videos perform when compared to other videos, and against pre-defined business goals.  Analysis of KPIs enables video content to be tailored to maximize performance.  Example KPI’s include:

Instream KPI’s:

  • Percent high, medium, and low duration video views
  • Average viewing time per video
  • Percent visitors who complete the video
  • Percent visitors that stop the video within 10 seconds
  • Percent visits when this video was the last video viewed
  • Percent visits when this video was the first video viewed

Outstream KPI’s:

  • Conversion rates by video, topic, channel, taxonomy node, referrer, geography, keyword, and so on
  • Average video views per visit
  • Percent visits/views from different channels (such as email/rss, organic search, paid search, direct)
  • Average time between visits that include a video view
  • Repeat visit rate for visits involving a video view or download

These KPIs are measurable using a Web analytics tool, and perhaps a few of them are possible using traditional panel-based measurement.  But if off-site video distribution creates a whole new set of challenges to using current analytics and audience measurement tools to track instream and outstream metrics and KPIs, what are publishers and advertisers to do?  It’s a business problem that demands a new technology solution for understanding audience behavior, consumption, and distribution patterns of off-site syndicated or viral video content.

So what would a new technology solution for measuring Internet video and audience behavior do?  First it would have to fill the gap between panel and census-based measurement systems in a way that helps both publishers and advertisers  – not just one or the other — understand audience reach, frequency, and behavior.  The technology must enable tracking and actionable reporting and dashboarding of key metrics and KPIs, distribution patterns, behaviors, and interactions regardless of where the video “goes” on the Internet.  Audience characteristics from external databases (like OpenID for example) and internal company databases (like subscription and registration dbs) should be able to be integrated with data collected about behavior, video metadata, and instream and outstream metrics. 

If measuring digital video is as important as eight out of 10 media and entertainment executives believe it to be, there are some huge money-making opportunities on the horizon — for companies that are already providing technology for tackling this emerging business need, for advertisers using Internet video to drive awareness and response, and for measurement professionals who can help make sense of the Internet video ecosystem, solve measurement challenges, identify significant business opportunities, and use video metrics to improve business performance.  We’re certainly at the beginning of the J-curve for Internet video measurement for both publishers and advertisers.  After all, Forrester predicts Internet video advertising spend to increase from $471 million last year to $7.1 billion in 2012. 

Web Analytics needs IT and the Business needs Web Analytics

I’ve been so busy folks, I’ve had no time to blog, so forgive me for my two week hiatus.   

The classic problem of “marketing versus IT” is real.  If you are lucky, you work with an excellent IT team (like me!), then this problem will be minimal if at all.  But in most cases, based on what I hear from my industry colleagues, the analytics team often has issues with IT resources being sufficiently delegated to supporting a web analytics implementation and program.

The classic problem goes something like this:

  1. Marketing:  We need advanced customizations, deep integrations, increased scalability, better performance, and more control overall over Web Analytics.
  2. IT: We don’t have resources, time, or budget to help you right now.  Fill out these forms and in the future maybe we can help.

In a nutshell, this is one of the reason why hosted solutions exist (SaaS, ASP, on-demand, whatever).  While it’s hard to do web analytics, it’s even harder to do it internally using actual software that you run.

Wouldn’t we prefer it to go something like this:

  1. Marketing: We need advanced customizations, deep integrations, increased scalability, better performance, and more control overall over Web Analytics.
  2. IT: Yes.  Can do.  Will do.  What do you need and when do you need them by?

My belief is that to “do web analytics” the right way, you need an allocation of IT resources to support your implementation and extend it to fulfill strategy and improve business performance.   After all, I firmly believe web analytics is for optimizing business performance, guiding strategy, and supporting tactical decisions.   And to do all that, you need resources when you need them.  The larger your site or portfolio of sites, the more resources you need.  It’s all pretty logical.  Getting back to IT, if you’re using a hosted solution, you need fewer IT resources.  The vendor takes care of a lot of IT stuff.  If you are running your analytics in-house, you need a team of IT resources because you will be doing it all yourself.  

I would prefer those technical resources report into Web Analytics, but I’m not sure if the general business world (as in non-Internet companies) sees the ROI of Web Analytics clearly enough to immediately delegate a full-time “mini IT” team to support analytics at phase zero (i.e. when you first get hired and plan the rollout).  And that’s why you need to be very wary of what vendors tell you about IT requirements and web analytics. 

If management expects that you just need to tag the pages and you the analyst can do that yourself, your company will be in for surprise.  It’s never that simple.  Smaller companies with one or a few sites that use the same technology may be able to pull off the solo cowboy analyst including tags and doing all the tech work.  Google has made that fairly easy.  But larger companies that have many sites and many different technologies serving those sites are a much different animal. 

My advice is that you can’t be fooled by vendor messaging that claims “you don’t need IT.”  That’s bull$4!+.  Marketers can’t do Web Analytics alone and in isolation.  You will need IT to help you extend your web analytics solution.   And as I’ve already stated, the level at which you need IT will vary on how you “do” web analytics.  It differs greatly if you are running an in-house proprietary solution, an internal vendor solution, or a hosted solution. 

If you are doing web analytics using a proprietary solution you created internally, you may probably then already understand what I mean when I say ”web analytics needs IT.”  Chances are you are using an OLAP-based solution that has huge BI infrastructure behind it and the cubes contain latent information.  Your data model may be limited compared to the major vendors.  Your tool may be overly complex, hard for business users to use, and limited in terms of features, or it may be the coolest thing since sliced bread, and the people who created it may know more than the vendors.  Still, unless resources are adequately delegated to support analytics and extending the implementation, your tool users and report consumers will make thousands of requests to IT, and they will go unfulfilled leading to user frustration.

If you are running an in-house software solution, such as that provided by Unica, WebTrends or Visual Sciences, you will rely on IT for all sorts of things, like hardware and software maintenance, database administration, network support, and will need to leverage help desk and ticketing systems.  In addition, web analytics projects become part of the IT project planning cycle with budget requests and consideration.

If you are having your web analytics tool hosted.  IT may be the ones who actually put the tags you field on your web site.  Modifications to any javascript may need to be done by IT.  You will need to reach out to IT for help with setting up cookies, changing the DNS, and writing any code that assists with web analytics.  “Change management” will be required. ;)

If a business wants to succeed with Web Analytics, it must determine how to effectively resource the implementation and ongoing extension of an analytics platform.  Here are some tips for ensuring you get the resources you need:

  • Factor web analytics resource needs into the capital budgeting and yearly planning process.  Business stakeholders must identify the IT resources they need in advance, and then align the IT team according to business goals.  Resources must be allocated according to financial guidelines that maintain corporate profitability. 
  • Document your web analytics projects and business requriements and share the documentation with IT.  Whether your web analytics projects are related to implementation, campaign optimization, data description, or integration, you need to share that information with IT so they can determine how to support analytics. 
  • Identify and document why you need IT resources.   In other words, identify and document what IT will be doing for web analytics and how their work is necessary for improving corporate performance.  On the business side, explain that you won’t be able to fulfill X business goal without IT resources.
  • Leverage a project manager.  Project managers are critical and important to cross-functional team success.  They focus work, align people, determine tasks, monitor completion, and allow a multifaceted team of business marketers and IT to do what they do without worrying about managing the project.
  • Share your analytics success with IT and let stakeholders know how IT has helped you.  Often times corporations forget that these very talented IT folks are working really hard behind the scenes, often without getting much (or any) credit for the complex work they do.  When you have an analytics success, share it with the folks that helped you tag the pages or configure your servers.  When people are singing your praises in the cafeteria because they now have the data they need to do their jobs and/or you’ve improved their business performance, let them know IT backed you up and helped you deliver.  There’s enough glory to go around.

If you do what I’m saying in this blogviation, the problem  of ”marketing versus IT” will be minimized.  IT will be able to keep up with all of your constantly-evolving business requirements and the dynamic, high-maintenance nature of your web analytics program.   And your marketing department, business stakeholders, and executive team will be very happy with the results. 

« Previous Entries