Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Director at a large multichannel media company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Archive for 'Audience Measurement'

Performance, Performance, Performance

From an article I wrote for MediaPost a few weeks ago:

Reach and frequency and the core concepts of traditional media planning and advertising.  For a given site, program, channel, radio station, billboard, newspaper section, a target audience (the reach) is exposed to a certain number of occurrences of the media (the frequency).  On the web, these concepts manifest themselves in metrics collected and reported from a number of recognizable services.  Audience measurement firms, like comScore and Nielsen, web analytics firms, like Omniture and Unica, to companies somewhere in between, like Quantcast and Google, all have reach and frequency data.  Many new media metrics can be used to proxy frequency- from time-based measures, espoused by audience measurement firms, to concepts like visitor retention or the repeat visitor rate cited by web analytics firms.  On the reach side, companies refer to concepts like “unique visitors.”

These data, of course, available in free tools or in for pay tools are certainly helpful for planning campaigns.  But reach measures can be dirty (cookies, unduplicated unique users, estimates from panels, coverage error).  Frequency measures can be just as dirty (problems recording time in single page visits or visits on the last page, do page views really matter with AJAX and rich media, cookies again, and so on).  We all are aware of the challenges.

Thus using basic reach and frequency measures for planning or evaluating a campaign does not suffice.   So advertisers and agencies target demographics, like gender, age, income, education, and job title.  It’s a given that advertising in the Robb Report reaches a different audience segment than advertising in Popular Mechanics. 

These brave new days we have “behavioral” tracking too.  By taking into account visitor activity across sessions, such as past actions taken on a site or a roster of previous purchases, we can attempt to deduce what a person or segment responds to or is interested in based on their behavior.

Even with reach, frequency, demographics, and behavioral data to help guide advertising and media buying, we are missing an important attribute for maximizing the potential success of our campaigns.  We do not have an available tool, whether free or paid, for advertising or buying media on or across sites according to measures of past performance.  Such measures include ad clickthrough rates, conversion rates, goal completion rates, delivered impressions, and perhaps even harder to quantify financial measures such as ROI, ROAS, and ROMI.

Sure, historic, tacit knowledge of campaign performance exists and is used by agencies or publishers.  However, there is no shared industry source that can help us answer “how has a site for display advertisement historically performed toward goals based on the reach, frequency, demographic and behavior of its audience segments?”  Interestingly, a company minting money right now, named Google, can masterfully demonstrate performance in paid search campaigning and help advertisers unify it with segmented reach, frequency, and demographics.

Outcomes based performance measurement unified with reach, frequency, demographics, and behavior is what is missing in audience measurement tools, not frequently reported externally by web analytics tools or ad serving tools, and not available in ad planning tools.  When advertisers can target display ads, or even video ads, to desired audience segments by reach, frequency, demographics, behavior in the context of known performance, media planning will be more effective.  

Why Don’t the Numbers Match?!?

A question any practitioner of Internet-based analytics will be asked by many different stakeholders is “why don’t the numbers match?”  Counts of the identically named metrics from ad servers don’t match the web analytics tool, which don’t match the for-pay third party audience measurement tools, which don’t match the free audience measurement tools, which never match any of the homegrown internal measurement tools.  And none of them ever match each other.

So it’s a good question certainly valid to ask.  The answers are even fairly easy to understand, but the root causes are often difficult to pinpoint and even harder, if possible at all, to remedy.  The fact of the matter is that data discrepancies in analytics result for a multitude of reasons, such as:

  • Different data collection methods.  We have a bunch of tools and services that collect web data using various, non-standardized, proprietary data collection methods.  Ad servers use javascript page tags.  Many web analytics tools use page tags too, but it’s not uncommon in web analytics to use additional methods, such as log files or packet sniffers.  Or perhaps a combination of these methods, called hybrid data collection.  And all the tools have different algorithms for processing the data collected.

On the audience measurement side, data is collected from self-selecting panels who install proprietary software (i.e. toolbars and so on) on their computers, perhaps at work or at their university, but most likely at home.  Then, the collected data from different panels is rolled-up and combined, and the limited subset of the Internet population that chooses to be monitored, in exchange for some incentive, is inflated and projected to the entire Internet audience using proprietary statistical methods.  We also have data collected from a limited set of geographically specific ISP’s.  And regardless of whether we’re talking about audience measurement or web analytics, the different data collection methods often, but not always, involve cookies and all their inherent issues of cookie deletion.  

  • Unique data models.  Ad servers aren’t focused on counting page views and the other dimension of web analytics (visits, time, and so on).  Rather ad servers focus on serving and counting impressions served (and loads of related derivative calculations, like CTR, CPC, and view–thru).  Metrics are based on an ad request and an ad code.  Ads may or may not be targeted to a page, and instead to various constructs, like a “zone” or “keyword.”  What that means is that the “page” dimension may not even exist in your ad server’s data model.  In other words, you aren’t looking at impressions measured on a page, but rather at the number of impressions served in a different conceptual construct.  That’s one of the reasons why people say metrics and ad-serving systems “don’t measure the same thing.”
  • Untagged pages.  Specific to technologies that collect data or serve ads using javascript page tags, there are challenges to ensuring and verifying complete coverage of page tags across every page on a site.  When the pages aren’t all tagged with the different tags for the assorted technologies, guess what?  The numbers won’t come close to falling within tolerable variances.  And questions and skepticism will ensue.
  • Non-JS executing clients and ad blocking software.  Let’s imagine for the moment, your site is perfectly tagged for all technologies, so the numbers between your ad server will be close to your web analytics system, right?  Nope, regardless of data model issues, not all browsers execute javascript and many Firefox users have installed Ad Block Plus. 
  • Cookie issues.  When you’re counting based on cookies, third-party cookies get blocked (often by privacy software).  Many ad servers and web analytics tools still serve third party cookies, and many corporations have not tricked out their DNS to accommodate this issue.  And we all know how cookie deletion affects unique visitor counts, even if you use first-party cookies.
  • Many other issues.  Latency from visitors moving off the page prior to the tag executing to latency in the call to pick up an ad from a third party while your ad server counts the traffic (so your ad count differs from the agency’s count), to refresh rates making it hard to correlate page views and impressions, to no rich media installed and no fallback, to robotic traffic not being filtered from logs or tags, to certain types of user agents (such as mobile devices) not executing javascript… there’s a whole host of other factors that cause data discrepancies.

And of course, there’s always the nebulous issue around the complete lack of consensus-based, enforceable standards for online measurement.  No industry organization can say what vendors or companies “must” do, only what they “should” do… And no industry body is going to get successful companies to change their secret sauce just because they said so…

So what’s a practitioner to do?  Understand the potential sources of discrepancies.  Work with your team (from IT to vendors) to prevent and minimize the root causes when possible.  Educate your team when discrepancies are not remediable.  Ensure you use the different sources of metrics judiciously in the context of your business goals.  Finally, realize that none of the tools are more “correct” than any other.  All of our analytics tools serve different, and sometimes overlapping, business purposes - from counting ads, to influencing media buying, to sizing audiences, to measuring business performance, and to optimizing the site.

Five Rules for and some Thoughts on Deep Packet Inspection

One of the many things on my mind in the online world these days is “deep packet inspection.” 

First, let me digress, packet sniffing isn’t new to web analytics.  From Accrue to Omniture (Visual Discover Sensor?) to AuriQ to Metronome Labs.  Packet sniffers are used to “do web analytics.”  It’s an uncommon method when compared to javascript page tags.

Web analytics packet sniffers are used to write logs for sessionization (and thus measure) the traffic on behalf of site owners (who don’t want to use tags or logs).  Once you’ve logged and sessionized you know what content people have looked at or downloaded on your site. 

“Deep packet inspection,” like WA sniffers looks at the entire payloadof packets in real-time across a huge number of simultaneous sessions.  Deep packet inspection, like regular packet sniffing, examines the files downloaded and the content of the pages viewed - the whole ball of wax. 

Deep packet inspection is being offered as a hardware/software technology by companies like FrontPorch and Sandvine (in the US) and Phorm(in the UK).  These companies are selling the technology to ISP’s (like Charter, Comcast, and Virgin Media) so that they can monitor the sites visited and the keywords used by customers, and then use the data collected for behavioral targeting.  The ISP’s want a slice of the juicy, lucrative online ad business.

What’s the difference?  Site owners collect data about what you do on ONE site (or a portfolio of their sites).  ISP’s collect data about what you do on EVERY site you visit.  As I understand it, some of these companies create an anonymous profile of your surfing activity by assigning a unique key to your browser.  Then they monitor the site’s visited by your browser, and use that data so that the ISP, or the companies to which they sell your data, can serve you what they conclude to be relevant, behaviorally targeted ads. 

Get it?  Packet sniffing by site owners = knowing about one site you visit.  Deep packet inspection by ISP’s = knowing about every site you visit.

Now to digress… In web analytics, we know that web analytics data is collected anonymously.  Unless there’s a login, you don’t know exactly who is coming from that IP address.  And in many cases, most companies data warehouses only contain purchase information, not the entire clickstream.  Once the data is collected, if you have the right architectures you can decode cookie values to people, and make that data non-anonymous (i.e PII).  Not difficult to do with some smart BI folks on your side.  

An ISP already knows who you are and can already identify the sites you visit.  Probably not that easily though on individual level.  They can dig through the logs, etc… 

So what’s the big deal and all the hoo-hah about  the “deep packet inspection” Phorm and FrontPorch are doing?   It’s the data they are collecting and the repository they are building containing data about every site you visit and all the content you view and download… Of course, these companies say that it’s all done anonymously and that your “privacy” is preserved “to the greatest extent possible.” 

Now let me quote Sir Tim Berners-Lee about the data collected from Phorm’s ISP tracking: “It’s mine - you can’t have it. If you want to use it for something, then you have to negotiate with me. I have to agree, I have to understand what I’m getting in return.”

And that’s the point of the blogviation, Tim is correct.  In web analytics, we do this - we try to operate within Tim’s constraints.  We enable opt-in with P3P statements and disclosures when you register/login.  Privacy policies disclose what we are doing with the data.  It’s just ethical and smart business practice to do so.

Thus, I think FrontPorch and Phorm and all the ISP’s who want a piece of online advertising should adhere to the following five rules for their services.

  1. Move to an obvious “opt-in” model with full disclosure.  Tracking via “deep packet inspection” should be an all opt-in model.  If you want anonymous data from your browser collected so that you can be behaviorally targeted, then you should opt-in to be.  Right now, it’s seems to be all opt-out.  You probably don’t know if it’s being done to you.  It’s buried in fine print you’ve probably never read.  Is that your fault you didn’t read the fine print? Yeah, but the point is it shouldn’t be buried in the fine print…
  2. Provide me with access to the data collected.  If I opt-in, I should be able to see the data collected from my browser.  It’s very simple.  I demand to see what you are collecting about my browser.  If you are building a profile, then I demand to see the data collected in the profile.  If it’s all anonymous, then explain how it is in detail, and then follow rule #1.
  3. Enable me to edit or prevent the data from being collected.  If I opt-in, I want to be able to edit or prevent certain types of data from being collected.  If you’re tracking my browser, alert me before the data is transmitted, so I can decide if I want to share it.  If a profile is built, I want to be able to edit it!
  4. Let me opt-out at any time EASILY. If I’ve opted in, and I’m unhappy with the service, allow me to opt-out simply.  Having to set an opt-out cookie on my browser is absolutely and completely absurd.  I want to be able to fully opt-out at the ISP level, just once forever, not at the browser level every time cookies are deleted.  Make it easy and permanent, not easily deletable.
  5. Disclose who you sell my data too.  Like online list rentals, the next step in all this ISP profiling is selling the data to third-parties.  Let me know what you’re doing with my data-before you do it- so I can opt out or prevent it from being sold to parties to which I don’t want it being sold.

Consumers must be given a choice for preserving their privacy.  Anonymity to the “greatest extent possible” is not enough and neither are short-sighted opt-out cookies.  Companies like Phorm and Front Porch would be wise to apply these rules to regulate themselves.  Otherwise freedom-loving governments will almost certainly regulate them

And I haven’t even mentioned the issues with net neutrality and deep packet inspection (i.e. traffic shaping and access restrictions (called “throttling” as Clint points out in the comment), have I?

A Few Thoughts After Another Awesome eMetrics….

Back from another excellent eMetrics.  I’m a very big fan of the eMetrics Marketing Optimization Summit…  Props go to Jim Sterne for growing this event from a little seed into an incredible, blogworthy blossom.  How involved is Jim in eMetrics?  I’d say he’s completely immersed in every little piece - he even came up to me at the SF WAW (way to go June D!) to find out about the renegade AV work I did in one of the sessions, and to get my take on how it could have been avoided.  He’s that intimately connected to what’s going on.  Macro and micro, micro and macro.  And when you have one of the best Internet Marketers in the world, keeping a tight rein on the Clydesdale of conferences, you know you’re in for one heck of fun ride. 

And so it was for about 500+ of the top web analytics in the beautiful Palace hotel.  Props to consummate conference organizers Matt Finlay and his crew at Rising Media for keeping the road smooth as we all trotted on it as well.  Fanny, you are one helpful polyglot of a marketing manager!  I never knew German keyboards were so wild… Thanks.

The eMetrics sessions were informative and actionable.  The lobby bar and after-hours parties fun and enlightening.  You really can’t ask for more out of a conference.  As I flew home thinking back on it all, there was a lot to blog about, including:

  • It’s all about attitude, dude – as in attitudinal data.  Like my father says “it’s all about your attitude.”  And so it is on the Internet in 2008.  From ForeSeeResults, to iPerceptions, to OpinionLab, to CRMMetrix, the often missing link in customer analytics is attitudinal data.  I’m talking here about Voice of Customer (VOC) technology that allows you to ask a question set to site visitors and then apply some sort of algorithm or model to express the meaningfulness of the data in quantifiable terms.  From the American Customer Satisfaction Index to 4Q.  VOC technology enables you to participate in a continuous, automated dialog with your customers in order to identify problem points on your web site and enable you to measure purpose and success of your most valuable segments.  Expect to see some of the big players gobble up these smaller companies.  Omniture, Unica, WebTrends, and CoreMetrics should be thinking about acquisition in this space to round out their offerings.
  • Testing, 123… as in multivariate, MVT.  The rage is site optimization technologies beyond the simple A/B, champion challenger, test.  In this category you find folks like SiteSpect (the only non-intrusive multivariate testing solution!).  I’m a big fan of these guys (and was in 2006 long before they ever sponsored a WAW, thanks to a nice demo from Larry at my old job).  Eric Hansen and his crew have specialized software that you install in your data center.  No futzing with damned tags.  Swap out your variations, create different recipes, determine what’s statistically significant in giving you a lift to your macro or micro conversion goal, and you’re off to the races.  The good folks at Google are doing it and doing it well with Google Site Optimizer (thanks for the t-shirts!).  Interwoven is baking in Optimost to the CMS, and Omniture has their Test and Target integrated with the Business Optimization Suite.  Accenture has MemetricsKefta too. And what ever happened to Verster?

In a nutshell, these technologies enable you to test variations of content themes, colors, creative, calls to action, points of resolution, buttons, navigational elements, –whatever you want to call the stuff on the screen—to determine what combination performs best against your goals.  But of course, this is all just software, so don’t get too excited.  The tests are about as good as the people creating them…  And complex tests that take a long time to execute may not finish.  Imagine 1-800-Flowers starting a test in January and not finishing until March, missing Valentine’s Day.  Or Intuit running a test beyond April 15th for a tax product.  Go humbly and carefully into this space, my friends, or you may end up optimizing for everyone and appealing to none.

  • Tying it all back to the dollar for profit-generating sites and to the mission of non-profit generating sites…  It seems like a “no, duh” moment but metrics for the sake of metrics can be a big waste of time.  If you can’t tie metrics or visitor actions back to value on a revenue-producing site or to the betterment of a non-profit site’s core mission, then what’s really the point of the measurement…  That’s why I’m a big fan of the stuff ZaaZ does.  They totally get the fact of how actionable metrics turn the wheel of Internet commerce and ad-based models, and they can model it all to prove it out the ROI.  Folks like newly elected WAA Director Alex Langshur’s company Public InSite do similar stuff for content driven sites.  That is they know how to use metrics to optimize the channel to goals, not to just puke confusing data, like most web analytics tools do.  Again, it’s all about the people you hire, not the tools you use… My good friend Avinash, right again!
  • The emergence and rise of deeply psychological and neuro-behavioral methods for automating persuasion and conversion.   Anyone who knows my good friend Joseph Carrabis, over at NextStage Evolution, knows that besides being one heck of giant kite flying, music master, he’s also got the models and the patents to help target and respond to human behavior across programmable devices.  We’re already seeing some companies, like Seven Billion Joe’s, er People, taking what he’s been saying for years and going to market with it.  The idea here being that if you can identify the affective, behavior, and motivational drivers of site visitors, you can maximize cognition in elements on the site (like pictures, text, informational flow) to appeal to target segments and persuade/provoke desired behavior.  It’s like a higher rung on the optimization ladder.  It’s not test what they see, it’s figure out how they think, then make the site better because of it.  Cool stuff.  Blows my mind.
  • Integrated, multichannel marketing.  Just ask my good friend Akin Arikan, author of the newly released Multichannel Marketing.  (Disclaimer: I was a technical editor on the book.  It’s easy to do when you edit brilliance).  Make sure to check it out!  Marketing in general will become more Internet-centric, but will continue to clutch the roots of broadcast and print.  You will have the database marketer and statistical modelers working with a union of web channel and offline data.  What’s preventing it now?  A unified marketing database.  You see companies like Salford Systems circulating in this space.  And take a look at Unica’s blend of Enterprise Marketing Management…  I’d stay tuned to see what Unica has up their sleeve for bringing together online and offline.  When you can segment and target across online and offline campaigns, if I were pure web channel player only, like Omniture or CoreMetrics, I’d be a bit concerned that people are waking up to open systems, not closed black boxes.  WebTrends is already moving in this direction…  But they all remain far behind Unica when it comes to multichannel marketing.

And that’s just a few of the things the phenomenal eMetrics got me thinking about…  I hope to see you in Washington DC in October! 

What Questions would you ask “the experts” about Web Analytics and Audience Measurement?

Next Sunday afternoon I am moderating a panel at eMetrics San Fran.  The panel is called ”Web Analytics -vs- Audience Measurement.”  Andrea Hadley at NetSetGo was the brainchild of this panel idea (and yes that is her picture on her site :).  In fact, I was a panelist on the same panel at eMetrics Toronto, filling in for my friend Marshall Sponder.  Since he’s going to be in San Fran, I yielded my seat 0n the panel and decided to stand up at the podium.   Other panelists include Jodi McDermott, Director of Product Management, at ClearSpring, and some other surprise guests (from comScore and IAB maybe)… You’ll have to show up and find out… :)

The panel description is as follows:

Are you confused about the number of customers visiting your website? Are the metrics reported by your web analytics tool different from the metrics reported by your online media, or by audience measurement organizations? The WAA invites eMetrics Marketing Optimization Summit attendees and the local San Francisco business community of web marketers, publishers and agencies to attend this community meeting. A panel of experts will discuss the value of the metrics, methods and tools used by web analytics practitioners, online advertising media and audience measurement organizations. Find out how-to use these metrics and tools to better understand your customers, your website’s competitive standing and overall website value.

The goals for this panel include:

  • Adding clarity around the tools and data associated with each set of technology and metrics - web analytics technologies and website data, ad servers and ad data, and audience measurement tools and data.
  • Learning how each data source can be used to expand our understanding of customers, how effective our website is as a business channel, the website’s competitive standing and value, and so on.
  • Providing insight into the role of the web analytics practitioner and how this role is growing in importance and influence over business, marketing, product, and strategic decisions.
  • Discussing the role of the Web Analytics Association (WAA) and how the WAA serves the practitioner.  That the WAA is an unbiased organization that doesn’t serve advertisers, publishers, or technology vendors, rather that the WAA serves and exists for the benefit and betterment of the the practitioner and the web marketer/strategist.
  • Articulating the announcement made at eMetrics Toronto on the important collaboration between the IAB and the WAA for standards review.

My goal as the moderator is not to critique, demean, or criticize audience measurement, Internet advertising technologies, or to embellish or hype up web analytics tools.  Rather I hope to clarify the differences between the technologies and speak about the value they hold together - like I did in my article for MediaPost called the Yin and Yang of Online Metrics.

So why am I telling you all of this on my blog???  Well it’s because I really want your help, whether you are going to eMetrics or not…  Since I’m the moderator, I get to ask the questions, and I don’t want to just ask “my” questions, I want to know what questions YOU would ask if you had the chance to ask.  Of course, those of you reading this and attending the panel will be given the microphone if you raise your hand.

Please help my crowdsource by telling me in comments or via email to judah (at) webanalyticsdemystified.com:

What questions would you ask to clarify the differences and value between web analytics and audience measurement tools?

Any questions you think worth asking from “why don’t the numbers match?” to complexly “what are the differences between audience measurement and web analytics systems in terms of data collection?” would be awesome and appreciated.  Thanks in advance for your help!  I’m eager to see if this social media experiment in blog-based crowdsourcing actually works! :)

Thinking about Measuring Internet Video?

Every month I write a column for MediaPost’s Metrics Insider.  This month I wanted tackle my evolving take on Internet video measurement.  Very few companies offer solutions in this space.  Only a few are really differentiated.  Check out Visible Measures, NedStat, TubeMogul, Divinity Metrics, and the usual suspects, Omniture, Unica, WebTrends, ComScore, and Neilsen NetRatings

Here’s my column:

IN LATE 2007, THE DIGITAL Video Barometer Executive Survey indicated that more than 80% of media and entertainment executives believe tracking, measuring, and monitoring Internet video content is critical to bottom-line profit.  That’s not surprising. Accurate measurement informs decision-making and improves business performance, and Internet video is more mainstream and popular than ever before.  What may be surprising to those executives is that technology for measuring Internet video generally focuses on video content served on-site, not off-site.  It’s fairly straightforward for a Web analytics tool to tell you how people are consuming and interacting with on-site video, but consumption and interaction of videos distributed across multiple sites, perhaps virally or via social media campaigning, aren’t directly measurable by Web analytics tools.  Panel-based technologies can approximate certain off-site measures of video consumption and distribution, but don’t provide very deep on-site metrics. Measurements of Internet video consumption, interaction, and distribution may be categorized as follows:

  • Instream measurement.  Refers to measuring the video itself and the various events and behaviors that occur during a video viewing experience, such as time-based duration metrics and interaction and behavioral metrics (for example, the number of stops, plays, pauses, rewinds, fast-forwards, sites that posted or syndicated the video, clicks on hotspots and social media features).
  • Outstream measurement.  Refers to measuring the content environment and user experience surrounding the video on the site or in the skin, such as the conversion metrics (percentage of visitors downloading or viewing a video), source metrics (refers to the video page, players used), and content metrics (percentage videos viewed by topic, percent videos viewed by file type). 

Those categories form a framework for Key Performance Indicators (KPI’s) that help to identify how people interact with videos, how videos perform when compared to other videos, and against pre-defined business goals.  Analysis of KPIs enables video content to be tailored to maximize performance.  Example KPI’s include:

Instream KPI’s:

  • Percent high, medium, and low duration video views
  • Average viewing time per video
  • Percent visitors who complete the video
  • Percent visitors that stop the video within 10 seconds
  • Percent visits when this video was the last video viewed
  • Percent visits when this video was the first video viewed

Outstream KPI’s:

  • Conversion rates by video, topic, channel, taxonomy node, referrer, geography, keyword, and so on
  • Average video views per visit
  • Percent visits/views from different channels (such as email/rss, organic search, paid search, direct)
  • Average time between visits that include a video view
  • Repeat visit rate for visits involving a video view or download

These KPIs are measurable using a Web analytics tool, and perhaps a few of them are possible using traditional panel-based measurement.  But if off-site video distribution creates a whole new set of challenges to using current analytics and audience measurement tools to track instream and outstream metrics and KPIs, what are publishers and advertisers to do?  It’s a business problem that demands a new technology solution for understanding audience behavior, consumption, and distribution patterns of off-site syndicated or viral video content.

So what would a new technology solution for measuring Internet video and audience behavior do?  First it would have to fill the gap between panel and census-based measurement systems in a way that helps both publishers and advertisers  – not just one or the other — understand audience reach, frequency, and behavior.  The technology must enable tracking and actionable reporting and dashboarding of key metrics and KPIs, distribution patterns, behaviors, and interactions regardless of where the video “goes” on the Internet.  Audience characteristics from external databases (like OpenID for example) and internal company databases (like subscription and registration dbs) should be able to be integrated with data collected about behavior, video metadata, and instream and outstream metrics. 

If measuring digital video is as important as eight out of 10 media and entertainment executives believe it to be, there are some huge money-making opportunities on the horizon — for companies that are already providing technology for tackling this emerging business need, for advertisers using Internet video to drive awareness and response, and for measurement professionals who can help make sense of the Internet video ecosystem, solve measurement challenges, identify significant business opportunities, and use video metrics to improve business performance.  We’re certainly at the beginning of the J-curve for Internet video measurement for both publishers and advertisers.  After all, Forrester predicts Internet video advertising spend to increase from $471 million last year to $7.1 billion in 2012. 

Thinking Back on Online Metrics in 2007 and Looking Forward to 2008

Every month I write column for MediaPost.  This month I wrote  a short summary piece I thought I’d share with you in case you missed it.  Here it is:

As 2007 ends, I thought it worth looking back, from the practitioner perspective, at just a few of the issues that have shaped Internet measurement and thus online metrics over the last year:

  • The Page View is Dead, Long Live the Page View.  During 2007, technologies like AJAX and Flash continued to erode the construct of the page view.  These technologies render content in a browser but do not always make requests to the server to do so.  If the technology you are using to measure behavior requires the page request and you do not have a page request, what do you measure?  The major vendors of online metrics tried to answer that question. 

Various audience measurement companies claimed “total minutes” and other time-based derivatives were better alternatives to measuring the page view.  Web Analytics companies rolled out features for measuring “events” subordinate or equal to the page view (and highlighted already existing time-based metrics).  Ad serving companies made inroads in reconciling view-through to assist advertisers in understanding the latent effect of ad exposure on the purchasing lifecycle.  Yet all these technologies still count and report page views.

  • Engagement, Engagement, Engagement.  One of the hot topics in 2007 was a carryover from 2006.  Definitions for “engagement” emerged from the worlds of advertising, social media, online metrics, and more.  Engagement has been described as “turning on a prospect to a brand idea enhanced by the surrounding context” to “repeated, satisfied interactions that strengthen the emotional connection a customer has with the brand” to “apparent interest” to the more metrical “estimate of the degree and depth of visitor interaction against a clearly defined set of goals.” 

“Engagement” is very specific to the site being measured and full of nuance.  This fact has led agencies, consultants, and various companies to create complex engagement indices consisting of measures of key behaviors.  Behaviors are tallied and segmented in order to calculate an engagement metric, which is then used as the basis for site optimization.  These indices go far beyond often-cited simple time-based measures of engagement.  For a well-thought-of example, see Eric Peterson’s Engagement Metric.

  • Cookie Deletion, Again!  Jupiter Research, in 2005, first uncovered and quantified how cookie deletion can affect unique visitor numbers in web analytics systems.  The effect of cookie deletion is not quantifiable in the basic way audience measurement companies want to quantify it in 2007 – by only examining cookie deletion rates from self-selecting panelists who visited one portal site and an ad server. 

Cookie deletion behavior varies greatly by audience segment and by site.  It may be as much of an accuracy problem in web analytics as selection bias and coverage errors are in panel measurement.  It is worth noting that some audience measurement firms use cookies to collect panel data. 

  • Black Box Audience Measurement.  Many questions were asked about whether audience measurement companies adequately measure “unique visitors” or “unique users” or just the frame of self-selecting “unique panelists.”  In audience measurement, counts of “unique visitors” are generated using complex, black-box mathematics that project observed metrics to the entire online universe.  The projections are always unequal to the same data provided by other audience measurement companies or web analytics tools.  Panel inconsistencies (across at-home, at-work, at-university, or specific to the geography being measured) may cause some level of bias and error. 

Accounting for the difference between actual, observed panel metrics and projected metrics is perhaps even more challenging to clarify and resolve than the measurable effect of cookie deletion. 

  • The Continuing Need for Standards Enforcement.  2007 was the year two significant industry bodies continued working on standards related to online metrics: the Internet Advertising Bureau and the Web Analytics Association.  While each organization serves the needs of different constituencies, they both share the inability to enforce standards.  Both bodies can say what you should do, but not what you “must” do. 

Throughout 2007, these issues (and others) brought increased attention and scrutiny to online metrics.  Corporations are inextricably linking online metrics to site and channel strategy and performance, and thus to overall corporate profitability.  The “numbers” are now more important than ever for managing an online business and maximizing online revenue.  Nevertheless, questions are still being asked about accuracy, precision, usage, and sources of online metrics.  We have a ton of collaborative work to do in 2008 to provide the best answers and numbers we can. 

Happy New Year!

happynewyear.jpg

The Yin and Yang of Online Metrics: Audience Measurement and Web Analytics

I write a monthly column for Mediapost’s Metrics Insider.  This month I wanted to talk about the different schools of thought in online metrics because at the end of the day we are all in Internet measurement together. Hope you enjoy the read:

Audience measurement and Web analytics systems are like the yin and yang of online metrics. Yin and yang are different, opposing forces, but they also complement each other. Think of Web analytics and audience measurement data in the same way: different, sometimes in opposition, but complementary.

The major difference between these systems is data collection:

  • Audience measurement companies don’t collect data directly from the sites being measured. They all rely on proprietary methods. Hitwise gets data from ISPs. Compete uses a toolbar that you can download as well as ISP and panel information. Nielsen and comScore use data collected from panels to create online metrics that they believe accurately represent overall Internet usage. Due to all these different data collection methods and no shared standards across companies, metrics from audience measurement firms are never identical with each other.
  • In Web analytics, data is collected directly from actual site activity. Methods include client-side data collection via javascript page tagging, server-side data collection via log file processing, or network data collection via packet sniffing. Sometimes methods such as page tagging and log file processing are combined in what’s called “hybrid data collection.” Vendors include Coremetrics, Webtrends, Unica, Visual Sciences, Omniture, Google, and others. The challenge with Web analytics tools is that each tool will calculate different numbers from the same source for identical metrics. In other words, Omniture numbers won’t match Google’s. That’s because each tool has its own “secret sauce” for “sessionization” — the fancy term for the way metrics are counted and measured by analytics technology. For example, certain tools may be configured to include or exclude certain filetypes or server responses. Robotic traffic may or may not be filtered.

It’s worth noting that a company named Quantcast uses panel data and also enables a site to add page tags to collect actual site data, which are then merged together in a completely different type of “hybrid” model.

All these different approaches to data collection lead to opposition when these systems are used for the same purpose. For example, conflict arises between the yin and yang when identifying reach using unique visitor metrics. Audience measurement firms may cry “cookie deletion” when analytics tools are used to count unique visitors, and Web analytics firms may shout back “coverage error” and “selection bias” at the unique visitor numbers from panel-based firms. Another area of opposition is demographics. I’ve been told that only audience measurement firms provide demographic data, and that you can’t get demographic data from Web analytics systems. That’s not true at all.

All enterprise-level Web analytics systems provide demographic location information at the country, city, state, and MSA levels. This information will be different than that provided by audience measurement companies.

Demographics that are harder to elicit from a Web analytics system, but are easily provided by audience measurement, include attributes like a visitor’s age, gender, occupation, income, and education.

But it is possible to integrate very detailed demographic attributes per visitor into a Web analytics system! Once demographic information is captured in a registration database, it can be joined with behavioral data in the Web analytics system and reported on. For a real-world example of analytics/demographic integration, take a look at what Microsoft is doing with Gatineau, the company’s free Web analytics offering currently in beta. Microsoft is joining Web site behavioral data with rich demographic data from MS Live profiles.

Even with differences and oppositions between these online metrics systems, companies find ways to use the data in complementary ways:

  • Audience measurement data is useful for competitive intelligence. All the paid and free services provide data for comparing the performance of a site to other sites, for understanding audience behavior across one or more sites by demographics, and for understanding generalized Internet traffic trends and search terms.
  • Web analytics data is useful for understanding site effectiveness, for defining key performance indicators, for determining conversion rates for marketing campaigns by channel (such as search, email, rss), for understanding what sites and keywords are driving traffic to your site, and for segmenting and reporting online metrics.

You can even use both data sources as part of the same site optimization activity. For example, you could use audience measurement data to determine that a competitor is gaining ground on a particular product or search term. Then you could look at your Web analytics tool to see how you’re doing for the same term and how visitors who searched for that keyword behave on your site. You may find a high bounce rate and low conversion rate for the keyword, so you segment that data perhaps by demographics! Next you suggest a hypothesis to minimize bounce and maximize conversion for each segment. Then you test your hypothesis, and reexamine the data. Based on the results, you then continuously improve your online performance through controlled experimentation. At the end of the day, you will drive more online revenue by understanding how the yin of audience measurement and the yang of Web analytics complement each other, than by worrying about how they differ and oppose.

yin_yang.jpg