Web Analytics Blogs

Judah Phillips is an experienced web analytics practitioner and Internet expert currently working as a Senior Director at a large, global Internet company. His blog is full of useful, unbiased, actionable insights learned from the real-world practice of a process-oriented, integrated approach to strategic Web Analytics for improving business performance.

Subscribe to Judah Phillips weblog

Part 2: Your Web Analytics Data Quality May Stink and Here’s Why!

In Part 1, I began a long list of reasons why your web analytics data quality may stink.  I’m continuing the list below (make sure you read Part 1 for context and to view the entire list)

  • Storing only visit level data.  May tools don’t have schemas that store raw data at the visitor level.  Instead they provide access to only visit level data.  For example, you may not be able to see all the page views during a single visit per ip address or cookied visitor.   Assess the impact of the vendor’s schema on your goals.  Companies that use analytics data to feed other systems or that want to use visitor attributes for content targeting, segmentation, optimization, or analysis may not be well-served by some vendor schemas.
  • Little to no decodes or lookups.  If you use numeric codes and non-human readable naming conventions in your data, they can pass through to your reporting and prevent your colleagues from understanding the reporting.  Strange codes look like hieroglyphics!  Decoding and looking up data can eliminate the problem of non-readability and strange numerical names in your reporting… While some would say this is a reporting issue, not a data issue,  I chose to include it because it’s at the surface… it’s the data your customers see.  Not all tools decode or lookup.  Some tools allow rewriting of data in the database.
  • Failure of key services supporting the application.  If you are dependent on page tags, synchronization software, web servers, databases, or any of the wondrous technology that makes it all work, failures are a real bummer.  Make sure you have monitoring and recovery processes in place so you don’t miss data!  When page tag collection fails (perhaps the page tag server went down ay?), the data is gone forever.  If the web server fails, then no logs are written, but no pages are served either - so is traffic missed?  But if the processes supporting log file analysis fail (i.e. data synch), watch out! 
  • Inadequate or incorrect implementation.  If you can’t cross dimensions (like finding out what keywords referred traffic to a page), filter all of your data (for example, filtering pages to see only those viewed by the iPhone), easily create new metrics, or if the numbers aren’t adding up, you may have not adequately or correctly implemented your software or communicated your requirements to your vendor’s professional services team. 
  • Limited, hard-to-extend data model. Powerful, actionable insights from web analytics are enabled by extending a data model to incorporate business specific dimensions.  For example, if every page has a category and an author, you may want to see a list of all the page views in that category or ranking of pages by most popular author.  To do that you may need to join data at the database level or take advantage of variables you pass in a page tag.  Various tools have different limits on if, how, and to what extent you can extend the data model.

So what do you do when you know your data quality is less than stellar?  Here’s some guidance:

  • Don’t worry, be happy. :-) Just by collecting the data you are collecting, you are doing better than a great majority of companies that do business on the Internet.  By asking questions about data and investigating the issues, you have a leg up on your competition.  Work on optimizing the data, expose flaws in site design or architecture that impede data collection, work with your vendor and seek help in the web analytics community if you run into real problems.  The Web Analytics Association’s Forum on Yahoo is a useful place for posting questions.  But whatever you do, stay positive and focused on solving your problems and making your web analytics practice more optimized.  Don’t get frustrated.
  • Recognize the limitations in the data and do not go gently into the night.  Ask the hard questions about sampling, schemas, data retention, processing, querying and reporting to understand where the holes and noise could be in your data.  Demand answers from your vendors and quick response times to your questions about data quality.  If you vendor is frustrating you by not being responsive, talk to the boss and the vendor’s bosses, escalate, escalate, escalate until you get resolution.
  • Understand the underlying elements of data collection and what can go wrong.  Learn about sessionization and why different tools and data collection methods have limitations.  Explore the more technical components of the backend, like the database and your web analytics schema - all your data is in one (or more)! Talk to your engineers.  Have them explain the technology in terms you understand.
  • Evaluate your tools.  Some tools are just better suited for particular business problems than other tools.  Log files tools enable you to constantly change assumptions and reprocess data.  Page tags provide a standard data collection and transport mechanism.

With hard work on your part, you can make you web analytics data smell like roses!  I know you can! :)

dataquality_renamed.jpg

Judah Phillips at Web Analytics Demystified » Blog Archive » Part 1: Your Web Analytics Data Quality Stinks and Here’s Why! added the following ...

[…] Let’s continue this long list in Part 2! […]

Daniel Shields added the following ...

Hey Judah,

Great list.

I was thrilled when I only came away with a few smaller reasons to crawl back through my warehouse and verify the information I’ve been using. This is also useful in pointing out a major reason to have more than one (or at least one hybrid log/tag tool) analytics tool.

Best Wishes…

Daniel W. Shields
Analyst
CableOrganizer.com

Judah added the following ...

Thanks for commenting Daniel.

Keep up the good work!

Best wishes to you too!
Judah


Add to the Conversation

Your email (required) will not be published.

Please note that contributions are moderated and may take a little while to appear.

The X Change 2009

The web analytics communities premier conference—the X Change—will be held September 9, 10, and 11 in San Francisco at the St. Regis hotel.

See what everyone is talking about and learn more about the X Change.

More »