Comscore and Quantcast – How they work and why they are the gold standard of guess-work.
Greetings sports fans, welcome to another post that I hope will either spark some discussion, open some eyes or perhaps help me understand this situation in a different light, granting it some legitimacy a bit heavier than the feather-weight I feel it has. It is often said that comScore is the gold standard of measured site metrics used by advertising agencies when they make their digital media buy decisions, shortly followed by Quantcast. These companies have the power to influence multi-million dollar decisions that can potentially be made based on their demographic reports.
I deal with comScore and recently Quantcast on a regular basis, and quite honestly I’ve come to the conclusion that they are really nothing more than peddlers of educated guesses. I also feel that the possible error-range in their evaluations has the potential to me massive, hence unfairly denying someone possible revenue dollars due to their metrics net being too small. I often scratch my head wondering, how in god’s green earth did these guys get so influential in the world of advertising? Really good PR I guess. I also often wonder just how far is the line pushed when polling the demographic data they gather in the efforts for online accuracy supremacy.
I would say one of the biggest errors the digital ad industry makes is confusing or blurring the lines between actual site metrics, vs site demographics beyond what is provided in the browsers header information. Let me be very clear here, the ONLY way a site can truly know the age, race, income or favorite beer of any given user is if that user enters it in a form and clicks enter. Anything beyond that, is quite honestly guesswork.
Site Metrics
Real site metrics easily defined is the collection and reporting of REAL visitor data to your website by a piece of analytical software, the most popular in the world being Google Analytics. GA uses the REAL data from your REAL visitors, records and collects that anonymous data and provides you easy-to-read formats of that traffic information in a vast palette of reporting options.
The site demographic metrics reports you get from comScore are really just educated guesses based on a pool of poll users that are tracked via plugins, web beacons and/or tracking cookies. They then average out that information with the number of uniques and page views you have and their advanced algorithm provides you with Gold Seal demographics… or what I call educated guesswork.
This is also why you can’t marry the data comScore spits out vs actual Google Analytics information. So if your comScore reporting is telling you the median age of your site is 28 and 35% of your audience is gamers, you can’t ACTUALLY track who those people are and what parts of your site they are visiting. You can’t track them because they aren’t real visitors, it’s just a guess.
What is comScore?
comScore is an Internet marketing research company providing marketing data and services to many of the Internet’s largest businesses. comScore tracks all internet data on its surveyed computers in order to study online behavior. You can visit their site at http://www.comscore.com.
How does comScore work?
One of the most common questions asked is how does comScore work? Where do they get the demographics information that almost seems to come out of thin air? Well, they in fact have a couple of primary methods.
comScore is a paid measuring tool, meaning you must subscribe to their services in order to track and report your comScore metrics reports. The website to be tracked must have the comScore tag propagated throughout the entire website in the same way you would a GA code and this will allow comScore to accurately measure your traffic, pageviews and other information that you a standard analytics program would collect. So typically pageviews and unique visitors are quite accurate and should line up with GA numbers, although UVs will likely be a bit lower due to the way comScore measures uniques vs Google Analytics. GA uses actual uniques, comScore magically tracks you and guesses when you are using your work PC, phone and Home PC and then numbers you as a single unique.
When I asked a comScore rep how exactly age is calculated by comScore for the demographics reporting, they confirmed the above:
“Demographic information is gathered from our panel. When someone opts into the comScore panel, they are required to fill out a short questionnaire where we gather demographic information for themselves as well as other people in the HHLD who will be using the metered computer. We then use census populations estimates to project out to the total internet population.”
So how does this part of the magic happen? How does it know what color your poop is, how old your car is and how much you make a year? A couple of ways.
The Research Panel
comScore maintains a group of around 2 million monitored research panelists that run a background monitoring software package that tracks everything they do online. comScore partners up with several technology brands to create and maintain this software, such as Permission Research, Opinion Square and VoiceFive Networks. The users in this group are given benefits for being a panelist, including free software, online storage data, and chances to win cash and prizes from comScore. comScore then uses a series of weights to adjust the statistics so that they have a better reach that reflects US and global browsing than just the two million panel members bring to the table. This is where the guess-work comes in, as well as comScore panelists…
comScore Panelists
comScore regularly recruits panelists through random digital dialing, as well as additional online and offline methods such as polls, quick questionnaires and much more. comScore then uses that data to determine total people online, geographic location, income, age, and other factors and then apply that information to “massage” their research panel stats and generate an accurate global or US demographic.
Additional information about comScore
I sent comScore a request with some of my questions and got back some great information directly from comScore:
comScore measures people and not cookies or IP addresses as your internal data does. The way we do this is as follows:
When a company wants to become unified they sign up and place a beacon (or a tag) on every page of their website. Whenever anyone goes to one of those pages (regardless of whether they’re in our comScore panel or not) our beacon call will place a comScore cookie on the machine. We count these cookies and then we use the people in our panel to understand the following:
- Users deleting or blocking cookies
- Users using multiple browsers per machine
- Users using multiple machines across home and work
- Users using multiple machines at home
- Multiple users on the same machine
- Computer Overlap: An adjustment is applied to the cookie counts to account for usage across multiple machines within the same household.
Based on our findings from the above criteria, we then assign an average Cookie Per Person ratio for that site. This CPP is updated every single month for every single entity we report on.
We then take the following steps to calculate the Unique Visitors:
- We sum up the number of cookies for the particular country we’re measuring and just for Home and Work (so we filter out International traffic, shared computers and mobile devices)
- We use the Cookie Per Person ratio based on the criteria above to calculate Census Only Unique Visitors from cookies
- We use the panel to understand the number of UV’s NOT seen from cookies. Since the site may not have tagged every single page, some UV’s will go unnoticed by cookies.
- We sum the Panel only Unique Visitors and Census Only Unique Visitors to reach Unified Home and Work Unique Visitors
- We use the panel to understand the overlap between people using BOTH home and work computers
- We report the final UNIFIED UNIQUE VISITORS
Page View numbers are calculated by the following:
This measure comes 100% from the tags on your site. We take the raw tag number and filter it from:
- International traffic (if you are only purchasing US data)
- Shared environment/mobile traffic
- Auto refreshes and don’t comply
- Forced viewing (pop-ups….)
- Nedoms (non-essential domains) comScore maintains a list of pages.
- Non human traffic from bots and spiders are also removed
Why the comScore numbers don’t match my Web Analytics data:
- comScore measures unique persons
- Web Analytics “unique” numbers are a measure of unique cookies
- Differences in the numbers come from:
- Users deleting or blocking cookies
- Users using multiple browsers per machine
- Users using multiple machines for home/work
- Users using multiple machines at home
- Multiple users on the same machine
- Web Analytics data totals include approximations of visitors using a combination of IP addresses and user agent when a cookie cannot be dropped.
- comScore filters out the following:
- International data (for US subscribers)
- Non-human traffic (bots and spiders)
- Nedoms (non-user initiated traffic) pop-ups, & partial page loads
- Shared usage environments (internet cafes, libraries, airport kiosks…..)
What is Quantcast?
Quantcast is a media measurement, web analytics service that allows users to view audience statistics for millions of websites. Quantcast Corporation’s prime focus is to analyze the Internet’s web sites in order to obtain accurate usage statistics by surfers from the USA. It is primarily used by online advertisers looking to target specific demographics such as age, income or other traits. You can visit their site at http://www.quantcast.com.
How does Quantcast Work?
Quantcast has a large network with millions of sites running its data collection feeds, web beacons and anonymous cookies, so it can track a person as he/she visits any of the websites in its network, and can build a profile of that person’s browsing habits, and then extrapolate demographics. Quantcast tends to associate themselves to the way search engines examine how webpages are interlinked, and thereby determine relevancy within it’s network and the demographics they collect. You can read more about what they do from a company perspective at http://www.quantcast.com/how-we-do-it.
Is it Legal?
Both comScore and Quantcast use proprietary algorithms they use to try to make educated guesses about the age of the user based on their internal measures and then displaying that person the ad. This is again not real data, it’s based on educated guesses based on their own measures and not on the real data of the actual visitor, which is what true analytics use. They can only track anonymous browsing habits… AGAIN, the only real way for a site to know 100% the age of the user is if the user enters their age on the site and submits that information… anything above and beyond that is quite honestly educated guesswork on an internal formula of indicators and measures gathered by web beacons and cookies that track behavioral browsing patterns.
This is actually why Quantcast and other demographic tracking companies have faced numerous lawsuits and privacy violations… they all have to walk a very fine line of how intrusive they can be in their data gathering methods until they start to break the law. It is illegal to gather personal individual data without authorization via cookies, beacons and other tracking means unless the user agrees to it, which is why contest pages etc all have terms and conditions the user has to agree to (one of the reasons anyhow). All general traffic behavior research gathering has to be anonymous and is generally mentioned in website privacy policies, which you can see on just about every company website you can think of. This is why the demographics information from comScore can’t actually be applied to the real analytical data.
Both companies have an extensive history of lawsuits, accusations and general privacy violations as they try to push the boundaries of what they collect and the collection processes. A quick Google Search shows dozens and dozens of privacy violation lawsuits against comScore.
There is no question that online privacy is a farce, and hopefully as individuals continue to see how easily their privacy is exploited online, we learn to keep things closer to the vest. Check out this little blog post by Robert Dempsey about comScore, a bit of an eye opener in case your head is in the sand.
Why do ad agencies LOVE Comscore?
Is it a lack of options? Awesome marketing on comScore’s part? Dominance in the industry? Well, it’s a bit of everything to be honest. We know that it’s nothing but educated guesswork, yet multi-million dollar deals are lost and won because of what comScore has to say, more so than REAL numbers such as the metrics provided by Google Analytics for example. comScore has positioned itself to be the leader in demographic research and through aggressive business propagation and smart marketing have made it as common and standard to digital ad planning as TVs are to a consumer’s home. You may not realize it, but you are likely touched by comScore on a daily basis… when you surf, when you read magazines, when your boss is making decisions that affect you and your company and so much more. It’s used daily in media kits, presentations, infographs, marketing plans and many other mediums for everything from entertainment to purchase proposals.
comScore is everywhere and has become masters of educated guesswork, ninjas of not-so-naked truth and warriors of what’s up online. There is no question that if you plan to work with ad agencies on premium ad buys for your website, you have no choice but to tag with comScore if you plan on playing with the big boys. BUT if you’re concerned about your online privacy and how you are tracked, fact-finding on the practises of comScore, Quantcast and other demographic measuring companies is a scary eye-opener.
How do you feel about your online privacy or about comScore and their metrics? Are you an actual comScore research panel member? Please chime in with your comments by using the form at the bottom of the page and please share this article socially with your friends and connections.
Thanks for reading!
Dan
22 thoughts on “Comscore and Quantcast – How they work and why they are the gold standard of guess-work.”
Very informative article. I always thought Quantcast was relying on real analytics, now I know it’s not true!
This was very informative – is it possible to get further advice on comscore
terrific, intelligent piece, dan. i track with comScore now, and everything you’ve discussed answers a lot my own questions. thx for sharing this.
A lot of what you are saying here is intellectual masturbation and unfortunately untrue (No, I do not work with either of the above mentioned companies).
Of what I know, the ‘educated guesswork’ you are referring to is actually based on a statistically significant sample of panel members who have actually filled up a form and clicked to register to become a panel member. This demographic information is carefully projected to the census (Site measurement). I cannot vouch for the methodology they operate with, however IMRB International in India does quite a good job with fusing of data.
The point being, there are several ways of deriving more insights into the existing data. The bucks lies in how you use the data to your benefit. We need to acknowledge that the trends are moving away from proprietary data to freely available big data. ComScore and Quantcast will follow that trend.
Hello Anon,
First off, thank you for taking the time to post your comment on my article, I don’t expect everyone to agree with me and I am always open to new perspectives. If you took the time to read my article and took the time to post a great comment, I will certainly return the favor by reading and commenting on your thoughts as well.
There is nothing wrong with deriving insights from data, this is actually the soul of good business, however you should always be questioning and searching your data to ensure it makes logical sense. I can’t count the times I’ve been handed information that was being followed blindly that made no sense if you took the time to step back and look at the big picture. The analyst would do a double take at the report, mutter a few choice words, and then proceed to make corrections.
The point here is you need to be logical about this topic and follow the numbers. So strip down everything Comscore marketing has told you and let’s get down to basics. Here’s a fact: Unless you are directly pulling demographic information from the http headers of your users (which is impossible, http headers do not contain personal attributes of it’s user, only the machine being used) or you are directly having every user of your site fill out a personal questionnaire, you are GUESSING at who they are. Comscore or anyone else can use any terms they want (estimation is a popular one) but strip away the buzzwords and this is guesswork. Now don’t get me wrong, as you’ve mentioned Comscore does what they can to ensure their data is as accurate as possible, but it’s still educated guesswork at the end of the day and nothing you stated in your comment refutes that simple fact.
Second, you refer to the Comscore panelist population, which I also mentioned in my initial article. The issue here is that the number of panelists are a minute fraction in number compared to the actual web user population. MUCH less than 1%. Do you know what happens when your sample data is much smaller than your data set? You get a massive potential error margin. Any webmaster can see the effect of this phenomenon in Google Analytics… simply take a massive data set where GA will use a small fraction of sample data (OK not ALL webmasters have sites big enough to do this, but I do) and you’ll notice the data integrity starts to suffer as compared to breaking down the analysis into small sets and less sampled data.
Even if they had 1 panelist for every 2 users, the demographic data would still be guesswork, granted much more accurate.
My third and final thought is on actual testing and data comparison. Comscore is part of my weekly routine and I practically live on the Comscore dashboard. I have compared many of the matrix reports against real data sources and I can tell you the inaccuracies are frightening on some of the reports. Unique US visits and page views are nowhere close to GA or server log data, often with a full 50% or more error margin despite both the GA and CS tags firing from the same include across the board.
Comscore also has a revenue report that will estimate (guess) gross advertising revenue and the names of each advertiser or ad network running on a site. Once again, this is guesswork as they do not have access to ANYONE’S actual revenue numbers, not even the panelists. In my comparisons to actual revenue numbers compared to Comscore’s numbers for properties I have access to, the numbers were off by more than 2000%. But guess what? The NAMES of the advertisers was extremely accurate. Do you know why? Because that’s actual data that was captured from the website itself. I would love to show you screenshots, but of course I love my job more than I love posting on my blog, so I’ll stick to my NDA.
So again, thank you for posting your comments and I am sure that many folks out there may have the same mind set, but I have to disagree with your assessment of my article (except for the intellectual masturbation part, I’m all about whacking off my brain) and stand by my words as true.
All the best!
Dan
I’m more concerned with how that sample is collected than the size. Unless it’s a total random sample of the entire population, it’s going to be very skewed.
I don’t see this panel as a true representation of the population
Well done, Dan — Thanks for this.
Grains of salt: I’ve got well-fostered reservations about comScore’s indirect, inferential data being posited and consumed whole as The Word. I’m also envious of their abilities to regularly extract cash from advertisers, agencies and publishers.
comScore’s transparency in answering your questions re their methodologies is impressive. Stating as fact “we measure people not cookies” is bold.
As I understand, comScore extrapolates the behavior of hundreds of millions of web “users” from their panel of 2 million with enough precision that people listen. I question the accuracy and our industry’s attachment to their indirect metrics when there are so many direct measurements available.
There are larger sample sizes readily available. Google and its DoubleClick ecosystem, e.g., has direct access to actual habits far more than 2mm people and/or cookies. They can make significantly more accurate inferences about web habits, demo- and psychographics.
For December, e.g., gAnalytics is showing me Age and Gender info on 39% of my 5mm directly-measured visits. It’s showing me affinity categories for 40% of those visits. We have member profiles with information provided directly and via permission granted to us for more than 1mm individuals. Our marketing team uses primary and secondary research about our readership. Even Quantcast gets more directly-measured information about our community of members. We’re able to share appropriate pieces of these data with our business partners. It doesn’t come up often enough.
It strikes me as willful ignorance of actual metrics to place so much stock in non-actual ones. Aren’t panel-based inferences what prematurely killed off “Arrested Development,” “My So-Called Life” and predicted “Star Wars” would bust? Hasn’t more direct measurement gotten us “House of Cards” and “Orange is the new Black”? Are television and the movies now advancing more quickly than the web?
Disclosure, I worked at one of the mentioned companies. It sounds like you’re a fan of internal web analytics and self-stated data. That’s fine. But its wrong just the same as the ‘guess work’ of market research companies. If you don’t like ‘guess work’ then don’t believe your doctor next time they take a blood sample and notify you of your ailment (or lack of ailment). You’re absconding the truth when you call statistics guess work. Are there methodology flaws? Sure. But you have to first accept that there is literally no way to accurately measure site audience. Not one. Third-party measurement is what it is today not because of fabulous PR but because it provides a neutral third-party and standard form of measurement to compare across sites. That’s its value, relative comparison. Not trending with your GA data.
Well done, Dan. Loved this.
Dan, interesting thread and commentary. I am being scouted by one of the above mentioned companies so this thread has effectively done much of my interview prep work! Thanks ….
I am of the more $ related type than the tech related type so this article is solid research for me …
I have 2 questions and please excuse the naive nature of them.
1. If you say the only accurate way of determining demographics is that the user enters these details on online forms voluntarily, how then do you weed out the voluntarily inaccurate ones? (These were not your words but my take on them)
2. Do my potential employer know I posting this?
….
Intentional grammatical error for effect …
Very interesting read. We’re very new to comscores analytics. We’re looking to learn more so to regulate better the traffic we provide to clients. Our adometry rating is around 300-400 on average and we use Integral Ad Science however comscore is our next unlearned champion. Do you have any other suggestions for traffic providers who want to cut their traffics junk down?
I just get ‘Load Error’ when trying to see a Traffic Report on Comscore. Help! 🙁
These days, pretty much everything is better than Alexa. Its value is only that which is given by companies looking for websites to advertise on, as it helps set the price, as inaccurate as it is. The likes of Quantcast – and Cisco Umbrella – are much more useful, in my opinion.
Pretty section of content. I just stumbled upon your website and in accession capital to assert that I get actually loved account your blog posts.
Anyway I’ll be subscribing for your augment or even I fulfillment you
get right of entry to persistently rapidly.