Dmitri Eroshenko here, CEO of Clicklab (www.clicklab.com?source=wa). I want
to share some thoughts on a recent thread and introduce our philosophy and
approach to click fraud detection.
There are two sides to the click fraud detection equation. First is a
visitor's behavior on the publisher's side, i.e. pre-click behavior. Second
is post-click behavior on the advertiser's side. There is a clear disconnect
between the two, as both search networks and advertisers are dealing with
overlapping, but disparate, data sets.
There is a fundamental limitation in what PPC search networks can do to
detect click fraud. They are unable to monitor post-click visitor activity
on advertisers' websites because, after the click, a visitor leaves the
One exception to this rule would be conversion tracking tools, such as
Google SiteStats. However, these tools are focused only on the conversion
pages proper, thus providing only a limited data set. Plus, it's unclear how
widespread is the adoption of these tools among advertisers.
Search networks are understandably reluctant to share information about
their fraud detection algorithms and filter settings. The advertisers' only
recourse, therefore, is to monitor visitor activity and behavior on the
When dealing with such a diverse, poorly defined, and often sophisticated
threat as click fraud, the only plausible approach is to look for
statistical patterns and signatures (not unlike spam filters). Web analytics
provide an inordinate amount of data that is an excellent substrate for
detecting and documenting click fraud.
Clicklab fraud detection technology is based on a statistical scoring
algorithm that works with all available web analytics data collected for a
given website. It applies a series of tests to each visitor session, and
assigns a weighed penalty score to each failed test, based on its
significance. For example, if a visitor session originates from an anonymous
proxy server, it gets a higher penalty score than many other offenses.
There are currently several dozen such tests and statistical comparisons,
and we are constantly adding more as we go along. To give you a rough idea
of the things we look at:
* Session statistics - recency, frequency, depth, duration
* IP statistics - number of paid clicks, visits, conversions, etc. per given
IP compared to control groups and statistical averages for a given site
* Keyword bid price - a direct indication of a risk factor
* Conversion rate analysis - obviously, clickers in India or China aren't
going to buy your red widgets
If a session's cumulative penalty score exceeds the threshold, then with a
high degree of certainty we can flag a session as fraudulent.
We then calculate a Click Inflation Index (CFI) for each individual traffic
source and keyword as the ratio of flagged sessions to total number of
visitor sessions. CFI serves as an indicator of click fraud that can be
monitored over time.
Clicklab Fraud Detection technology is scheduled for commercial release at
the end of February, and will be available both directly from Clicklab and
for licensing and private labeling. If you are interested in a partnership,
or would like more company background, please contact me directly.
P.S. Last, but not least: Clicklab wants your clicks! We always need more
data to work with, so if you have more than 50,000 paid clicks per month,
especially in competitive or B2B categories -- contact me directly and we'll
make it worth your while.