- -- In webanalytics@yahoogroups.com, MM <bristolnational@...> wrote:
>

Maybe I'm just ignorant about the power of these methods, but I have

> Effectively, you are getting involved in a very interesting project

> around Predictive Modeling and Predective Analytics. What you need

> first and foremost is the predictors, or facts that are likely to

> influence your traffic. For example, depending on your site, the

> demographics and siteographics (like Gender, Age, Purchase History,

> Online Campaigns etc). Hopefully, you will be able to capture a

> majority or some of these predictors and come up with a llinear

> equation around the behavior. We have independently worked on a

> similar modeling methdology for a B2C client and were able to come

> up with a model (using linear, logarithmic, quadratic, and

> exponential functions) ofcourse factoring seasonality (christmas,

> halloween), key media events (anagelina's baby, cricket world cup)

> etc. to stabilise the data points. The point is, it can be done,

> once you have your factors established it is a matter of having some

> data mining people rolling up their sleeves and getting it done.

>

to say, I would be very skeptical about using any such high-powered

mathematical models in this sort of problem. Using them for a

short-term estimate is one thing, but for a long-term estimate the

inputs to the equations are so much guesswork that the outputs are

going to be complete guesswork too. Especially if you have quadratic

and exponential functions in there, your guesses are going to be

multiplied upon guesses.

The problem is, the more sophisticated the mathematical model (1) the

more wrong it's likely to be given slightly wrong inputs; but (2) the

more people are likely to trust it because it looks clever. If you use

rough estimates of the sort described earlier in this thread, I think

you are at least as likely to get good answers, and as a bonus people

will have a good feeling for how wrong the answers might be.

--

Stephen Turner

CTO, ClickTracks http://www.clicktracks.com/ - Sounds like sensible advice. I suspect the processes which underly a lot

of the long tailed distributions we see scattered all over web access

data will be extremely difficult to predict with any certainty the scale

of an effect. For example, what's the likelihood of the site getting

slashdotted, and how relatively popular might that link be.

Damian

Stephen Turner wrote:>

--

>

> Maybe I'm just ignorant about the power of these methods, but I have

> to say, I would be very skeptical about using any such high-powered

> mathematical models in this sort of problem. Using them for a

> short-term estimate is one thing, but for a long-term estimate the

> inputs to the equations are so much guesswork that the outputs are

> going to be complete guesswork too. Especially if you have quadratic

> and exponential functions in there, your guesses are going to be

> multiplied upon guesses.

>

> The problem is, the more sophisticated the mathematical model (1) the

> more wrong it's likely to be given slightly wrong inputs; but (2) the

> more people are likely to trust it because it looks clever. If you use

> rough estimates of the sort described earlier in this thread, I think

> you are at least as likely to get good answers, and as a bonus people

> will have a good feeling for how wrong the answers might be.

>

> --

> Stephen Turner

> CTO, ClickTracks http://www.clicktracks.com/ <http://www.clicktracks.com/>

>

>

Damian Connell

Red Isle IT Consultancy

Consultancy and support services for small business

Braemoray

Forsyth Street

Hopeman

Elgin

IV30 5SY

01343 508 109

www.redisle.com

damianc@... - While it is true that nonlinear models are more susceptible to

unstable behavior in models that extrapolate, this doesn't need to be

the case if care is taken in building and deploying the models. Some

model types extrapolate more conservatively than others (linear models

for example), but even with wildly nonlinear models, you can prevent

them from behaving badly by restricting the extent of their

extrapolation--this is a topic that would take more time to develop

however.

But even this, in my opinion, misses the real issue. Any quantitative

decision-making process (business rules, rules of thumb, predictive

models) take observed data and turns them into predictions. That is

why the most important step with any quantitative analysis is the

validation of the model/rules/guesses. Validation can be done from

data that is set aside purely for validation, by simulation, by

inspection by those who know the domain area. If the predictive model,

then, finds interesting predictor variables and is not overfit, and

validates well, there is no reason not to use it. (It may be that it

isn't worth the investment in time and resources to build the model to

begin with, but that is entirely another question). - Man, am I with you on that. It drives me crazy, and this will

definitely be part of the doc I send to them. Thanks for the backup!

--- In webanalytics@yahoogroups.com, "Debbie Pascoe" <dpascoe@...> wrote:

>

> Rachel,

> I'm fascinated with the phrasing - "5 year goals for website growth" -

> this assumes that growth, not stagnation or deterioration, will occur.

>

> Redesign and hiring of a full-time editor is no guarantee of

> improvement. As an example, I reviewed a site just last week that has

> been recently redesigned and relaunched, using .net technology. The

> way this site has been designed has made all but the first page

> invisible to Google. To get visibility for their content, they will

> have to pay for it, losing the cost advantage derived from good

> natural search placement.

>

> Any assumption of increased traffic can not be made without taking

> into account site structural and quality aspects, privacy,

> accessibility (an increasingly important issue), and utilization of

> techniques and technogies that result in good natural search

> placement. All these issues can impact the ability of the site to

> help you meet the ultimate objective - raise more money. On- and

> off-line activities can drive more people to the site, but if the

> experience is bad, and donations do not increase, the objective will

> not have been achieved.

>

> The question seems to imply that simply having more visitors is the

> objective....it has been my experience that when somebody asks you a

> question like this one, and you answer it, you will get to answer FOR

> it later :-)

> - Wow, I've finally been able to go over all of your comments and I have

to say, I'm overwhelmed and in a bit of a jam. Last June we started

using GA (previously we were using WebTrends, which overinflated our

numbers due to poor configuration). So, this means we really only have

less than a years worth of web data to go on and any sort of

predictive modeling will be tricky, no? I can do my best with what I

have, but as someone who has never undertaken a task like this, my gut

says to use the data from the past year and try to project over the

next five years assuming everything is the same (which as we all said,

will definitely not be) and make it clear that this data is bogus.

Stephen, I'm definitely with you on making this look less scientific

for fear that they may take it more seriously.

I'll let you know how it goes!

--- In webanalytics@yahoogroups.com, "Stephen Turner"

<s.r.e.turner@...> wrote:>

> -- In webanalytics@yahoogroups.com, MM <bristolnational@> wrote:

> >

> > Effectively, you are getting involved in a very interesting project

> > around Predictive Modeling and Predective Analytics. What you need

> > first and foremost is the predictors, or facts that are likely to

> > influence your traffic. For example, depending on your site, the

> > demographics and siteographics (like Gender, Age, Purchase History,

> > Online Campaigns etc). Hopefully, you will be able to capture a

> > majority or some of these predictors and come up with a llinear

> > equation around the behavior. We have independently worked on a

> > similar modeling methdology for a B2C client and were able to come

> > up with a model (using linear, logarithmic, quadratic, and

> > exponential functions) ofcourse factoring seasonality (christmas,

> > halloween), key media events (anagelina's baby, cricket world cup)

> > etc. to stabilise the data points. The point is, it can be done,

> > once you have your factors established it is a matter of having some

> > data mining people rolling up their sleeves and getting it done.

> >

>

> Maybe I'm just ignorant about the power of these methods, but I have

> to say, I would be very skeptical about using any such high-powered

> mathematical models in this sort of problem. Using them for a

> short-term estimate is one thing, but for a long-term estimate the

> inputs to the equations are so much guesswork that the outputs are

> going to be complete guesswork too. Especially if you have quadratic

> and exponential functions in there, your guesses are going to be

> multiplied upon guesses.

>

> The problem is, the more sophisticated the mathematical model (1) the

> more wrong it's likely to be given slightly wrong inputs; but (2) the

> more people are likely to trust it because it looks clever. If you use

> rough estimates of the sort described earlier in this thread, I think

> you are at least as likely to get good answers, and as a bonus people

> will have a good feeling for how wrong the answers might be.

>

> --

> Stephen Turner

> CTO, ClickTracks http://www.clicktracks.com/

>