Browse Groups

• By stable, I mean that as the epochs go by, the RMSE continues to move down. If the learning rate s too high, you ll see it bounce around. Step 1: Find the
Message 1 of 12 , Jun 11, 2012
View Source
By stable, I mean that as the epochs go by, the
RMSE continues to move down. If the learning rate's
too high, you'll see it bounce around.

Step 1: Find the highest learning rate under which

Step 2: With this learning rate, play around with
annealing rates until you find one that gives you

In essence, you're now done for fitting one matrix, but
if you have to do it again, you'll want to take the annealing
rate that anneals the quickest while still giving you the

The problem is that if you lower the learning rate too
quickly, the algorithm will converge to an answer, but it
won't be the right answer (the one with lowest RMSE).
You can see that by considering a very tiny learning rate,
which moves the parameters very little -- you can always
find a learning rate so that the overall RMSE moves very
little.

- Bob

On 6/11/12 9:27 AM, cucumberguaguais wrote:
> Hi Bob,
>
> What do you mean by "stabilizes early"? I understand that "converges"
> mean that the RMSE for this epoch is close to the RMSE in the last
> epoch. But what's the difference between "stabilize" and "converge"?
>
> Jia
>
> --- In LingPipe@yahoogroups.com <mailto:LingPipe%40yahoogroups.com>, Bob
> Carpenter <carp@...> wrote:
> >
> > First set the learning rate so that it
> > stabilizes early. Then, set the annealing
> > rate so that it converges. Painful, I know.
> > At least once you get the right basic params, it'll
> > work for other similar matrices.
> >
> > - Bob
> >
> > On 6/10/12 4:19 PM, cucumberguaguais wrote:
> > > So do you mean that I should trying larger learning rate or smaller
> annealing rate?
> > >
> > > --- In LingPipe@yahoogroups.com <mailto:LingPipe%40yahoogroups.com>
> <mailto:LingPipe%40yahoogroups.com>, Bob Carpenter <carp@> wrote:
> > > >
> > > > That's not that far off. I'd try playing
> > > > with the learning rate to get it as high as
> > > > you can without thrashing.
> > > >
> > > > That is the problem with these stochastic gradient
> > > > algorithms. They're fast, but tuning the parameters
> > > > is tricky.
> > > >
> > > > Usually, whenever you think you have the right rate,
> > > > you should make sure that rate/2 and rate*2 don't
> > > > work better.
> > > >
> > > > You have it configured to run 1M epochs. (because
> > > > minImprovement = 0). Presumably it's stopping
> > > > earlier than that, which may mean the learning
> > > > rate's too low or the annealing rate's too high.
> > > >
> > > > The singular values are close enough that the
> > > > matrix conition shouldn't be causing any numerical
> > > > issues.
> > > >
> > > > - Bob
> > > >
> > > > On 6/10/12 3:41 PM, cucumberguaguais wrote:
> > > > > Hi, Bob,
> > > > > Thanks for your reply! My parameters are as follows.
> > > > > assertConverge(userNum,itemNum,columnIds,partialValues,2,0.1);//k=2
> > > > > double featureInit = 0.1; double initialLearningRate =
> > > > > 0.001; double annealingRate = 10000; double
> > > > > regularization = 0.00; double minImprovement = 0; int
> > > > > minEpochs = 2000; int maxEpochs = 1000000;
> > > > > And here's my output. Thanks in advance!SINGULAR
> > > > > VALUES0=4186.3772320195361=595.4645694928746
> > > > > LEFT SINGULAR VECTORS
> > > > > RIGHT SINGULAR VECTORSortho columns col=0 col2=1 expected=
> > > > > 0.0!=actual=0.048715485730904846ortho columns col=0 col2=1
> expected=
> > > > > 0.0!=actual=-0.07270873113458708
> > > > >
> > > > > --- In LingPipe@yahoogroups.com
> <mailto:LingPipe%40yahoogroups.com> <mailto:LingPipe%40yahoogroups.com>
> <mailto:LingPipe%40yahoogroups.com>, Bob
> > > Carpenter <carp@> wrote:
> > > > > >
> > > > > > Thanks for the feedback. You want to tune the
> > > > > > learning rate to be as high as you can so that
> > > > > > there is a steady decrease in the error per epoch.
> > > > > >
> > > > > > I'm curious as to when the right singular
> > > > > > vectors could be non-orthogonal. I'm wondering
> > > > > > if you're getting some kind of arithmetic overflow
> > > > > > or underflow.
> > > > > >
> > > > > > How far off being orthogonal are they? That is,
> > > > > > what's the cosine between them? You only get so close
> > > > > > with floating-point approximations of continuous
> > > > > > values.
> > > > > >
> > > > > > I'm really swamped for the next couple of weeks with
> > > > > > a grant proposal and non-work-related chores,
> > > > > > but could take a closer look at it after that.
> > > > > >
> > > > > > - Bob
> > >
> > >
> >
>
>
Your message has been successfully submitted and would be delivered to recipients shortly.
• Changes have not been saved
Press OK to abandon changes or Cancel to continue editing
• Your browser is not supported
Kindly note that Groups does not support 7.0 or earlier versions of Internet Explorer. We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox. If you are using IE 9 or later, make sure you turn off Compatibility View.