Loading ...
Sorry, an error occurred while loading the content.

Re: [LingPipe] Re: SvdMatrix

Expand Messages
  • Bob Carpenter
    By stable, I mean that as the epochs go by, the RMSE continues to move down. If the learning rate s too high, you ll see it bounce around. Step 1: Find the
    Message 1 of 12 , Jun 11, 2012
    View Source
    • 0 Attachment
      By stable, I mean that as the epochs go by, the
      RMSE continues to move down. If the learning rate's
      too high, you'll see it bounce around.

      Step 1: Find the highest learning rate under which
      the RMSE values decrease steadily.

      Step 2: With this learning rate, play around with
      annealing rates until you find one that gives you
      the best answer.

      In essence, you're now done for fitting one matrix, but
      if you have to do it again, you'll want to take the annealing
      rate that anneals the quickest while still giving you the
      same answer.

      The problem is that if you lower the learning rate too
      quickly, the algorithm will converge to an answer, but it
      won't be the right answer (the one with lowest RMSE).
      You can see that by considering a very tiny learning rate,
      which moves the parameters very little -- you can always
      find a learning rate so that the overall RMSE moves very
      little.

      - Bob




      On 6/11/12 9:27 AM, cucumberguaguais wrote:
      > Hi Bob,
      >
      > What do you mean by "stabilizes early"? I understand that "converges"
      > mean that the RMSE for this epoch is close to the RMSE in the last
      > epoch. But what's the difference between "stabilize" and "converge"?
      >
      > Jia
      >
      > --- In LingPipe@yahoogroups.com <mailto:LingPipe%40yahoogroups.com>, Bob
      > Carpenter <carp@...> wrote:
      > >
      > > First set the learning rate so that it
      > > stabilizes early. Then, set the annealing
      > > rate so that it converges. Painful, I know.
      > > At least once you get the right basic params, it'll
      > > work for other similar matrices.
      > >
      > > - Bob
      > >
      > > On 6/10/12 4:19 PM, cucumberguaguais wrote:
      > > > So do you mean that I should trying larger learning rate or smaller
      > annealing rate?
      > > >
      > > > --- In LingPipe@yahoogroups.com <mailto:LingPipe%40yahoogroups.com>
      > <mailto:LingPipe%40yahoogroups.com>, Bob Carpenter <carp@> wrote:
      > > > >
      > > > > That's not that far off. I'd try playing
      > > > > with the learning rate to get it as high as
      > > > > you can without thrashing.
      > > > >
      > > > > That is the problem with these stochastic gradient
      > > > > algorithms. They're fast, but tuning the parameters
      > > > > is tricky.
      > > > >
      > > > > Usually, whenever you think you have the right rate,
      > > > > you should make sure that rate/2 and rate*2 don't
      > > > > work better.
      > > > >
      > > > > You have it configured to run 1M epochs. (because
      > > > > minImprovement = 0). Presumably it's stopping
      > > > > earlier than that, which may mean the learning
      > > > > rate's too low or the annealing rate's too high.
      > > > >
      > > > > The singular values are close enough that the
      > > > > matrix conition shouldn't be causing any numerical
      > > > > issues.
      > > > >
      > > > > - Bob
      > > > >
      > > > > On 6/10/12 3:41 PM, cucumberguaguais wrote:
      > > > > > Hi, Bob,
      > > > > > Thanks for your reply! My parameters are as follows.
      > > > > > assertConverge(userNum,itemNum,columnIds,partialValues,2,0.1);//k=2
      > > > > > double featureInit = 0.1; double initialLearningRate =
      > > > > > 0.001; double annealingRate = 10000; double
      > > > > > regularization = 0.00; double minImprovement = 0; int
      > > > > > minEpochs = 2000; int maxEpochs = 1000000;
      > > > > > And here's my output. Thanks in advance!SINGULAR
      > > > > > VALUES0=4186.3772320195361=595.4645694928746
      > > > > > LEFT SINGULAR VECTORS
      > > > > > RIGHT SINGULAR VECTORSortho columns col=0 col2=1 expected=
      > > > > > 0.0!=actual=0.048715485730904846ortho columns col=0 col2=1
      > expected=
      > > > > > 0.0!=actual=-0.07270873113458708
      > > > > >
      > > > > > --- In LingPipe@yahoogroups.com
      > <mailto:LingPipe%40yahoogroups.com> <mailto:LingPipe%40yahoogroups.com>
      > <mailto:LingPipe%40yahoogroups.com>, Bob
      > > > Carpenter <carp@> wrote:
      > > > > > >
      > > > > > > Thanks for the feedback. You want to tune the
      > > > > > > learning rate to be as high as you can so that
      > > > > > > there is a steady decrease in the error per epoch.
      > > > > > >
      > > > > > > I'm curious as to when the right singular
      > > > > > > vectors could be non-orthogonal. I'm wondering
      > > > > > > if you're getting some kind of arithmetic overflow
      > > > > > > or underflow.
      > > > > > >
      > > > > > > How far off being orthogonal are they? That is,
      > > > > > > what's the cosine between them? You only get so close
      > > > > > > with floating-point approximations of continuous
      > > > > > > values.
      > > > > > >
      > > > > > > I'm really swamped for the next couple of weeks with
      > > > > > > a grant proposal and non-work-related chores,
      > > > > > > but could take a closer look at it after that.
      > > > > > >
      > > > > > > - Bob
      > > >
      > > >
      > >
      >
      >
    Your message has been successfully submitted and would be delivered to recipients shortly.