Loading ...
Sorry, an error occurred while loading the content.

Re: [vowpal_wabbit] Re: vw --bfgs and gcc compiler version?

Expand Messages
  • John Langford
    Try commenting out first_hessian_on = true in drive_bfgs() in bfgs.cc in the current master. -John
    Message 1 of 9 , Feb 21, 2012
    View Source
    • 0 Attachment
      Try commenting out first_hessian_on = true in drive_bfgs() in bfgs.cc in
      the current master.

      -John

      On 02/21/2012 05:59 PM, regularizer wrote:
      >
      > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
      > it turned out that in my last pasted example it was still on. Turning
      > it off in that environment produces results similar to the first two,
      > i.e., failure. So in this case -ffast-math is the only way to produce
      > good results with BFGS. In this case I'm just not able to get as good
      > results with on-line learning.
      >
      > What you say about the quantile loss second derivative definitely
      > makes sense. It would be good not to do have to do the first huge
      > misstep with quantile loss.
      >
      > I guess that's why beginning with on-line learning with quantile loss
      > and continuing with BFGS did not work since the 1st BFGS step always
      > goes someplace far away from where on-line learning got me.
      >
      > - Kari
      >
      > --- In vowpal_wabbit@yahoogroups.com
      > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford <jl@...> wrote:
      > >
      > > Different results are due to -ffast-math, which you could turn off
      > > when compiling.
      > >
      > > The quantile loss is particularly brutal because quantile loss has no
      > > meaningful second derivative. That means that it oversteps by alot in
      > > the first step, and then steps back for many steps. Maybe try
      > > increasing the regularizer to compenaste. Alternatively, we could
      > > modify LBFGS to not use the second derivative in the step direction on
      > > the first pass.
      > >
      > > Zero curvature is pretty plausible, and more regularization may help
      > there.
      > >
      > > -John
      > >
      > > On 2/21/12, regularizer <regularizer@...> wrote:
      > > > vw --bfgs seems to be sensitive to the gcc version. I have three
      > > > different versions on various machines and all produce different
      > results
      > > > (consistently). Is there a preferred compiler version?
      > > >
      > > > Here's what I'm trying to do:
      > > >
      > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes 60
      > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
      > > >
      > > > I'll attach the outputs of vw compiled with three different compiler
      > > > versions (all 64-bit systems). The lines are long, sorry. Notice
      > how the
      > > > results diverge radically at pass 22 while already differing
      > slightly at
      > > > pass 3. The first two produce predictors that predict all zeros, only
      > > > the last one is good. Based on these three data points should I
      > conclude
      > > > that I need at least gcc 4.5.3 or that I need Cygwin instead of Linux?
      > > >
      > > > Also, what does BFGS do after "In wolfe_eval: Zero or negative
      > curvature
      > > > detected."? It spends a long time doing apparently nothing, and the
      > > > resulting weights are not good.
      > > >
      > > > Thanks,
      > > > - Kari
      > > >
      > > > ----------------------------------------------------------\
      > > > -------
      > > > This is under Linux with gcc version 4.1.2
      > > >
      > > > enabling BFGS based optimization **without** curvature calculation
      > > > creating quadratic features for pairs: sd aa
      > > > final_regressor = 75.rgr
      > > > using cache_file = vw.cache
      > > > ignoring text input in favor of cache input
      > > > num sources = 1
      > > > Num weight bits = 18
      > > > learning rate = 10
      > > > initial_t = 1
      > > > power_t = 0.5
      > > > decay_learning_rate = 1
      > > > using l2 regularization
      > > > m = 25
      > > > Allocated 54M for weights and mem
      > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
      > > > mix fraction curvature dir. magnitude step size time
      > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
      > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
      > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
      > > > -1.695048 (revise x 0.4)
      > > > 4.434236e-01 132.949
      > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
      > > > -1.138472 (revise x 0.4)
      > > > 1.900951e-01 166.716
      > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
      > > > -0.885143 (revise x 0.4)
      > > > 7.967209e-02 200.672
      > > > 6 4.762672e+05 2.958291e+00 9.097862e+07 -0.039836
      > > > -0.774720 (revise x 0.4)
      > > > 3.299097e-02 234.418
      > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
      > > > -0.728039 (revise x 0.4)
      > > > 1.358448e-02 268.178
      > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
      > > > -0.708633 (revise x 0.4)
      > > > 5.579975e-03 302.334
      > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
      > > > -0.700628 (revise x 0.4)
      > > > 2.289690e-03 336.101
      > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
      > > > -0.697338 (revise x 0.4)
      > > > 9.391521e-04 369.972
      > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
      > > > -0.695987 (revise x 0.4)
      > > > 3.851381e-04 403.756
      > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
      > > > -0.695433 (revise x 0.4)
      > > > 1.579278e-04 437.512
      > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
      > > > -0.695206 (revise x 0.4)
      > > > 6.475448e-05 471.257
      > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
      > > > -0.695113 (revise x 0.4)
      > > > 2.654803e-05 505.011
      > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
      > > > -0.695075 (revise x 0.4)
      > > > 1.088143e-05 538.771
      > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
      > > > -0.695059 (revise x 0.4)
      > > > 4.457373e-06 572.513
      > > > 17 1.571349e+00 2.386943e+00 7.340751e+07 -0.001723
      > > > -0.695052 (revise x 0.4)
      > > > 1.823206e-06 606.268
      > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
      > > > -0.695050 (revise x 0.4)
      > > > 7.430751e-07 640.011
      > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
      > > > -0.695049 (revise x 0.4)
      > > > 3.001712e-07 674.087
      > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
      > > > -0.695048 (revise x 0.4)
      > > > 1.185601e-07 707.928
      > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
      > > > -0.695048 (revise x 0.4)
      > > > 4.409118e-08 741.700
      > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
      > > > -0.673731 (revise x 0.3)
      > > > 1.526299e-08 775.480
      > > > 23 4.191600e-01
      > > > In wolfe_eval: Zero or negative curvature detected.
      > > > To increase curvature you can increase regularization or rescale
      > > > features.
      > > > It is also very likely that you have reached numerical accuracy
      > > > and further decrease in the objective cannot be reliably detected.
      > > > (revise x 0.0) 0.000000e+00 809.273
      > > > 24 4.191601e-01
      > > > (revise x 0.0) 0.000000e+00 1539.289
      > > > Net time spent in communication = 0 seconds
      > > > Net time spent = 1539.3 seconds
      > > > finished run
      > > > number of examples = 184522680
      > > > weighted example sum = 1.845e+08
      > > > weighted label sum = 1.031e+08
      > > > average loss = 0.5425
      > > > best constant = 0.5589
      > > > total feature number = 49403301420
      > > >
      > > >
      > > > ----------------------------------------------------------\
      > > > -------
      > > > This is under Linux with gcc version 4.3.5
      > > >
      > > > enabling BFGS based optimization **without** curvature calculation
      > > > creating quadratic features for pairs: sd aa
      > > > final_regressor = 75.rgr
      > > > using cache_file = vw.cache
      > > > ignoring text input in favor of cache input
      > > > num sources = 1
      > > > Num weight bits = 18
      > > > learning rate = 10
      > > > initial_t = 1
      > > > power_t = 0.5
      > > > decay_learning_rate = 1
      > > > using l2 regularization
      > > > m = 25
      > > > Allocated 54M for weights and mem
      > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
      > > > mix fraction curvature dir. magnitude step size time
      > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
      > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
      > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
      > > > -1.695048 (revise x 0.4)
      > > > 4.434237e-01 69.483
      > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
      > > > -1.138472 (revise x 0.4)
      > > > 1.900951e-01 91.311
      > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
      > > > -0.885143 (revise x 0.4)
      > > > 7.967210e-02 113.372
      > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
      > > > -0.774720 (revise x 0.4)
      > > > 3.299098e-02 136.162
      > > > 7 8.166488e+04 2.615997e+00 8.045179e+07 -0.016496
      > > > -0.728039 (revise x 0.4)
      > > > 1.358448e-02 158.192
      > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
      > > > -0.708633 (revise x 0.4)
      > > > 5.579975e-03 179.810
      > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
      > > > -0.700628 (revise x 0.4)
      > > > 2.289690e-03 201.564
      > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
      > > > -0.697338 (revise x 0.4)
      > > > 9.391521e-04 223.651
      > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
      > > > -0.695987 (revise x 0.4)
      > > > 3.851381e-04 245.367
      > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
      > > > -0.695433 (revise x 0.4)
      > > > 1.579278e-04 267.813
      > > > 13 3.440815e+00 2.387984e+00 7.343954e+07 -0.000128
      > > > -0.695206 (revise x 0.4)
      > > > 6.475448e-05 289.566
      > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
      > > > -0.695113 (revise x 0.4)
      > > > 2.654803e-05 311.194
      > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
      > > > -0.695075 (revise x 0.4)
      > > > 1.088143e-05 332.782
      > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
      > > > -0.695059 (revise x 0.4)
      > > > 4.457373e-06 354.700
      > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
      > > > -0.695053 (revise x 0.4)
      > > > 1.823206e-06 376.323
      > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
      > > > -0.695050 (revise x 0.4)
      > > > 7.430750e-07 398.265
      > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
      > > > -0.695049 (revise x 0.4)
      > > > 3.001712e-07 419.779
      > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
      > > > -0.695048 (revise x 0.4)
      > > > 1.185601e-07 441.545
      > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
      > > > -0.695048 (revise x 0.4)
      > > > 4.409114e-08 463.328
      > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
      > > > -0.695045 (revise x 0.3)
      > > > 1.362805e-08 484.935
      > > > 23 4.191623e-01
      > > > In wolfe_eval: Zero or negative curvature detected.
      > > > To increase curvature you can increase regularization or rescale
      > > > features.
      > > > It is also very likely that you have reached numerical accuracy
      > > > and further decrease in the objective cannot be reliably detected.
      > > > (revise x 0.0) 0.000000e+00 506.788
      > > > 24 4.191600e-01
      > > > (revise x 0.0) 0.000000e+00 987.156
      > > > Net time spent in communication = 0 seconds
      > > > Net time spent = 987.16 seconds
      > > > finished run
      > > > number of examples = 184522680
      > > > weighted example sum = 1.845e+08
      > > > weighted label sum = 1.031e+08
      > > > average loss = 0.5509
      > > > best constant = 0.5589
      > > > total feature number = 49403301420
      > > >
      > > > ----------------------------------------------------------\
      > > > -------
      > > > This is under Cygwin with gcc version 4.5.3
      > > >
      > > > enabling BFGS based optimization **without** curvature calculation
      > > > creating quadratic features for pairs: sd aa
      > > > final_regressor = 75.rgr
      > > > using cache_file = vw.cache
      > > > ignoring text input in favor of cache input
      > > > num sources = 1
      > > > Num weight bits = 18
      > > > learning rate = 10
      > > > initial_t = 1
      > > > power_t = 0.5
      > > > decay_learning_rate = 1
      > > > using l2 regularization
      > > > m = 25
      > > > Allocated 54M for weights and mem
      > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
      > > > mix fraction curvature dir. magnitude step size time
      > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
      > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
      > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
      > > > -1.695048 (revise x 0.4)
      > > > 4.434237e-01 109.199
      > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
      > > > -1.138472 (revise x 0.4)
      > > > 1.900951e-01 144.549
      > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
      > > > -0.885143 (revise x 0.4)
      > > > 7.967210e-02 180.274
      > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
      > > > -0.774720 (revise x 0.4)
      > > > 3.299098e-02 215.368
      > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
      > > > -0.728039 (revise x 0.4)
      > > > 1.358448e-02 251.118
      > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
      > > > -0.708633 (revise x 0.4)
      > > > 5.579975e-03 286.362
      > > > 9 2.337717e+03 2.424913e+00 7.457524e+07 -0.002791
      > > > -0.700628 (revise x 0.4)
      > > > 2.289690e-03 321.647
      > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
      > > > -0.697338 (revise x 0.4)
      > > > 9.391521e-04 359.588
      > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
      > > > -0.695987 (revise x 0.4)
      > > > 3.851380e-04 398.898
      > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
      > > > -0.695433 (revise x 0.4)
      > > > 1.579277e-04 436.665
      > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
      > > > -0.695206 (revise x 0.4)
      > > > 6.475448e-05 472.627
      > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
      > > > -0.695113 (revise x 0.4)
      > > > 2.654803e-05 507.751
      > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
      > > > -0.695075 (revise x 0.4)
      > > > 1.088143e-05 543.170
      > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
      > > > -0.695059 (revise x 0.4)
      > > > 4.457372e-06 578.753
      > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
      > > > -0.695053 (revise x 0.4)
      > > > 1.823206e-06 613.860
      > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
      > > > -0.695050 (revise x 0.4)
      > > > 7.430749e-07 650.090
      > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
      > > > -0.695049 (revise x 0.4)
      > > > 3.001711e-07 686.592
      > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
      > > > -0.695048 (revise x 0.4)
      > > > 1.185601e-07 729.522
      > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
      > > > -0.695048 (revise x 0.4)
      > > > 4.409113e-08 765.928
      > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
      > > > -0.695048 (revise x 0.3)
      > > > 1.356294e-08 806.912
      > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
      > > > -0.520586 1.129881e-01
      > > > 1.000000e+00 845.604
      > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
      > > > 1.031045e-01 1.000000e+00 880.431
      > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
      > > > -1.159684 2.684013e-02
      > > > 1.000000e+00 916.249
      > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
      > > > -0.119104 9.070075e-04
      > > > 1.000000e+00 951.681
      > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
      > > > 1.577486e-02 1.000000e+00 988.121
      > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
      > > > 4.510566e-03 1.000000e+00 1024.557
      > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
      > > > 1.230063e-02 1.000000e+00 1060.234
      > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
      > > > 1.402782e-02 1.000000e+00 1096.908
      > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
      > > > 2.145415e-02 1.000000e+00 1134.393
      > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
      > > > 5.921376e-02 1.000000e+00 1169.727
      > > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.419121
      > > > 1.841315e-01 1.000000e+00 1205.274
      > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
      > > > -1.324207 (revise x 0.5)
      > > > 5.300100e-01 1240.358
      > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
      > > > -0.248880 9.528314e-03
      > > > 1.000000e+00 1276.848
      > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
      > > > 5.401969e-03 1.000000e+00 1312.630
      > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
      > > > 2.109435e-02 1.000000e+00 1348.003
      > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
      > > > 4.156245e-02 1.000000e+00 1383.463
      > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
      > > > 6.534269e-02 1.000000e+00 1418.902
      > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
      > > > 2.155944e-02 1.000000e+00 1454.386
      > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
      > > > 6.547502e-03 1.000000e+00 1489.848
      > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
      > > > 2.127681e-02 1.000000e+00 1525.388
      > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
      > > > 8.674439e-02 1.000000e+00 1560.996
      > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
      > > > -0.040180 4.881849e-02
      > > > 1.000000e+00 1596.506
      > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
      > > > -0.057879 3.470570e-03
      > > > 1.000000e+00 1632.151
      > > > 46 1.817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
      > > > 1.187489e-02 1.000000e+00 1668.191
      > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
      > > > 2.917945e-02 1.000000e+00 1703.878
      > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
      > > > 1.323640e-01 1.000000e+00 1739.512
      > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
      > > > -0.129532 1.930031e-02
      > > > 1.000000e+00 1775.168
      > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
      > > > 6.557360e-03 1.000000e+00 1810.845
      > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
      > > > 3.182580e-02 1.000000e+00 1846.487
      > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
      > > > 8.439630e-02 1.000000e+00 1883.680
      > > > 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
      > > > -0.451383 3.352547e-03
      > > > 1.000000e+00 1919.294
      > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
      > > > 1.574223e-03 1.000000e+00 1955.011
      > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
      > > > 4.880569e-03 1.000000e+00 1990.919
      > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
      > > > 1.338996e-02 1.000000e+00 2026.526
      > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
      > > > 1.254878e-01 1.000000e+00 2062.208
      > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
      > > > -0.856343 1.059850e-02
      > > > 1.000000e+00 2098.129
      > > > 59 1.717457e-01 5.019926e-04 1.543817e+04 0.356830
      > > > -0.113790 5.999178e-03
      > > > 1.000000e+00 2133.891
      > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
      > > > 4.132071e-03 1.000000e+00 2169.619
      > > > Net time spent in communication = 0 seconds
      > > > Net time spent = 2169.6 seconds
      > > > finished run
      > > > number of examples = 184522680
      > > > weighted example sum = 1.845e+08
      > > > weighted label sum = 1.031e+08
      > > > average loss = 0.666
      > > > best constant = 0.5589
      > > > total feature number = 49403301420
      > > >
      > > >
      > > >
      > > >
      > > >
      > > >
      > > >
      > >
      >
      >
    • regularizer
      Thanks John, that helps in that I don t get stuck at Zero or negative curvature detected anymore and the initial step is slightly smaller. Still, the first
      Message 2 of 9 , Feb 21, 2012
      View Source
      • 0 Attachment
        Thanks John, that helps in that I don't get stuck at "Zero or negative curvature detected" anymore and the initial step is slightly smaller. Still, the first printed direction magnitude seems huge to me.  The loss at second pass becomes large and it takes time to creep back. Output attached.

        If I do first on-line learning and then switch to bfgs, the same happens - the first direction magnitude is too large and all initial learning is undone. Is there something I can do to continue optimization (with quantile loss) starting from a previous weight vector without actually throwing it away?

        Thanks,

        - Kari


        $ vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes 60 --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
        enabling BFGS based optimization **without** curvature calculation
        creating quadratic features for pairs: sd aa
        final_regressor = 75.rgr
        creating cache_file = vw.cache
        Reading from stdin
        num sources = 1
        Num weight bits = 18
        learning rate = 10
        initial_t = 1
        power_t = 0.5
        decay_learning_rate = 1
        using l2 regularization
        m = 25
        Allocated 54M for weights and mem
        ## avg. loss    der. mag.       d. m. cond.      wolfe1         wolfe2          mix fraction    curvature       dir. magnitude  step size       time     
         1 4.191599e-01 4.879425e+00    1.500608e+08                                                                    4.614935e+15    5.000000e-01    52.935   
         2 1.875760e+07 6.998204e+00    2.152212e+08     -0.250000      -1.195048                                       (revise x 0.5)  2.500000e-01    75.075
         3 4.689400e+06 4.387594e+00    1.349351e+08     -0.125000      -0.945048                                       (revise x 0.5)  1.250000e-01    98.801
         4 1.172351e+06 3.311012e+00    1.018261e+08     -0.062500      -0.820048                                       (revise x 0.5)  6.250000e-02    121.187
         5 2.930890e+05 2.829902e+00    8.703020e+07     -0.031250      -0.757548                                       (revise x 0.5)  3.125000e-02    144.725
         6 7.327342e+04 2.603643e+00    8.007185e+07     -0.015625      -0.726298                                       (revise x 0.5)  1.562500e-02    167.149
         7 1.831953e+04 2.494086e+00    7.670258e+07     -0.007813      -0.710673                                       (revise x 0.5)  7.812500e-03    190.380
         8 4.581061e+03 2.440202e+00    7.504543e+07     -0.003907      -0.702861                                       (revise x 0.5)  3.906250e-03    212.310
         9 1.146443e+03 2.413483e+00    7.422372e+07     -0.001955      -0.698954                                       (revise x 0.5)  1.953125e-03    239.450
        10 2.877881e+02 2.400179e+00    7.381458e+07     -0.000980      -0.697001                                       (revise x 0.5)  9.765625e-04    270.791
        11 7.312442e+01 2.393541e+00    7.361044e+07     -0.000496      -0.696025                                       (revise x 0.5)  4.882812e-04    305.179
        12 1.945852e+01 2.390226e+00    7.350848e+07     -0.000260      -0.695536                                       (revise x 0.5)  2.441406e-04    339.859
        13 6.042039e+00 2.388569e+00    7.345753e+07     -0.000153      -0.695292                                       (revise x 0.5)  1.220703e-04    374.339
        14 2.687920e+00 2.387741e+00    7.343206e+07     -0.000124      -0.695170                                       (revise x 0.5)  6.103516e-05    408.740
        15 1.849390e+00 2.387327e+00    7.341932e+07     -0.000156      -0.695109                                       (revise x 0.5)  3.051758e-05    443.824
        16 1.639757e+00 2.387120e+00    7.341296e+07     -0.000267      -0.695079                                       (revise x 0.5)  1.525879e-05    478.239
        17 1.587349e+00 2.387016e+00    7.340977e+07     -0.000510      -0.695063                                       (revise x 0.5)  7.629395e-06    512.576
        18 1.574247e+00 2.386965e+00    7.340818e+07     -0.001009      -0.695056                                       (revise x 0.5)  3.814697e-06    547.266
        19 1.570972e+00 2.386939e+00    7.340739e+07     -0.002012      -0.695052                                       (revise x 0.5)  1.907349e-06    581.753
        20 1.570153e+00 2.386926e+00    7.340699e+07     -0.004021      -0.695050                                       (revise x 0.5)  9.536743e-07    616.106
        21 1.569948e+00 2.386919e+00    7.340679e+07     -0.008041      -0.695049                                       (revise x 0.5)  4.768372e-07    650.558
        22 1.569897e+00 2.386916e+00    7.340669e+07     -0.016082      -0.695049                                       (revise x 0.5)  2.384186e-07    684.996
        23 1.569884e+00 2.386914e+00    7.340664e+07     -0.032164      -0.695048                                       (revise x 0.5)  1.192093e-07    719.363
        24 1.569881e+00 2.386914e+00    7.340662e+07     -0.064327      -0.695048                                       (revise x 0.5)  5.960464e-08    753.747
        25 1.569811e+00 2.386913e+00    7.340660e+07     -0.128646      -0.695048                                       (revise x 0.5)  2.980232e-08    788.155
        26 1.560698e+00 2.386911e+00    7.340653e+07     -0.255254      -0.695048                                       (revise x 0.5)  1.490116e-08    822.468
        27 1.335834e+00 2.381222e+00    7.323158e+07     -0.409947      -0.694208                                       (revise x 0.5)  7.450581e-09    856.963
        28 6.457887e-01 2.206131e+00    6.784688e+07     -0.202702      -0.667622                                       (revise x 0.5)  3.725290e-09    892.038
        29 3.134720e-01 9.375462e-01    2.883309e+07     0.189059       -0.424878                                       5.721519e-03    1.000000e+00    928.184  
        30 2.615405e-01 1.543026e-01    4.745389e+06     0.710839       0.356731                                        3.218869e-03    1.000000e+00    964.430  
        31 2.527036e-01 1.169463e-01    3.596541e+06     0.397394       -0.292694                                       3.594861e-04    1.000000e+00    1000.642 
        32 2.478497e-01 3.924282e-02    1.206865e+06     0.797068       0.607724                                        5.707927e-03    1.000000e+00    1036.349 
        33 2.382581e-01 2.043136e-02    6.283417e+05     0.687158       0.373366                                        5.886233e-03    1.000000e+00    1072.106 
        34 2.325280e-01 1.878525e-02    5.777174e+05     0.723725       0.451241                                        2.183486e-02    1.000000e+00    1107.914 
        35 2.234863e-01 2.136923e-02    6.571848e+05     0.683187       0.380104                                        3.404222e-02    1.000000e+00    1143.795 
        36 2.161262e-01 7.854667e-03    2.415607e+05     0.677020       0.306281                                        2.641602e-02    1.000000e+00    1180.005 
        37 2.111394e-01 6.731176e-03    2.070091e+05     0.734114       0.365996                                        3.066953e-02    1.000000e+00    1215.994 
        38 2.064287e-01 7.055607e-03    2.169866e+05     0.791928       0.477421                                        9.820154e-02    1.000000e+00    1252.024 
        39 2.005127e-01 1.397243e-02    4.297050e+05     0.532899       0.010674                                        2.559544e-02    1.000000e+00    1288.122 
        40 1.968502e-01 1.953271e-03    6.007048e+04     0.581431       0.154733                                        3.655928e-03    1.000000e+00    1324.317 
        41 1.956542e-01 2.292869e-03    7.051438e+04     0.661248       0.613157                                        2.971351e-02    1.000000e+00    1360.552 
        42 1.928280e-01 2.290130e-03    7.043016e+04     0.598646       0.344799                                        5.504233e-02    1.000000e+00    1396.824 
        43 1.902657e-01 3.252593e-03    1.000295e+05     0.562222       0.065822                                        2.691962e-02    1.000000e+00    1433.380 
        44 1.881628e-01 9.226223e-04    2.837412e+04     0.705538       0.362939                                        1.867108e-02    1.000000e+00    1469.847 
        45 1.866176e-01 8.557320e-04    2.631699e+04     0.759632       0.393094                                        2.948932e-02    1.000000e+00    1506.383 
        46 1.851779e-01 1.614627e-03    4.965587e+04     0.641649       0.222579                                        5.023984e-02    1.000000e+00    1542.968 
        47 1.836210e-01 1.426560e-03    4.387212e+04     0.550367       0.067523                                        9.671028e-03    1.000000e+00    1579.712 
        48 1.824689e-01 6.715775e-04    2.065355e+04     0.740073       0.578498                                        3.563188e-02    1.000000e+00    1616.411 
        49 1.810121e-01 8.212360e-04    2.525611e+04     0.560126       0.208578                                        2.365269e-02    1.000000e+00    1653.145 
        50 1.799282e-01 1.155586e-03    3.553864e+04     0.709983       0.329706                                        4.263906e-02    1.000000e+00    1689.971 
        51 1.785792e-01 6.220927e-04    1.913170e+04     0.761242       0.439758                                        7.058747e-02    1.000000e+00    1726.856 
        52 1.770287e-01 7.729798e-04    2.377205e+04     0.704055       0.307157                                        1.056268e-01    1.000000e+00    1764.094 
        53 1.759657e-01 1.297268e-03    3.989590e+04     0.405447       -0.213848                                       3.006896e-03    1.000000e+00    1800.991 
        54 1.751416e-01 4.022849e-04    1.237178e+04     0.673106       0.447034                                        7.249319e-03    1.000000e+00    1838.000 
        55 1.745050e-01 3.247104e-04    9.986073e+03     0.606828       0.369200                                        1.471770e-02    1.000000e+00    1874.988 
        56 1.738613e-01 3.934394e-04    1.209975e+04     0.728943       0.567636                                        1.063217e-01    1.000000e+00    1911.937 
        57 1.727164e-01 1.591558e-03    4.894642e+04     0.547620       -0.063455                                       5.282468e-02    1.000000e+00    1948.869 
        58 1.726473e-01 1.510033e-02    4.643923e+05     0.043418       -0.577282
        Termination condition reached in pass 58: decrease in loss less than 0.100%.
        If you want to optimize further, decrease termination threshold.
                                                                                                                        1.219617e-03    1.000000e+00    1985.922 
        59 1.718465e-01 5.661531e-04    1.741135e+04     0.782218       0.216402                                        5.475620e-04    1.000000e+00    2045.012 
        finished run
        number of examples = 184522680
        weighted example sum = 1.845e+08
        weighted label sum = 1.031e+08
        average loss = 0.8003
        best constant = 0.5589
        total feature number = 49403301420




        --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@...> wrote:
        >
        > Try commenting out first_hessian_on = true in drive_bfgs() in bfgs.cc in
        > the current master.
        >
        > -John
        >
        > On 02/21/2012 05:59 PM, regularizer wrote:
        > >
        > > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
        > > it turned out that in my last pasted example it was still on. Turning
        > > it off in that environment produces results similar to the first two,
        > > i.e., failure. So in this case -ffast-math is the only way to produce
        > > good results with BFGS. In this case I'm just not able to get as good
        > > results with on-line learning.
        > >
        > > What you say about the quantile loss second derivative definitely
        > > makes sense. It would be good not to do have to do the first huge
        > > misstep with quantile loss.
        > >
        > > I guess that's why beginning with on-line learning with quantile loss
        > > and continuing with BFGS did not work since the 1st BFGS step always
        > > goes someplace far away from where on-line learning got me.
        > >
        > > - Kari
        > >
        > > --- In vowpal_wabbit@yahoogroups.com
        > > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford jl@ wrote:
        > > >
        > > > Different results are due to -ffast-math, which you could turn off
        > > > when compiling.
        > > >
        > > > The quantile loss is particularly brutal because quantile loss has no
        > > > meaningful second derivative. That means that it oversteps by alot in
        > > > the first step, and then steps back for many steps. Maybe try
        > > > increasing the regularizer to compenaste. Alternatively, we could
        > > > modify LBFGS to not use the second derivative in the step direction on
        > > > the first pass.
        > > >
        > > > Zero curvature is pretty plausible, and more regularization may help
        > > there.
        > > >
        > > > -John
        > > >
        > > > On 2/21/12, regularizer regularizer@ wrote:
        > > > > vw --bfgs seems to be sensitive to the gcc version. I have three
        > > > > different versions on various machines and all produce different
        > > results
        > > > > (consistently). Is there a preferred compiler version?
        > > > >
        > > > > Here's what I'm trying to do:
        > > > >
        > > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes 60
        > > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
        > > > >
        > > > > I'll attach the outputs of vw compiled with three different compiler
        > > > > versions (all 64-bit systems). The lines are long, sorry. Notice
        > > how the
        > > > > results diverge radically at pass 22 while already differing
        > > slightly at
        > > > > pass 3. The first two produce predictors that predict all zeros, only
        > > > > the last one is good. Based on these three data points should I
        > > conclude
        > > > > that I need at least gcc 4.5.3 or that I need Cygwin instead of Linux?
        > > > >
        > > > > Also, what does BFGS do after "In wolfe_eval: Zero or negative
        > > curvature
        > > > > detected."? It spends a long time doing apparently nothing, and the
        > > > > resulting weights are not good.
        > > > >
        > > > > Thanks,
        > > > > - Kari
        > > > >
        > > > > ----------------------------------------------------------\
        > > > > -------
        > > > > This is under Linux with gcc version 4.1.2
        > > > >
        > > > > enabling BFGS based optimization **without** curvature calculation
        > > > > creating quadratic features for pairs: sd aa
        > > > > final_regressor = 75.rgr
        > > > > using cache_file = vw.cache
        > > > > ignoring text input in favor of cache input
        > > > > num sources = 1
        > > > > Num weight bits = 18
        > > > > learning rate = 10
        > > > > initial_t = 1
        > > > > power_t = 0.5
        > > > > decay_learning_rate = 1
        > > > > using l2 regularization
        > > > > m = 25
        > > > > Allocated 54M for weights and mem
        > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
        > > > > mix fraction curvature dir. magnitude step size time
        > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
        > > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
        > > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
        > > > > -1.695048 (revise x 0.4)
        > > > > 4.434236e-01 132.949
        > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
        > > > > -1.138472 (revise x 0.4)
        > > > > 1.900951e-01 166.716
        > > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
        > > > > -0.885143 (revise x 0.4)
        > > > > 7.967209e-02 200.672
        > > > > 6 4.762672e+05 2.958291e+00 9.097862e+07 -0.039836
        > > > > -0.774720 (revise x 0.4)
        > > > > 3.299097e-02 234.418
        > > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
        > > > > -0.728039 (revise x 0.4)
        > > > > 1.358448e-02 268.178
        > > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
        > > > > -0.708633 (revise x 0.4)
        > > > > 5.579975e-03 302.334
        > > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
        > > > > -0.700628 (revise x 0.4)
        > > > > 2.289690e-03 336.101
        > > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
        > > > > -0.697338 (revise x 0.4)
        > > > > 9.391521e-04 369.972
        > > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
        > > > > -0.695987 (revise x 0.4)
        > > > > 3.851381e-04 403.756
        > > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
        > > > > -0.695433 (revise x 0.4)
        > > > > 1.579278e-04 437.512
        > > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
        > > > > -0.695206 (revise x 0.4)
        > > > > 6.475448e-05 471.257
        > > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
        > > > > -0.695113 (revise x 0.4)
        > > > > 2.654803e-05 505.011
        > > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
        > > > > -0.695075 (revise x 0.4)
        > > > > 1.088143e-05 538.771
        > > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
        > > > > -0.695059 (revise x 0.4)
        > > > > 4.457373e-06 572.513
        > > > > 17 1.571349e+00 2.386943e+00 7.340751e+07 -0.001723
        > > > > -0.695052 (revise x 0.4)
        > > > > 1.823206e-06 606.268
        > > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
        > > > > -0.695050 (revise x 0.4)
        > > > > 7.430751e-07 640.011
        > > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
        > > > > -0.695049 (revise x 0.4)
        > > > > 3.001712e-07 674.087
        > > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
        > > > > -0.695048 (revise x 0.4)
        > > > > 1.185601e-07 707.928
        > > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
        > > > > -0.695048 (revise x 0.4)
        > > > > 4.409118e-08 741.700
        > > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
        > > > > -0.673731 (revise x 0.3)
        > > > > 1.526299e-08 775.480
        > > > > 23 4.191600e-01
        > > > > In wolfe_eval: Zero or negative curvature detected.
        > > > > To increase curvature you can increase regularization or rescale
        > > > > features.
        > > > > It is also very likely that you have reached numerical accuracy
        > > > > and further decrease in the objective cannot be reliably detected.
        > > > > (revise x 0.0) 0.000000e+00 809.273
        > > > > 24 4.191601e-01
        > > > > (revise x 0.0) 0.000000e+00 1539.289
        > > > > Net time spent in communication = 0 seconds
        > > > > Net time spent = 1539.3 seconds
        > > > > finished run
        > > > > number of examples = 184522680
        > > > > weighted example sum = 1.845e+08
        > > > > weighted label sum = 1.031e+08
        > > > > average loss = 0.5425
        > > > > best constant = 0.5589
        > > > > total feature number = 49403301420
        > > > >
        > > > >
        > > > > ----------------------------------------------------------\
        > > > > -------
        > > > > This is under Linux with gcc version 4.3.5
        > > > >
        > > > > enabling BFGS based optimization **without** curvature calculation
        > > > > creating quadratic features for pairs: sd aa
        > > > > final_regressor = 75.rgr
        > > > > using cache_file = vw.cache
        > > > > ignoring text input in favor of cache input
        > > > > num sources = 1
        > > > > Num weight bits = 18
        > > > > learning rate = 10
        > > > > initial_t = 1
        > > > > power_t = 0.5
        > > > > decay_learning_rate = 1
        > > > > using l2 regularization
        > > > > m = 25
        > > > > Allocated 54M for weights and mem
        > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
        > > > > mix fraction curvature dir. magnitude step size time
        > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
        > > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
        > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
        > > > > -1.695048 (revise x 0.4)
        > > > > 4.434237e-01 69.483
        > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
        > > > > -1.138472 (revise x 0.4)
        > > > > 1.900951e-01 91.311
        > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
        > > > > -0.885143 (revise x 0.4)
        > > > > 7.967210e-02 113.372
        > > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
        > > > > -0.774720 (revise x 0.4)
        > > > > 3.299098e-02 136.162
        > > > > 7 8.166488e+04 2.615997e+00 8.045179e+07 -0.016496
        > > > > -0.728039 (revise x 0.4)
        > > > > 1.358448e-02 158.192
        > > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
        > > > > -0.708633 (revise x 0.4)
        > > > > 5.579975e-03 179.810
        > > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
        > > > > -0.700628 (revise x 0.4)
        > > > > 2.289690e-03 201.564
        > > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
        > > > > -0.697338 (revise x 0.4)
        > > > > 9.391521e-04 223.651
        > > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
        > > > > -0.695987 (revise x 0.4)
        > > > > 3.851381e-04 245.367
        > > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
        > > > > -0.695433 (revise x 0.4)
        > > > > 1.579278e-04 267.813
        > > > > 13 3.440815e+00 2.387984e+00 7.343954e+07 -0.000128
        > > > > -0.695206 (revise x 0.4)
        > > > > 6.475448e-05 289.566
        > > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
        > > > > -0.695113 (revise x 0.4)
        > > > > 2.654803e-05 311.194
        > > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
        > > > > -0.695075 (revise x 0.4)
        > > > > 1.088143e-05 332.782
        > > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
        > > > > -0.695059 (revise x 0.4)
        > > > > 4.457373e-06 354.700
        > > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
        > > > > -0.695053 (revise x 0.4)
        > > > > 1.823206e-06 376.323
        > > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
        > > > > -0.695050 (revise x 0.4)
        > > > > 7.430750e-07 398.265
        > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
        > > > > -0.695049 (revise x 0.4)
        > > > > 3.001712e-07 419.779
        > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
        > > > > -0.695048 (revise x 0.4)
        > > > > 1.185601e-07 441.545
        > > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
        > > > > -0.695048 (revise x 0.4)
        > > > > 4.409114e-08 463.328
        > > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
        > > > > -0.695045 (revise x 0.3)
        > > > > 1.362805e-08 484.935
        > > > > 23 4.191623e-01
        > > > > In wolfe_eval: Zero or negative curvature detected.
        > > > > To increase curvature you can increase regularization or rescale
        > > > > features.
        > > > > It is also very likely that you have reached numerical accuracy
        > > > > and further decrease in the objective cannot be reliably detected.
        > > > > (revise x 0.0) 0.000000e+00 506.788
        > > > > 24 4.191600e-01
        > > > > (revise x 0.0) 0.000000e+00 987.156
        > > > > Net time spent in communication = 0 seconds
        > > > > Net time spent = 987.16 seconds
        > > > > finished run
        > > > > number of examples = 184522680
        > > > > weighted example sum = 1.845e+08
        > > > > weighted label sum = 1.031e+08
        > > > > average loss = 0.5509
        > > > > best constant = 0.5589
        > > > > total feature number = 49403301420
        > > > >
        > > > > ----------------------------------------------------------\
        > > > > -------
        > > > > This is under Cygwin with gcc version 4.5.3
        > > > >
        > > > > enabling BFGS based optimization **without** curvature calculation
        > > > > creating quadratic features for pairs: sd aa
        > > > > final_regressor = 75.rgr
        > > > > using cache_file = vw.cache
        > > > > ignoring text input in favor of cache input
        > > > > num sources = 1
        > > > > Num weight bits = 18
        > > > > learning rate = 10
        > > > > initial_t = 1
        > > > > power_t = 0.5
        > > > > decay_learning_rate = 1
        > > > > using l2 regularization
        > > > > m = 25
        > > > > Allocated 54M for weights and mem
        > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
        > > > > mix fraction curvature dir. magnitude step size time
        > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
        > > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
        > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
        > > > > -1.695048 (revise x 0.4)
        > > > > 4.434237e-01 109.199
        > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
        > > > > -1.138472 (revise x 0.4)
        > > > > 1.900951e-01 144.549
        > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
        > > > > -0.885143 (revise x 0.4)
        > > > > 7.967210e-02 180.274
        > > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
        > > > > -0.774720 (revise x 0.4)
        > > > > 3.299098e-02 215.368
        > > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
        > > > > -0.728039 (revise x 0.4)
        > > > > 1.358448e-02 251.118
        > > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
        > > > > -0.708633 (revise x 0.4)
        > > > > 5.579975e-03 286.362
        > > > > 9 2.337717e+03 2.424913e+00 7.457524e+07 -0.002791
        > > > > -0.700628 (revise x 0.4)
        > > > > 2.289690e-03 321.647
        > > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
        > > > > -0.697338 (revise x 0.4)
        > > > > 9.391521e-04 359.588
        > > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
        > > > > -0.695987 (revise x 0.4)
        > > > > 3.851380e-04 398.898
        > > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
        > > > > -0.695433 (revise x 0.4)
        > > > > 1.579277e-04 436.665
        > > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
        > > > > -0.695206 (revise x 0.4)
        > > > > 6.475448e-05 472.627
        > > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
        > > > > -0.695113 (revise x 0.4)
        > > > > 2.654803e-05 507.751
        > > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
        > > > > -0.695075 (revise x 0.4)
        > > > > 1.088143e-05 543.170
        > > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
        > > > > -0.695059 (revise x 0.4)
        > > > > 4.457372e-06 578.753
        > > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
        > > > > -0.695053 (revise x 0.4)
        > > > > 1.823206e-06 613.860
        > > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
        > > > > -0.695050 (revise x 0.4)
        > > > > 7.430749e-07 650.090
        > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
        > > > > -0.695049 (revise x 0.4)
        > > > > 3.001711e-07 686.592
        > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
        > > > > -0.695048 (revise x 0.4)
        > > > > 1.185601e-07 729.522
        > > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
        > > > > -0.695048 (revise x 0.4)
        > > > > 4.409113e-08 765.928
        > > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
        > > > > -0.695048 (revise x 0.3)
        > > > > 1.356294e-08 806.912
        > > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
        > > > > -0.520586 1.129881e-01
        > > > > 1.000000e+00 845.604
        > > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
        > > > > 1.031045e-01 1.000000e+00 880.431
        > > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
        > > > > -1.159684 2.684013e-02
        > > > > 1.000000e+00 916.249
        > > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
        > > > > -0.119104 9.070075e-04
        > > > > 1.000000e+00 951.681
        > > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
        > > > > 1.577486e-02 1.000000e+00 988.121
        > > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
        > > > > 4.510566e-03 1.000000e+00 1024.557
        > > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
        > > > > 1.230063e-02 1.000000e+00 1060.234
        > > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
        > > > > 1.402782e-02 1.000000e+00 1096.908
        > > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
        > > > > 2.145415e-02 1.000000e+00 1134.393
        > > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
        > > > > 5.921376e-02 1.000000e+00 1169.727
        > > > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.419121
        > > > > 1.841315e-01 1.000000e+00 1205.274
        > > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
        > > > > -1.324207 (revise x 0.5)
        > > > > 5.300100e-01 1240.358
        > > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
        > > > > -0.248880 9.528314e-03
        > > > > 1.000000e+00 1276.848
        > > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
        > > > > 5.401969e-03 1.000000e+00 1312.630
        > > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
        > > > > 2.109435e-02 1.000000e+00 1348.003
        > > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
        > > > > 4.156245e-02 1.000000e+00 1383.463
        > > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
        > > > > 6.534269e-02 1.000000e+00 1418.902
        > > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
        > > > > 2.155944e-02 1.000000e+00 1454.386
        > > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
        > > > > 6.547502e-03 1.000000e+00 1489.848
        > > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
        > > > > 2.127681e-02 1.000000e+00 1525.388
        > > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
        > > > > 8.674439e-02 1.000000e+00 1560.996
        > > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
        > > > > -0.040180 4.881849e-02
        > > > > 1.000000e+00 1596.506
        > > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
        > > > > -0.057879 3.470570e-03
        > > > > 1.000000e+00 1632.151
        > > > > 46 1.817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
        > > > > 1.187489e-02 1.000000e+00 1668.191
        > > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
        > > > > 2.917945e-02 1.000000e+00 1703.878
        > > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
        > > > > 1.323640e-01 1.000000e+00 1739.512
        > > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
        > > > > -0.129532 1.930031e-02
        > > > > 1.000000e+00 1775.168
        > > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
        > > > > 6.557360e-03 1.000000e+00 1810.845
        > > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
        > > > > 3.182580e-02 1.000000e+00 1846.487
        > > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
        > > > > 8.439630e-02 1.000000e+00 1883.680
        > > > > 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
        > > > > -0.451383 3.352547e-03
        > > > > 1.000000e+00 1919.294
        > > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
        > > > > 1.574223e-03 1.000000e+00 1955.011
        > > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
        > > > > 4.880569e-03 1.000000e+00 1990.919
        > > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
        > > > > 1.338996e-02 1.000000e+00 2026.526
        > > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
        > > > > 1.254878e-01 1.000000e+00 2062.208
        > > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
        > > > > -0.856343 1.059850e-02
        > > > > 1.000000e+00 2098.129
        > > > > 59 1.717457e-01 5.019926e-04 1.543817e+04 0.356830
        > > > > -0.113790 5.999178e-03
        > > > > 1.000000e+00 2133.891
        > > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
        > > > > 4.132071e-03 1.000000e+00 2169.619
        > > > > Net time spent in communication = 0 seconds
        > > > > Net time spent = 2169.6 seconds
        > > > > finished run
        > > > > number of examples = 184522680
        > > > > weighted example sum = 1.845e+08
        > > > > weighted label sum = 1.031e+08
        > > > > average loss = 0.666
        > > > > best constant = 0.5589
        > > > > total feature number = 49403301420
        > > > >
        > > > >
        > > > >
        > > > >
        > > > >
        > > > >
        > > > >
        > > >
        > >
        > >
        >
      • John Langford
        That s not a happy optimization, but at least it s converging after wasting half of the time flailing. More regularization would help with the initial step,
        Message 3 of 9 , Feb 21, 2012
        View Source
        • 0 Attachment
          That's not a happy optimization, but at least it's converging after
          wasting half of the time flailing. More regularization would help with
          the initial step, but that's not enough, and some serious thought is
          required here. Is the dataset available? Does multipass online learning
          do something sensible?

          -John

          On 02/21/2012 09:17 PM, regularizer wrote:
          >
          > Thanks John, that helps in that I don't get stuck at "Zero or negative
          > curvature detected" anymore and the initial step is slightly smaller.
          > Still, the first printed direction magnitude seems huge to me. The
          > loss at second pass becomes large and it takes time to creep back.
          > Output attached.
          >
          > If I do first on-line learning and then switch to bfgs, the same
          > happens - the first direction magnitude is too large and all initial
          > learning is undone. Is there something I can do to continue
          > optimization (with quantile loss) starting from a previous weight
          > vector without actually throwing it away?
          >
          > Thanks,
          >
          > - Kari
          >
          >
          > $ vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes
          > 60 --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
          > enabling BFGS based optimization **without** curvature calculation
          > creating quadratic features for pairs: sd aa
          > final_regressor = 75! .rgr
          > creating cache_file = vw.cache
          > Reading from stdin
          > num sources = 1
          > Num weight bits = 18
          > learning rate = 10
          > initial_t = 1
          > power_t = 0.5
          > decay_learning_rate = 1
          > using l2 regularization
          > m = 25
          > Allocated 54M for weights and mem
          > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2 mix fraction
          > curvature dir. magnitude step size time
          > 1 4.191599e-01 4.879425e+00 1.500608e+08 ! &nbs p; 4.614935e+15
          > 5.000000e-01 52.935
          > 2 1.875760e+07 6.998204e+00 2.152212e+08 -0.250000 -1.195048 (revise x
          > 0.5) 2.500000e-01 75.075
          > 3 4.689400e+06 4.387594e+00 1.349351e+08 -0.125000 -0.945048 ! (revise
          > x 0.5) 1.250000e-01 98.801
          > 4 1.172351e+06 3.311012e+00 1.018261e+08 -0.062500 -0.820048 (revise x
          > 0.5) 6.250000e-02 121.187
          > 5 2.930890e+05 2.829902e+00 8.703020e+07 -0.031250 -0.757548 (revise x
          > 0.5) 3.125000e-02 144.725
          > 6 7.327342e! +04 2.603643e+00 8.007185e+07 ; -0.015625 -0.726298
          > (revise x 0.5) 1.562500e-02 167.149
          > 7 1.831953e+04 2.494086e+00 7.670258e+07 -0.007813 -0.710673 (revise x
          > 0.5) 7.812500e-03 190.380
          > 8 4.581061e+03 2.440202e+00 7.504543e+07 -0.003907 -0.702861 ! (revise
          > x 0.5) 3.906250e-03 212.310
          > 9 1.146443e+03 2.413483e+00 7.422372e+07 -0.001955 -0.698954 (revise x
          > 0.5) 1.953125e-03 239.450
          > 10 2.877881e+02 2.400179e+00 7.381458e+07 -0.000980 -0.697001 &! nbsp;
          > ; (revise x 0.5) 9.765625e-04 270.791
          > 11 7.312442e+01 2.393541e+00 7.361044e+07 -0.000496 -0.696025 (revise
          > x 0.5) 4.882812e-04 305.179
          > 12 1.945852e+01 2.390226e+00 7.350848e+07 -0.000260 -0.695536 (revise
          > x 0.5) 2.441406e-04 339.859
          > 13 6.042039e+00 2.388569e+00 &n! bsp; 7.345753e+07 -0.000153 -0.695292
          > (revise x 0.5) 1.220703e-04 374.339
          > 14 2.687920e+00 2.387741e+00 7.343206e+07 -0.000124 -0.695170 (revise
          > x 0.5) 6.103516e-05 408.740
          > 15 1.849390e+00 2.387327e+00 7.341932e+07 -0.000156 -0.695109 &! nbsp;
          > ; (revise x 0.5) 3.051758e-05 443.824
          > 16 1.639757e+00 2.387120e+00 7.341296e+07 -0.000267 -0.695079 (revise
          > x 0.5) 1.525879e-05 478.239
          > 17 1.587349e+00 2.387016e+00 7.340977e+07 -0.000510 -0.695063 !
          > (revise x 0.5) 7.629395e-06 512.576
          > 18 1.574247e+00 2.386965e+00 7.340818e+07 -0.001009 -0.695056 (revise
          > x 0.5) 3.814697e-06 547.266
          > 19 1.570972e+00 2.386939e+00 7.340739e+07 -0.002012 -0.695052 (revise
          > x 0.5) 1.907349e-06 581.753
          > 20 1.570153e+00 ! 2.386926e+00 7.340699e+07 & nbsp; -0.004021
          > -0.695050 (revise x 0.5) 9.536743e-07 616.106
          > 21 1.569948e+00 2.386919e+00 7.340679e+07 -0.008041 -0.695049 (revise
          > x 0.5) 4.768372e-07 650.558
          > 22 1.569897e+00 2.386916e+00 7.340669e+07 -0.016082 -0.695049 !
          > (revise x 0.5) 2.384186e-07 684.996
          > 23 1.569884e+00 2.386914e+00 7.340664e+07 -0.032164 -0.695048 (revise
          > x 0.5) 1.192093e-07 719.363
          > 24 1.569881e+00 2.386914e+00 7.340662e+07 -0.064327 -0.695048 !
          > (revise x 0.5)& nbsp; 5.960464e-08 753.747
          > 25 1.569811e+00 2.386913e+00 7.340660e+07 -0.128646 -0.695048 (revise
          > x 0.5) 2.980232e-08 788.155
          > 26 1.560698e+00 2.386911e+00 7.340653e+07 -0.255254 -0.695048 (revise
          > x 0.5) 1.490116e-08 822.468
          > 27 1.335834e+00 2.381222e+00 7.323158e! 3;07 -0.409947 -0.694208
          > (revise x 0.5) 7.450581e-09 856.963
          > 28 6.457887e-01 2.206131e+00 6.784688e+07 -0.202702 -0.667622 (revise
          > x 0.5) 3.725290e-09 892.038
          > 29 3.134720e-01 9.375462e-01 2.883309e+07 0.189059 -0.424878 &! nbsp;
          > ; 5.721519e-03 1.000000e+00 928.184
          > 30 2.615405e-01 1.543026e-01 4.745389e+06 0.710839 0.356731
          > 3.218869e-03 1.000000e+00 964.430
          > 31 2.527036e-01 1.169463e-01 3.596541e+06 0.397394 -0.292694 &! nbsp;
          > 3.594861e-04 1.000000e+00 1000.642
          > 32 2.478497e-01 3.924282e-02 1.206865e+06 0.797068 0.607724
          > 5.707927e-03 1.000000e+00 1036.349
          > 33 2.382581e-01 2.043136e-02 6.283417e+05 0.687158 0.373366
          > 5.886233e-03! 1.000000e+00 1072.106 < br>34 2.325280e-01 1.878525e-02
          > 5.777174e+05 0.723725 0.451241 2.183486e-02 1.000000e+00 1107.914
          > 35 2.234863e-01 2.136923e-02 6.571848e+05 0.683187 0.380104
          > 3.404222e-02 1.000000e+00 1143.795
          > 36 2.161262e-01 7.854667e-03 2.415607e+05&! nbsp; 0.677020 0.306281
          > 2.641602e-02 1.000000e+00 1180.005
          > 37 2.111394e-01 6.731176e-03 2.070091e+05 0.734114 0.365996
          > 3.066953e-02 1.000000e+00 1215.994
          > 38 2.064287e-01 7.055607e-03 2.169866e+05 0.791928 &nbs! p; 0.477421 ;
          > 9.820154e-02 1.000000e+00 1252.024
          > 39 2.005127e-01 1.397243e-02 4.297050e+05 0.532899 0.010674
          > 2.559544e-02 1.000000e+00 1288.122
          > 40 1.968502e-01 1.953271e-03 6.007048e+04 0.581431 0.154733 !
          > 3.655928e-03 1.000000e+00 1324.317
          > 41 1.956542e-01 2.292869e-03 7.051438e+04 0.661248 0.613157
          > 2.971351e-02 1.000000e+00 1360.552
          > 42 1.928280e-01 2.290130e-03 7.043016e+04 0.598646 0.344799 !
          > 5.504233e-02 &nb sp; 1.000000e+00 1396.824
          > 43 1.902657e-01 3.252593e-03 1.000295e+05 0.562222 0.065822
          > 2.691962e-02 1.000000e+00 1433.380
          > 44 1.881628e-01 9.226223e-04 2.837412e+04 0.705538 0.362939
          > 1.867108e-02 1.000000e+00 1469.847
          > 45 1.8661! 76e-01 8.557320e-04 2.631699e+04 0.759632 0.393094
          > 2.948932e-02 1.000000e+00 1506.383
          > 46 1.851779e-01 1.614627e-03 4.965587e+04 0.641649 0.222579
          > 5.023984e-02 1.000000e+00 1542.968
          > 47 1.836210e-01 1.426560e-03 4.387212e+04 ! 0.550367 0.067523&nbs p;
          > 9.671028e-03 1.000000e+00 1579.712
          > 48 1.824689e-01 6.715775e-04 2.065355e+04 0.740073 0.578498
          > 3.563188e-02 1.000000e+00 1616.411
          > 49 1.810121e-01 8.212360e-04 2.525611e+04 0.560126 0.208578 ! ;
          > 2.365269e-02 1.000000e+00 1653.145
          > 50 1.799282e-01 1.155586e-03 3.553864e+04 0.709983 0.329706
          > 4.263906e-02 1.000000e+00 1689.971
          > 51 1.785792e-01 6.220927e-04 1.913170e+04 0.761242 0.439758 ! ; &nb
          > sp; 7.058747e-02 1.000000e+00 1726.856
          > 52 1.770287e-01 7.729798e-04 2.377205e+04 0.704055 0.307157
          > 1.056268e-01 1.000000e+00 1764.094
          > 53 1.759657e-01 1.297268e-03 3.989590e+04 0.405447 -0.213848
          > 3.006896e-03 1.000000e+! ;00 1800.991
          > 54 1.751416e-01 4.022849e-04 1.237178e+04 0.673106 0.447034
          > 7.249319e-03 1.000000e+00 1838.000
          > 55 1.745050e-01 3.247104e-04 9.986073e+03 0.606828 0.369200
          > 1.471770e-02 1.000000e+00 1874.988
          > 56 1.738613e-01 3.934394e! -04 1.209975e+04 0.72894 3 0.567636
          > 1.063217e-01 1.000000e+00 1911.937
          > 57 1.727164e-01 1.591558e-03 4.894642e+04 0.547620 -0.063455
          > 5.282468e-02 1.000000e+00 1948.869
          > 58 1.726473e-01 1.510033e-02 4.643923e+05 0.043418 -0.577282
          > Termination conditio! n reached in pass 58: decrease in loss less than
          > 0.100%.
          > If you want to optimize further, decrease termination threshold.
          > 1.219617e-03 1.000000e+00 1985.922
          > 59 1.718465e-01 5.661531e-04 1.741135e+04 0.782218 ! 0.216402 & nbsp;
          > 5.475620e-04 1.000000e+00 2045.012
          > finished run
          > number of examples = 184522680
          > weighted example sum = 1.845e+08
          > weighted label sum = 1.031e+08
          > average loss = 0.8003
          > best constant = 0.5589
          > total feature number = 49403301420
          >
          >
          >
          > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@...> wrote:
          > >
          > > Try commenting out first_hessian_on = true in drive_bfgs() in
          > bfgs.cc in
          > > the current master.
          > >
          > > -John
          > >
          > > On 02/21/2012 05:59 PM, regularizer wrote:
          > > >
          > > > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
          > > > it turned out that in my last pasted example it was still on. Turning
          > > > it! off in that environment produces results similar to the first
          > two,
          > > > i.e., failure. So in this case -ffast-math is the only way to produce
          > > > good results with BFGS. In this case I'm just not able to get as good
          > > > results with on-line learning.
          > > >
          > > > What you say about the quantile loss second derivative definitely
          > > > makes sense. It would be good not to do have to do the first huge
          > > > misstep with quantile loss.
          > > >
          > > > I guess that's why beginning with on-line learning with quantile loss
          > > > and continuing with BFGS did not work since the 1st BFGS step always
          > > > goes someplace far away from where on-line learning got me.
          > > >
          > > > - Kari
          > > >
          > > > --- In vowpal_wabbit@yahoogroups.com
          > > > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford jl@ wrote:
          > > > >
          > > > > Different results are! due to -ffast-math, which you could turn off
          > > > > when c ompiling.
          > > > >
          > > > > The quantile loss is particularly brutal because quantile loss
          > has no
          > > > > meaningful second derivative. That means that it oversteps by
          > alot in
          > > > > the first step, and then steps back for many steps. Maybe try
          > > > > increasing the regularizer to compenaste. Alternatively, we could
          > > > > modify LBFGS to not use the second derivative in the step
          > direction on
          > > > > the first pass.
          > > > >
          > > > > Zero curvature is pretty plausible, and more regularization may
          > help
          > > > there.
          > > > >
          > > > > -John
          > > > >
          > > > > On 2/21/12, regularizer regularizer@ wrote:
          > > > > > vw --bfgs seems to be sensitive to the gcc version. I have three
          > > > > > different versions on various machines and all produce different
          > > > results
          > > > > > (consistently). Is there a pr! eferred compiler version?
          > > > > >
          > > > > > Here's what I'm trying to do:
          > > > > >
          > > > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr
          > --passes 60
          > > > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
          > > > > >
          > > > > > I'll attach the outputs of vw compiled with three different
          > compiler
          > > > > > versions (all 64-bit systems). The lines are long, sorry. Notice
          > > > how the
          > > > > > results diverge radically at pass 22 while already differing
          > > > slightly at
          > > > > > pass 3. The first two produce predictors that predict all
          > zeros, only
          > > > > > the last one is good. Based on these three data points should I
          > > > conclude
          > > > > > that I need at least gcc 4.5.3 or that I need Cygwin instead
          > of Linux?
          > > > > >
          > > > > > Also, what! does BFGS do after "In wolfe_eval: Zero or negative
          > > > cur vature
          > > > > > detected."? It spends a long time doing apparently nothing,
          > and the
          > > > > > resulting weights are not good.
          > > > > >
          > > > > > Thanks,
          > > > > > - Kari
          > > > > >
          > > > > > ----------------------------------------------------------\
          > > > > > -------
          > > > > > This is under Linux with gcc version 4.1.2
          > > > > >
          > > > > > enabling BFGS based optimization **without** curvature calculation
          > > > > > creating quadratic features for pairs: sd aa
          > > > > > final_regressor = 75.rgr
          > > > > > using cache_file = vw.cache
          > > > > > ignoring text input in favor of cache input
          > > > > > num sources = 1
          > > > > > Num weight bits = 18
          > > > > > learning rate = 10
          > > > > > initial_t = 1
          > > > > > power_t = 0.5
          > ! > > > > decay_learning_rate = 1
          > > > > > using l2 regularization
          > > > > > m = 25
          > > > > > Allocated 54M for weights and mem
          > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
          > > > > > mix fraction curvature dir. magnitude step size time
          > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
          > > > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
          > > > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
          > > > > > -1.695048 (revise x 0.4)
          > > > > > 4.434236e-01 132.949
          > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
          > > > > > -1.138472 (revise x 0.4)
          > > > > > 1.900951e-01 166.716
          > > > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
          > > > > > -0.885143 (revise x 0.4)
          > > > > > ! 7.967209e-02 200.672
          > > > > > 6 4.762672e+05 2.95829 1e+00 9.097862e+07 -0.039836
          > > > > > -0.774720 (revise x 0.4)
          > > > > > 3.299097e-02 234.418
          > > > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
          > > > > > -0.728039 (revise x 0.4)
          > > > > > 1.358448e-02 268.178
          > > > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
          > > > > > -0.708633 (revise x 0.4)
          > > > > > 5.579975e-03 302.334
          > > > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
          > > > > > -0.700628 (revise x 0.4)
          > > > > > 2.289690e-03 336.101
          > > > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
          > > > > > -0.697338 (revise x 0.4)
          > > > > > 9.391521e-04 369.972
          > > > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
          > > > > > -0.695987 (revise x 0.4)
          > > >! > > 3.851381e-04 403.756
          > > > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
          > > > > > -0.695433 (revise x 0.4)
          > > > > > 1.579278e-04 437.512
          > > > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
          > > > > > -0.695206 (revise x 0.4)
          > > > > > 6.475448e-05 471.257
          > > > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
          > > > > > -0.695113 (revise x 0.4)
          > > > > > 2.654803e-05 505.011
          > > > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
          > > > > > -0.695075 (revise x 0.4)
          > > > > > 1.088143e-05 538.771
          > > > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
          > > > > > -0.695059 (revise x 0.4)
          > > > > > 4.457373e-06 572.513
          > > > > > 17 1.571349e+00 2.386943e+00 7! .340751e+07 -0.001723
          > > > > > -0.695052 (revise x 0 .4)
          > > > > > 1.823206e-06 606.268
          > > > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
          > > > > > -0.695050 (revise x 0.4)
          > > > > > 7.430751e-07 640.011
          > > > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
          > > > > > -0.695049 (revise x 0.4)
          > > > > > 3.001712e-07 674.087
          > > > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
          > > > > > -0.695048 (revise x 0.4)
          > > > > > 1.185601e-07 707.928
          > > > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
          > > > > > -0.695048 (revise x 0.4)
          > > > > > 4.409118e-08 741.700
          > > > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
          > > > > > -0.673731 (revise x 0.3)
          > > > > > 1.526299e-08 775.480
          > > > > > 23 4.191600e-01
          > >! > > > In wolfe_eval: Zero or negative curvature detected.
          > > > > > To increase curvature you can increase regularization or rescale
          > > > > > features.
          > > > > > It is also very likely that you have reached numerical accuracy
          > > > > > and further decrease in the objective cannot be reliably detected.
          > > > > > (revise x 0.0) 0.000000e+00 809.273
          > > > > > 24 4.191601e-01
          > > > > > (revise x 0.0) 0.000000e+00 1539.289
          > > > > > Net time spent in communication = 0 seconds
          > > > > > Net time spent = 1539.3 seconds
          > > > > > finished run
          > > > > > number of examples = 184522680
          > > > > > weighted example sum = 1.845e+08
          > > > > > weighted label sum = 1.031e+08
          > > > > > average loss = 0.5425
          > > > > > best constant = 0.5589
          > > > > > total featur! e number = 49403301420
          > > > > >
          > > > > >*> > > >
          > ----------------------------------------------------------\
          > > > > > -------
          > > > > > This is under Linux with gcc version 4.3.5
          > > > > >
          > > > > > enabling BFGS based optimization **without** curvature calculation
          > > > > > creating quadratic features for pairs: sd aa
          > > > > > final_regressor = 75.rgr
          > > > > > using cache_file = vw.cache
          > > > > > ignoring text input in favor of cache input
          > > > > > num sources = 1
          > > > > > Num weight bits = 18
          > > > > > learning rate = 10
          > > > > > initial_t = 1
          > > > > > power_t = 0.5
          > > > > > decay_learning_rate = 1
          > > > > > using l2 regularization
          > > > > > m = 25
          > > > > > Allocated 54M for weights and mem
          > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
          > > >! ; > > mix fraction curvature dir. magnitude step size time
          > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
          > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
          > > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
          > > > > > -1.695048 (revise x 0.4)
          > > > > > 4.434237e-01 69.483
          > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
          > > > > > -1.138472 (revise x 0.4)
          > > > > > 1.900951e-01 91.311
          > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
          > > > > > -0.885143 (revise x 0.4)
          > > > > > 7.967210e-02 113.372
          > > > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
          > > > > > -0.774720 (revise x 0.4)
          > > > > > 3.299098e-02 136.162
          > > > > > 7 8.166488e+04 2.615997e+00 8! .045179e+07 -0.016496
          > > > > > -0.728039 (revise x 0 .4)
          > > > > > 1.358448e-02 158.192
          > > > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
          > > > > > -0.708633 (revise x 0.4)
          > > > > > 5.579975e-03 179.810
          > > > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
          > > > > > -0.700628 (revise x 0.4)
          > > > > > 2.289690e-03 201.564
          > > > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
          > > > > > -0.697338 (revise x 0.4)
          > > > > > 9.391521e-04 223.651
          > > > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
          > > > > > -0.695987 (revise x 0.4)
          > > > > > 3.851381e-04 245.367
          > > > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
          > > > > > -0.695433 (revise x 0.4)
          > > > > > 1.579278e-04 267.813
          > > > > > 13 3.440815e+00 2.387! 984e+00 7.343954e+07 -0.000128
          > > > > > -0.695206 (revise x 0.4)
          > > > > > 6.475448e-05 289.566
          > > > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
          > > > > > -0.695113 (revise x 0.4)
          > > > > > 2.654803e-05 311.194
          > > > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
          > > > > > -0.695075 (revise x 0.4)
          > > > > > 1.088143e-05 332.782
          > > > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
          > > > > > -0.695059 (revise x 0.4)
          > > > > > 4.457373e-06 354.700
          > > > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
          > > > > > -0.695053 (revise x 0.4)
          > > > > > 1.823206e-06 376.323
          > > > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
          > > > > > -0.695050 (revise x 0.4)
          > >! ; > > > 7.430750e-07 398.265
          > > > > > 19 1.5699 20e+00 2.386918e+00 7.340674e+07 -0.010320
          > > > > > -0.695049 (revise x 0.4)
          > > > > > 3.001712e-07 419.779
          > > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
          > > > > > -0.695048 (revise x 0.4)
          > > > > > 1.185601e-07 441.545
          > > > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
          > > > > > -0.695048 (revise x 0.4)
          > > > > > 4.409114e-08 463.328
          > > > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
          > > > > > -0.695045 (revise x 0.3)
          > > > > > 1.362805e-08 484.935
          > > > > > 23 4.191623e-01
          > > > > > In wolfe_eval: Zero or negative curvature detected.
          > > > > > To increase curvature you can increase regularization or rescale
          > > > > > features.
          > > > > > It is also very likely that you have reached nu! merical accuracy
          > > > > > and further decrease in the objective cannot be reliably detected.
          > > > > > (revise x 0.0) 0.000000e+00 506.788
          > > > > > 24 4.191600e-01
          > > > > > (revise x 0.0) 0.000000e+00 987.156
          > > > > > Net time spent in communication = 0 seconds
          > > > > > Net time spent = 987.16 seconds
          > > > > > finished run
          > > > > > number of examples = 184522680
          > > > > > weighted example sum = 1.845e+08
          > > > > > weighted label sum = 1.031e+08
          > > > > > average loss = 0.5509
          > > > > > best constant = 0.5589
          > > > > > total feature number = 49403301420
          > > > > >
          > > > > > ----------------------------------------------------------\
          > > > > > -------
          > > > > > This is under Cygwin with gcc version 4.5.3
          > > > > >
          > > &g! t; > > enabling BFGS based optimization **without** curvature ca
          > lculation
          > > > > > creating quadratic features for pairs: sd aa
          > > > > > final_regressor = 75.rgr
          > > > > > using cache_file = vw.cache
          > > > > > ignoring text input in favor of cache input
          > > > > > num sources = 1
          > > > > > Num weight bits = 18
          > > > > > learning rate = 10
          > > > > > initial_t = 1
          > > > > > power_t = 0.5
          > > > > > decay_learning_rate = 1
          > > > > > using l2 regularization
          > > > > > m = 25
          > > > > > Allocated 54M for weights and mem
          > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
          > > > > > mix fraction curvature dir. magnitude step size time
          > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
          > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
          > > > > > 3 7.503038e+07 1.404921e+01 4.320! 662e+08 -0.500000
          > > > > > -1.695048 (revise x 0.4)
          > > > > > 4.434237e-01 109.199
          > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
          > > > > > -1.138472 (revise x 0.4)
          > > > > > 1.900951e-01 144.549
          > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
          > > > > > -0.885143 (revise x 0.4)
          > > > > > 7.967210e-02 180.274
          > > > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
          > > > > > -0.774720 (revise x 0.4)
          > > > > > 3.299098e-02 215.368
          > > > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
          > > > > > -0.728039 (revise x 0.4)
          > > > > > 1.358448e-02 251.118
          > > > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
          > > > > > -0.708633 (revise x 0.4)
          > > > > > 5.579! 975e-03 286.362
          > > > > > 9 2.337717e+03 2.424913e 3;00 7.457524e+07 -0.002791
          > > > > > -0.700628 (revise x 0.4)
          > > > > > 2.289690e-03 321.647
          > > > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
          > > > > > -0.697338 (revise x 0.4)
          > > > > > 9.391521e-04 359.588
          > > > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
          > > > > > -0.695987 (revise x 0.4)
          > > > > > 3.851380e-04 398.898
          > > > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
          > > > > > -0.695433 (revise x 0.4)
          > > > > > 1.579277e-04 436.665
          > > > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
          > > > > > -0.695206 (revise x 0.4)
          > > > > > 6.475448e-05 472.627
          > > > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
          > > > > > -0.695113 (revise x 0.4)
          > > > &! gt; > 2.654803e-05 507.751
          > > > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
          > > > > > -0.695075 (revise x 0.4)
          > > > > > 1.088143e-05 543.170
          > > > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
          > > > > > -0.695059 (revise x 0.4)
          > > > > > 4.457372e-06 578.753
          > > > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
          > > > > > -0.695053 (revise x 0.4)
          > > > > > 1.823206e-06 613.860
          > > > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
          > > > > > -0.695050 (revise x 0.4)
          > > > > > 7.430749e-07 650.090
          > > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
          > > > > > -0.695049 (revise x 0.4)
          > > > > > 3.001711e-07 686.592
          > > > > > 20 1.569886e+00 2.386915e+00 7.3! 40665e+07 -0.025547
          > > > > > -0.695048 (revise x 0.4 )
          > > > > > 1.185601e-07 729.522
          > > > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
          > > > > > -0.695048 (revise x 0.4)
          > > > > > 4.409113e-08 765.928
          > > > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
          > > > > > -0.695048 (revise x 0.3)
          > > > > > 1.356294e-08 806.912
          > > > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
          > > > > > -0.520586 1.129881e-01
          > > > > > 1.000000e+00 845.604
          > > > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
          > > > > > 1.031045e-01 1.000000e+00 880.431
          > > > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
          > > > > > -1.159684 2.684013e-02
          > > > > > 1.000000e+00 916.249
          > > > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
          > > >! ; > > -0.119104 9.070075e-04
          > > > > > 1.000000e+00 951.681
          > > > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
          > > > > > 1.577486e-02 1.000000e+00 988.121
          > > > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
          > > > > > 4.510566e-03 1.000000e+00 1024.557
          > > > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
          > > > > > 1.230063e-02 1.000000e+00 1060.234
          > > > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
          > > > > > 1.402782e-02 1.000000e+00 1096.908
          > > > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
          > > > > > 2.145415e-02 1.000000e+00 1134.393
          > > > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
          > > > > > 5.921376e-02 1.000000e+00 1169.727
          > > >! > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.41 9121
          > > > > > 1.841315e-01 1.000000e+00 1205.274
          > > > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
          > > > > > -1.324207 (revise x 0.5)
          > > > > > 5.300100e-01 1240.358
          > > > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
          > > > > > -0.248880 9.528314e-03
          > > > > > 1.000000e+00 1276.848
          > > > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
          > > > > > 5.401969e-03 1.000000e+00 1312.630
          > > > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
          > > > > > 2.109435e-02 1.000000e+00 1348.003
          > > > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
          > > > > > 4.156245e-02 1.000000e+00 1383.463
          > > > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
          > > > > > 6.534269e-02 1.00000! 0e+00 1418.902
          > > > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
          > > > > > 2.155944e-02 1.000000e+00 1454.386
          > > > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
          > > > > > 6.547502e-03 1.000000e+00 1489.848
          > > > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
          > > > > > 2.127681e-02 1.000000e+00 1525.388
          > > > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
          > > > > > 8.674439e-02 1.000000e+00 1560.996
          > > > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
          > > > > > -0.040180 4.881849e-02
          > > > > > 1.000000e+00 1596.506
          > > > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
          > > > > > -0.057879 3.470570e-03
          > > > > > 1.000000e+00 1632.151
          > > > > > 46 1! .817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
          > > &g t; > > 1.187489e-02 1.000000e+00 1668.191
          > > > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
          > > > > > 2.917945e-02 1.000000e+00 1703.878
          > > > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
          > > > > > 1.323640e-01 1.000000e+00 1739.512
          > > > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
          > > > > > -0.129532 1.930031e-02
          > > > > > 1.000000e+00 1775.168
          > > > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
          > > > > > 6.557360e-03 1.000000e+00 1810.845
          > > > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
          > > > > > 3.182580e-02 1.000000e+00 1846.487
          > > > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
          > > > > > 8.439630e-02 1.000000e+00 1883.680
          > > > > &! gt; 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
          > > > > > -0.451383 3.352547e-03
          > > > > > 1.000000e+00 1919.294
          > > > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
          > > > > > 1.574223e-03 1.000000e+00 1955.011
          > > > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
          > > > > > 4.880569e-03 1.000000e+00 1990.919
          > > > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
          > > > > > 1.338996e-02 1.000000e+00 2026.526
          > > > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
          > > > > > 1.254878e-01 1.000000e+00 2062.208
          > > > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
          > > > > > -0.856343 1.059850e-02
          > > > > > 1.000000e+00 2098.129
          > > > > > 59 1.717457e-01 5.019926e-04 1.543817e+! ;04 0.356830
          > > > > > -0.113790 5.999178e-03
          > > > ; > > 1.000000e+00 2133.891
          > > > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
          > > > > > 4.132071e-03 1.000000e+00 2169.619
          > > > > > Net time spent in communication = 0 seconds
          > > > > > Net time spent = 2169.6 seconds
          > > > > > finished run
          > > > > > number of examples = 184522680
          > > > > > weighted example sum = 1.845e+08
          > > > > > weighted label sum = 1.031e+08
          > > > > > average loss = 0.666
          > > > > > best constant = 0.5589
          > > > > > total feature number = 49403301420
          > > > > >
          > > > > >
          > > > > >
          > > > > >
          > > > > >
          > > > > >
          > > > > >
          > > > >
          > > >
          > > >
          > >
          > *
          >
          > **
          > **
          > **
          > **
        • regularizer
          John, if you have a mechanism for me to drop a 120M compressed file, it is available. If multipass online learning means running online learning, saving the
          Message 4 of 9 , Feb 22, 2012
          View Source
          • 0 Attachment
            John, if you have a mechanism for me to drop a 120M compressed file, it is available.

            If "multipass online learning" means running online learning, saving the weights; loading the weights, running further online training saving the weights; and so on, then no, each time the initial training seems to destroy whatever the previous training achieved (with default parameters).

            - Kari


            --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@...> wrote:
            >
            > That's not a happy optimization, but at least it's converging after
            > wasting half of the time flailing. More regularization would help with
            > the initial step, but that's not enough, and some serious thought is
            > required here. Is the dataset available? Does multipass online learning
            > do something sensible?
            >
            > -John
            >
            > On 02/21/2012 09:17 PM, regularizer wrote:
            > >
            > > Thanks John, that helps in that I don't get stuck at "Zero or negative
            > > curvature detected" anymore and the initial step is slightly smaller.
            > > Still, the first printed direction magnitude seems huge to me. The
            > > loss at second pass becomes large and it takes time to creep back.
            > > Output attached.
            > >
            > > If I do first on-line learning and then switch to bfgs, the same
            > > happens - the first direction magnitude is too large and all initial
            > > learning is undone. Is there something I can do to continue
            > > optimization (with quantile loss) starting from a previous weight
            > > vector without actually throwing it away?
            > >
            > > Thanks,
            > >
            > > - Kari
            > >
            > >
            > > $ vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes
            > > 60 --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
            > > enabling BFGS based optimization **without** curvature calculation
            > > creating quadratic features for pairs: sd aa
            > > final_regressor = 75! .rgr
            > > creating cache_file = vw.cache
            > > Reading from stdin
            > > num sources = 1
            > > Num weight bits = 18
            > > learning rate = 10
            > > initial_t = 1
            > > power_t = 0.5
            > > decay_learning_rate = 1
            > > using l2 regularization
            > > m = 25
            > > Allocated 54M for weights and mem
            > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2 mix fraction
            > > curvature dir. magnitude step size time
            > > 1 4.191599e-01 4.879425e+00 1.500608e+08 ! &nbs p; 4.614935e+15
            > > 5.000000e-01 52.935
            > > 2 1.875760e+07 6.998204e+00 2.152212e+08 -0.250000 -1.195048 (revise x
            > > 0.5) 2.500000e-01 75.075
            > > 3 4.689400e+06 4.387594e+00 1.349351e+08 -0.125000 -0.945048 ! (revise
            > > x 0.5) 1.250000e-01 98.801
            > > 4 1.172351e+06 3.311012e+00 1.018261e+08 -0.062500 -0.820048 (revise x
            > > 0.5) 6.250000e-02 121.187
            > > 5 2.930890e+05 2.829902e+00 8.703020e+07 -0.031250 -0.757548 (revise x
            > > 0.5) 3.125000e-02 144.725
            > > 6 7.327342e! +04 2.603643e+00 8.007185e+07 ; -0.015625 -0.726298
            > > (revise x 0.5) 1.562500e-02 167.149
            > > 7 1.831953e+04 2.494086e+00 7.670258e+07 -0.007813 -0.710673 (revise x
            > > 0.5) 7.812500e-03 190.380
            > > 8 4.581061e+03 2.440202e+00 7.504543e+07 -0.003907 -0.702861 ! (revise
            > > x 0.5) 3.906250e-03 212.310
            > > 9 1.146443e+03 2.413483e+00 7.422372e+07 -0.001955 -0.698954 (revise x
            > > 0.5) 1.953125e-03 239.450
            > > 10 2.877881e+02 2.400179e+00 7.381458e+07 -0.000980 -0.697001 &! nbsp;
            > > ; (revise x 0.5) 9.765625e-04 270.791
            > > 11 7.312442e+01 2.393541e+00 7.361044e+07 -0.000496 -0.696025 (revise
            > > x 0.5) 4.882812e-04 305.179
            > > 12 1.945852e+01 2.390226e+00 7.350848e+07 -0.000260 -0.695536 (revise
            > > x 0.5) 2.441406e-04 339.859
            > > 13 6.042039e+00 2.388569e+00 &n! bsp; 7.345753e+07 -0.000153 -0.695292
            > > (revise x 0.5) 1.220703e-04 374.339
            > > 14 2.687920e+00 2.387741e+00 7.343206e+07 -0.000124 -0.695170 (revise
            > > x 0.5) 6.103516e-05 408.740
            > > 15 1.849390e+00 2.387327e+00 7.341932e+07 -0.000156 -0.695109 &! nbsp;
            > > ; (revise x 0.5) 3.051758e-05 443.824
            > > 16 1.639757e+00 2.387120e+00 7.341296e+07 -0.000267 -0.695079 (revise
            > > x 0.5) 1.525879e-05 478.239
            > > 17 1.587349e+00 2.387016e+00 7.340977e+07 -0.000510 -0.695063 !
            > > (revise x 0.5) 7.629395e-06 512.576
            > > 18 1.574247e+00 2.386965e+00 7.340818e+07 -0.001009 -0.695056 (revise
            > > x 0.5) 3.814697e-06 547.266
            > > 19 1.570972e+00 2.386939e+00 7.340739e+07 -0.002012 -0.695052 (revise
            > > x 0.5) 1.907349e-06 581.753
            > > 20 1.570153e+00 ! 2.386926e+00 7.340699e+07 & nbsp; -0.004021
            > > -0.695050 (revise x 0.5) 9.536743e-07 616.106
            > > 21 1.569948e+00 2.386919e+00 7.340679e+07 -0.008041 -0.695049 (revise
            > > x 0.5) 4.768372e-07 650.558
            > > 22 1.569897e+00 2.386916e+00 7.340669e+07 -0.016082 -0.695049 !
            > > (revise x 0.5) 2.384186e-07 684.996
            > > 23 1.569884e+00 2.386914e+00 7.340664e+07 -0.032164 -0.695048 (revise
            > > x 0.5) 1.192093e-07 719.363
            > > 24 1.569881e+00 2.386914e+00 7.340662e+07 -0.064327 -0.695048 !
            > > (revise x 0.5)& nbsp; 5.960464e-08 753.747
            > > 25 1.569811e+00 2.386913e+00 7.340660e+07 -0.128646 -0.695048 (revise
            > > x 0.5) 2.980232e-08 788.155
            > > 26 1.560698e+00 2.386911e+00 7.340653e+07 -0.255254 -0.695048 (revise
            > > x 0.5) 1.490116e-08 822.468
            > > 27 1.335834e+00 2.381222e+00 7.323158e! 3;07 -0.409947 -0.694208
            > > (revise x 0.5) 7.450581e-09 856.963
            > > 28 6.457887e-01 2.206131e+00 6.784688e+07 -0.202702 -0.667622 (revise
            > > x 0.5) 3.725290e-09 892.038
            > > 29 3.134720e-01 9.375462e-01 2.883309e+07 0.189059 -0.424878 &! nbsp;
            > > ; 5.721519e-03 1.000000e+00 928.184
            > > 30 2.615405e-01 1.543026e-01 4.745389e+06 0.710839 0.356731
            > > 3.218869e-03 1.000000e+00 964.430
            > > 31 2.527036e-01 1.169463e-01 3.596541e+06 0.397394 -0.292694 &! nbsp;
            > > 3.594861e-04 1.000000e+00 1000.642
            > > 32 2.478497e-01 3.924282e-02 1.206865e+06 0.797068 0.607724
            > > 5.707927e-03 1.000000e+00 1036.349
            > > 33 2.382581e-01 2.043136e-02 6.283417e+05 0.687158 0.373366
            > > 5.886233e-03! 1.000000e+00 1072.106 < br>34 2.325280e-01 1.878525e-02
            > > 5.777174e+05 0.723725 0.451241 2.183486e-02 1.000000e+00 1107.914
            > > 35 2.234863e-01 2.136923e-02 6.571848e+05 0.683187 0.380104
            > > 3.404222e-02 1.000000e+00 1143.795
            > > 36 2.161262e-01 7.854667e-03 2.415607e+05&! nbsp; 0.677020 0.306281
            > > 2.641602e-02 1.000000e+00 1180.005
            > > 37 2.111394e-01 6.731176e-03 2.070091e+05 0.734114 0.365996
            > > 3.066953e-02 1.000000e+00 1215.994
            > > 38 2.064287e-01 7.055607e-03 2.169866e+05 0.791928 &nbs! p; 0.477421 ;
            > > 9.820154e-02 1.000000e+00 1252.024
            > > 39 2.005127e-01 1.397243e-02 4.297050e+05 0.532899 0.010674
            > > 2.559544e-02 1.000000e+00 1288.122
            > > 40 1.968502e-01 1.953271e-03 6.007048e+04 0.581431 0.154733 !
            > > 3.655928e-03 1.000000e+00 1324.317
            > > 41 1.956542e-01 2.292869e-03 7.051438e+04 0.661248 0.613157
            > > 2.971351e-02 1.000000e+00 1360.552
            > > 42 1.928280e-01 2.290130e-03 7.043016e+04 0.598646 0.344799 !
            > > 5.504233e-02 &nb sp; 1.000000e+00 1396.824
            > > 43 1.902657e-01 3.252593e-03 1.000295e+05 0.562222 0.065822
            > > 2.691962e-02 1.000000e+00 1433.380
            > > 44 1.881628e-01 9.226223e-04 2.837412e+04 0.705538 0.362939
            > > 1.867108e-02 1.000000e+00 1469.847
            > > 45 1.8661! 76e-01 8.557320e-04 2.631699e+04 0.759632 0.393094
            > > 2.948932e-02 1.000000e+00 1506.383
            > > 46 1.851779e-01 1.614627e-03 4.965587e+04 0.641649 0.222579
            > > 5.023984e-02 1.000000e+00 1542.968
            > > 47 1.836210e-01 1.426560e-03 4.387212e+04 ! 0.550367 0.067523&nbs p;
            > > 9.671028e-03 1.000000e+00 1579.712
            > > 48 1.824689e-01 6.715775e-04 2.065355e+04 0.740073 0.578498
            > > 3.563188e-02 1.000000e+00 1616.411
            > > 49 1.810121e-01 8.212360e-04 2.525611e+04 0.560126 0.208578 ! ;
            > > 2.365269e-02 1.000000e+00 1653.145
            > > 50 1.799282e-01 1.155586e-03 3.553864e+04 0.709983 0.329706
            > > 4.263906e-02 1.000000e+00 1689.971
            > > 51 1.785792e-01 6.220927e-04 1.913170e+04 0.761242 0.439758 ! ; &nb
            > > sp; 7.058747e-02 1.000000e+00 1726.856
            > > 52 1.770287e-01 7.729798e-04 2.377205e+04 0.704055 0.307157
            > > 1.056268e-01 1.000000e+00 1764.094
            > > 53 1.759657e-01 1.297268e-03 3.989590e+04 0.405447 -0.213848
            > > 3.006896e-03 1.000000e+! ;00 1800.991
            > > 54 1.751416e-01 4.022849e-04 1.237178e+04 0.673106 0.447034
            > > 7.249319e-03 1.000000e+00 1838.000
            > > 55 1.745050e-01 3.247104e-04 9.986073e+03 0.606828 0.369200
            > > 1.471770e-02 1.000000e+00 1874.988
            > > 56 1.738613e-01 3.934394e! -04 1.209975e+04 0.72894 3 0.567636
            > > 1.063217e-01 1.000000e+00 1911.937
            > > 57 1.727164e-01 1.591558e-03 4.894642e+04 0.547620 -0.063455
            > > 5.282468e-02 1.000000e+00 1948.869
            > > 58 1.726473e-01 1.510033e-02 4.643923e+05 0.043418 -0.577282
            > > Termination conditio! n reached in pass 58: decrease in loss less than
            > > 0.100%.
            > > If you want to optimize further, decrease termination threshold.
            > > 1.219617e-03 1.000000e+00 1985.922
            > > 59 1.718465e-01 5.661531e-04 1.741135e+04 0.782218 ! 0.216402 & nbsp;
            > > 5.475620e-04 1.000000e+00 2045.012
            > > finished run
            > > number of examples = 184522680
            > > weighted example sum = 1.845e+08
            > > weighted label sum = 1.031e+08
            > > average loss = 0.8003
            > > best constant = 0.5589
            > > total feature number = 49403301420
            > >
            > >
            > >
            > > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@> wrote:
            > > >
            > > > Try commenting out first_hessian_on = true in drive_bfgs() in
            > > bfgs.cc in
            > > > the current master.
            > > >
            > > > -John
            > > >
            > > > On 02/21/2012 05:59 PM, regularizer wrote:
            > > > >
            > > > > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
            > > > > it turned out that in my last pasted example it was still on. Turning
            > > > > it! off in that environment produces results similar to the first
            > > two,
            > > > > i.e., failure. So in this case -ffast-math is the only way to produce
            > > > > good results with BFGS. In this case I'm just not able to get as good
            > > > > results with on-line learning.
            > > > >
            > > > > What you say about the quantile loss second derivative definitely
            > > > > makes sense. It would be good not to do have to do the first huge
            > > > > misstep with quantile loss.
            > > > >
            > > > > I guess that's why beginning with on-line learning with quantile loss
            > > > > and continuing with BFGS did not work since the 1st BFGS step always
            > > > > goes someplace far away from where on-line learning got me.
            > > > >
            > > > > - Kari
            > > > >
            > > > > --- In vowpal_wabbit@yahoogroups.com
            > > > > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford jl@ wrote:
            > > > > >
            > > > > > Different results are! due to -ffast-math, which you could turn off
            > > > > > when c ompiling.
            > > > > >
            > > > > > The quantile loss is particularly brutal because quantile loss
            > > has no
            > > > > > meaningful second derivative. That means that it oversteps by
            > > alot in
            > > > > > the first step, and then steps back for many steps. Maybe try
            > > > > > increasing the regularizer to compenaste. Alternatively, we could
            > > > > > modify LBFGS to not use the second derivative in the step
            > > direction on
            > > > > > the first pass.
            > > > > >
            > > > > > Zero curvature is pretty plausible, and more regularization may
            > > help
            > > > > there.
            > > > > >
            > > > > > -John
            > > > > >
            > > > > > On 2/21/12, regularizer regularizer@ wrote:
            > > > > > > vw --bfgs seems to be sensitive to the gcc version. I have three
            > > > > > > different versions on various machines and all produce different
            > > > > results
            > > > > > > (consistently). Is there a pr! eferred compiler version?
            > > > > > >
            > > > > > > Here's what I'm trying to do:
            > > > > > >
            > > > > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr
            > > --passes 60
            > > > > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
            > > > > > >
            > > > > > > I'll attach the outputs of vw compiled with three different
            > > compiler
            > > > > > > versions (all 64-bit systems). The lines are long, sorry. Notice
            > > > > how the
            > > > > > > results diverge radically at pass 22 while already differing
            > > > > slightly at
            > > > > > > pass 3. The first two produce predictors that predict all
            > > zeros, only
            > > > > > > the last one is good. Based on these three data points should I
            > > > > conclude
            > > > > > > that I need at least gcc 4.5.3 or that I need Cygwin instead
            > > of Linux?
            > > > > > >
            > > > > > > Also, what! does BFGS do after "In wolfe_eval: Zero or negative
            > > > > cur vature
            > > > > > > detected."? It spends a long time doing apparently nothing,
            > > and the
            > > > > > > resulting weights are not good.
            > > > > > >
            > > > > > > Thanks,
            > > > > > > - Kari
            > > > > > >
            > > > > > > ----------------------------------------------------------\
            > > > > > > -------
            > > > > > > This is under Linux with gcc version 4.1.2
            > > > > > >
            > > > > > > enabling BFGS based optimization **without** curvature calculation
            > > > > > > creating quadratic features for pairs: sd aa
            > > > > > > final_regressor = 75.rgr
            > > > > > > using cache_file = vw.cache
            > > > > > > ignoring text input in favor of cache input
            > > > > > > num sources = 1
            > > > > > > Num weight bits = 18
            > > > > > > learning rate = 10
            > > > > > > initial_t = 1
            > > > > > > power_t = 0.5
            > > ! > > > > decay_learning_rate = 1
            > > > > > > using l2 regularization
            > > > > > > m = 25
            > > > > > > Allocated 54M for weights and mem
            > > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
            > > > > > > mix fraction curvature dir. magnitude step size time
            > > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
            > > > > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
            > > > > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
            > > > > > > -1.695048 (revise x 0.4)
            > > > > > > 4.434236e-01 132.949
            > > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
            > > > > > > -1.138472 (revise x 0.4)
            > > > > > > 1.900951e-01 166.716
            > > > > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
            > > > > > > -0.885143 (revise x 0.4)
            > > > > > > ! 7.967209e-02 200.672
            > > > > > > 6 4.762672e+05 2.95829 1e+00 9.097862e+07 -0.039836
            > > > > > > -0.774720 (revise x 0.4)
            > > > > > > 3.299097e-02 234.418
            > > > > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
            > > > > > > -0.728039 (revise x 0.4)
            > > > > > > 1.358448e-02 268.178
            > > > > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
            > > > > > > -0.708633 (revise x 0.4)
            > > > > > > 5.579975e-03 302.334
            > > > > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
            > > > > > > -0.700628 (revise x 0.4)
            > > > > > > 2.289690e-03 336.101
            > > > > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
            > > > > > > -0.697338 (revise x 0.4)
            > > > > > > 9.391521e-04 369.972
            > > > > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
            > > > > > > -0.695987 (revise x 0.4)
            > > > >! > > 3.851381e-04 403.756
            > > > > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
            > > > > > > -0.695433 (revise x 0.4)
            > > > > > > 1.579278e-04 437.512
            > > > > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
            > > > > > > -0.695206 (revise x 0.4)
            > > > > > > 6.475448e-05 471.257
            > > > > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
            > > > > > > -0.695113 (revise x 0.4)
            > > > > > > 2.654803e-05 505.011
            > > > > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
            > > > > > > -0.695075 (revise x 0.4)
            > > > > > > 1.088143e-05 538.771
            > > > > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
            > > > > > > -0.695059 (revise x 0.4)
            > > > > > > 4.457373e-06 572.513
            > > > > > > 17 1.571349e+00 2.386943e+00 7! .340751e+07 -0.001723
            > > > > > > -0.695052 (revise x 0 .4)
            > > > > > > 1.823206e-06 606.268
            > > > > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
            > > > > > > -0.695050 (revise x 0.4)
            > > > > > > 7.430751e-07 640.011
            > > > > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
            > > > > > > -0.695049 (revise x 0.4)
            > > > > > > 3.001712e-07 674.087
            > > > > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
            > > > > > > -0.695048 (revise x 0.4)
            > > > > > > 1.185601e-07 707.928
            > > > > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
            > > > > > > -0.695048 (revise x 0.4)
            > > > > > > 4.409118e-08 741.700
            > > > > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
            > > > > > > -0.673731 (revise x 0.3)
            > > > > > > 1.526299e-08 775.480
            > > > > > > 23 4.191600e-01
            > > >! > > > In wolfe_eval: Zero or negative curvature detected.
            > > > > > > To increase curvature you can increase regularization or rescale
            > > > > > > features.
            > > > > > > It is also very likely that you have reached numerical accuracy
            > > > > > > and further decrease in the objective cannot be reliably detected.
            > > > > > > (revise x 0.0) 0.000000e+00 809.273
            > > > > > > 24 4.191601e-01
            > > > > > > (revise x 0.0) 0.000000e+00 1539.289
            > > > > > > Net time spent in communication = 0 seconds
            > > > > > > Net time spent = 1539.3 seconds
            > > > > > > finished run
            > > > > > > number of examples = 184522680
            > > > > > > weighted example sum = 1.845e+08
            > > > > > > weighted label sum = 1.031e+08
            > > > > > > average loss = 0.5425
            > > > > > > best constant = 0.5589
            > > > > > > total featur! e number = 49403301420
            > > > > > >
            > > > > > >*> > > >
            > > ----------------------------------------------------------\
            > > > > > > -------
            > > > > > > This is under Linux with gcc version 4.3.5
            > > > > > >
            > > > > > > enabling BFGS based optimization **without** curvature calculation
            > > > > > > creating quadratic features for pairs: sd aa
            > > > > > > final_regressor = 75.rgr
            > > > > > > using cache_file = vw.cache
            > > > > > > ignoring text input in favor of cache input
            > > > > > > num sources = 1
            > > > > > > Num weight bits = 18
            > > > > > > learning rate = 10
            > > > > > > initial_t = 1
            > > > > > > power_t = 0.5
            > > > > > > decay_learning_rate = 1
            > > > > > > using l2 regularization
            > > > > > > m = 25
            > > > > > > Allocated 54M for weights and mem
            > > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
            > > > >! ; > > mix fraction curvature dir. magnitude step size time
            > > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
            > > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
            > > > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
            > > > > > > -1.695048 (revise x 0.4)
            > > > > > > 4.434237e-01 69.483
            > > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
            > > > > > > -1.138472 (revise x 0.4)
            > > > > > > 1.900951e-01 91.311
            > > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
            > > > > > > -0.885143 (revise x 0.4)
            > > > > > > 7.967210e-02 113.372
            > > > > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
            > > > > > > -0.774720 (revise x 0.4)
            > > > > > > 3.299098e-02 136.162
            > > > > > > 7 8.166488e+04 2.615997e+00 8! .045179e+07 -0.016496
            > > > > > > -0.728039 (revise x 0 .4)
            > > > > > > 1.358448e-02 158.192
            > > > > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
            > > > > > > -0.708633 (revise x 0.4)
            > > > > > > 5.579975e-03 179.810
            > > > > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
            > > > > > > -0.700628 (revise x 0.4)
            > > > > > > 2.289690e-03 201.564
            > > > > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
            > > > > > > -0.697338 (revise x 0.4)
            > > > > > > 9.391521e-04 223.651
            > > > > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
            > > > > > > -0.695987 (revise x 0.4)
            > > > > > > 3.851381e-04 245.367
            > > > > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
            > > > > > > -0.695433 (revise x 0.4)
            > > > > > > 1.579278e-04 267.813
            > > > > > > 13 3.440815e+00 2.387! 984e+00 7.343954e+07 -0.000128
            > > > > > > -0.695206 (revise x 0.4)
            > > > > > > 6.475448e-05 289.566
            > > > > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
            > > > > > > -0.695113 (revise x 0.4)
            > > > > > > 2.654803e-05 311.194
            > > > > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
            > > > > > > -0.695075 (revise x 0.4)
            > > > > > > 1.088143e-05 332.782
            > > > > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
            > > > > > > -0.695059 (revise x 0.4)
            > > > > > > 4.457373e-06 354.700
            > > > > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
            > > > > > > -0.695053 (revise x 0.4)
            > > > > > > 1.823206e-06 376.323
            > > > > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
            > > > > > > -0.695050 (revise x 0.4)
            > > >! ; > > > 7.430750e-07 398.265
            > > > > > > 19 1.5699 20e+00 2.386918e+00 7.340674e+07 -0.010320
            > > > > > > -0.695049 (revise x 0.4)
            > > > > > > 3.001712e-07 419.779
            > > > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
            > > > > > > -0.695048 (revise x 0.4)
            > > > > > > 1.185601e-07 441.545
            > > > > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
            > > > > > > -0.695048 (revise x 0.4)
            > > > > > > 4.409114e-08 463.328
            > > > > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
            > > > > > > -0.695045 (revise x 0.3)
            > > > > > > 1.362805e-08 484.935
            > > > > > > 23 4.191623e-01
            > > > > > > In wolfe_eval: Zero or negative curvature detected.
            > > > > > > To increase curvature you can increase regularization or rescale
            > > > > > > features.
            > > > > > > It is also very likely that you have reached nu! merical accuracy
            > > > > > > and further decrease in the objective cannot be reliably detected.
            > > > > > > (revise x 0.0) 0.000000e+00 506.788
            > > > > > > 24 4.191600e-01
            > > > > > > (revise x 0.0) 0.000000e+00 987.156
            > > > > > > Net time spent in communication = 0 seconds
            > > > > > > Net time spent = 987.16 seconds
            > > > > > > finished run
            > > > > > > number of examples = 184522680
            > > > > > > weighted example sum = 1.845e+08
            > > > > > > weighted label sum = 1.031e+08
            > > > > > > average loss = 0.5509
            > > > > > > best constant = 0.5589
            > > > > > > total feature number = 49403301420
            > > > > > >
            > > > > > > ----------------------------------------------------------\
            > > > > > > -------
            > > > > > > This is under Cygwin with gcc version 4.5.3
            > > > > > >
            > > > &g! t; > > enabling BFGS based optimization **without** curvature ca
            > > lculation
            > > > > > > creating quadratic features for pairs: sd aa
            > > > > > > final_regressor = 75.rgr
            > > > > > > using cache_file = vw.cache
            > > > > > > ignoring text input in favor of cache input
            > > > > > > num sources = 1
            > > > > > > Num weight bits = 18
            > > > > > > learning rate = 10
            > > > > > > initial_t = 1
            > > > > > > power_t = 0.5
            > > > > > > decay_learning_rate = 1
            > > > > > > using l2 regularization
            > > > > > > m = 25
            > > > > > > Allocated 54M for weights and mem
            > > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
            > > > > > > mix fraction curvature dir. magnitude step size time
            > > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
            > > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
            > > > > > > 3 7.503038e+07 1.404921e+01 4.320! 662e+08 -0.500000
            > > > > > > -1.695048 (revise x 0.4)
            > > > > > > 4.434237e-01 109.199
            > > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
            > > > > > > -1.138472 (revise x 0.4)
            > > > > > > 1.900951e-01 144.549
            > > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
            > > > > > > -0.885143 (revise x 0.4)
            > > > > > > 7.967210e-02 180.274
            > > > > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
            > > > > > > -0.774720 (revise x 0.4)
            > > > > > > 3.299098e-02 215.368
            > > > > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
            > > > > > > -0.728039 (revise x 0.4)
            > > > > > > 1.358448e-02 251.118
            > > > > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
            > > > > > > -0.708633 (revise x 0.4)
            > > > > > > 5.579! 975e-03 286.362
            > > > > > > 9 2.337717e+03 2.424913e 3;00 7.457524e+07 -0.002791
            > > > > > > -0.700628 (revise x 0.4)
            > > > > > > 2.289690e-03 321.647
            > > > > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
            > > > > > > -0.697338 (revise x 0.4)
            > > > > > > 9.391521e-04 359.588
            > > > > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
            > > > > > > -0.695987 (revise x 0.4)
            > > > > > > 3.851380e-04 398.898
            > > > > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
            > > > > > > -0.695433 (revise x 0.4)
            > > > > > > 1.579277e-04 436.665
            > > > > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
            > > > > > > -0.695206 (revise x 0.4)
            > > > > > > 6.475448e-05 472.627
            > > > > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
            > > > > > > -0.695113 (revise x 0.4)
            > > > > &! gt; > 2.654803e-05 507.751
            > > > > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
            > > > > > > -0.695075 (revise x 0.4)
            > > > > > > 1.088143e-05 543.170
            > > > > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
            > > > > > > -0.695059 (revise x 0.4)
            > > > > > > 4.457372e-06 578.753
            > > > > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
            > > > > > > -0.695053 (revise x 0.4)
            > > > > > > 1.823206e-06 613.860
            > > > > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
            > > > > > > -0.695050 (revise x 0.4)
            > > > > > > 7.430749e-07 650.090
            > > > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
            > > > > > > -0.695049 (revise x 0.4)
            > > > > > > 3.001711e-07 686.592
            > > > > > > 20 1.569886e+00 2.386915e+00 7.3! 40665e+07 -0.025547
            > > > > > > -0.695048 (revise x 0.4 )
            > > > > > > 1.185601e-07 729.522
            > > > > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
            > > > > > > -0.695048 (revise x 0.4)
            > > > > > > 4.409113e-08 765.928
            > > > > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
            > > > > > > -0.695048 (revise x 0.3)
            > > > > > > 1.356294e-08 806.912
            > > > > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
            > > > > > > -0.520586 1.129881e-01
            > > > > > > 1.000000e+00 845.604
            > > > > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
            > > > > > > 1.031045e-01 1.000000e+00 880.431
            > > > > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
            > > > > > > -1.159684 2.684013e-02
            > > > > > > 1.000000e+00 916.249
            > > > > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
            > > > >! ; > > -0.119104 9.070075e-04
            > > > > > > 1.000000e+00 951.681
            > > > > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
            > > > > > > 1.577486e-02 1.000000e+00 988.121
            > > > > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
            > > > > > > 4.510566e-03 1.000000e+00 1024.557
            > > > > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
            > > > > > > 1.230063e-02 1.000000e+00 1060.234
            > > > > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
            > > > > > > 1.402782e-02 1.000000e+00 1096.908
            > > > > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
            > > > > > > 2.145415e-02 1.000000e+00 1134.393
            > > > > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
            > > > > > > 5.921376e-02 1.000000e+00 1169.727
            > > > >! > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.41 9121
            > > > > > > 1.841315e-01 1.000000e+00 1205.274
            > > > > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
            > > > > > > -1.324207 (revise x 0.5)
            > > > > > > 5.300100e-01 1240.358
            > > > > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
            > > > > > > -0.248880 9.528314e-03
            > > > > > > 1.000000e+00 1276.848
            > > > > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
            > > > > > > 5.401969e-03 1.000000e+00 1312.630
            > > > > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
            > > > > > > 2.109435e-02 1.000000e+00 1348.003
            > > > > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
            > > > > > > 4.156245e-02 1.000000e+00 1383.463
            > > > > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
            > > > > > > 6.534269e-02 1.00000! 0e+00 1418.902
            > > > > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
            > > > > > > 2.155944e-02 1.000000e+00 1454.386
            > > > > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
            > > > > > > 6.547502e-03 1.000000e+00 1489.848
            > > > > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
            > > > > > > 2.127681e-02 1.000000e+00 1525.388
            > > > > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
            > > > > > > 8.674439e-02 1.000000e+00 1560.996
            > > > > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
            > > > > > > -0.040180 4.881849e-02
            > > > > > > 1.000000e+00 1596.506
            > > > > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
            > > > > > > -0.057879 3.470570e-03
            > > > > > > 1.000000e+00 1632.151
            > > > > > > 46 1! .817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
            > > > &g t; > > 1.187489e-02 1.000000e+00 1668.191
            > > > > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
            > > > > > > 2.917945e-02 1.000000e+00 1703.878
            > > > > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
            > > > > > > 1.323640e-01 1.000000e+00 1739.512
            > > > > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
            > > > > > > -0.129532 1.930031e-02
            > > > > > > 1.000000e+00 1775.168
            > > > > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
            > > > > > > 6.557360e-03 1.000000e+00 1810.845
            > > > > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
            > > > > > > 3.182580e-02 1.000000e+00 1846.487
            > > > > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
            > > > > > > 8.439630e-02 1.000000e+00 1883.680
            > > > > > &! gt; 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
            > > > > > > -0.451383 3.352547e-03
            > > > > > > 1.000000e+00 1919.294
            > > > > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
            > > > > > > 1.574223e-03 1.000000e+00 1955.011
            > > > > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
            > > > > > > 4.880569e-03 1.000000e+00 1990.919
            > > > > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
            > > > > > > 1.338996e-02 1.000000e+00 2026.526
            > > > > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
            > > > > > > 1.254878e-01 1.000000e+00 2062.208
            > > > > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
            > > > > > > -0.856343 1.059850e-02
            > > > > > > 1.000000e+00 2098.129
            > > > > > > 59 1.717457e-01 5.019926e-04 1.543817e+! ;04 0.356830
            > > > > > > -0.113790 5.999178e-03
            > > > > ; > > 1.000000e+00 2133.891
            > > > > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
            > > > > > > 4.132071e-03 1.000000e+00 2169.619
            > > > > > > Net time spent in communication = 0 seconds
            > > > > > > Net time spent = 2169.6 seconds
            > > > > > > finished run
            > > > > > > number of examples = 184522680
            > > > > > > weighted example sum = 1.845e+08
            > > > > > > weighted label sum = 1.031e+08
            > > > > > > average loss = 0.666
            > > > > > > best constant = 0.5589
            > > > > > > total feature number = 49403301420
            > > > > > >
            > > > > > >
            > > > > > >
            > > > > > >
            > > > > > >
            > > > > > >
            > > > > > >
            > > > > >
            > > > >
            > > > >
            > > >
            > > *
            > >
            > > **
            > > **
            > > **
            > > **
            >
          • John Langford
            ... We d need the source file. ... I just mean using --passes when doing online learning. Miro is correct that LBFGS was not designed to handle this, but
            Message 5 of 9 , Feb 22, 2012
            View Source
            • 0 Attachment
              On 2/22/12, regularizer <regularizer@...> wrote:
              > John, if you have a mechanism for me to drop a 120M compressed file, it is
              > available.

              We'd need the source file.

              > If "multipass online learning" means running online learning, saving the
              > weights; loading the weights, running further online training saving the
              > weights; and so on, then no, each time the initial training seems to destroy
              > whatever the previous training achieved (with default parameters).

              I just mean using --passes <n> when doing online learning.

              Miro is correct that LBFGS was not designed to handle this, but the
              fact that it is working suggests that it can be made to work well with
              a little bit of thought.

              -John

              > - Kari
              >
              >
              > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@...> wrote:
              >>
              >> That's not a happy optimization, but at least it's converging after
              >> wasting half of the time flailing. More regularization would help with
              >> the initial step, but that's not enough, and some serious thought is
              >> required here. Is the dataset available? Does multipass online learning
              >> do something sensible?
              >>
              >> -John
              >>
              >> On 02/21/2012 09:17 PM, regularizer wrote:
              >> >
              >> > Thanks John, that helps in that I don't get stuck at "Zero or negative
              >> > curvature detected" anymore and the initial step is slightly smaller.
              >> > Still, the first printed direction magnitude seems huge to me. The
              >> > loss at second pass becomes large and it takes time to creep back.
              >> > Output attached.
              >> >
              >> > If I do first on-line learning and then switch to bfgs, the same
              >> > happens - the first direction magnitude is too large and all initial
              >> > learning is undone. Is there something I can do to continue
              >> > optimization (with quantile loss) starting from a previous weight
              >> > vector without actually throwing it away?
              >> >
              >> > Thanks,
              >> >
              >> > - Kari
              >> >
              >> >
              >> > $ vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes
              >> > 60 --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
              >> > enabling BFGS based optimization **without** curvature calculation
              >> > creating quadratic features for pairs: sd aa
              >> > final_regressor = 75! .rgr
              >> > creating cache_file = vw.cache
              >> > Reading from stdin
              >> > num sources = 1
              >> > Num weight bits = 18
              >> > learning rate = 10
              >> > initial_t = 1
              >> > power_t = 0.5
              >> > decay_learning_rate = 1
              >> > using l2 regularization
              >> > m = 25
              >> > Allocated 54M for weights and mem
              >> > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2 mix fraction
              >> > curvature dir. magnitude step size time
              >> > 1 4.191599e-01 4.879425e+00 1.500608e+08 ! &nbs p; 4.614935e+15
              >> > 5.000000e-01 52.935
              >> > 2 1.875760e+07 6.998204e+00 2.152212e+08 -0.250000 -1.195048 (revise x
              >> > 0.5) 2.500000e-01 75.075
              >> > 3 4.689400e+06 4.387594e+00 1.349351e+08 -0.125000 -0.945048 ! (revise
              >> > x 0.5) 1.250000e-01 98.801
              >> > 4 1.172351e+06 3.311012e+00 1.018261e+08 -0.062500 -0.820048 (revise x
              >> > 0.5) 6.250000e-02 121.187
              >> > 5 2.930890e+05 2.829902e+00 8.703020e+07 -0.031250 -0.757548 (revise x
              >> > 0.5) 3.125000e-02 144.725
              >> > 6 7.327342e! +04 2.603643e+00 8.007185e+07 ; -0.015625 -0.726298
              >> > (revise x 0.5) 1.562500e-02 167.149
              >> > 7 1.831953e+04 2.494086e+00 7.670258e+07 -0.007813 -0.710673 (revise x
              >> > 0.5) 7.812500e-03 190.380
              >> > 8 4.581061e+03 2.440202e+00 7.504543e+07 -0.003907 -0.702861 ! (revise
              >> > x 0.5) 3.906250e-03 212.310
              >> > 9 1.146443e+03 2.413483e+00 7.422372e+07 -0.001955 -0.698954 (revise x
              >> > 0.5) 1.953125e-03 239.450
              >> > 10 2.877881e+02 2.400179e+00 7.381458e+07 -0.000980 -0.697001 &! nbsp;
              >> > ; (revise x 0.5) 9.765625e-04 270.791
              >> > 11 7.312442e+01 2.393541e+00 7.361044e+07 -0.000496 -0.696025 (revise
              >> > x 0.5) 4.882812e-04 305.179
              >> > 12 1.945852e+01 2.390226e+00 7.350848e+07 -0.000260 -0.695536 (revise
              >> > x 0.5) 2.441406e-04 339.859
              >> > 13 6.042039e+00 2.388569e+00 &n! bsp; 7.345753e+07 -0.000153 -0.695292
              >> > (revise x 0.5) 1.220703e-04 374.339
              >> > 14 2.687920e+00 2.387741e+00 7.343206e+07 -0.000124 -0.695170 (revise
              >> > x 0.5) 6.103516e-05 408.740
              >> > 15 1.849390e+00 2.387327e+00 7.341932e+07 -0.000156 -0.695109 &! nbsp;
              >> > ; (revise x 0.5) 3.051758e-05 443.824
              >> > 16 1.639757e+00 2.387120e+00 7.341296e+07 -0.000267 -0.695079 (revise
              >> > x 0.5) 1.525879e-05 478.239
              >> > 17 1.587349e+00 2.387016e+00 7.340977e+07 -0.000510 -0.695063 !
              >> > (revise x 0.5) 7.629395e-06 512.576
              >> > 18 1.574247e+00 2.386965e+00 7.340818e+07 -0.001009 -0.695056 (revise
              >> > x 0.5) 3.814697e-06 547.266
              >> > 19 1.570972e+00 2.386939e+00 7.340739e+07 -0.002012 -0.695052 (revise
              >> > x 0.5) 1.907349e-06 581.753
              >> > 20 1.570153e+00 ! 2.386926e+00 7.340699e+07 & nbsp; -0.004021
              >> > -0.695050 (revise x 0.5) 9.536743e-07 616.106
              >> > 21 1.569948e+00 2.386919e+00 7.340679e+07 -0.008041 -0.695049 (revise
              >> > x 0.5) 4.768372e-07 650.558
              >> > 22 1.569897e+00 2.386916e+00 7.340669e+07 -0.016082 -0.695049 !
              >> > (revise x 0.5) 2.384186e-07 684.996
              >> > 23 1.569884e+00 2.386914e+00 7.340664e+07 -0.032164 -0.695048 (revise
              >> > x 0.5) 1.192093e-07 719.363
              >> > 24 1.569881e+00 2.386914e+00 7.340662e+07 -0.064327 -0.695048 !
              >> > (revise x 0.5)& nbsp; 5.960464e-08 753.747
              >> > 25 1.569811e+00 2.386913e+00 7.340660e+07 -0.128646 -0.695048 (revise
              >> > x 0.5) 2.980232e-08 788.155
              >> > 26 1.560698e+00 2.386911e+00 7.340653e+07 -0.255254 -0.695048 (revise
              >> > x 0.5) 1.490116e-08 822.468
              >> > 27 1.335834e+00 2.381222e+00 7.323158e ! 3;07 -0.409947 -0.694208
              >> > (revise x 0.5) 7.450581e-09 856.963
              >> > 28 6.457887e-01 2.206131e+00 6.784688e+07 -0.202702 -0.667622 (revise
              >> > x 0.5) 3.725290e-09 892.038
              >> > 29 3.134720e-01 9.375462e-01 2.883309e+07 0.189059 -0.424878 &! nbsp;
              >> > ; 5.721519e-03 1.000000e+00 928.184
              >> > 30 2.615405e-01 1.543026e-01 4.745389e+06 0.710839 0.356731
              >> > 3.218869e-03 1.000000e+00 964.430
              >> > 31 2.527036e-01 1.169463e-01 3.596541e+06 0.397394 -0.292694 &! nbsp;
              >> > 3.594861e-04 1.000000e+00 1000.642
              >> > 32 2.478497e-01 3.924282e-02 1.206865e+06 0.797068 0.607724
              >> > 5.707927e-03 1.000000e+00 1036.349
              >> > 33 2.382581e-01 2.043136e-02 6.283417e+05 0.687158 0.373366
              >> > 5.886233e-03! 1.000000e+00 1072.106 < br>34 2.325280e-01 1.878525e-02
              >> > 5.777174e+05 0.723725 0.451241 2.183486e-02 1.000000e+00 1107.914
              >> > 35 2.234863e-01 2.136923e-02 6.571848e+05 0.683187 0.380104
              >> > 3.404222e-02 1.000000e+00 1143.795
              >> > 36 2.161262e-01 7.854667e-03 2.415607e+05&! nbsp; 0.677020 0.306281
              >> > 2.641602e-02 1.000000e+00 1180.005
              >> > 37 2.111394e-01 6.731176e-03 2.070091e+05 0.734114 0.365996
              >> > 3.066953e-02 1.000000e+00 1215.994
              >> > 38 2.064287e-01 7.055607e-03 2.169866e+05 0.791928 &nbs! p; 0.477421 ;
              >> > 9.820154e-02 1.000000e+00 1252.024
              >> > 39 2.005127e-01 1.397243e-02 4.297050e+05 0.532899 0.010674
              >> > 2.559544e-02 1.000000e+00 1288.122
              >> > 40 1.968502e-01 1.953271e-03 6.007048e+04 0.581431 0.154733 !
              >> > 3.655928e-03 1.000000e+00 1324.317
              >> > 41 1.956542e-01 2.292869e-03 7.051438e+04 0.661248 0.613157
              >> > 2.971351e-02 1.000000e+00 1360.552
              >> > 42 1.928280e-01 2.290130e-03 7.043016e+04 0.598646 0.344799 !
              >> > 5.504233e-02 &nb sp; 1.000000e+00 1396.824
              >> > 43 1.902657e-01 3.252593e-03 1.000295e+05 0.562222 0.065822
              >> > 2.691962e-02 1.000000e+00 1433.380
              >> > 44 1.881628e-01 9.226223e-04 2.837412e+04 0.705538 0.362939
              >> > 1.867108e-02 1.000000e+00 1469.847
              >> > 45 1.8661! 76e-01 8.557320e-04 2.631699e+04 0.759632 0.393094
              >> > 2.948932e-02 1.000000e+00 1506.383
              >> > 46 1.851779e-01 1.614627e-03 4.965587e+04 0.641649 0.222579
              >> > 5.023984e-02 1.000000e+00 1542.968
              >> > 47 1.836210e-01 1.426560e-03 4.387212e+04 ! 0.550367 0.067523&nbs p;
              >> > 9.671028e-03 1.000000e+00 1579.712
              >> > 48 1.824689e-01 6.715775e-04 2.065355e+04 0.740073 0.578498
              >> > 3.563188e-02 1.000000e+00 1616.411
              >> > 49 1.810121e-01 8.212360e-04 2.525611e+04 0.560126 0.208578 ! ;
              >> > 2.365269e-02 1.000000e+00 1653.145
              >> > 50 1.799282e-01 1.155586e-03 3.553864e+04 0.709983 0.329706
              >> > 4.263906e-02 1.000000e+00 1689.971
              >> > 51 1.785792e-01 6.220927e-04 1.913170e+04 0.761242 0.439758 ! ; &nb
              >> > sp; 7.058747e-02 1.000000e+00 1726.856
              >> > 52 1.770287e-01 7.729798e-04 2.377205e+04 0.704055 0.307157
              >> > 1.056268e-01 1.000000e+00 1764.094
              >> > 53 1.759657e-01 1.297268e-03 3.989590e+04 0.405447 -0.213848
              >> > 3.006896e-03 1.000000e+! ;00 1800.991
              >> > 54 1.751416e-01 4.022849e-04 1.237178e+04 0.673106 0.447034
              >> > 7.249319e-03 1.000000e+00 1838.000
              >> > 55 1.745050e-01 3.247104e-04 9.986073e+03 0.606828 0.369200
              >> > 1.471770e-02 1.000000e+00 1874.988
              >> > 56 1.738613e-01 3.934394e! -04 1.209975e+04 0.72894 3 0.567636
              >> > 1.063217e-01 1.000000e+00 1911.937
              >> > 57 1.727164e-01 1.591558e-03 4.894642e+04 0.547620 -0.063455
              >> > 5.282468e-02 1.000000e+00 1948.869
              >> > 58 1.726473e-01 1.510033e-02 4.643923e+05 0.043418 -0.577282
              >> > Termination conditio! n reached in pass 58: decrease in loss less than
              >> > 0.100%.
              >> > If you want to optimize further, decrease termination threshold.
              >> > 1.219617e-03 1.000000e+00 1985.922
              >> > 59 1.718465e-01 5.661531e-04 1.741135e+04 0.782218 ! 0.216402 & nbsp;
              >> > 5.475620e-04 1.000000e+00 2045.012
              >> > finished run
              >> > number of examples = 184522680
              >> > weighted example sum = 1.845e+08
              >> > weighted label sum = 1.031e+08
              >> > average loss = 0.8003
              >> > best constant = 0.5589
              >> > total feature number = 49403301420
              >> >
              >> >
              >> >
              >> > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@> wrote:
              >> > >
              >> > > Try commenting out first_hessian_on = true in drive_bfgs() in
              >> > bfgs.cc in
              >> > > the current master.
              >> > >
              >> > > -John
              >> > >
              >> > > On 02/21/2012 05:59 PM, regularizer wrote:
              >> > > >
              >> > > > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
              >> > > > it turned out that in my last pasted example it was still on.
              >> > > > Turning
              >> > > > it! off in that environment produces results similar to the first
              >> > two,
              >> > > > i.e., failure. So in this case -ffast-math is the only way to
              >> > > > produce
              >> > > > good results with BFGS. In this case I'm just not able to get as
              >> > > > good
              >> > > > results with on-line learning.
              >> > > >
              >> > > > What you say about the quantile loss second derivative definitely
              >> > > > makes sense. It would be good not to do have to do the first huge
              >> > > > misstep with quantile loss.
              >> > > >
              >> > > > I guess that's why beginning with on-line learning with quantile
              >> > > > loss
              >> > > > and continuing with BFGS did not work since the 1st BFGS step always
              >> > > > goes someplace far away from where on-line learning got me.
              >> > > >
              >> > > > - Kari
              >> > > >
              >> > > > --- In vowpal_wabbit@yahoogroups.com
              >> > > > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford jl@ wrote:
              >> > > > >
              >> > > > > Different results are! due to -ffast-math, which you could turn
              >> > > > > off
              >> > > > > when c ompiling.
              >> > > > >
              >> > > > > The quantile loss is particularly brutal because quantile loss
              >> > has no
              >> > > > > meaningful second derivative. That means that it oversteps by
              >> > alot in
              >> > > > > the first step, and then steps back for many steps. Maybe try
              >> > > > > increasing the regularizer to compenaste. Alternatively, we could
              >> > > > > modify LBFGS to not use the second derivative in the step
              >> > direction on
              >> > > > > the first pass.
              >> > > > >
              >> > > > > Zero curvature is pretty plausible, and more regularization may
              >> > help
              >> > > > there.
              >> > > > >
              >> > > > > -John
              >> > > > >
              >> > > > > On 2/21/12, regularizer regularizer@ wrote:
              >> > > > > > vw --bfgs seems to be sensitive to the gcc version. I have three
              >> > > > > > different versions on various machines and all produce different
              >> > > > results
              >> > > > > > (consistently). Is there a pr! eferred compiler version?
              >> > > > > >
              >> > > > > > Here's what I'm trying to do:
              >> > > > > >
              >> > > > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr
              >> > --passes 60
              >> > > > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
              >> > > > > >
              >> > > > > > I'll attach the outputs of vw compiled with three different
              >> > compiler
              >> > > > > > versions (all 64-bit systems). The lines are long, sorry. Notice
              >> > > > how the
              >> > > > > > results diverge radically at pass 22 while already differing
              >> > > > slightly at
              >> > > > > > pass 3. The first two produce predictors that predict all
              >> > zeros, only
              >> > > > > > the last one is good. Based on these three data points should I
              >> > > > conclude
              >> > > > > > that I need at least gcc 4.5.3 or that I need Cygwin instead
              >> > of Linux?
              >> > > > > >
              >> > > > > > Also, what! does BFGS do after "In wolfe_eval: Zero or negative
              >> > > > cur vature
              >> > > > > > detected."? It spends a long time doing apparently nothing,
              >> > and the
              >> > > > > > resulting weights are not good.
              >> > > > > >
              >> > > > > > Thanks,
              >> > > > > > - Kari
              >> > > > > >
              >> > > > > > ----------------------------------------------------------\
              >> > > > > > -------
              >> > > > > > This is under Linux with gcc version 4.1.2
              >> > > > > >
              >> > > > > > enabling BFGS based optimization **without** curvature
              >> > > > > > calculation
              >> > > > > > creating quadratic features for pairs: sd aa
              >> > > > > > final_regressor = 75.rgr
              >> > > > > > using cache_file = vw.cache
              >> > > > > > ignoring text input in favor of cache input
              >> > > > > > num sources = 1
              >> > > > > > Num weight bits = 18
              >> > > > > > learning rate = 10
              >> > > > > > initial_t = 1
              >> > > > > > power_t = 0.5
              >> > ! > > > > decay_learning_rate = 1
              >> > > > > > using l2 regularization
              >> > > > > > m = 25
              >> > > > > > Allocated 54M for weights and mem
              >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
              >> > > > > > mix fraction curvature dir. magnitude step size time
              >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
              >> > > > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
              >> > > > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
              >> > > > > > -1.695048 (revise x 0.4)
              >> > > > > > 4.434236e-01 132.949
              >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
              >> > > > > > -1.138472 (revise x 0.4)
              >> > > > > > 1.900951e-01 166.716
              >> > > > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
              >> > > > > > -0.885143 (revise x 0.4)
              >> > > > > > ! 7.967209e-02 200.672
              >> > > > > > 6 4.762672e+05 2.95829 1e+00 9.097862e+07 -0.039836
              >> > > > > > -0.774720 (revise x 0.4)
              >> > > > > > 3.299097e-02 234.418
              >> > > > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
              >> > > > > > -0.728039 (revise x 0.4)
              >> > > > > > 1.358448e-02 268.178
              >> > > > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
              >> > > > > > -0.708633 (revise x 0.4)
              >> > > > > > 5.579975e-03 302.334
              >> > > > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
              >> > > > > > -0.700628 (revise x 0.4)
              >> > > > > > 2.289690e-03 336.101
              >> > > > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
              >> > > > > > -0.697338 (revise x 0.4)
              >> > > > > > 9.391521e-04 369.972
              >> > > > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
              >> > > > > > -0.695987 (revise x 0.4)
              >> > > >! > > 3.851381e-04 403.756
              >> > > > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
              >> > > > > > -0.695433 (revise x 0.4)
              >> > > > > > 1.579278e-04 437.512
              >> > > > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
              >> > > > > > -0.695206 (revise x 0.4)
              >> > > > > > 6.475448e-05 471.257
              >> > > > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
              >> > > > > > -0.695113 (revise x 0.4)
              >> > > > > > 2.654803e-05 505.011
              >> > > > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
              >> > > > > > -0.695075 (revise x 0.4)
              >> > > > > > 1.088143e-05 538.771
              >> > > > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
              >> > > > > > -0.695059 (revise x 0.4)
              >> > > > > > 4.457373e-06 572.513
              >> > > > > > 17 1.571349e+00 2.386943e+00 7! .340751e+07 -0.001723
              >> > > > > > -0.695052 (revise x 0 .4)
              >> > > > > > 1.823206e-06 606.268
              >> > > > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
              >> > > > > > -0.695050 (revise x 0.4)
              >> > > > > > 7.430751e-07 640.011
              >> > > > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
              >> > > > > > -0.695049 (revise x 0.4)
              >> > > > > > 3.001712e-07 674.087
              >> > > > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
              >> > > > > > -0.695048 (revise x 0.4)
              >> > > > > > 1.185601e-07 707.928
              >> > > > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
              >> > > > > > -0.695048 (revise x 0.4)
              >> > > > > > 4.409118e-08 741.700
              >> > > > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
              >> > > > > > -0.673731 (revise x 0.3)
              >> > > > > > 1.526299e-08 775.480
              >> > > > > > 23 4.191600e-01
              >> > >! > > > In wolfe_eval: Zero or negative curvature detected.
              >> > > > > > To increase curvature you can increase regularization or rescale
              >> > > > > > features.
              >> > > > > > It is also very likely that you have reached numerical accuracy
              >> > > > > > and further decrease in the objective cannot be reliably
              >> > > > > > detected.
              >> > > > > > (revise x 0.0) 0.000000e+00 809.273
              >> > > > > > 24 4.191601e-01
              >> > > > > > (revise x 0.0) 0.000000e+00 1539.289
              >> > > > > > Net time spent in communication = 0 seconds
              >> > > > > > Net time spent = 1539.3 seconds
              >> > > > > > finished run
              >> > > > > > number of examples = 184522680
              >> > > > > > weighted example sum = 1.845e+08
              >> > > > > > weighted label sum = 1.031e+08
              >> > > > > > average loss = 0.5425
              >> > > > > > best constant = 0.5589
              >> > > > > > total featur! e number = 49403301420
              >> > > > > >
              >> > > > > >*> > > >
              >> > ----------------------------------------------------------\
              >> > > > > > -------
              >> > > > > > This is under Linux with gcc version 4.3.5
              >> > > > > >
              >> > > > > > enabling BFGS based optimization **without** curvature
              >> > > > > > calculation
              >> > > > > > creating quadratic features for pairs: sd aa
              >> > > > > > final_regressor = 75.rgr
              >> > > > > > using cache_file = vw.cache
              >> > > > > > ignoring text input in favor of cache input
              >> > > > > > num sources = 1
              >> > > > > > Num weight bits = 18
              >> > > > > > learning rate = 10
              >> > > > > > initial_t = 1
              >> > > > > > power_t = 0.5
              >> > > > > > decay_learning_rate = 1
              >> > > > > > using l2 regularization
              >> > > > > > m = 25
              >> > > > > > Allocated 54M for weights and mem
              >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
              >> > > >! ; > > mix fraction curvature dir. magnitude step size time
              >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
              >> > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
              >> > > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
              >> > > > > > -1.695048 (revise x 0.4)
              >> > > > > > 4.434237e-01 69.483
              >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
              >> > > > > > -1.138472 (revise x 0.4)
              >> > > > > > 1.900951e-01 91.311
              >> > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
              >> > > > > > -0.885143 (revise x 0.4)
              >> > > > > > 7.967210e-02 113.372
              >> > > > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
              >> > > > > > -0.774720 (revise x 0.4)
              >> > > > > > 3.299098e-02 136.162
              >> > > > > > 7 8.166488e+04 2.615997e+00 8! .045179e+07 -0.016496
              >> > > > > > -0.728039 (revise x 0 .4)
              >> > > > > > 1.358448e-02 158.192
              >> > > > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
              >> > > > > > -0.708633 (revise x 0.4)
              >> > > > > > 5.579975e-03 179.810
              >> > > > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
              >> > > > > > -0.700628 (revise x 0.4)
              >> > > > > > 2.289690e-03 201.564
              >> > > > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
              >> > > > > > -0.697338 (revise x 0.4)
              >> > > > > > 9.391521e-04 223.651
              >> > > > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
              >> > > > > > -0.695987 (revise x 0.4)
              >> > > > > > 3.851381e-04 245.367
              >> > > > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
              >> > > > > > -0.695433 (revise x 0.4)
              >> > > > > > 1.579278e-04 267.813
              >> > > > > > 13 3.440815e+00 2.387! 984e+00 7.343954e+07 -0.000128
              >> > > > > > -0.695206 (revise x 0.4)
              >> > > > > > 6.475448e-05 289.566
              >> > > > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
              >> > > > > > -0.695113 (revise x 0.4)
              >> > > > > > 2.654803e-05 311.194
              >> > > > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
              >> > > > > > -0.695075 (revise x 0.4)
              >> > > > > > 1.088143e-05 332.782
              >> > > > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
              >> > > > > > -0.695059 (revise x 0.4)
              >> > > > > > 4.457373e-06 354.700
              >> > > > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
              >> > > > > > -0.695053 (revise x 0.4)
              >> > > > > > 1.823206e-06 376.323
              >> > > > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
              >> > > > > > -0.695050 (revise x 0.4)
              >> > >! ; > > > 7.430750e-07 398.265
              >> > > > > > 19 1.5699 20e+00 2.386918e+00 7.340674e+07 -0.010320
              >> > > > > > -0.695049 (revise x 0.4)
              >> > > > > > 3.001712e-07 419.779
              >> > > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
              >> > > > > > -0.695048 (revise x 0.4)
              >> > > > > > 1.185601e-07 441.545
              >> > > > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
              >> > > > > > -0.695048 (revise x 0.4)
              >> > > > > > 4.409114e-08 463.328
              >> > > > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
              >> > > > > > -0.695045 (revise x 0.3)
              >> > > > > > 1.362805e-08 484.935
              >> > > > > > 23 4.191623e-01
              >> > > > > > In wolfe_eval: Zero or negative curvature detected.
              >> > > > > > To increase curvature you can increase regularization or rescale
              >> > > > > > features.
              >> > > > > > It is also very likely that you have reached nu! merical
              >> > > > > > accuracy
              >> > > > > > and further decrease in the objective cannot be reliably
              >> > > > > > detected.
              >> > > > > > (revise x 0.0) 0.000000e+00 506.788
              >> > > > > > 24 4.191600e-01
              >> > > > > > (revise x 0.0) 0.000000e+00 987.156
              >> > > > > > Net time spent in communication = 0 seconds
              >> > > > > > Net time spent = 987.16 seconds
              >> > > > > > finished run
              >> > > > > > number of examples = 184522680
              >> > > > > > weighted example sum = 1.845e+08
              >> > > > > > weighted label sum = 1.031e+08
              >> > > > > > average loss = 0.5509
              >> > > > > > best constant = 0.5589
              >> > > > > > total feature number = 49403301420
              >> > > > > >
              >> > > > > > ----------------------------------------------------------\
              >> > > > > > -------
              >> > > > > > This is under Cygwin with gcc version 4.5.3
              >> > > > > >
              >> > > &g! t; > > enabling BFGS based optimization **without** curvature ca
              >> > lculation
              >> > > > > > creating quadratic features for pairs: sd aa
              >> > > > > > final_regressor = 75.rgr
              >> > > > > > using cache_file = vw.cache
              >> > > > > > ignoring text input in favor of cache input
              >> > > > > > num sources = 1
              >> > > > > > Num weight bits = 18
              >> > > > > > learning rate = 10
              >> > > > > > initial_t = 1
              >> > > > > > power_t = 0.5
              >> > > > > > decay_learning_rate = 1
              >> > > > > > using l2 regularization
              >> > > > > > m = 25
              >> > > > > > Allocated 54M for weights and mem
              >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
              >> > > > > > mix fraction curvature dir. magnitude step size time
              >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
              >> > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
              >> > > > > > 3 7.503038e+07 1.404921e+01 4.320! 662e+08 -0.500000
              >> > > > > > -1.695048 (revise x 0.4)
              >> > > > > > 4.434237e-01 109.199
              >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
              >> > > > > > -1.138472 (revise x 0.4)
              >> > > > > > 1.900951e-01 144.549
              >> > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
              >> > > > > > -0.885143 (revise x 0.4)
              >> > > > > > 7.967210e-02 180.274
              >> > > > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
              >> > > > > > -0.774720 (revise x 0.4)
              >> > > > > > 3.299098e-02 215.368
              >> > > > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
              >> > > > > > -0.728039 (revise x 0.4)
              >> > > > > > 1.358448e-02 251.118
              >> > > > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
              >> > > > > > -0.708633 (revise x 0.4)
              >> > > > > > 5.579! 975e-03 286.362
              >> > > > > > 9 2.337717e+03 2.424913e 3;00 7.457524e+07 -0.002791
              >> > > > > > -0.700628 (revise x 0.4)
              >> > > > > > 2.289690e-03 321.647
              >> > > > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
              >> > > > > > -0.697338 (revise x 0.4)
              >> > > > > > 9.391521e-04 359.588
              >> > > > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
              >> > > > > > -0.695987 (revise x 0.4)
              >> > > > > > 3.851380e-04 398.898
              >> > > > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
              >> > > > > > -0.695433 (revise x 0.4)
              >> > > > > > 1.579277e-04 436.665
              >> > > > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
              >> > > > > > -0.695206 (revise x 0.4)
              >> > > > > > 6.475448e-05 472.627
              >> > > > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
              >> > > > > > -0.695113 (revise x 0.4)
              >> > > > &! gt; > 2.654803e-05 507.751
              >> > > > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
              >> > > > > > -0.695075 (revise x 0.4)
              >> > > > > > 1.088143e-05 543.170
              >> > > > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
              >> > > > > > -0.695059 (revise x 0.4)
              >> > > > > > 4.457372e-06 578.753
              >> > > > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
              >> > > > > > -0.695053 (revise x 0.4)
              >> > > > > > 1.823206e-06 613.860
              >> > > > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
              >> > > > > > -0.695050 (revise x 0.4)
              >> > > > > > 7.430749e-07 650.090
              >> > > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
              >> > > > > > -0.695049 (revise x 0.4)
              >> > > > > > 3.001711e-07 686.592
              >> > > > > > 20 1.569886e+00 2.386915e+00 7.3! 40665e+07 -0.025547
              >> > > > > > -0.695048 (revise x 0.4 )
              >> > > > > > 1.185601e-07 729.522
              >> > > > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
              >> > > > > > -0.695048 (revise x 0.4)
              >> > > > > > 4.409113e-08 765.928
              >> > > > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
              >> > > > > > -0.695048 (revise x 0.3)
              >> > > > > > 1.356294e-08 806.912
              >> > > > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
              >> > > > > > -0.520586 1.129881e-01
              >> > > > > > 1.000000e+00 845.604
              >> > > > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
              >> > > > > > 1.031045e-01 1.000000e+00 880.431
              >> > > > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
              >> > > > > > -1.159684 2.684013e-02
              >> > > > > > 1.000000e+00 916.249
              >> > > > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
              >> > > >! ; > > -0.119104 9.070075e-04
              >> > > > > > 1.000000e+00 951.681
              >> > > > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
              >> > > > > > 1.577486e-02 1.000000e+00 988.121
              >> > > > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
              >> > > > > > 4.510566e-03 1.000000e+00 1024.557
              >> > > > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
              >> > > > > > 1.230063e-02 1.000000e+00 1060.234
              >> > > > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
              >> > > > > > 1.402782e-02 1.000000e+00 1096.908
              >> > > > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
              >> > > > > > 2.145415e-02 1.000000e+00 1134.393
              >> > > > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
              >> > > > > > 5.921376e-02 1.000000e+00 1169.727
              >> > > >! > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.41 9121
              >> > > > > > 1.841315e-01 1.000000e+00 1205.274
              >> > > > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
              >> > > > > > -1.324207 (revise x 0.5)
              >> > > > > > 5.300100e-01 1240.358
              >> > > > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
              >> > > > > > -0.248880 9.528314e-03
              >> > > > > > 1.000000e+00 1276.848
              >> > > > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
              >> > > > > > 5.401969e-03 1.000000e+00 1312.630
              >> > > > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
              >> > > > > > 2.109435e-02 1.000000e+00 1348.003
              >> > > > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
              >> > > > > > 4.156245e-02 1.000000e+00 1383.463
              >> > > > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
              >> > > > > > 6.534269e-02 1.00000! 0e+00 1418.902
              >> > > > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
              >> > > > > > 2.155944e-02 1.000000e+00 1454.386
              >> > > > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
              >> > > > > > 6.547502e-03 1.000000e+00 1489.848
              >> > > > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
              >> > > > > > 2.127681e-02 1.000000e+00 1525.388
              >> > > > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
              >> > > > > > 8.674439e-02 1.000000e+00 1560.996
              >> > > > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
              >> > > > > > -0.040180 4.881849e-02
              >> > > > > > 1.000000e+00 1596.506
              >> > > > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
              >> > > > > > -0.057879 3.470570e-03
              >> > > > > > 1.000000e+00 1632.151
              >> > > > > > 46 1! .817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
              >> > > &g t; > > 1.187489e-02 1.000000e+00 1668.191
              >> > > > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
              >> > > > > > 2.917945e-02 1.000000e+00 1703.878
              >> > > > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
              >> > > > > > 1.323640e-01 1.000000e+00 1739.512
              >> > > > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
              >> > > > > > -0.129532 1.930031e-02
              >> > > > > > 1.000000e+00 1775.168
              >> > > > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
              >> > > > > > 6.557360e-03 1.000000e+00 1810.845
              >> > > > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
              >> > > > > > 3.182580e-02 1.000000e+00 1846.487
              >> > > > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
              >> > > > > > 8.439630e-02 1.000000e+00 1883.680
              >> > > > > &! gt; 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
              >> > > > > > -0.451383 3.352547e-03
              >> > > > > > 1.000000e+00 1919.294
              >> > > > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
              >> > > > > > 1.574223e-03 1.000000e+00 1955.011
              >> > > > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
              >> > > > > > 4.880569e-03 1.000000e+00 1990.919
              >> > > > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
              >> > > > > > 1.338996e-02 1.000000e+00 2026.526
              >> > > > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
              >> > > > > > 1.254878e-01 1.000000e+00 2062.208
              >> > > > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
              >> > > > > > -0.856343 1.059850e-02
              >> > > > > > 1.000000e+00 2098.129
              >> > > > > > 59 1.717457e-01 5.019926e-04 1.543817e+! ;04 0.356830
              >> > > > > > -0.113790 5.999178e-03
              >> > > > ; > > 1.000000e+00 2133.891
              >> > > > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
              >> > > > > > 4.132071e-03 1.000000e+00 2169.619
              >> > > > > > Net time spent in communication = 0 seconds
              >> > > > > > Net time spent = 2169.6 seconds
              >> > > > > > finished run
              >> > > > > > number of examples = 184522680
              >> > > > > > weighted example sum = 1.845e+08
              >> > > > > > weighted label sum = 1.031e+08
              >> > > > > > average loss = 0.666
              >> > > > > > best constant = 0.5589
              >> > > > > > total feature number = 49403301420
              >> > > > > >
              >> > > > > >
              >> > > > > >
              >> > > > > >
              >> > > > > >
              >> > > > > >
              >> > > > > >
              >> > > > >
              >> > > >
              >> > > >
              >> > >
              >> > *
              >> >
              >> > **
              >> > **
              >> > **
              >> > **
              >>
              >
              >
              >
            • regularizer
              ... That s what I mean. I m just saying that it is 120M compressed. ... OK. That does not get me anywhere near the good solution I get with BFGS using
              Message 6 of 9 , Feb 22, 2012
              View Source
              • 0 Attachment
                --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@...> wrote:
                >
                > On 2/22/12, regularizer <regularizer@...> wrote:
                > > John, if you have a mechanism for me to drop a 120M compressed file, it is
                > > available.
                >
                > We'd need the source file.

                That's what I mean. I'm just saying that it is 120M compressed.

                >
                > > If "multipass online learning" means running online learning, saving the
                > > weights; loading the weights, running further online training saving the
                > > weights; and so on, then no, each time the initial training seems to destroy
                > > whatever the previous training achieved (with default parameters).
                >
                > I just mean using --passes <n> when doing online learning.

                OK.
                That does not get me anywhere near the good solution I get with BFGS using approximately the same amount of passes through the data (~60 in this case). I haven't probably experimented with all possible options here.


                > Miro is correct that LBFGS was not designed to handle this, but the
                > fact that it is working suggests that it can be made to work well with
                > a little bit of thought.

                Right. It is true that with more regularization the first misstep becomes a bit smaller, but still I'd say that an optimization method that increases the loss as its first step (and basically ignores its initial state) is partly broken. However, I'm happy that eventually it does converge!

                BTW, I don't understand how "--initial_pass_length <arg>" is supposed to work. Given a small number such as 5% of the data as the <arg> I can see that it actually converges very fast, much faster than the whole batch approach, but once it has gone through its passes using this initial_pass_length, it starts spewing output lines that seem to correspond to each individual data sample, and this of course takes forever.

                - Kari


                > -John
                >
                > > - Kari
                > >
                > >
                > > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@> wrote:
                > >>
                > >> That's not a happy optimization, but at least it's converging after
                > >> wasting half of the time flailing. More regularization would help with
                > >> the initial step, but that's not enough, and some serious thought is
                > >> required here. Is the dataset available? Does multipass online learning
                > >> do something sensible?
                > >>
                > >> -John
                > >>
                > >> On 02/21/2012 09:17 PM, regularizer wrote:
                > >> >
                > >> > Thanks John, that helps in that I don't get stuck at "Zero or negative
                > >> > curvature detected" anymore and the initial step is slightly smaller.
                > >> > Still, the first printed direction magnitude seems huge to me. The
                > >> > loss at second pass becomes large and it takes time to creep back.
                > >> > Output attached.
                > >> >
                > >> > If I do first on-line learning and then switch to bfgs, the same
                > >> > happens - the first direction magnitude is too large and all initial
                > >> > learning is undone. Is there something I can do to continue
                > >> > optimization (with quantile loss) starting from a previous weight
                > >> > vector without actually throwing it away?
                > >> >
                > >> > Thanks,
                > >> >
                > >> > - Kari
                > >> >
                > >> >
                > >> > $ vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr --passes
                > >> > 60 --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
                > >> > enabling BFGS based optimization **without** curvature calculation
                > >> > creating quadratic features for pairs: sd aa
                > >> > final_regressor = 75! .rgr
                > >> > creating cache_file = vw.cache
                > >> > Reading from stdin
                > >> > num sources = 1
                > >> > Num weight bits = 18
                > >> > learning rate = 10
                > >> > initial_t = 1
                > >> > power_t = 0.5
                > >> > decay_learning_rate = 1
                > >> > using l2 regularization
                > >> > m = 25
                > >> > Allocated 54M for weights and mem
                > >> > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2 mix fraction
                > >> > curvature dir. magnitude step size time
                > >> > 1 4.191599e-01 4.879425e+00 1.500608e+08 ! &nbs p; 4.614935e+15
                > >> > 5.000000e-01 52.935
                > >> > 2 1.875760e+07 6.998204e+00 2.152212e+08 -0.250000 -1.195048 (revise x
                > >> > 0.5) 2.500000e-01 75.075
                > >> > 3 4.689400e+06 4.387594e+00 1.349351e+08 -0.125000 -0.945048 ! (revise
                > >> > x 0.5) 1.250000e-01 98.801
                > >> > 4 1.172351e+06 3.311012e+00 1.018261e+08 -0.062500 -0.820048 (revise x
                > >> > 0.5) 6.250000e-02 121.187
                > >> > 5 2.930890e+05 2.829902e+00 8.703020e+07 -0.031250 -0.757548 (revise x
                > >> > 0.5) 3.125000e-02 144.725
                > >> > 6 7.327342e! +04 2.603643e+00 8.007185e+07 ; -0.015625 -0.726298
                > >> > (revise x 0.5) 1.562500e-02 167.149
                > >> > 7 1.831953e+04 2.494086e+00 7.670258e+07 -0.007813 -0.710673 (revise x
                > >> > 0.5) 7.812500e-03 190.380
                > >> > 8 4.581061e+03 2.440202e+00 7.504543e+07 -0.003907 -0.702861 ! (revise
                > >> > x 0.5) 3.906250e-03 212.310
                > >> > 9 1.146443e+03 2.413483e+00 7.422372e+07 -0.001955 -0.698954 (revise x
                > >> > 0.5) 1.953125e-03 239.450
                > >> > 10 2.877881e+02 2.400179e+00 7.381458e+07 -0.000980 -0.697001 &! nbsp;
                > >> > ; (revise x 0.5) 9.765625e-04 270.791
                > >> > 11 7.312442e+01 2.393541e+00 7.361044e+07 -0.000496 -0.696025 (revise
                > >> > x 0.5) 4.882812e-04 305.179
                > >> > 12 1.945852e+01 2.390226e+00 7.350848e+07 -0.000260 -0.695536 (revise
                > >> > x 0.5) 2.441406e-04 339.859
                > >> > 13 6.042039e+00 2.388569e+00 &n! bsp; 7.345753e+07 -0.000153 -0.695292
                > >> > (revise x 0.5) 1.220703e-04 374.339
                > >> > 14 2.687920e+00 2.387741e+00 7.343206e+07 -0.000124 -0.695170 (revise
                > >> > x 0.5) 6.103516e-05 408.740
                > >> > 15 1.849390e+00 2.387327e+00 7.341932e+07 -0.000156 -0.695109 &! nbsp;
                > >> > ; (revise x 0.5) 3.051758e-05 443.824
                > >> > 16 1.639757e+00 2.387120e+00 7.341296e+07 -0.000267 -0.695079 (revise
                > >> > x 0.5) 1.525879e-05 478.239
                > >> > 17 1.587349e+00 2.387016e+00 7.340977e+07 -0.000510 -0.695063 !
                > >> > (revise x 0.5) 7.629395e-06 512.576
                > >> > 18 1.574247e+00 2.386965e+00 7.340818e+07 -0.001009 -0.695056 (revise
                > >> > x 0.5) 3.814697e-06 547.266
                > >> > 19 1.570972e+00 2.386939e+00 7.340739e+07 -0.002012 -0.695052 (revise
                > >> > x 0.5) 1.907349e-06 581.753
                > >> > 20 1.570153e+00 ! 2.386926e+00 7.340699e+07 & nbsp; -0.004021
                > >> > -0.695050 (revise x 0.5) 9.536743e-07 616.106
                > >> > 21 1.569948e+00 2.386919e+00 7.340679e+07 -0.008041 -0.695049 (revise
                > >> > x 0.5) 4.768372e-07 650.558
                > >> > 22 1.569897e+00 2.386916e+00 7.340669e+07 -0.016082 -0.695049 !
                > >> > (revise x 0.5) 2.384186e-07 684.996
                > >> > 23 1.569884e+00 2.386914e+00 7.340664e+07 -0.032164 -0.695048 (revise
                > >> > x 0.5) 1.192093e-07 719.363
                > >> > 24 1.569881e+00 2.386914e+00 7.340662e+07 -0.064327 -0.695048 !
                > >> > (revise x 0.5)& nbsp; 5.960464e-08 753.747
                > >> > 25 1.569811e+00 2.386913e+00 7.340660e+07 -0.128646 -0.695048 (revise
                > >> > x 0.5) 2.980232e-08 788.155
                > >> > 26 1.560698e+00 2.386911e+00 7.340653e+07 -0.255254 -0.695048 (revise
                > >> > x 0.5) 1.490116e-08 822.468
                > >> > 27 1.335834e+00 2.381222e+00 7.323158e ! 3;07 -0.409947 -0.694208
                > >> > (revise x 0.5) 7.450581e-09 856.963
                > >> > 28 6.457887e-01 2.206131e+00 6.784688e+07 -0.202702 -0.667622 (revise
                > >> > x 0.5) 3.725290e-09 892.038
                > >> > 29 3.134720e-01 9.375462e-01 2.883309e+07 0.189059 -0.424878 &! nbsp;
                > >> > ; 5.721519e-03 1.000000e+00 928.184
                > >> > 30 2.615405e-01 1.543026e-01 4.745389e+06 0.710839 0.356731
                > >> > 3.218869e-03 1.000000e+00 964.430
                > >> > 31 2.527036e-01 1.169463e-01 3.596541e+06 0.397394 -0.292694 &! nbsp;
                > >> > 3.594861e-04 1.000000e+00 1000.642
                > >> > 32 2.478497e-01 3.924282e-02 1.206865e+06 0.797068 0.607724
                > >> > 5.707927e-03 1.000000e+00 1036.349
                > >> > 33 2.382581e-01 2.043136e-02 6.283417e+05 0.687158 0.373366
                > >> > 5.886233e-03! 1.000000e+00 1072.106 < br>34 2.325280e-01 1.878525e-02
                > >> > 5.777174e+05 0.723725 0.451241 2.183486e-02 1.000000e+00 1107.914
                > >> > 35 2.234863e-01 2.136923e-02 6.571848e+05 0.683187 0.380104
                > >> > 3.404222e-02 1.000000e+00 1143.795
                > >> > 36 2.161262e-01 7.854667e-03 2.415607e+05&! nbsp; 0.677020 0.306281
                > >> > 2.641602e-02 1.000000e+00 1180.005
                > >> > 37 2.111394e-01 6.731176e-03 2.070091e+05 0.734114 0.365996
                > >> > 3.066953e-02 1.000000e+00 1215.994
                > >> > 38 2.064287e-01 7.055607e-03 2.169866e+05 0.791928 &nbs! p; 0.477421 ;
                > >> > 9.820154e-02 1.000000e+00 1252.024
                > >> > 39 2.005127e-01 1.397243e-02 4.297050e+05 0.532899 0.010674
                > >> > 2.559544e-02 1.000000e+00 1288.122
                > >> > 40 1.968502e-01 1.953271e-03 6.007048e+04 0.581431 0.154733 !
                > >> > 3.655928e-03 1.000000e+00 1324.317
                > >> > 41 1.956542e-01 2.292869e-03 7.051438e+04 0.661248 0.613157
                > >> > 2.971351e-02 1.000000e+00 1360.552
                > >> > 42 1.928280e-01 2.290130e-03 7.043016e+04 0.598646 0.344799 !
                > >> > 5.504233e-02 &nb sp; 1.000000e+00 1396.824
                > >> > 43 1.902657e-01 3.252593e-03 1.000295e+05 0.562222 0.065822
                > >> > 2.691962e-02 1.000000e+00 1433.380
                > >> > 44 1.881628e-01 9.226223e-04 2.837412e+04 0.705538 0.362939
                > >> > 1.867108e-02 1.000000e+00 1469.847
                > >> > 45 1.8661! 76e-01 8.557320e-04 2.631699e+04 0.759632 0.393094
                > >> > 2.948932e-02 1.000000e+00 1506.383
                > >> > 46 1.851779e-01 1.614627e-03 4.965587e+04 0.641649 0.222579
                > >> > 5.023984e-02 1.000000e+00 1542.968
                > >> > 47 1.836210e-01 1.426560e-03 4.387212e+04 ! 0.550367 0.067523&nbs p;
                > >> > 9.671028e-03 1.000000e+00 1579.712
                > >> > 48 1.824689e-01 6.715775e-04 2.065355e+04 0.740073 0.578498
                > >> > 3.563188e-02 1.000000e+00 1616.411
                > >> > 49 1.810121e-01 8.212360e-04 2.525611e+04 0.560126 0.208578 ! ;
                > >> > 2.365269e-02 1.000000e+00 1653.145
                > >> > 50 1.799282e-01 1.155586e-03 3.553864e+04 0.709983 0.329706
                > >> > 4.263906e-02 1.000000e+00 1689.971
                > >> > 51 1.785792e-01 6.220927e-04 1.913170e+04 0.761242 0.439758 ! ; &nb
                > >> > sp; 7.058747e-02 1.000000e+00 1726.856
                > >> > 52 1.770287e-01 7.729798e-04 2.377205e+04 0.704055 0.307157
                > >> > 1.056268e-01 1.000000e+00 1764.094
                > >> > 53 1.759657e-01 1.297268e-03 3.989590e+04 0.405447 -0.213848
                > >> > 3.006896e-03 1.000000e+! ;00 1800.991
                > >> > 54 1.751416e-01 4.022849e-04 1.237178e+04 0.673106 0.447034
                > >> > 7.249319e-03 1.000000e+00 1838.000
                > >> > 55 1.745050e-01 3.247104e-04 9.986073e+03 0.606828 0.369200
                > >> > 1.471770e-02 1.000000e+00 1874.988
                > >> > 56 1.738613e-01 3.934394e! -04 1.209975e+04 0.72894 3 0.567636
                > >> > 1.063217e-01 1.000000e+00 1911.937
                > >> > 57 1.727164e-01 1.591558e-03 4.894642e+04 0.547620 -0.063455
                > >> > 5.282468e-02 1.000000e+00 1948.869
                > >> > 58 1.726473e-01 1.510033e-02 4.643923e+05 0.043418 -0.577282
                > >> > Termination conditio! n reached in pass 58: decrease in loss less than
                > >> > 0.100%.
                > >> > If you want to optimize further, decrease termination threshold.
                > >> > 1.219617e-03 1.000000e+00 1985.922
                > >> > 59 1.718465e-01 5.661531e-04 1.741135e+04 0.782218 ! 0.216402 & nbsp;
                > >> > 5.475620e-04 1.000000e+00 2045.012
                > >> > finished run
                > >> > number of examples = 184522680
                > >> > weighted example sum = 1.845e+08
                > >> > weighted label sum = 1.031e+08
                > >> > average loss = 0.8003
                > >> > best constant = 0.5589
                > >> > total feature number = 49403301420
                > >> >
                > >> >
                > >> >
                > >> > --- In vowpal_wabbit@yahoogroups.com, John Langford <jl@> wrote:
                > >> > >
                > >> > > Try commenting out first_hessian_on = true in drive_bfgs() in
                > >> > bfgs.cc in
                > >> > > the current master.
                > >> > >
                > >> > > -John
                > >> > >
                > >> > > On 02/21/2012 05:59 PM, regularizer wrote:
                > >> > > >
                > >> > > > Thanks, John! In fact I already turned off -ffast-math. Furthermore,
                > >> > > > it turned out that in my last pasted example it was still on.
                > >> > > > Turning
                > >> > > > it! off in that environment produces results similar to the first
                > >> > two,
                > >> > > > i.e., failure. So in this case -ffast-math is the only way to
                > >> > > > produce
                > >> > > > good results with BFGS. In this case I'm just not able to get as
                > >> > > > good
                > >> > > > results with on-line learning.
                > >> > > >
                > >> > > > What you say about the quantile loss second derivative definitely
                > >> > > > makes sense. It would be good not to do have to do the first huge
                > >> > > > misstep with quantile loss.
                > >> > > >
                > >> > > > I guess that's why beginning with on-line learning with quantile
                > >> > > > loss
                > >> > > > and continuing with BFGS did not work since the 1st BFGS step always
                > >> > > > goes someplace far away from where on-line learning got me.
                > >> > > >
                > >> > > > - Kari
                > >> > > >
                > >> > > > --- In vowpal_wabbit@yahoogroups.com
                > >> > > > <mailto:vowpal_wabbit%40yahoogroups.com>, John Langford jl@ wrote:
                > >> > > > >
                > >> > > > > Different results are! due to -ffast-math, which you could turn
                > >> > > > > off
                > >> > > > > when c ompiling.
                > >> > > > >
                > >> > > > > The quantile loss is particularly brutal because quantile loss
                > >> > has no
                > >> > > > > meaningful second derivative. That means that it oversteps by
                > >> > alot in
                > >> > > > > the first step, and then steps back for many steps. Maybe try
                > >> > > > > increasing the regularizer to compenaste. Alternatively, we could
                > >> > > > > modify LBFGS to not use the second derivative in the step
                > >> > direction on
                > >> > > > > the first pass.
                > >> > > > >
                > >> > > > > Zero curvature is pretty plausible, and more regularization may
                > >> > help
                > >> > > > there.
                > >> > > > >
                > >> > > > > -John
                > >> > > > >
                > >> > > > > On 2/21/12, regularizer regularizer@ wrote:
                > >> > > > > > vw --bfgs seems to be sensitive to the gcc version. I have three
                > >> > > > > > different versions on various machines and all produce different
                > >> > > > results
                > >> > > > > > (consistently). Is there a pr! eferred compiler version?
                > >> > > > > >
                > >> > > > > > Here's what I'm trying to do:
                > >> > > > > >
                > >> > > > > > vw --loss_function quantile --quantile_tau 0.75 -f 75.rgr
                > >> > --passes 60
                > >> > > > > > --bfgs --l2 0.1 --cache_file vw.cache -q sd -q aa --mem 25
                > >> > > > > >
                > >> > > > > > I'll attach the outputs of vw compiled with three different
                > >> > compiler
                > >> > > > > > versions (all 64-bit systems). The lines are long, sorry. Notice
                > >> > > > how the
                > >> > > > > > results diverge radically at pass 22 while already differing
                > >> > > > slightly at
                > >> > > > > > pass 3. The first two produce predictors that predict all
                > >> > zeros, only
                > >> > > > > > the last one is good. Based on these three data points should I
                > >> > > > conclude
                > >> > > > > > that I need at least gcc 4.5.3 or that I need Cygwin instead
                > >> > of Linux?
                > >> > > > > >
                > >> > > > > > Also, what! does BFGS do after "In wolfe_eval: Zero or negative
                > >> > > > cur vature
                > >> > > > > > detected."? It spends a long time doing apparently nothing,
                > >> > and the
                > >> > > > > > resulting weights are not good.
                > >> > > > > >
                > >> > > > > > Thanks,
                > >> > > > > > - Kari
                > >> > > > > >
                > >> > > > > > ----------------------------------------------------------\
                > >> > > > > > -------
                > >> > > > > > This is under Linux with gcc version 4.1.2
                > >> > > > > >
                > >> > > > > > enabling BFGS based optimization **without** curvature
                > >> > > > > > calculation
                > >> > > > > > creating quadratic features for pairs: sd aa
                > >> > > > > > final_regressor = 75.rgr
                > >> > > > > > using cache_file = vw.cache
                > >> > > > > > ignoring text input in favor of cache input
                > >> > > > > > num sources = 1
                > >> > > > > > Num weight bits = 18
                > >> > > > > > learning rate = 10
                > >> > > > > > initial_t = 1
                > >> > > > > > power_t = 0.5
                > >> > ! > > > > decay_learning_rate = 1
                > >> > > > > > using l2 regularization
                > >> > > > > > m = 25
                > >> > > > > > Allocated 54M for weights and mem
                > >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
                > >> > > > > > mix fraction curvature dir. magnitude step size time
                > >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
                > >> > > > > > 1.500608e+08 4.614935e+15 9.999999e-01 99.191
                > >> > > > > > 3 7.503037e+07 1.404921e+01 4.320662e+08 -0.500000
                > >> > > > > > -1.695048 (revise x 0.4)
                > >> > > > > > 4.434236e-01 132.949
                > >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
                > >> > > > > > -1.138472 (revise x 0.4)
                > >> > > > > > 1.900951e-01 166.716
                > >> > > > > > 5 2.711309e+06 3.852626e+00 1.184828e+08 -0.095048
                > >> > > > > > -0.885143 (revise x 0.4)
                > >> > > > > > ! 7.967209e-02 200.672
                > >> > > > > > 6 4.762672e+05 2.95829 1e+00 9.097862e+07 -0.039836
                > >> > > > > > -0.774720 (revise x 0.4)
                > >> > > > > > 3.299097e-02 234.418
                > >> > > > > > 7 8.166480e+04 2.615997e+00 8.045179e+07 -0.016496
                > >> > > > > > -0.728039 (revise x 0.4)
                > >> > > > > > 1.358448e-02 268.178
                > >> > > > > > 8 1.384746e+04 2.479955e+00 7.626798e+07 -0.006793
                > >> > > > > > -0.708633 (revise x 0.4)
                > >> > > > > > 5.579975e-03 302.334
                > >> > > > > > 9 2.337698e+03 2.424913e+00 7.457523e+07 -0.002791
                > >> > > > > > -0.700628 (revise x 0.4)
                > >> > > > > > 2.289690e-03 336.101
                > >> > > > > > 10 3.949191e+02 2.402469e+00 7.388500e+07 -0.001148
                > >> > > > > > -0.697338 (revise x 0.4)
                > >> > > > > > 9.391521e-04 369.972
                > >> > > > > > 11 6.774264e+01 2.393287e+00 7.360262e+07 -0.000478
                > >> > > > > > -0.695987 (revise x 0.4)
                > >> > > >! > > 3.851381e-04 403.756
                > >> > > > > > 12 1.269738e+01 2.389526e+00 7.348695e+07 -0.000212
                > >> > > > > > -0.695433 (revise x 0.4)
                > >> > > > > > 1.579278e-04 437.512
                > >> > > > > > 13 3.440464e+00 2.387984e+00 7.343953e+07 -0.000127
                > >> > > > > > -0.695206 (revise x 0.4)
                > >> > > > > > 6.475448e-05 471.257
                > >> > > > > > 14 1.884181e+00 2.387352e+00 7.342009e+07 -0.000151
                > >> > > > > > -0.695113 (revise x 0.4)
                > >> > > > > > 2.654803e-05 505.011
                > >> > > > > > 15 1.622633e+00 2.387093e+00 7.341212e+07 -0.000302
                > >> > > > > > -0.695075 (revise x 0.4)
                > >> > > > > > 1.088143e-05 538.771
                > >> > > > > > 16 1.578712e+00 2.386986e+00 7.340885e+07 -0.000710
                > >> > > > > > -0.695059 (revise x 0.4)
                > >> > > > > > 4.457373e-06 572.513
                > >> > > > > > 17 1.571349e+00 2.386943e+00 7! .340751e+07 -0.001723
                > >> > > > > > -0.695052 (revise x 0 .4)
                > >> > > > > > 1.823206e-06 606.268
                > >> > > > > > 18 1.570121e+00 2.386925e+00 7.340696e+07 -0.004207
                > >> > > > > > -0.695050 (revise x 0.4)
                > >> > > > > > 7.430751e-07 640.011
                > >> > > > > > 19 1.569918e+00 2.386918e+00 7.340674e+07 -0.010320
                > >> > > > > > -0.695049 (revise x 0.4)
                > >> > > > > > 3.001712e-07 674.087
                > >> > > > > > 20 1.569885e+00 2.386915e+00 7.340665e+07 -0.025547
                > >> > > > > > -0.695048 (revise x 0.4)
                > >> > > > > > 1.185601e-07 707.928
                > >> > > > > > 21 1.569872e+00 2.386913e+00 7.340661e+07 -0.064679
                > >> > > > > > -0.695048 (revise x 0.4)
                > >> > > > > > 4.409118e-08 741.700
                > >> > > > > > 22 1.043330e+00 2.244743e+00 6.903435e+07 -0.094338
                > >> > > > > > -0.673731 (revise x 0.3)
                > >> > > > > > 1.526299e-08 775.480
                > >> > > > > > 23 4.191600e-01
                > >> > >! > > > In wolfe_eval: Zero or negative curvature detected.
                > >> > > > > > To increase curvature you can increase regularization or rescale
                > >> > > > > > features.
                > >> > > > > > It is also very likely that you have reached numerical accuracy
                > >> > > > > > and further decrease in the objective cannot be reliably
                > >> > > > > > detected.
                > >> > > > > > (revise x 0.0) 0.000000e+00 809.273
                > >> > > > > > 24 4.191601e-01
                > >> > > > > > (revise x 0.0) 0.000000e+00 1539.289
                > >> > > > > > Net time spent in communication = 0 seconds
                > >> > > > > > Net time spent = 1539.3 seconds
                > >> > > > > > finished run
                > >> > > > > > number of examples = 184522680
                > >> > > > > > weighted example sum = 1.845e+08
                > >> > > > > > weighted label sum = 1.031e+08
                > >> > > > > > average loss = 0.5425
                > >> > > > > > best constant = 0.5589
                > >> > > > > > total featur! e number = 49403301420
                > >> > > > > >
                > >> > > > > >*> > > >
                > >> > ----------------------------------------------------------\
                > >> > > > > > -------
                > >> > > > > > This is under Linux with gcc version 4.3.5
                > >> > > > > >
                > >> > > > > > enabling BFGS based optimization **without** curvature
                > >> > > > > > calculation
                > >> > > > > > creating quadratic features for pairs: sd aa
                > >> > > > > > final_regressor = 75.rgr
                > >> > > > > > using cache_file = vw.cache
                > >> > > > > > ignoring text input in favor of cache input
                > >> > > > > > num sources = 1
                > >> > > > > > Num weight bits = 18
                > >> > > > > > learning rate = 10
                > >> > > > > > initial_t = 1
                > >> > > > > > power_t = 0.5
                > >> > > > > > decay_learning_rate = 1
                > >> > > > > > using l2 regularization
                > >> > > > > > m = 25
                > >> > > > > > Allocated 54M for weights and mem
                > >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
                > >> > > >! ; > > mix fraction curvature dir. magnitude step size time
                > >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
                > >> > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 47.379
                > >> > > > > > 3 7.503038e+07 1.404921e+01 4.320662e+08 -0.500000
                > >> > > > > > -1.695048 (revise x 0.4)
                > >> > > > > > 4.434237e-01 69.483
                > >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
                > >> > > > > > -1.138472 (revise x 0.4)
                > >> > > > > > 1.900951e-01 91.311
                > >> > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
                > >> > > > > > -0.885143 (revise x 0.4)
                > >> > > > > > 7.967210e-02 113.372
                > >> > > > > > 6 4.762674e+05 2.958291e+00 9.097863e+07 -0.039836
                > >> > > > > > -0.774720 (revise x 0.4)
                > >> > > > > > 3.299098e-02 136.162
                > >> > > > > > 7 8.166488e+04 2.615997e+00 8! .045179e+07 -0.016496
                > >> > > > > > -0.728039 (revise x 0 .4)
                > >> > > > > > 1.358448e-02 158.192
                > >> > > > > > 8 1.384749e+04 2.479955e+00 7.626799e+07 -0.006793
                > >> > > > > > -0.708633 (revise x 0.4)
                > >> > > > > > 5.579975e-03 179.810
                > >> > > > > > 9 2.337710e+03 2.424913e+00 7.457524e+07 -0.002791
                > >> > > > > > -0.700628 (revise x 0.4)
                > >> > > > > > 2.289690e-03 201.564
                > >> > > > > > 10 3.949243e+02 2.402469e+00 7.388500e+07 -0.001148
                > >> > > > > > -0.697338 (revise x 0.4)
                > >> > > > > > 9.391521e-04 223.651
                > >> > > > > > 11 6.774474e+01 2.393287e+00 7.360263e+07 -0.000478
                > >> > > > > > -0.695987 (revise x 0.4)
                > >> > > > > > 3.851381e-04 245.367
                > >> > > > > > 12 1.269823e+01 2.389526e+00 7.348695e+07 -0.000212
                > >> > > > > > -0.695433 (revise x 0.4)
                > >> > > > > > 1.579278e-04 267.813
                > >> > > > > > 13 3.440815e+00 2.387! 984e+00 7.343954e+07 -0.000128
                > >> > > > > > -0.695206 (revise x 0.4)
                > >> > > > > > 6.475448e-05 289.566
                > >> > > > > > 14 1.884324e+00 2.387352e+00 7.342010e+07 -0.000151
                > >> > > > > > -0.695113 (revise x 0.4)
                > >> > > > > > 2.654803e-05 311.194
                > >> > > > > > 15 1.622692e+00 2.387093e+00 7.341213e+07 -0.000302
                > >> > > > > > -0.695075 (revise x 0.4)
                > >> > > > > > 1.088143e-05 332.782
                > >> > > > > > 16 1.578736e+00 2.386986e+00 7.340886e+07 -0.000710
                > >> > > > > > -0.695059 (revise x 0.4)
                > >> > > > > > 4.457373e-06 354.700
                > >> > > > > > 17 1.571359e+00 2.386943e+00 7.340752e+07 -0.001723
                > >> > > > > > -0.695053 (revise x 0.4)
                > >> > > > > > 1.823206e-06 376.323
                > >> > > > > > 18 1.570125e+00 2.386925e+00 7.340697e+07 -0.004207
                > >> > > > > > -0.695050 (revise x 0.4)
                > >> > >! ; > > > 7.430750e-07 398.265
                > >> > > > > > 19 1.5699 20e+00 2.386918e+00 7.340674e+07 -0.010320
                > >> > > > > > -0.695049 (revise x 0.4)
                > >> > > > > > 3.001712e-07 419.779
                > >> > > > > > 20 1.569886e+00 2.386915e+00 7.340665e+07 -0.025547
                > >> > > > > > -0.695048 (revise x 0.4)
                > >> > > > > > 1.185601e-07 441.545
                > >> > > > > > 21 1.569880e+00 2.386913e+00 7.340661e+07 -0.064679
                > >> > > > > > -0.695048 (revise x 0.4)
                > >> > > > > > 4.409114e-08 463.328
                > >> > > > > > 22 1.551396e+00 2.386894e+00 7.340600e+07 -0.171127
                > >> > > > > > -0.695045 (revise x 0.3)
                > >> > > > > > 1.362805e-08 484.935
                > >> > > > > > 23 4.191623e-01
                > >> > > > > > In wolfe_eval: Zero or negative curvature detected.
                > >> > > > > > To increase curvature you can increase regularization or rescale
                > >> > > > > > features.
                > >> > > > > > It is also very likely that you have reached nu! merical
                > >> > > > > > accuracy
                > >> > > > > > and further decrease in the objective cannot be reliably
                > >> > > > > > detected.
                > >> > > > > > (revise x 0.0) 0.000000e+00 506.788
                > >> > > > > > 24 4.191600e-01
                > >> > > > > > (revise x 0.0) 0.000000e+00 987.156
                > >> > > > > > Net time spent in communication = 0 seconds
                > >> > > > > > Net time spent = 987.16 seconds
                > >> > > > > > finished run
                > >> > > > > > number of examples = 184522680
                > >> > > > > > weighted example sum = 1.845e+08
                > >> > > > > > weighted label sum = 1.031e+08
                > >> > > > > > average loss = 0.5509
                > >> > > > > > best constant = 0.5589
                > >> > > > > > total feature number = 49403301420
                > >> > > > > >
                > >> > > > > > ----------------------------------------------------------\
                > >> > > > > > -------
                > >> > > > > > This is under Cygwin with gcc version 4.5.3
                > >> > > > > >
                > >> > > &g! t; > > enabling BFGS based optimization **without** curvature ca
                > >> > lculation
                > >> > > > > > creating quadratic features for pairs: sd aa
                > >> > > > > > final_regressor = 75.rgr
                > >> > > > > > using cache_file = vw.cache
                > >> > > > > > ignoring text input in favor of cache input
                > >> > > > > > num sources = 1
                > >> > > > > > Num weight bits = 18
                > >> > > > > > learning rate = 10
                > >> > > > > > initial_t = 1
                > >> > > > > > power_t = 0.5
                > >> > > > > > decay_learning_rate = 1
                > >> > > > > > using l2 regularization
                > >> > > > > > m = 25
                > >> > > > > > Allocated 54M for weights and mem
                > >> > > > > > ## avg. loss der. mag. d. m. cond. wolfe1 wolfe2
                > >> > > > > > mix fraction curvature dir. magnitude step size time
                > >> > > > > > 1 4.191599e-01 4.879425e+00 1.500608e+08
                > >> > > > > > 1.500608e+08 4.614935e+15 1.000000e+00 73.531
                > >> > > > > > 3 7.503038e+07 1.404921e+01 4.320! 662e+08 -0.500000
                > >> > > > > > -1.695048 (revise x 0.4)
                > >> > > > > > 4.434237e-01 109.199
                > >> > > > > > 4 1.475281e+07 6.354012e+00 1.954099e+08 -0.221712
                > >> > > > > > -1.138472 (revise x 0.4)
                > >> > > > > > 1.900951e-01 144.549
                > >> > > > > > 5 2.711310e+06 3.852627e+00 1.184828e+08 -0.095048
                > >> > > > > > -0.885143 (revise x 0.4)
                > >> > > > > > 7.967210e-02 180.274
                > >> > > > > > 6 4.762675e+05 2.958291e+00 9.097863e+07 -0.039836
                > >> > > > > > -0.774720 (revise x 0.4)
                > >> > > > > > 3.299098e-02 215.368
                > >> > > > > > 7 8.166493e+04 2.615997e+00 8.045180e+07 -0.016496
                > >> > > > > > -0.728039 (revise x 0.4)
                > >> > > > > > 1.358448e-02 251.118
                > >> > > > > > 8 1.384751e+04 2.479955e+00 7.626799e+07 -0.006793
                > >> > > > > > -0.708633 (revise x 0.4)
                > >> > > > > > 5.579! 975e-03 286.362
                > >> > > > > > 9 2.337717e+03 2.424913e 3;00 7.457524e+07 -0.002791
                > >> > > > > > -0.700628 (revise x 0.4)
                > >> > > > > > 2.289690e-03 321.647
                > >> > > > > > 10 3.949270e+02 2.402469e+00 7.388500e+07 -0.001148
                > >> > > > > > -0.697338 (revise x 0.4)
                > >> > > > > > 9.391521e-04 359.588
                > >> > > > > > 11 6.774589e+01 2.393287e+00 7.360263e+07 -0.000478
                > >> > > > > > -0.695987 (revise x 0.4)
                > >> > > > > > 3.851380e-04 398.898
                > >> > > > > > 12 1.269871e+01 2.389526e+00 7.348695e+07 -0.000212
                > >> > > > > > -0.695433 (revise x 0.4)
                > >> > > > > > 1.579277e-04 436.665
                > >> > > > > > 13 3.441010e+00 2.387984e+00 7.343954e+07 -0.000128
                > >> > > > > > -0.695206 (revise x 0.4)
                > >> > > > > > 6.475448e-05 472.627
                > >> > > > > > 14 1.884404e+00 2.387352e+00 7.342010e+07 -0.000151
                > >> > > > > > -0.695113 (revise x 0.4)
                > >> > > > &! gt; > 2.654803e-05 507.751
                > >> > > > > > 15 1.622725e+00 2.387093e+00 7.341213e+07 -0.000302
                > >> > > > > > -0.695075 (revise x 0.4)
                > >> > > > > > 1.088143e-05 543.170
                > >> > > > > > 16 1.578749e+00 2.386987e+00 7.340886e+07 -0.000710
                > >> > > > > > -0.695059 (revise x 0.4)
                > >> > > > > > 4.457372e-06 578.753
                > >> > > > > > 17 1.571365e+00 2.386943e+00 7.340752e+07 -0.001723
                > >> > > > > > -0.695053 (revise x 0.4)
                > >> > > > > > 1.823206e-06 613.860
                > >> > > > > > 18 1.570127e+00 2.386925e+00 7.340697e+07 -0.004207
                > >> > > > > > -0.695050 (revise x 0.4)
                > >> > > > > > 7.430749e-07 650.090
                > >> > > > > > 19 1.569920e+00 2.386918e+00 7.340674e+07 -0.010320
                > >> > > > > > -0.695049 (revise x 0.4)
                > >> > > > > > 3.001711e-07 686.592
                > >> > > > > > 20 1.569886e+00 2.386915e+00 7.3! 40665e+07 -0.025547
                > >> > > > > > -0.695048 (revise x 0.4 )
                > >> > > > > > 1.185601e-07 729.522
                > >> > > > > > 21 1.569881e+00 2.386914e+00 7.340661e+07 -0.064679
                > >> > > > > > -0.695048 (revise x 0.4)
                > >> > > > > > 4.409113e-08 765.928
                > >> > > > > > 22 1.567969e+00 2.386913e+00 7.340660e+07 -0.173632
                > >> > > > > > -0.695048 (revise x 0.3)
                > >> > > > > > 1.356294e-08 806.912
                > >> > > > > > 23 3.738755e-01 1.373830e+00 4.225047e+07 0.022250
                > >> > > > > > -0.520586 1.129881e-01
                > >> > > > > > 1.000000e+00 845.604
                > >> > > > > > 24 2.744955e-01 2.885634e-01 8.874415e+06 0.772202 0.455316
                > >> > > > > > 1.031045e-01 1.000000e+00 880.431
                > >> > > > > > 25 2.707598e-01 8.013777e-01 2.464539e+07 0.060259
                > >> > > > > > -1.159684 2.684013e-02
                > >> > > > > > 1.000000e+00 916.249
                > >> > > > > > 26 2.521993e-01 3.737808e-02 1.149517e+06 0.437760
                > >> > > >! ; > > -0.119104 9.070075e-04
                > >> > > > > > 1.000000e+00 951.681
                > >> > > > > > 27 2.487337e-01 2.980484e-02 9.166114e+05 0.890319 0.774747
                > >> > > > > > 1.577486e-02 1.000000e+00 988.121
                > >> > > > > > 28 2.388466e-01 6.851274e-02 2.107026e+06 0.590096 0.194143
                > >> > > > > > 4.510566e-03 1.000000e+00 1024.557
                > >> > > > > > 29 2.331922e-01 3.384458e-02 1.040849e+06 0.785676 0.551578
                > >> > > > > > 1.230063e-02 1.000000e+00 1060.234
                > >> > > > > > 30 2.257282e-01 1.035047e-02 3.183161e+05 0.726034 0.453913
                > >> > > > > > 1.402782e-02 1.000000e+00 1096.908
                > >> > > > > > 31 2.205253e-01 1.324547e-02 4.073484e+05 0.730802 0.456633
                > >> > > > > > 2.145415e-02 1.000000e+00 1134.393
                > >> > > > > > 32 2.157588e-01 9.859573e-03 3.032191e+05 0.759526 0.475091
                > >> > > > > > 5.921376e-02 1.000000e+00 1169.727
                > >> > > >! > > 33 2.088887e-01 1.235760e-02 3.800429e+05 0.730657 0.41 9121
                > >> > > > > > 1.841315e-01 1.000000e+00 1205.274
                > >> > > > > > 34 2.101704e-01 9.911478e-02 3.048154e+06 -0.092354
                > >> > > > > > -1.324207 (revise x 0.5)
                > >> > > > > > 5.300100e-01 1240.358
                > >> > > > > > 35 2.056780e-01 1.969835e-02 6.057987e+05 0.436495
                > >> > > > > > -0.248880 9.528314e-03
                > >> > > > > > 1.000000e+00 1276.848
                > >> > > > > > 36 2.022406e-01 3.700955e-03 1.138184e+05 0.668004 0.316562
                > >> > > > > > 5.401969e-03 1.000000e+00 1312.630
                > >> > > > > > 37 2.004172e-01 2.586845e-03 7.955525e+04 0.755313 0.517794
                > >> > > > > > 2.109435e-02 1.000000e+00 1348.003
                > >> > > > > > 38 1.974068e-01 2.885261e-03 8.873268e+04 0.724907 0.456644
                > >> > > > > > 4.156245e-02 1.000000e+00 1383.463
                > >> > > > > > 39 1.940215e-01 2.243242e-03 6.898818e+04 0.704875 0.418325
                > >> > > > > > 6.534269e-02 1.00000! 0e+00 1418.902
                > >> > > > > > 40 1.910583e-01 5.228435e-03 1.607941e+05 0.592461 0.078042
                > >> > > > > > 2.155944e-02 1.000000e+00 1454.386
                > >> > > > > > 41 1.891780e-01 1.062300e-03 3.266975e+04 0.603013 0.277422
                > >> > > > > > 6.547502e-03 1.000000e+00 1489.848
                > >> > > > > > 42 1.881406e-01 1.233737e-03 3.794209e+04 0.698576 0.496102
                > >> > > > > > 2.127681e-02 1.000000e+00 1525.388
                > >> > > > > > 43 1.864403e-01 1.268378e-03 3.900742e+04 0.714181 0.491254
                > >> > > > > > 8.674439e-02 1.000000e+00 1560.996
                > >> > > > > > 44 1.844092e-01 3.018823e-03 9.284021e+04 0.512288
                > >> > > > > > -0.040180 4.881849e-02
                > >> > > > > > 1.000000e+00 1596.506
                > >> > > > > > 45 1.827905e-01 1.491786e-03 4.587806e+04 0.475903
                > >> > > > > > -0.057879 3.470570e-03
                > >> > > > > > 1.000000e+00 1632.151
                > >> > > > > > 46 1! .817528e-01 5.705357e-04 1.754613e+04 0.750989 0.470422
                > >> > > &g t; > > 1.187489e-02 1.000000e+00 1668.191
                > >> > > > > > 47 1.806794e-01 6.571171e-04 2.020884e+04 0.702106 0.352037
                > >> > > > > > 2.917945e-02 1.000000e+00 1703.878
                > >> > > > > > 48 1.794564e-01 9.732167e-04 2.993009e+04 0.767566 0.477306
                > >> > > > > > 1.323640e-01 1.000000e+00 1739.512
                > >> > > > > > 49 1.780222e-01 1.737843e-03 5.344525e+04 0.451400
                > >> > > > > > -0.129532 1.930031e-02
                > >> > > > > > 1.000000e+00 1775.168
                > >> > > > > > 50 1.768670e-01 5.968434e-04 1.835519e+04 0.610033 0.239879
                > >> > > > > > 6.557360e-03 1.000000e+00 1810.845
                > >> > > > > > 51 1.761865e-01 5.629359e-04 1.731241e+04 0.721832 0.542742
                > >> > > > > > 3.182580e-02 1.000000e+00 1846.487
                > >> > > > > > 52 1.751273e-01 5.249340e-04 1.614370e+04 0.667468 0.420878
                > >> > > > > > 8.439630e-02 1.000000e+00 1883.680
                > >> > > > > &! gt; 53 1.742363e-01 1.555000e-02 4.782213e+05 0.441844
                > >> > > > > > -0.451383 3.352547e-03
                > >> > > > > > 1.000000e+00 1919.294
                > >> > > > > > 54 1.737074e-01 1.324455e-03 4.073200e+04 0.408970 0.251978
                > >> > > > > > 1.574223e-03 1.000000e+00 1955.011
                > >> > > > > > 55 1.734941e-01 4.903513e-04 1.508016e+04 0.549206 0.444421
                > >> > > > > > 4.880569e-03 1.000000e+00 1990.919
                > >> > > > > > 56 1.732157e-01 6.418537e-04 1.973943e+04 0.761774 0.487686
                > >> > > > > > 1.338996e-02 1.000000e+00 2026.526
                > >> > > > > > 57 1.726689e-01 3.739879e-04 1.150154e+04 0.793421 0.412380
                > >> > > > > > 1.254878e-01 1.000000e+00 2062.208
                > >> > > > > > 58 1.723390e-01 2.002074e-03 6.157133e+04 0.309866
                > >> > > > > > -0.856343 1.059850e-02
                > >> > > > > > 1.000000e+00 2098.129
                > >> > > > > > 59 1.717457e-01 5.019926e-04 1.543817e+! ;04 0.356830
                > >> > > > > > -0.113790 5.999178e-03
                > >> > > > ; > > 1.000000e+00 2133.891
                > >> > > > > > 60 1.714882e-01 2.208578e-04 6.792212e+03 0.613648 0.443083
                > >> > > > > > 4.132071e-03 1.000000e+00 2169.619
                > >> > > > > > Net time spent in communication = 0 seconds
                > >> > > > > > Net time spent = 2169.6 seconds
                > >> > > > > > finished run
                > >> > > > > > number of examples = 184522680
                > >> > > > > > weighted example sum = 1.845e+08
                > >> > > > > > weighted label sum = 1.031e+08
                > >> > > > > > average loss = 0.666
                > >> > > > > > best constant = 0.5589
                > >> > > > > > total feature number = 49403301420
                > >> > > > > >
                > >> > > > > >
                > >> > > > > >
                > >> > > > > >
                > >> > > > > >
                > >> > > > > >
                > >> > > > > >
                > >> > > > >
                > >> > > >
                > >> > > >
                > >> > >
                > >> > *
                > >> >
                > >> > **
                > >> > **
                > >> > **
                > >> > **
                > >>
                > >
                > >
                > >
                >
              Your message has been successfully submitted and would be delivered to recipients shortly.