Loading ...
Sorry, an error occurred while loading the content.

Re: Interesting idea - Musical Tuning and Human Biology - today's NewScientist -

Expand Messages
  • francois_laferriere
    Martin I exposed the flaw in the method used three time, with diffrent wording and/or more detail, asking you to pinpoint exactly where you either disagree or
    Message 1 of 87 , Sep 1, 2003
    • 0 Attachment

      I exposed the flaw in the method used three time, with diffrent
      wording and/or more detail, asking you to pinpoint exactly where you
      either disagree or do not understand (because seemingly I am not that
      good at being clear).

      Seemingly, you are unwilling to enter a fair discussion, but just
      repeatedly state that I misinform the group (how?), that I state false
      fact (which one please?), and that my writing is unclear (in spite of
      the fact that some other group members like Paul tip me that I am not
      totally obscure).

      After that carload of amabilities, you state that "I" should apologize!!!!

      Ok, I take it as humour...

      > My question was, did they put it into the
      > paper by choosing a special trick to get it in.
      > Fig 3A is an empirical one.
      > It does reflect the nature of human speech,
      > not a methods decision by the authors!!!

      In this figure 3A, there is indeed a method decision, even though,
      probably not a malicious one. It is "currently admitted" (sorry, no
      bibliography) that there is no significant physical coupling between
      the vocal fold and the vocal tract, so there is no significant
      correlate between f0 (the pitch) and F1, F2 .. (formant freequencies).
      So as random variables, f0 and F1 can be considered continuous
      independent random variable (or slightly correlated, that is not
      important for the following).

      The value of Fm do not correspond to a physical property of the vocal
      tract conversely to F1 (effective vocal tract). F1 can be computed by
      various interpolation method from the signal. F1 is continuously
      distributed and is by no way forced to be an integer multiple of f0.

      On the other hand side Fm is, by definition, a multiple of f0, thus,
      it is a roundoff of the real physical value (F1). Fm has no physical
      meaning (numerology excluded). By this simple process of rounding-off
      Fm to an integer multiple of f0, a continuous spectrum is artificially
      transformed in a discrete spectrum with a very limited number of
      significant value.

      Harmonic number of Fm is not a "natural" measure, it is the result of
      a computation that creates an artificial set of discrete random
      variable that have, conviently, a very limited ranged, that are
      afterward "shaked and baked" to produce a very poor continuous
      "normalized" spectrum that is just a discrete spectrum in disguise.

      So no, figure 3A reflect nothing in the nature of human speech. Taking
      Fm as
      round (F1 / locutor height in centimeter)
      would have given very similar results.

      Its another way to explain the error in the method

      Again, as soon as you pinpoint where I am wrong/unclear, I am
      absolutely willing admit my error (it would not be the first time I
      make a fool of myself on this list) or explain until cristal-like
      clarity is reached (I am not that good at pedagogy, but I may try
      again). You are my guest Martin

      > Martin
      > What then was this error? I can see no error in the FFT analysis.

      No error in the FFT presented, as I explained, things goes bad just

      Even more good will, I recap below, in three sentence as you asked
      (even though it is just a summary, not a clarication

      1- The theoritical "normalized" spectrum is a spectrum of discrete
      values at some integer ratio. No surprize that actual normalized
      spectrum has peak.
      2- The physical relationship limits between f0 range and F1 range
      limits the number of significant peaks in interval [1,2]
      3- this lead to the trivial result (peak that are physically unlikely
      do not appear, those that are more likely due to the distribution of
      harmonic number of Fm as N/2, N/3, N/4 N/5 are proeminent).

      All this is just due to acoustics, elementary algebrae and fairly
      simple stats concepts; it is totally neuroscience-free, otherwise, I
      would not have permit to challenge you Martin :).

      All the detail is in my previous posts

      yours truly

      François Laferrière
    • francois_laferriere
      Hello Martin, A final post (as far as I am concerned) on this discussion that becomes very technical, more and more distant to the list concerns and somehow,
      Message 87 of 87 , Sep 15, 2003
      • 0 Attachment
        Hello Martin,

        A final post (as far as I am concerned) on this discussion that
        becomes very technical, more and more distant to the list concerns and
        somehow, tedious.

        > Fran�ois:
        > So, on the window average, f5/f4 seems to be inharmonic
        > (1.18 instead of 1.25).

        > To get rid of this annoying effect, you have to use shorter window,
        > but for short windows, the uncertainty principle kicks in. So
        > compromise must be made on window size to get best spectral estimate.

        > This computational inharmonicity do exist (and in fact occur all the
        > time for .1 sec window)

        > Martin:
        > I am glad that we agree on the description of the phenomenon of
        > disharmonization in pitch shifts. It's also fine that you saw in
        > data what I expected by only thinking through the physics.

        > Fran�ois:
        > But again:
        > - This inharmonicity is not physical

        > Martin:
        > Here we still disagree. You are right that the ratio 5/4 (of your
        > does not disappear. But what happens, if your time windows are so
        short that
        > the "errors" through averaging disappear? You'll see this: in some
        > there is no power at the 4th partial and in other windows there is
        no power
        > at the 5th partials.

        Not at all, what appears is:
        - as window size get smaller, harmonic peaks get broader, until they
        (and no peak extraction algorith is of any help).
        - As they get broader, the assesment of harmonicity is less and less

        > So you have a 5/4 ratio all the time, but one that is
        > not real. What you have in reality is a ratio between real peaks
        (those that
        > have power) which deviates from 5/4.

        Even if that occur (an harmonic is so small that a given peak vanishes
        in noise) that does not mean that it cease to exist, no more that
        vanishing stars cease to exist in daylight. In other words, that some
        peaks cease to be measurable is more a signal-to-noise problem than
        anything related to a modification of the underlying physical process.

        > Fran�ois
        > - It shall spread the peaks but not shift them on the average as they
        > are as likely to contribute at right or at left of each peak.

        > again we seem to agree

        > Martin:
        > exactly

        We never agree on so much before, great!

        > Martin
        > The peaks are not shifted, of course, but the majority of ratio
        > probability
        > is FLAT, that is BETWEEN the low-order-ratio peaks. This is what the
        > figures of the study show.

        > François:
        > The paper rightfully focus on the peaks, not on the background.

        > Martin:
        > The interpretation of the authors focuses on the peaks. But they
        > the complete spectra for anybody to see.

        > François:
        > Secondly, I see no flat floor but gentle slope on each side of each
        > peak that eventually merge with the neighbouring peaks.

        > Martin:
        > François, below the slopes - in fact: below the dips - there is a HIGH
        > plateau of noise !!!
        > For example in Fig.2C the noise floor is 20 times (!) as high as the
        > difference between the peak at 5/4 and the valley between 5/4 and 6/5.

        20 time? what do you mean? 13 dB? where do you see a "floor" (at which
        ratio value)?

        > Fran�ois:
        > I do not understand why you focus on what occur BETWEEN the peaks;
        > this is very secondary.

        There is no floor, but an steepy slope of around -20db per octave
        between 1 and 2.
        To have a clearer picture, we should remove or "normalise" this slope.

        But to so, we ought to know where does it come from.

        Where does it come from? On normal speech, there is around -6dB from
        the glottal source (may vary much) and -6dB from the lips radiation
        caracterists, so roughly -12dB per Octave. The missing -8dB come from
        the fact that most data gatered for ratio between 1 and 2 are done on
        the right hand side of a formant. But is certainly not a straight,
        simple -8dB / octave, because near 1, the more complex N+1/N ratio
        such as 7:6 (compared to less complex 3:2) contribute more due to
        their average proximity to formant top. I see no way to "predict" this
        value of -20 dB because it mixes up contributions of formant bandwith,
        mde even more complex by the normalisation process, and glottis and
        lips radiation. Instead of isolating variables contributing to the
        background, the normalisation make them impossible to disentangle.

        Not being able to modelize properly this high frequency decay hamper
        any attempt to interpret the background. But that is not the topic of
        the paper anyways.

        > Martin:
        > Well, in the example above there is much more between the peaks than
        at the
        > peaks. This is important to note, because it shows the big difference
        > between "clean" theoretically derived textbook spectra and real speech
        > spectra. The value of the study is not to have replicated simple general
        > wisdom on the general harmonicity of speech. It's value is to have shown
        > what the harmonicity looks like in REAL data.

        An this has nothing to do with harmonicity/inharmonicity anayways and
        inharmonicity has nothing to do the topic of the paper.

        > So Martin, when you write:

        >> Fran�ois, none of your suggestions would be of help in finding
        >>answers to
        >> the questions of the research project. It seems to me you misunderstood
        >> what the authors tried to investigate.

        > Francois:
        > you are absolutely correct all the way:

        > - none of my suggestion would produce valuable results
        > - I do not understand what they try to investigate
        > - I see, but hardly, what are the questions of the research project

        > Martin
        > You could have read that what you did not understand or see in the first
        > section of the paper, which is called "introduction".

        I was only half kidding. The paper goes from a rather broad hypothesis
        to a not less broad conclusion through the very small bottleneck of a
        debatable (to say the less) process.

        > Martin:
        > If the idea is worth discussing, and you even like it, then the same
        > also apply to the details of the results. That is, which ratios
        stick out of
        > the noise and which don't.

        I explained clearly and quantitatively enough, which ration stick out
        and maintain what I said:

        > I just say that their protocol is wrong from the beginning, and
        > produces trivial, predictable and otherwise uninteresting results,
        > that's all.

        > Martin:
        > Had the authors asked you, before starting this study, you had grossly
        > mispredicted the amount of noise between the peaks. And you also
        would not
        > have been able to predict the limit beyond which simple ratios
        disappear in
        > the noise.

        Well, in experimental science, who care about "predicting" the noise.
        Understanding the source of the noise in order to reduce it in the
        data gathering process is the only useful issue.

        > Martin:
        > You might have predicted sex differences, but not their exact
        > values.

        I have been able to "predict" :-) peak locations and rough relatives
        amplitudes from harmonic number distribution alone, explain sex
        difference, describe roughly different contribution to background
        noise and spectral decay.

        I think that it is not bad for a dilettante.

        I must go back to the work I am paid for

        yours truly

        François Laferrière
      Your message has been successfully submitted and would be delivered to recipients shortly.