Loading ...
Sorry, an error occurred while loading the content.

Re: [LingPipe] Beginner question about using Classifiers

Expand Messages
  • Bob Carpenter
    You might want to check out the word sense tutorial. It contains examples using a broad range of LingPipe classifiers. And the logistic regression
    Message 1 of 4 , Apr 13 10:46 AM
    View Source
    • 0 Attachment
      You might want to check out the word sense
      tutorial. It contains examples using a broad
      range of LingPipe classifiers. And the
      logistic regression classifier, which is
      harder and slower to train, but much more flexible
      in terms of inputs and much more accurate for
      most problems.

      The way to swap things out is to assign to
      the Java type. You can assign an instance
      of NaiveBayesClassifier to DyanmicLMClassifier.
      I'm not sure where you're having problems. If
      you could send me the compiler error, I might
      be able to help.

      You should set the generic type appropriately,
      too. You'll see that NaiveBayesClassifier
      actually extends DynamicLMClassifier<TokenizedLM>.

      createNGramBoundary(...) creates a
      DynamicLMClassifier<NGramBoundaryLM>

      and

      createNGramProcess(...) creates a
      DynamicLMClassifier<NGramProcess>.

      - Bob Carpenter
      Alias-i

      On April 12, 2010, nitishranjan <nitish.ranjan@...> wrote:

      > Hi,
      >
      > I am new to lingpipe and followed the classification tutorial. I
      > notice that in classify tutorial and some other ones lingpipe demo
      > uses DynamicLMClassifier with this statement
      >
      > DynamicLMClassifier classifier =
      > DynamicLMClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);
      >
      > I wanted to use some other classifiers like NaiveBayesClassifier (and
      > may be maximum entropy). A simple change to NaiveBayesClassifier from
      > DynamicLMClassifier does not work, even though NaiveBayesClassifier
      > is derived from DynamicLMClassifier. What is the easiest way to get
      > around?
      >
      > Regards
      > Nitish Sinha
    • Nitish Ranjan
      I was doing DynamicLMClassifier classifier = NaiveBayesClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE); This compiles but gives me back the same
      Message 2 of 4 , Apr 13 12:04 PM
      View Source
      • 0 Attachment
        I was doing

        DynamicLMClassifier classifier
        = NaiveBayesClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);

        This compiles but gives me back the same classifier as earlier. I will dig
        into the word sense tutorial to find out how to change the classifier at
        work.


        Thanks
        Nitish

        On Tue, Apr 13, 2010 at 1:46 PM, Bob Carpenter <carp@...> wrote:

        >
        >
        >
        > You might want to check out the word sense
        > tutorial. It contains examples using a broad
        > range of LingPipe classifiers. And the
        > logistic regression classifier, which is
        > harder and slower to train, but much more flexible
        > in terms of inputs and much more accurate for
        > most problems.
        >
        > The way to swap things out is to assign to
        > the Java type. You can assign an instance
        > of NaiveBayesClassifier to DyanmicLMClassifier.
        > I'm not sure where you're having problems. If
        > you could send me the compiler error, I might
        > be able to help.
        >
        > You should set the generic type appropriately,
        > too. You'll see that NaiveBayesClassifier
        > actually extends DynamicLMClassifier<TokenizedLM>.
        >
        > createNGramBoundary(...) creates a
        > DynamicLMClassifier<NGramBoundaryLM>
        >
        > and
        >
        > createNGramProcess(...) creates a
        > DynamicLMClassifier<NGramProcess>.
        >
        > - Bob Carpenter
        > Alias-i
        >
        > On April 12, 2010, nitishranjan <nitish.ranjan@...<nitish.ranjan%40gmail.com>>
        > wrote:
        >
        > > Hi,
        > >
        > > I am new to lingpipe and followed the classification tutorial. I
        > > notice that in classify tutorial and some other ones lingpipe demo
        > > uses DynamicLMClassifier with this statement
        > >
        > > DynamicLMClassifier classifier =
        > > DynamicLMClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);
        > >
        > > I wanted to use some other classifiers like NaiveBayesClassifier (and
        > > may be maximum entropy). A simple change to NaiveBayesClassifier from
        > > DynamicLMClassifier does not work, even though NaiveBayesClassifier
        > > is derived from DynamicLMClassifier. What is the easiest way to get
        > > around?
        > >
        > > Regards
        > > Nitish Sinha
        >
        >
        >



        --
        Yesterday is history, tomorrow is a mystery but today is a gift. That is why
        we call it the present.


        Carpe Diem
        Nitish


        [Non-text portions of this message have been removed]
      • Bob Carpenter
        Aha. You re running into a confusing Java issue. Because NaiveBayesClassifier is a subclass of DynamicLMClassifier, it inherits the static methods, such as
        Message 3 of 4 , Apr 13 2:38 PM
        View Source
        • 0 Attachment
          Aha. You're running into a confusing Java issue.
          Because NaiveBayesClassifier is a subclass of
          DynamicLMClassifier, it inherits the static methods,
          such as createNGramProcess. It's calling exactly
          the same method, because static methods can't be
          overridden.

          You need to use the constructor for NaiveBayesClassifier,
          as in:

          String[] categories = ...
          TokenizerFactory tokenizerFactory = ...
          DynamicLMClassifier<TokenizedLM> classifier
          = new NaiveBayesClassifier(categories, tokenizerFactory);

          The generic argument says what the underlying language
          model is for the Dynamic LM classifier. For Naive Bayes,
          it's a tokenized LM.

          You need to choose a tokenizer in order to do Naive
          Bayes. One reasonable choice is:

          TokenizerFactory tokenizerFactory = IndoEuropeanTokenizerFactory.INSTANCE;

          (There's also confusingly a second, more conventional
          Naive Bayes implementation in the class TradNaiveBayes.
          It does *not* inherit from DyanmicLMClassifier.)

          - Bob Carpenter
          Alias-i

          On April 13, 2010, Nitish Ranjan <nitish.ranjan@...> wrote:

          > I was doing
          >
          > DynamicLMClassifier classifier
          > = NaiveBayesClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);
          >
          > This compiles but gives me back the same classifier as earlier. I will dig
          > into the word sense tutorial to find out how to change the classifier at
          > work.
        Your message has been successfully submitted and would be delivered to recipients shortly.