Loading ...
Sorry, an error occurred while loading the content.
 

Could swish++ replace glimpse in LXR?

Expand Messages
  • Neil Salter
    Hello Paul & swish++ users, I ve recently started to experiment with swish++, mainly because I d like to see a replacement for Glimpse in LXR. I m hoping to
    Message 1 of 5 , Dec 16, 2001
      Hello Paul & swish++ users,

      I've recently started to experiment with swish++, mainly because I'd like to
      see a replacement for Glimpse in LXR. I'm hoping to find some more
      information as to its suitability of swish++ for this task.

      Background... the LXR project (NB I'm not a developer on this, just a user)
      is a cross-referencing tool which makes available a web-based view of a
      source tree, with various source code symbols hyperlinked. It was originally
      written to cross-reference the Linux source, but can be used on any project,
      and now supports several languages.

      SourceForge page: http://www.sourceforge.net/projects/lxr

      Example in use: http://lxr.mozilla.org/
      (NB this is now an older version of LXR)

      There seems to be general interest in replacing Glimpse, possibly with
      swish++:

      http://sourceforge.net/tracker/index.php?func=detail&aid=481573&group_id=27350&atid=390117

      I note that "Using SWISH++ as-is to index documents in a language other than
      English is naive and will yield poor results." I also note that it's possible
      to apply filters to files, as well as write add-on modules to index other
      types of file. And I've read the FAQ. "Can I use SWISH++ to index source
      code? This has the same answer as FAQ #4 except replace ``documents in a
      language other than English'' with ``source code.''"

      Despite these comments, given that LXR cross-references source code in
      various programming languages (including C++, perl, Java) could use be made
      of swish++ as viable candidate to replace glimpse as LXR's __freetext
      search__ facility? (LXR has it's own language-symbol lookup.)

      If so, can anyone provide any advice on the best approach for this - it would
      be a pain to have to write an indexer for every language supported by LXR,
      but maybe this is the best way? Is this likely to be an difficult or lengthy
      task?

      Also, the Glimpse part of LXR outputs the file, line number, and some text
      from the line. Could this be done with swish++?

      If swish++ does not look like the appropriate tool for this purpose, does any
      one have any ideas of what might be tried instead.

      Many thanks in advance for any replies,

      Neil.
    • Paul J. Lucas
      ... IMHO, they re doing a poor job of parsing. ... Using *any* text indexer is a mistake. The only proper way to index source code is to use a compiler
      Message 2 of 5 , Dec 17, 2001
        On Sun, 16 Dec 2001, Neil Salter wrote:

        > ... the LXR project ... is a cross-referencing tool which makes available a
        > web-based view of a source tree, with various source code symbols
        > hyperlinked.

        IMHO, they're doing a poor job of parsing.

        > There seems to be general interest in replacing Glimpse, possibly with
        > swish++:

        Using *any* text indexer is a mistake. The only proper way to
        index source code is to use a compiler front-end.

        > Despite these comments, given that LXR cross-references source code in
        > various programming languages (including C++, perl, Java) could use be made
        > of swish++ as viable candidate to replace glimpse as LXR's __freetext
        > search__ facility?

        If you're merely replacing one indexer with another, then of
        course. But, again, using any free-text indexer is a mistake.

        > If so, can anyone provide any advice on the best approach for this - it would
        > be a pain to have to write an indexer for every language supported by LXR,

        If you used compiler front-end like you should be doing, the
        work is already done. It's called GCC.

        > Also, the Glimpse part of LXR outputs the file, line number, and some text
        > from the line. Could this be done with swish++?

        Did you see "line number" mentioned anywhere in the
        documentation?

        - Paul
      • Ian Soboroff
        ... you might also look into ctags and etags, which provide hyperlinking in source trees. i agree with Paul that you _don t_ want to re-solve the code parsing
        Message 3 of 5 , Dec 18, 2001
          "Paul J. Lucas" <pauljlucas@...> writes:

          > > If so, can anyone provide any advice on the best approach for this
          > > - it would be a pain to have to write an indexer for every
          > > language supported by LXR,
          >
          > If you used compiler front-end like you should be doing, the
          > work is already done. It's called GCC.

          you might also look into ctags and etags, which provide hyperlinking
          in source trees. i agree with Paul that you _don't_ want to re-solve
          the code parsing problems, because it's either boring (you just
          download the LALR grammer) or fraught with peril (ever wrote a
          compiler?) (peril == fun!). ctags/etags has solved it quite nicely,
          and you could probably just use their index to build your view.

          ian
        • Neil Salter
          ... Thanks for the replies. LXR does in fact use e/ctags to extract symbols from the source and index them. I d rather not go to the effort of trying to make
          Message 4 of 5 , Dec 18, 2001
            On Tuesday 18 December 2001 14:24 pm, Ian Soboroff wrote:
            > "Paul J. Lucas" <pauljlucas@...> writes:
            > > > If so, can anyone provide any advice on the best approach for this
            > > > - it would be a pain to have to write an indexer for every
            > > > language supported by LXR,
            > >
            > > If you used compiler front-end like you should be doing, the
            > > work is already done. It's called GCC.
            >
            > you might also look into ctags and etags, which provide hyperlinking
            > in source trees. i agree with Paul that you _don't_ want to re-solve
            > the code parsing problems, because it's either boring (you just
            > download the LALR grammer) or fraught with peril (ever wrote a
            > compiler?) (peril == fun!). ctags/etags has solved it quite nicely,
            > and you could probably just use their index to build your view.

            Thanks for the replies. LXR does in fact use e/ctags to extract symbols from
            the source and index them.

            I'd rather not go to the effort of trying to make gcc produce the info I
            need, which I think would require the traversal of a learning curve for which
            I lack the time (and quite possibly the ability), and would anyway support
            only those languages supported by gcc.

            I'm interested in using swish++ replace the freetext search afforded by
            glimpse, which picks out things in comments, strings, and so forth. I guess I
            need to experiment with swish++ to see how effective this is when C++ (or
            whatever) is used as input.

            Thanks again,

            Neil.
          • Paul J. Lucas
            ... You *will* have to modify the source code to change the is_ok_word() rules. You should have all language X s keywords as stop-words. - Paul
            Message 5 of 5 , Dec 18, 2001
              On Tue, 18 Dec 2001, Neil Salter wrote:

              > I'm interested in using swish++ replace the freetext search afforded by
              > glimpse, which picks out things in comments, strings, and so forth. I guess I
              > need to experiment with swish++ to see how effective this is when C++ (or
              > whatever) is used as input.

              You *will* have to modify the source code to change the
              is_ok_word() rules.

              You should have all language X's keywords as stop-words.

              - Paul
            Your message has been successfully submitted and would be delivered to recipients shortly.