Loading ...
Sorry, an error occurred while loading the content.

Re: [Clip] Re: Replacing comma with a tab

Expand Messages
  • Don - HtmlFixIt.com
    ... No you don t want it in the parenthesis because the stuff in the parenthesis becomes the $1 for replacement, and it applies to the entire following
    Message 1 of 34 , Apr 1 7:41 AM
    View Source
    • 0 Attachment
      m_frascinella wrote:
      > Hi,
      >
      > I have been reading this string of notes and trying to understand how
      > to use Regex but the method by which you construct the pattern still
      > eludes me. For example, Flo gave these examples for replacing the
      > first and second comma on a line:
      >
      >> ^!Replace "^(.+?)," >> "$1\t" AWRS
      >> ^!Replace "^(.+\t.+?)," >> "$1\t" AWRS
      >
      > I see these statements as great models of syntax for replacing the
      > first and second "anything" on a line but have trouble understanding
      > how the expression works and why it needs to contain what it contains
      > and why the sequence has to be in the order presented.
      >
      > For the first example, here is my limited understanding of the code:
      >
      > ^ match at the beginning of the line.
      > Does this have to be there to force the match to start at the
      > beginning of the current line? If you put the "^" inside the
      > parentheses, it works the same as if it was outside the parentheses.
      No you don't want it in the parenthesis because the stuff in the
      parenthesis becomes the $1 for replacement, and it applies to the entire
      following pattern, not just the parenthesis.
      The W says do the whole document and the A says do all, so it will work
      on EVERY line in the document regardless of where the cursor is at the
      start of the procedure. So no not just the current line.

      >
      > () groups the syntax together.
      > Does this mean it has to match everything inside the parentheses?
      It does group the syntax and makes it a "referable" item that can be
      reinserted on the other side with the $1. The sets of parenthesis are
      numbered if there are more than one.
      >
      > . matches any single character.
      > Why is this needed since the comma is specified at the end?
      because it says match from the start of the line to the first comma.

      so if you have:
      xxxxxxxxx,
      it is matching the xx's
      >
      > + matches one or more matches of the expression (everything inside
      > the parentheses or more?). This also is mentioned in the help as a
      > quantifier {1,}
      > But why is this necessary after the period?
      because we don't know how many xxx's there are. When I say xxx's it can
      be any character. So it matches variable length of whatever up to the
      first comma. That is the hard part to grasp (at least it was for me),
      but it is the power when if finally starts to make sense.
      >
      > ? matches zero or more matches of the regular expression but it also
      > seem so have something to do with limiting the "+" character. Why is
      > this used here?
      ? in that context makes the .* non-greedy, meaning it takes the least
      amount possible before finding a comma.

      >
      > , the comma to be found on the line (the easiest part of the
      > expression).
      Indeed it is. Because you are only replacing up to and including the
      comma with $1, the remainder of the line stays the same.
      >
      > Yours,
      >
      > Michael F.
    • notetab_is_great
      ... inside one ... there ... to work correctly too. That s a strange conclusion to reach as a result of this thread... It is true that the regular expression
      Message 34 of 34 , Apr 5 8:54 AM
      View Source
      • 0 Attachment
        --- In ntb-clips@yahoogroups.com, "janderri" <jan_derrick@...> wrote:
        >
        > --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@> wrote:
        > >
        > > Flo wrote:
        > > > "^(.+\t.+?),"
        > >
        > > A question, as I have never yet used such a complicated term
        inside one
        > > pair of parentheses:
        > > What does the non greedy specifier apply to there exactly and would it
        > > make any sense to write (.+?\t.+?) instead? We were never told if
        there
        > > could be preexisting tabs anywhere in the source lines.
        > >
        > > Danke
        > > Axel
        > >
        >
        >
        > I think we can conclude this with :
        >
        > Regular expressions in NoteTab are a mess + we can't even trust them
        to work correctly too.


        That's a strange conclusion to reach as a result of this thread...

        It is true that the regular expression given here was written making
        an assumption that the only tab at the beginning of the data is the
        one that was just inserted by replacing the previous ",". However,
        that doesn't mean that regular expressions are a mess, or don't
        work... it just means there was a poor assumption made in constructing
        this one.

        Thing is, regular expressions are used to tackle problems that are
        hard to solve in other ways, and that doesn't mean those problems are
        easily solved using regular expressions, either... but often they are
        solvable.

        While there are many smart people contributing to this forum, many of
        the problems they are attempting to help with are poorly specified.

        Even this regular expression is correct, if the assumption made
        actually holds; it is unfortunate that assumptions like that sneak in
        without being stated... that's a problem with assumptions in general,
        though, not just ones regarding regular expressions.
      Your message has been successfully submitted and would be delivered to recipients shortly.