Loading ...
Sorry, an error occurred while loading the content.

regular expression replacements

Expand Messages
  • Jason Grossman
    A while ago I was planning to make a binding for the PCRS regular expression library (see below), but I haven t gotten around to working out how to do
    Message 1 of 1 , Feb 2, 2005
    View Source
    • 0 Attachment
      A while ago I was planning to make a binding for the PCRS regular
      expression library (see below), but I haven't gotten around to working
      out how to do bindings. Instead, I've written the following snippet of

      // Regular expression substitutions, with wild cards.

      // usage example: "wombats are cuddly" rSubstitute
      ("([A-Za-z][ao][A-Za-z])([A-Za-z][ao][A-Za-z])", "\\2-\\1")
      // ==> "bat-woms are cuddly"

      regexCache := Map clone
      Buffer rSubstitute := method (findRegex, substituteRegex,
      r := regexCache atIfAbsentPut (findRegex, Regex clone setPattern
      (findRegex)) // cache compiled regular expressions, for speed
      r setString (self asString)
      regexMap := Map clone
      maxSubstitutionTerms := (substituteRegex length / 3 + 1 roundUp) // a
      safe upper bound on the number of substitution terms, each of which
      takes 3 chars to specify ("\\1" etc.)
      r allMatches foreach (i, v,
      if (v type != "List") then (v := list (v))
      for (n, maxSubstitutionTerms, 0,
      groupN := v at (n)
      regexMap atPut ("\\" .. n asString, groupN or "")
      self := self replace (v at (0), substituteRegex replaceMap (regexMap))

      Worth putting in the distribution?


      From: Jason Grossman <Jason.Grossman@...>
      Date: 16 September 2004 8:19:15 EDT
      To: iolanguage@yahoogroups.com
      Subject: [Io] yet more on regexp libraries
      Reply-To: iolanguage@yahoogroups.com

      Aha. This looks about perfect: http://www.oesterhelt.org/pcrs . It's
      modestly numbered version 0.03, but the web site says it's stable ...
      and since it's based on PCRE, that claim doesn't seem too unreasonable.

      Another thought I had along the way was that it might not be too
      ridiculous to bind in the whole of sed! There's a stable version of
      sed in less than 100kb at http://www.catb.org/~esr/sed . But PCRS
      would probably make for a much neater API.

      This might really help to capture part of the perl market - especially
      people who are currently tossing up between perl, ruby and python.

      From the PCRE docs:

      PCRS is a small library, written as a supplement to the PCRE library,
      that implements regex based substitution with the syntax and semantics
      of Perl's s/// operator. ...


      The function pcrs_compile() is called to compile a
      pcrs_job from a pattern, substitute and options string.
      The resulting pcrs_job structure is dynamically allocated
      and it is the caller's responsibility to call
      pcrs_free_job() when it's no longer needed.

      pcrs_compile_command() is a convenience wrapper function
      that parses a Perl command of the form s/pattern/substi-
      tute/[options] into its components and then calls
      pcrs_compile(). As in Perl, you are not bound to the '/'
      character: Whatever follows the 's' will be used as the
      delimiter. Patterns or substitutes that contain the delim-
      iter need to quote it: s/th\/is/th\/at/ will replace th/is
      by th/at and can be written more simply as s|th/is|th/at|.

      pattern, substitute, options and command must be zero-ter-
      minated C strings. substitute and options may be NULL, in
      which case they are treated like the empty string.

      Return value and diagnostics
      On success, both functions return a pointer to the com-
      piled job. On failure, NULL is returned. In that case,
      the pcrs error code is written to *err.
    Your message has been successfully submitted and would be delivered to recipients shortly.