Loading ...
Sorry, an error occurred while loading the content.

Introducing the "MetaChecker"

Expand Messages
  • Flo
    Hi all, Wouldn t it be helpful to have clip that checks all metacharacters — sequences, Posix Character Classes, and even some Unicode Property Codes (as far
    Message 1 of 1 , Mar 2, 2008
    • 0 Attachment
      Hi all,

      Wouldn't it be helpful to have clip that checks all metacharacters —
      sequences, Posix Character Classes, and even some Unicode Property
      Codes (as far as they prove useful in NT5)?

      With the following clip you can enter a metacharacter and get a list
      of characters from decimal 32 to 255 that are matched or not matched
      by that metacharacter.

      First, a wizard prompts you to choose a metacharacter or to enter any
      token (here you can also test normal characters classes [...], for
      example). Choose "Overview" to get a more detailed information on the
      metacharacters that you can select.

      For more information on Unicode Property Codes, see message
      #16561 "Using \p{Prop} Sequences".

      When testing that clip, some results have surprised me. For example
      Posix [[:cntrl:]] matches much more than I have expected.

      I hope it works for you. Please report any errors or issues that are
      missing. (Watch wrong line breaks in the clip!)


      ; Show matching characters from dec 032 to 255
      ; Flo 3/1/2008
      ^!Set %Dec%=32
      ^!Goto ^?[(H=5)Choose type:==Sequences|POSIX|Unicode Props|Enter

      :Enter any
      ^!Set %MetaChar%=^?{Enter any metacharacter:=}
      ^!Goto Start

      ^!Set %MetaChar%=^?{(H=10)Choose
      ^!Goto Start

      ^!Set %MetaChar%=^?{(H=14)Choose
      ^!Set %MetaChar%=[[:^%MetaChar%:]]
      ^!Goto Start

      :Unicode Props
      ^!Set %MetaChar%=^?{(H=14)Choose metacharacter:==\pL|\pN|\p{Nd}|\p{No}
      ^!Goto Start

      ^!Set %Char%=^$DecToChar(^%Dec%)$
      ^!IfMatch "(?-i)^%MetaChar%" "^%Char%" MatchTrue Else MatchFalse

      ^!If ^%Dec%<100 ^!Set %Dec%=0^%Dec%
      ^!Append %MatchesTrue%=[^%Dec%]^%Space%^%Char%^%NL%
      ^!Goto Loop

      ^!If ^%Dec%<100 ^!Set %Dec%=0^%Dec%
      ^!Append %MatchesFalse%=[^%Dec%]^%Space%^%Char%^%NL%

      ^!Inc %Dec%
      ^!If ^%Dec%=256 Output
      ^!Goto Start

      ^!InsertText Characters matched with ^%MetaChar%^%NL%^%NL%^%
      ^!InsertText Characters not matched with ^%MetaChar%^%NL%^%NL%^%
      ^!Goto End

      ^!Toolbar New Document

      Selected metacharacters matching dec 32 to 255 in NoteTab 5.5


      \d any decimal digit
      \D any character that is not a decimal digit
      \h any horizontal whitespace
      \H any character that is not a horizontal whitespace
      \s any whitespace character
      \S any character that is not a whitespace character
      \v any vertical whitespace
      \V any character that is not a vertical whitespace
      \w any "word" character
      \W any "non-word" character

      POSIX Character Classes

      alnum letters and digits
      alpha letters
      ascii character codes 0 - 127
      blank space or tab only
      cntrl control characters
      digit decimal digits
      graph printing characters, excluding space
      lower lower case letters
      print printing characters, including space
      punct printing characters, excluding letters and digits
      space white space
      upper upper case letters
      word "word" characters
      xdigit hexadecimal digits

      (Selected) Unicode Property Codes

      \pL any kind of letter from any language
      \pN any kind of numeric character in any script
      \p{Nd} a digit zero through nine in any script except ideographic
      \p{No} a superscript or subscript digit, or a number that is not a
      digit 0..9
      (excluding numbers from ideographic scripts)
      \pP any kind of punctuation character
      \p{Pe} any kind of closing bracket
      \p{Po} any kind of punctuation character that is not a dash, bracket,
      or connector
      \p{Ps} any kind of opening bracket
      \pS math symbols, currency signs, dingbats, box-drawing
      characters, etc
      \p{Sc} any currency sign
      \p{Sk} a combining character (mark) as a full character on its own
      \p{Sm} any mathematical symbol
      \p{So} various symbols that are not math symbols, currency signs, or
      \pZ any kind of whitespace or invisible separator (dec 32 + 160)

      ^!Jump 1
      ; end of clip
    Your message has been successfully submitted and would be delivered to recipients shortly.