Loading ...
Sorry, an error occurred while loading the content.

Re: Remove content, leave markup

Expand Messages
  • diodeom
    ... This could work: ^!Jump 1 ^!Find (?s) K[^
    Message 1 of 3 , Mar 3, 2010
    • 0 Attachment
      "twinlor" <tmp5@...> wrote:
      >
      > Hi all,
      >
      > I want to remove all content leaving only html markup in an htm file, leaving hard returns 'as is'. Basically the mirror image of Modify>Strip HTML Tags>Remove All Tags.
      >
      > I've started with a positive look behind and ahead assertion
      > ^!Find "(?<=\>)[simple char class]+(?=\<)" TRS
      > This works fine for <tag>content</tag> when everything is on one line.
      >
      > Once I start trying to incorporate /R I run into trouble.
      > For instance..
      > <tag>
      > content</tag>
      > ..or..
      > <tag>content
      > </tag>
      > ...or..
      > <tag>
      > content
      > </tag>
      >
      > I want the hard returns to remain where they are.
      >
      > Any help?
      >
      > Thanks a lot.
      >
      > - David T
      >

      This could work:

      ^!Jump 1
      ^!Find "(?s)>\K[^<]++(?=<)" RS
      ^!IfError End
      ^!Replace ".++" >> "" HARS
      ^!Goto Skip_-3

      And should you have, just like in the example, some lines with no tags at all, it could be followed by:
      ^!Replace "/R{2,}" >> "\r\n" WARS
    • diodeom
      ... [snip] And so should this: ^!Replace ( | n) K[^
      Message 2 of 3 , Mar 3, 2010
      • 0 Attachment
        I wrote:
        >
        > This could work:
        >
        > ^!Jump 1
        [snip]

        And so should this:

        ^!Replace "(>|\n)\K[^<\r]++" >> "" WARS
      Your message has been successfully submitted and would be delivered to recipients shortly.