Loading ...
Sorry, an error occurred while loading the content.

Remove content, leave markup

Expand Messages
  • twinlor
    Hi all, I want to remove all content leaving only html markup in an htm file, leaving hard returns as is . Basically the mirror image of Modify Strip HTML
    Message 1 of 3 , Mar 3, 2010
    • 0 Attachment
      Hi all,

      I want to remove all content leaving only html markup in an htm file, leaving hard returns 'as is'. Basically the mirror image of Modify>Strip HTML Tags>Remove All Tags.

      I've started with a positive look behind and ahead assertion
      ^!Find "(?<=\>)[simple char class]+(?=\<)" TRS
      This works fine for <tag>content</tag> when everything is on one line.

      Once I start trying to incorporate /R I run into trouble.
      For instance..
      <tag>
      content</tag>
      ..or..
      <tag>content
      </tag>
      ...or..
      <tag>
      content
      </tag>

      I want the hard returns to remain where they are.

      Any help?

      Thanks a lot.

      - David T
    • diodeom
      ... This could work: ^!Jump 1 ^!Find (?s) K[^
      Message 2 of 3 , Mar 3, 2010
      • 0 Attachment
        "twinlor" <tmp5@...> wrote:
        >
        > Hi all,
        >
        > I want to remove all content leaving only html markup in an htm file, leaving hard returns 'as is'. Basically the mirror image of Modify>Strip HTML Tags>Remove All Tags.
        >
        > I've started with a positive look behind and ahead assertion
        > ^!Find "(?<=\>)[simple char class]+(?=\<)" TRS
        > This works fine for <tag>content</tag> when everything is on one line.
        >
        > Once I start trying to incorporate /R I run into trouble.
        > For instance..
        > <tag>
        > content</tag>
        > ..or..
        > <tag>content
        > </tag>
        > ...or..
        > <tag>
        > content
        > </tag>
        >
        > I want the hard returns to remain where they are.
        >
        > Any help?
        >
        > Thanks a lot.
        >
        > - David T
        >

        This could work:

        ^!Jump 1
        ^!Find "(?s)>\K[^<]++(?=<)" RS
        ^!IfError End
        ^!Replace ".++" >> "" HARS
        ^!Goto Skip_-3

        And should you have, just like in the example, some lines with no tags at all, it could be followed by:
        ^!Replace "/R{2,}" >> "\r\n" WARS
      • diodeom
        ... [snip] And so should this: ^!Replace ( | n) K[^
        Message 3 of 3 , Mar 3, 2010
        • 0 Attachment
          I wrote:
          >
          > This could work:
          >
          > ^!Jump 1
          [snip]

          And so should this:

          ^!Replace "(>|\n)\K[^<\r]++" >> "" WARS
        Your message has been successfully submitted and would be delivered to recipients shortly.