Loading ...
Sorry, an error occurred while loading the content.
 

Re: [Clip] Strip XML tags

Expand Messages
  • loro
    ... Is it the radio UserLand outline stuff you mean? Like this one? Almost all tags are empty
    Message 1 of 4 , Nov 6, 2005
      Martin ONeill wrote:
      >I think an OPML file is in XML. How can I strip XML tags. If I use
      >strip HTML tags then all the data is stripped out as well.

      Is it the radio UserLand outline stuff you mean? Like this one?
      <http://static.userland.com/gems/radiodiscuss/specification.opml>

      Almost all tags are empty and the real content is in the attribute values,
      so Strip HTML won't work. You could use ^$GetHtmlTagAttr()$ .

      As an example, to get the value of "text" and strip the rest of the
      <outline> tags in the document above you could do something like this.

      ---------------------
      H="OPML"
      ^!Jump text_start
      :loop
      ^!Find <outline IS
      ^!IfError end
      ^!Replace "^$GetHTMLTag$" >> "^$GetHtmlTagAttr("^$GetHTMLTag$";text)$" S
      ^!Goto loop
      ---------------------

      I know there can be several attributes and that there are more things both
      to strip and preserve. Just a push (hopefully) in the right direction. ;-)

      Lotta
    • Martin ONeill
      ... I m not sure as that page would not load for me. The opml I m referring to is on http://www.opml.org/spec I have tried your clip, but on a small circa 50k
      Message 2 of 4 , Nov 6, 2005
        --- In ntb-clips@yahoogroups.com, loro <loro-spam01-@t...> wrote:
        > Is it the radio UserLand outline stuff you mean? Like this one?
        > <http://static.userland.com/gems/radiodiscuss/specification.opml>

        I'm not sure as that page would not load for me. The opml I'm
        referring to is on http://www.opml.org/spec

        I have tried your clip, but on a small circa 50k file, the clip goes
        into some sort of loop and does not finish.

        I don't know anything about XML and probably need to learn a bit. Is
        there any simple online source that I could refer to? - particularly
        regarding the XML tags.

        Many thanks,
        Martin
      • loro
        ... Yeah, they have a link to sample file I posted a link to. We talk about the a same thing. That s a starting point. ;-) ... Don t know why it loops, but it
        Message 3 of 4 , Nov 6, 2005
          Martin ONeill wrote:
          >I'm not sure as that page would not load for me. The opml I'm
          >referring to is on http://www.opml.org/spec

          Yeah, they have a link to sample file I posted a link to. We talk about the
          a same thing. That's a starting point. ;-)

          >I have tried your clip, but on a small circa 50k file, the clip goes
          >into some sort of loop and does not finish.

          Don't know why it loops, but it only looks for the 'text' attributes found
          in that sample file.

          >I don't know anything about XML and probably need to learn a bit. Is
          >there any simple online source that I could refer to?

          The samples at opml.org, I guess. But you don't need to know XML, more than
          to understand the concept of tags. If you have this
          <tag>Content</tag>
          "Strip HTML" gets rid of '<tag>' and '</tag>' and leaves 'Content'.

          But this OPML format seems to build on mostly empty elements (they have no
          content), with the bits you want to keep in attributes. Like so.
          <tag attr="stuff we want here" />
          "Strip HTML" kills the whole tag and leaves you with nothing. That's why
          you must find the attribute values and save them somehow.
          ^$GetHtmlTagAttr()$ does that.

          If you post a link to a file of the type you are actually working with,
          or a long enough sample of it, someone can probably come up with something.

          Lotta
        Your message has been successfully submitted and would be delivered to recipients shortly.