Loading ...
Sorry, an error occurred while loading the content.
 

Re: [Clip] modify strip html preserve urls

Expand Messages
  • loro
    ... Are you sure? What version? Because what Notetab has always done before is turning this
    Message 1 of 15 , Nov 2 9:26 PM
      Don wrote:
      >when I do this I get the <a href="whatever.html"> but it deletes the
      >corresponding </a>
      >I want the </a> left alone as well .... am I missing something?

      Are you sure? What version? Because what Notetab has always done
      before is turning this
      <a href="http://...>Link text</a>
      into this.
      <http://....>Link text

      No A HREF, just whatever-they-are-called brackets around the URL.

      Lotta
    • Don - HtmlFixIt.com
      so that is expected behavior then ... I thought it would actually leave the tags, need to clean tags another way then
      Message 2 of 15 , Nov 2 9:45 PM
        so that is expected behavior then ... I thought it would actually leave
        the tags, need to clean tags another way then

        loro wrote:
        > Don wrote:
        >> when I do this I get the <a href="whatever.html"> but it deletes the
        >> corresponding </a>
        >> I want the </a> left alone as well .... am I missing something?
        >
        > Are you sure? What version? Because what Notetab has always done
        > before is turning this
        > <a href="http://...>Link text</a>
        > into this.
        > <http://....>Link text
        >
        > No A HREF, just whatever-they-are-called brackets around the URL.
        >
        > Lotta
        >
        >
        >
        > ------------------------------------
        >
        > Fookes Software: http://www.fookes.com/
        > NoteTab website: http://www.notetab.com/
        > NoteTab Discussion Lists: http://www.notetab.com/groups.php
        >
        > ***
        > Yahoo! Groups Links
        >
        >
        >
        >
      • Axel Berger
        ... Well, it s called strip HTML , innit? ... The sequence should be easy to find. Just wrap whatever you want around it. Axel
        Message 3 of 15 , Nov 3 6:26 AM
          "Don - HtmlFixIt.com" wrote:
          > I thought it would actually leave the tags

          Well, it's called "strip HTML", innit?

          > need to clean tags another way then

          The "<http://(.*?)>" sequence should be easy to find. Just wrap whatever
          you want around it.

          Axel
        • Don - HtmlFixIt.com
          I guess in my mind I took preserve urls as preserve hyperlinks. Here is the rub, I want to keep hyperlinks and delete everything else in the html world. So
          Message 4 of 15 , Nov 3 7:12 AM
            I guess in my mind I took preserve urls as preserve hyperlinks.

            Here is the rub, I want to keep hyperlinks and delete everything else in
            the html world. So can I simply look for something like:
            <not \a or \a href> and replace with nothing?
            My not regex is pretty (can't say sloppy because it doesn't exist).

            Axel Berger wrote:
            > "Don - HtmlFixIt.com" wrote:
            >> I thought it would actually leave the tags
            >
            > Well, it's called "strip HTML", innit?
            >
            >> need to clean tags another way then
            >
            > The "<http://(.*?)>" sequence should be easy to find. Just wrap whatever
            > you want around it.
            >
            > Axel
            >
          • John Shotsky
            It might be easier to strip the html then rebuild the links. What does the source look like? John From: ntb-clips@yahoogroups.com
            Message 5 of 15 , Nov 3 7:26 AM
              It might be easier to strip the html then rebuild the links. What does the
              source look like?

              John



              From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf
              Of Don - HtmlFixIt.com
              Sent: Tuesday, November 03, 2009 7:13 AM
              To: ntb-clips@yahoogroups.com
              Subject: Re: [Clip] modify strip html preserve urls





              I guess in my mind I took preserve urls as preserve hyperlinks.

              Here is the rub, I want to keep hyperlinks and delete everything else in
              the html world. So can I simply look for something like:
              <not \a or \a href> and replace with nothing?
              My not regex is pretty (can't say sloppy because it doesn't exist).

              Axel Berger wrote:
              > "Don - HtmlFixIt.com" wrote:
              >> I thought it would actually leave the tags
              >
              > Well, it's called "strip HTML", innit?
              >
              >> need to clean tags another way then
              >
              > The "<http://(.*?)>" sequence should be easy to find. Just wrap whatever
              > you want around it.
              >
              > Axel
              >





              [Non-text portions of this message have been removed]
            • Axel Berger
              ... Yes, I suppose you can. But I suggest my way is simpler, erase all HTML through the menu function and then find the URLs and resore the tag around them.
              Message 6 of 15 , Nov 3 7:31 AM
                "Don - HtmlFixIt.com" wrote:
                > So can I simply look for something like:
                > <not \a or \a href> and replace with nothing?

                Yes, I suppose you can. But I suggest my way is simpler, erase all HTML
                through the menu function and then find the URLs and resore the tag
                around them. Caution: My suggested find will not work for local relative
                references without the "http:".

                Axel
              • Don - HtmlFixIt.com
                Won t work Axel, the has been removed. I suppose I could put a marker in there ...
                Message 7 of 15 , Nov 3 7:58 AM
                  Won't work Axel, the </a> has been removed. I suppose I could put a
                  marker in there ...

                  Axel Berger wrote:
                  > "Don - HtmlFixIt.com" wrote:
                  >> So can I simply look for something like:
                  >> <not \a or \a href> and replace with nothing?
                  >
                  > Yes, I suppose you can. But I suggest my way is simpler, erase all HTML
                  > through the menu function and then find the URLs and resore the tag
                  > around them. Caution: My suggested find will not work for local relative
                  > references without the "http:".
                  >
                  > Axel
                • Axel Berger
                  ... Yes, but isn t some clickable text unexpected and rather misleading in a non-HTML environment? Wouldn t it be better to show the URL verbatim and make that
                  Message 8 of 15 , Nov 3 8:15 AM
                    "Don - HtmlFixIt.com" wrote:
                    > Won't work Axel, the </a> has been removed.

                    Yes, but isn't some clickable text unexpected and rather misleading in a
                    non-HTML environment? Wouldn't it be better to show the URL verbatim and
                    make that clickable?

                    It depends on what kind of end result you have in mind I suppose, and
                    I'm not as yet very sure what it is you want to achieve.

                    Axel
                  • Don - HtmlFixIt.com
                    No I am taking information and moving it from one html environment to another where mark-up is not needed to get my result. Formatting is handled in the upload
                    Message 9 of 15 , Nov 3 11:10 AM
                      No I am taking information and moving it from one html environment to
                      another where mark-up is not needed to get my result.
                      Formatting is handled in the upload method I am using by line breaks vs
                      <p> tags for example and that is converted via php to render the p tag
                      -- so I no longer need the p tag and so forth. The reality is that I am
                      going back into an html environment, but loosing mark-up styling
                      information while preserving links and maybe one or two other tags while
                      losing the rest.

                      So is there a easy regex I can use for find all tags that aren't a href
                      or /a and delete them. Ideally I may add another or to it as well.
                      So maybe this: find <^a href.*?|\a|ul|li> and delete or replace with
                      nothing.

                      So that would be ^=not a href or \a or ul or li ... or something like
                      that. Syntax help on alternative nots?

                      Not's are new to my regex brain.

                      Don
                      Axel Berger wrote:
                      > "Don - HtmlFixIt.com" wrote:
                      >> Won't work Axel, the </a> has been removed.
                      >
                      > Yes, but isn't some clickable text unexpected and rather misleading in a
                      > non-HTML environment? Wouldn't it be better to show the URL verbatim and
                      > make that clickable?
                      >
                      > It depends on what kind of end result you have in mind I suppose, and
                      > I'm not as yet very sure what it is you want to achieve.
                      >
                      > Axel
                      >
                      >
                    • ebbtidalflats
                      Don, Constructing a NOT pattern is fairly difficult because the [^...] deals with character sets, not patterns, and might need several passes to deal with the
                      Message 10 of 15 , Nov 4 9:16 AM
                        Don,

                        Constructing a NOT pattern is fairly difficult because the [^...] deals with character sets, not patterns, and might need several passes to deal with the variety.

                        Instead in only three replace actions you could:
                        1. tokenize the links (for example {a href ...}button text{/a})
                        2. strip all html (no need to preserve URLs, since they don't exist anymore)
                        3. then restore the tokenized links to html links.

                        Cheers,


                        Eb

                        --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...> wrote:
                        >
                        > No I am taking information and moving it from one html environment to
                        > another where mark-up is not needed to get my result.
                        > Formatting is handled in the upload method I am using by line breaks vs
                        > <p> tags for example and that is converted via php to render the p tag
                        > -- so I no longer need the p tag and so forth. The reality is that I am
                        > going back into an html environment, but loosing mark-up styling
                        > information while preserving links and maybe one or two other tags while
                        > losing the rest.
                        >
                        > So is there a easy regex I can use for find all tags that aren't a href
                        > or /a and delete them. Ideally I may add another or to it as well.
                        > So maybe this: find <^a href.*?|\a|ul|li> and delete or replace with
                        > nothing.
                        >
                        > So that would be ^=not a href or \a or ul or li ... or something like
                        > that. Syntax help on alternative nots?
                        >
                        > Not's are new to my regex brain.
                        >
                        > Don
                        > Axel Berger wrote:
                        > > "Don - HtmlFixIt.com" wrote:
                        > >> Won't work Axel, the </a> has been removed.
                        > >
                        > > Yes, but isn't some clickable text unexpected and rather misleading in a
                        > > non-HTML environment? Wouldn't it be better to show the URL verbatim and
                        > > make that clickable?
                        > >
                        > > It depends on what kind of end result you have in mind I suppose, and
                        > > I'm not as yet very sure what it is you want to achieve.
                        > >
                        > > Axel
                        > >
                        > >
                        >
                      • Sheri
                        ... Try: ^!Replace ]* ) RAWS0 Regards, Sheri
                        Message 11 of 15 , Nov 5 8:34 AM
                          --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...> wrote:
                          >
                          > Here is the rub, I want to keep hyperlinks and delete everything
                          > else in the html world. So can I simply look for something like:
                          > <not \a or \a href> and replace with nothing?
                          > My not regex is pretty (can't say sloppy because it doesn't exist).

                          Try:
                          ^!Replace "<(?(?=/?a)(*FAIL)|[^>]*>)" >> "" RAWS0

                          Regards,
                          Sheri
                        • Sheri
                          ... also this: ^!Replace ]* RAWS0
                          Message 12 of 15 , Nov 5 12:35 PM
                            --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
                            >
                            > --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@> wrote:
                            > >
                            > > Here is the rub, I want to keep hyperlinks and delete everything
                            > > else in the html world. So can I simply look for something like:
                            > > <not \a or \a href> and replace with nothing?
                            > > My not regex is pretty (can't say sloppy because it doesn't exist).
                            >
                            > Try:
                            > ^!Replace "<(?(?=/?a)(*FAIL)|[^>]*>)" >> "" RAWS0

                            also this:

                            ^!Replace "<(?!/?+a)[^>]*>" >> "" RAWS0
                          • Don - HtmlFixIt.com
                            dang I was still working on the first puzzle! does that cover either an a tag or an /a tag then Sheri?
                            Message 13 of 15 , Nov 5 12:59 PM
                              dang I was still working on the first puzzle!
                              does that cover either an a tag or an /a tag then Sheri?

                              Sheri wrote:
                              >
                              > --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
                              >> --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@> wrote:
                              >>> Here is the rub, I want to keep hyperlinks and delete everything
                              >>> else in the html world. So can I simply look for something like:
                              >>> <not \a or \a href> and replace with nothing?
                              >>> My not regex is pretty (can't say sloppy because it doesn't exist).
                              >> Try:
                              >> ^!Replace "<(?(?=/?a)(*FAIL)|[^>]*>)" >> "" RAWS0
                              >
                              > also this:
                              >
                              > ^!Replace "<(?!/?+a)[^>]*>" >> "" RAWS0
                            • Sheri
                              ... Should preserve both opening and closing a tags, while otherwise removing tags. Tags are not being confined to single lines. To confine to single lines
                              Message 14 of 15 , Nov 5 2:24 PM
                                --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...> wrote:
                                >
                                > dang I was still working on the first puzzle!
                                > does that cover either an a tag or an /a tag then Sheri?

                                Should preserve both opening and closing "a" tags, while otherwise removing tags. Tags are not being confined to single lines. To confine to single lines the character class [^>]* would need to be [^>\r\n]*

                                >
                                > Sheri wrote:
                                > >
                                > > --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@> wrote:
                                > >> --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@> wrote:
                                > >>> Here is the rub, I want to keep hyperlinks and delete everything
                                > >>> else in the html world. So can I simply look for something like:
                                > >>> <not \a or \a href> and replace with nothing?
                                > >>> My not regex is pretty (can't say sloppy because it doesn't exist).
                                > >> Try:
                                > >> ^!Replace "<(?(?=/?a)(*FAIL)|[^>]*>)" >> "" RAWS0
                                > >
                                > > also this:
                                > >
                                > > ^!Replace "<(?!/?+a)[^>]*>" >> "" RAWS0
                                >
                              Your message has been successfully submitted and would be delivered to recipients shortly.