Loading ...
Sorry, an error occurred while loading the content.

Re: [Clip] HTML to wikiHow conversion clip

Expand Messages
  • Mr. Phillip Sand Hansel II
    Sorry for not providing a better example. My goal is to convert bona-fide HTML into the simpler wiki format. My example clip works fairly well, but leaves some
    Message 1 of 8 , Dec 1, 2009
    • 0 Attachment
      Sorry for not providing a better example. My goal is to convert bona-fide
      HTML into the simpler wiki format. My example clip works fairly well, but
      leaves some hand clean-up. I run a Toolbar Modify Change HTML tags to
      UpperCase first, run my script, then run a Toolbar Modify Strip HTML tags to
      get rid of "non-wiki supported" HTML tags that I may have missed.

      Since wiki auto-creates thumbnails, I chose to simply delete the WIDTH=123,
      HEIGHT=456, BORDER=3 image modifiers.

      Convert this...
      <H1>Trebuchet Trials</H1>
      <p>
      <ol>
      <li>The base can not exceed one meter in length and five decimeters in
      width.
      <li>The throwing arm can not exceed 1.5 meters in length.
      <li>The catapult must have a locking device.
      </ol>
      <p>
      <ul>
      <li>Estimate and obtain adequate amount of materials.
      <li>The design called for four eight foot long two-by-twos, so purchase
      five.
      <li>All 5 were used, with about two feet left over.
      </ul>
      <p>
      The <A HREF="http://members.iinet.net.au/~rmine/gctrebs.html">Grey Company
      Trebuchet</A> site...
      <p>
      <IMG SRC="buildbase.jpg" WIDTH="300" HEIGHT="225" BORDER="3" ALT="Build A
      Base">


      To this...
      == Trebuchet Trials ==

      #The base can not exceed one meter in length and 0.5 meter in width.
      #The throwing arm can not exceed 1.5 meters in length.
      #The catapult must have a locking device.

      *Estimate and obtain adequate amount of materials.
      *The design called for four eight foot long two-by-twos, so purchase five.
      *All 5 were used, with about two feet left over.

      The [[http://members.iinet.net.au/~rmine/gctrebs.html | Grey Company
      Trebuchet]] site...

      [[Image:buildbase.jpg |thumb| Build a base]]




      Mr. Phillip Sand Hansel II


      ----- Original Message -----
      From: John Shotsky
      To: ntb-clips@yahoogroups.com
      Sent: Tuesday, December 01, 2009 6:33 PM
      Subject: RE: [Clip] HTML to wikiHow conversion clip



      Phillip, it would be a little easier if you include one actual example of
      each line you want to convert, and then how
      they should appear after the conversion.

      Regards,

      John

      From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf
      Of Mr. Phillip Sand Hansel II
      Sent: Tuesday, December 01, 2009 2:21 PM
      To: ntb-clips@yahoogroups.com
      Subject: Re: [Clip] HTML to wikiHow conversion clip

      Greetings;

      I saw there was some discussion in the past on using NoteTab to make wiki
      documents, but could not find a clipbook on
      converting HTML to wikiHow syntax.

      I am trying to convert web pages to something wikiHow understands, and have
      come up with a substitution clip that works
      fairly well, but is probably not as elegant as possible. My humble approach
      attacks the start and end tags individually.

      WikiHow has it's own simple syntax. There is a second level header denoted
      by == Header 2 ==.
      There are URL links denoted by [[link description]].
      There are image links denoted by [[Image: imagename.jpg|thumb|
      description]].
      There are table markup tags; which are too complicated for me to understand
      how to translate (out of my scope,
      presently).

      My clip so far looks like this... any suggestions appreciated.

      ; convert HTML to WikiHow markup
      ^!Jump DOC_START
      ; Convert various Headers to 2nd level hdr used by wikiHow.
      ^!Replace <H1> == SICHA
      ^!Replace <H2> == SICHA
      ^!Replace <H3> == SICHA
      ^!Replace <H4> == SICHA
      ^!Replace <H5> == SICHA
      ^!Replace </H1> == SICHA
      ^!Replace </H2> == SICHA
      ^!Replace </H3> == SICHA
      ^!Replace </H4> == SICHA
      ^!Replace </H5> == SICHA
      ;Convert List items to wikiHow ordered list flags (#)
      ^!Replace <LI> # SICHA
      ^!Replace </LI> SICHA
      ;Convert paragraphs to wikiHow ordered list flags? (#)
      ^!Replace <P> # SICHA
      ^!Replace </P> SICHA
      ;Convert links to wikiHow syntax
      ^!REPLACE <A HREF=" [[ SICHA
      ^!REPLACE </A> ]] SICHA

      ;Convert image links to wikiHow syntax
      ^!REPLACE <IMG SRC=" [Image: SICHA
      ; replace image attributes with nothing
      ^!Replace "WIDTH=\"...\"" >> "" WRS
      ^!Replace "HEIGHT=\"...\"" >> "" WRS
      ^!Replace "BORDER=\".*\"" >> "" WRS
      ^!Replace "ALT=\".+\"" >> "" WRS
      ;Fix straggler closing angle brackets?
      ^!Replace " > "" WRS
      ^!Replace " > "" WRS
      ^!Replace " > "" WRS
      ;And then add "strip HTML markup" menu item? Currently doing manually after
      above substitutions have been made.

      Mr. Phillip Sand Hansel II

      [Non-text portions of this message have been removed]

      [Non-text portions of this message have been removed]
    • Sheri
      Hi Phillip, I gave it a quick stab, no guarantees. I m sure documents will still need some cleanup after using it. There s at least one long line that will
      Message 2 of 8 , Dec 2, 2009
      • 0 Attachment
        Hi Phillip,

        I gave it a quick stab, no guarantees. I'm sure documents will still need some cleanup after using it. There's at least one long line that will need to be joined after copying this clip from email or archives (the IMG line).

        Regards,
        Sheri

        ^!Replace "(?is)<H\d>(.+?)</H\d>" >> "== $1 ==" RAWS
        ^!Jump Doc_Start
        :olloop
        ^!Find "(?si)<ol>.+?</ol>" RS
        ^!Iferror olloopend
        ^!Replace "<li>" >> "#" RAHS
        ^!Jump Select_End
        ^!Goto olloop
        :olloopend
        ^!Jump Doc_start
        :ulloop
        ^!Find "(?si)<ul>.+?</ul>" RS
        ^!Iferror links
        ^!Replace "<li>" >> "*" RAHS
        ^!Jump Select_End
        ^!Goto ulloop
        :links
        ^!Replace "(?is)<A HREF=\x22(.+?)\x22>(.*?)</A>" >> "[[$1|$2]]" RAWS
        :images
        ^!Replace "(?is)<IMG SRC=\x22.*?ALT=\x22(.+?)\x22>" >> "[[Image:$1|thumb|$2]]" RAWS
        ^!Replace "(?is)<p>\R?" >> "*" RAWS
        :tags
        ^!Replace "(?i)<[^>]+>" >> "" RAWS
        ;end of clip
      • Mr. Phillip Sand Hansel II
        Sheri: I knew that what I had mostly worked, but I also see that your approach is much cleaner and more direct. Thank you for leaving me a character building
        Message 3 of 8 , Dec 2, 2009
        • 0 Attachment
          Sheri:

          I knew that what I had mostly worked, but I also see that your approach is
          much cleaner and more direct.

          Thank you for leaving me a "character building challenge"; it helped me to
          more fully understand what the code was doing. :-)

          The Image replacement step did not work as expected; that caused me to read
          some help, but when I quickly got lost on the PCRE patterns section, I
          simply sat and stared at the code until I figured out some () were missing.
          I also added an extra set of double quotes (\x22 to close the imagename.jpg
          variable) and a middle variable, $2, which is the WIDTH & HEIGHT stuff I
          throw away.

          I changed...
          ^!Replace "(?is)<IMG SRC=\x22.*?ALT=\x22(.+?)\x22>" >>
          "[[Image:$1|thumb|$2]]" RAWS

          To...
          ^!Replace "(?is)<IMG SRC=\x22(.+?)\x22(.+?)ALT=\x22(.+?)\x22>" >>
          "[[Image:$1|thumb|$3]]" RAWS

          And then it did work as expected. Thank your for the improved method, and
          for making me grow.

          I will test some more real life conversions of HTML fles, and then perhaps
          create a wikiHow page on the topic. It is a useful tool if you've got HTML
          and want to share what is says with wikiFolk.


          Mr. Phillip Sand Hansel II


          ----- Original Message -----
          From: Sheri
          To: ntb-clips@yahoogroups.com
          Sent: Wednesday, December 02, 2009 4:49 PM
          Subject: Re: [Clip] HTML to wikiHow conversion clip



          Hi Phillip,

          I gave it a quick stab, no guarantees. I'm sure documents will still need
          some cleanup after using it. There's at least one long line that will need
          to be joined after copying this clip from email or archives (the IMG line).

          Regards,
          Sheri

          ^!Replace "(?is)<H\d>(.+?)</H\d>" >> "== $1 ==" RAWS
          ^!Jump Doc_Start
          :olloop
          ^!Find "(?si)<ol>.+?</ol>" RS
          ^!Iferror olloopend
          ^!Replace "<li>" >> "#" RAHS
          ^!Jump Select_End
          ^!Goto olloop
          :olloopend
          ^!Jump Doc_start
          :ulloop
          ^!Find "(?si)<ul>.+?</ul>" RS
          ^!Iferror links
          ^!Replace "<li>" >> "*" RAHS
          ^!Jump Select_End
          ^!Goto ulloop
          :links
          ^!Replace "(?is)<A HREF=\x22(.+?)\x22>(.*?)</A>" >> "[[$1|$2]]" RAWS
          :images
          ^!Replace "(?is)<IMG SRC=\x22.*?ALT=\x22(.+?)\x22>" >>
          "[[Image:$1|thumb|$2]]" RAWS
          ^!Replace "(?is)<p>\R?" >> "*" RAWS
          :tags
          ^!Replace "(?i)<[^>]+>" >> "" RAWS
          ;end of clip
        • flo.gehrke
          ... Hi Sheri, I would like to point out just some small details regarding the last 8 lines (I ve added some line numbers for description)... 1. :links 2.
          Message 4 of 8 , Dec 3, 2009
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "Sheri" <silvermoonwoman@...> wrote:
            >
            > I gave it a quick stab, no guarantees. I'm sure documents will still need some cleanup after using it. There's at least one long line that will need to be joined after copying this clip from email or archives (the IMG line).
            >
            > Regards,
            > Sheri
            >
            > ^!Replace "(?is)<H\d>(.+?)</H\d>" >> "== $1 ==" RAWS
            > ^!Jump Doc_Start...

            Hi Sheri,

            I would like to point out just some small details regarding the last 8 lines (I've added some line numbers for description)...

            1. :links
            2. ^!Replace "(?is)<A HREF=\x22(.+?)\x22>(.*?)</A>" >> "[[$1|$2]]" RAWS
            3. :images
            4. ^!Replace "(?is)<IMG SRC=\x22.*?ALT=\x22(.+?)\x22>" >> "[[Image:$1|thumb|$2]]" RAWS
            5. ^!Replace "(?is)<p>\R?" >> "*" RAWS
            6. :tags
            7. ^!Replace "(?i)<[^>]+>" >> "" RAWS
            8. ;end of clip


            Line #2: I think, Phillip wants to see some more space in the replacement.

            Line #4: For me, it doesn't capture a second substring, so '$2' remains empty and is literally output with the replacement.

            Line #5: Produces some asterisks which I can't see in Phillip's result (omitted in my proposal).

            What would you think of replacing the last lines with...

            :links
            ^!Replace "(?is)<A HREF=\x22(.+?)\x22>(.*?)</A>" >> "[[$1\x20|\x20$2]]" RAWS
            :images
            ; Next line extended
            ^!Replace "(?isx) <IMG\x20SRC=\x22 ([^\x22]+) \x22.+ALT=\x22 (.+) \x22>" >> "[[Image:$1\x20|thumb|\x20$2]]" AWRS
            :tags
            ^!Replace "(<[^>]+>\R)+" >> "\r\n" AWRS
            ; Join the image-line
            ^!Find "^\[\[Image:\C+\]" WRS
            ^!Menu Modify/Lines/Join Lines
            ^!Jump 1
            ; end of clip


            Probably, some more improvements are needed. For example, joining more lines?

            Regards,
            Flo
          • Sheri
            ... LOL, No wonder Phillip says I gave him cause to stare at the documentation Sorry Phillip! ;) I saw that he d been waiting awhile for help, felt badly and
            Message 5 of 8 , Dec 3, 2009
            • 0 Attachment
              --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
              >
              > I would like to point out just some small details regarding the
              > last 8 lines (I've added some line numbers for description)...
              >

              LOL, No wonder Phillip says I gave him cause to stare at the documentation Sorry Phillip! ;)

              I saw that he'd been waiting awhile for help, felt badly and did what I could on the fly. I don't have time, but I'm glad if you and rest of the group can help him further refine and improve it.

              Regards,
              Sheri
            • Mr. Phillip Sand Hansel II
              The first off the cuff effort was great and caused me to grow. It also caused me to upgrade to the latest Light version (I was proud that I had actually paid
              Message 6 of 8 , Dec 3, 2009
              • 0 Attachment
                The first "off the cuff" effort was great and caused me to grow. It also caused me to upgrade to the latest Light version (I was proud that I had actually paid for 4.5 several years back... pride is funny.)

                You've both been more than helpful, I feel that the response was very prompt. I've incorporated Flo's suggestions and they work great.

                Thank you (and the rest of the group) for being there, you've turned a repetitive editing process into a one-click-fix. I think I understand enough to extend what I've got to other cases, should they arise.

                Mr. Phillip Sand Hansel II


                ----- Original Message -----
                From: Sheri
                To: ntb-clips@yahoogroups.com
                Sent: Thursday, December 03, 2009 2:10 PM
                Subject: Re: [Clip] HTML to wikiHow conversion clip



                --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                >
                > I would like to point out just some small details regarding the
                > last 8 lines (I've added some line numbers for description)...
                >

                LOL, No wonder Phillip says I gave him cause to stare at the documentation Sorry Phillip! ;)

                I saw that he'd been waiting awhile for help, felt badly and did what I could on the fly. I don't have time, but I'm glad if you and rest of the group can help him further refine and improve it.

                Regards,
                Sheri





                [Non-text portions of this message have been removed]
              Your message has been successfully submitted and would be delivered to recipients shortly.