Loading ...
Sorry, an error occurred while loading the content.

RE: [cightml] Munged e-mail

Expand Messages
  • Franki
    Well....., Technically a spider downloads a page to a local machine and there it can run any test it wants, so in theory, it could send your page through a
    Message 1 of 4 , Sep 1, 2003
    • 0 Attachment
      Well.....,

      Technically a spider downloads a page to a local machine and there it can
      run any test it wants,
      so in theory, it could send your page through a browser engine that
      supports JS, and unicode extensions and if the browser could make the link
      work, then it would grab your email, and no amount of encoding and/or
      javascripting would stop it..

      The simple fact is that for the mailto: to work, the address has to be a
      link... and the spiders will just look for links and check what the browser
      engine returns as the link address and parse that.. (like what you see in
      the status bar when you mouseover a mailto link...)

      You have to weigh up the benefit compared to the time spent.. if spiders
      get complicated enough to get the JS and unicode encoding methods, there is
      nothing you can do bar break the href link and make the user do something
      to get the email address.

      Otherwise, anything that works in the browser display is fair game to a
      spider that has a browser engine running at its home base.

      Since the full source code of about half a dozen suitable browser engines
      are freely available on the net.. if that hasn't happened already it won't
      be long.
      (actually the source code is not even necessary, you can pass code to IE
      dll's and it will return results as well.)


      rgds

      Franki





      >-----Original Message-----
      >From: Gordon Reeder [mailto:greeder@...]
      >Sent: Monday, 1 September 2003 12:48 PM
      >To: cightml@yahoogroups.com
      >Subject: Re: [cightml] Munged e-mail
      >
      >
      >Good point, re encoding the mailto:.
      >
      >However a spam crawler could also just look for unicode
      >characters and assume it is a munged e-mail address. That
      >is why I swapped parts 2 and 4. But a good spambot would
      >look for the dot in the domain name and could re-assemble
      >the address. So it's not foolproof yet.
      >
      >At the risk of geting too complicated, I'm thinking of encryping
      >the strings as arrays of numbers. This would make the
      >document.write procedure a bit more complicated since it would
      >have to de-crypt the strings too.
      >
      >Franki wrote:
      >>
      >> Hi Gordon (and everyone)..
      >>
      >> Quite clever to use both the common methods together to protect against
      >> spiders that can do only one or the other..
      >>
      >> There is a tool to generate the unicode at:
      >> http://htmlfixit.com/tools.php
      >>
      >> One thing I would also do, is encode the mailto: as well..
      >> like so:
      >>
      >> <script type='text/javascript'>
      >> <!--
      >> var addr1 = 'mailto:';
      >> var addr2 = 'xprt.net';
      >> var addr3 = '@';
      >> var addr4 = 'greeder';
      >> document.write( '<a href="' + addr1 + addr4 + addr3 + addr2 +'">');
      >> document.write( addr4 + addr3 + addr2 + ' </a>');
      >> //-->
      >> </SCRIPT>
      >>
      >> The reason is pretty simple,
      >>
      >> If I was to write a script to dig email address's out of a webpages, I'd
      >> start by searching the HTML for the string "mailto:" and then
      >pulling all
      >> text between that and the following closing inverted comma.
      >> (I'll bet thats how most spambots do it too.)
      >>
      >> If you encode the "mailto:" as well, the script would fail to find the
      >> email at all let alone decode it.
      >>
      >> regards
      >>
      >> Franki
      >>
      >> >-----Original Message-----
      >> >From: Gordon Reeder [mailto:greeder@...]
      >> >Sent: Monday, 1 September 2003 6:48 AM
      >> >To: CIGHTML Mailing List
      >> >Subject: [cightml] Munged e-mail
      >> >
      >> >
      >> >There are a few tricks to munging an e-mail address. But
      >> >the spam crawlers are getting smarter so I decided to change
      >> >the way I munge my e-mail address. I thought you would find
      >> >this usefull.
      >> >
      >> >The following script will write my e-mail address onto my web page.
      >> >It is a modification of Paul's method shown in CIGHTML. There
      >> >are three things that this code does.
      >> >1) It uses unicode for the text strings instead of regular text.
      >> >2)It scrambles the order of the substrings. Note that addr4 is
      >> >writen before addr3.
      >> >3)It uses javascript to actually write the hyperlink text to the page.
      >> >
      >> >[!-- The following JavaScript will write my e-mail address to the web
      >> >page --]
      >> >[SCRIPT LANGUAGE="JavaScript"]
      >> >[!--
      >> >var addr1 = "mailto:"
      >> >var addr2 = "xprt.net"
      >> >var addr3 = "@"
      >> >var addr4 = "greeder"
      >> >document.write( '<A HREF="' + addr1 + addr4 + addr3 + addr2 +'"
      >> >CLASS="HLINK">')
      >> >document.write( addr4 + addr3 + addr2 + ' </A>')
      >> >//--]
      >> >[/SCRIPT]
      >> >
      >> >--
      >> >Gordon & Juanita Reeder
      >> >
      >> >greeder@... (Gordon)
      >> >jmr2@... (Juanita)
      >> >greeder@... (Both)
      >>
      >>
      >> To unsubscribe from the Yahoo Group, send an email to:
      >> cightml-unsubscribe@yahoogroups.com
      >>
      >> This list is for readers of Paul McFedries' CIG to Creating a Web Page.
      >>
      >> Messages prior to 2002 are also archived at http://www.mcfedries.com
      >>
      >>
      >>
      >> Your use of Yahoo! Groups is subject to
      http://docs.yahoo.com/info/terms/

      --
      Gordon & Juanita Reeder

      greeder@... (Gordon)
      jmr2@... (Juanita)
      greeder@... (Both)

      Check out my web site:
      http://www.xprt.net/~greeder


      To unsubscribe from the Yahoo Group, send an email to:
      cightml-unsubscribe@yahoogroups.com

      This list is for readers of Paul McFedries' CIG to Creating a Web Page.

      Messages prior to 2002 are also archived at http://www.mcfedries.com



      Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
    Your message has been successfully submitted and would be delivered to recipients shortly.