Loading ...
Sorry, an error occurred while loading the content.

Re: [xenu-usergroup] PHP and session IDs

Expand Messages
  • Tilman Hausherr
    what if you enable cookies? Tilman
    Message 1 of 5 , Nov 6, 2005
    • 0 Attachment
      what if you enable cookies?

      Tilman

      On Sun, 06 Nov 2005 14:03:37 -0000, frank visser wrote:

      >hi all,
      >
      >when scanning this site:
      >http://stuartdavis.com/
      >
      >i get a huge multiplication of links due to the PHP system of
      >session id's:
      >
      >for example:
      >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple
      >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
      >PHPSESSID=6446040253bfd5caadea71f2973082b7
      >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
      >PHPSESSID=f676e48f73751ccef6c553cd33379eed
      >etc., etc.
      >
      >they all refer to the same page/URL.
      >
      >is there a way around this? i have stopped scanning after 40.000
      >links were scanned, for this gets a bit impractical.
      >
      >i could of course exclude URLs that refer to only one page (a Search
      >page or a Contact page), but I am not sure if all of these are of
      >this type.
      >
      >any input is welcome,
      >
      >frank
      >
      >
      >
      >
      >
      >
      >
      >
      >Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
    • frank visser
      hi tilman, i have enabled all cookies now: direct, indirect and session (under Privacy Advanced Disable automatic cookie handling). seems i get less
      Message 2 of 5 , Nov 6, 2005
      • 0 Attachment
        hi tilman,

        i have enabled all cookies now: direct, indirect and session (under
        Privacy > Advanced > Disable automatic cookie handling).

        seems i get less session id URLs, but still find a lot of them, some
        cases 33 for one URL:

        http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
        PHPSESSID=1a5f4aa3adcc1bd55627a196bf16a7bb
        http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
        PHPSESSID=26bd7acfe18ba32f9e2c23c5f19c61d5
        etc.

        is the logic that if cookies are disabled, a session ID is added to
        the URL?

        are these session URLs generated by Xenu while spidering?

        when i look up the properties of these URLs, they all have different
        URLs under "pages linking to this one", for example:

        http://stuartdavis.com/book/print/429?sort=asc&order=Venue&from=50
        http://stuartdavis.com/book/print/429?sort=asc&order=City&from=0
        http://stuartdavis.com/book/print/429?from=50&sort=asc&order=State

        is this all generated by Xenu?

        any input is helpful,

        frank

        --- In xenu-usergroup@yahoogroups.com, Tilman Hausherr <tilman@s...>
        wrote:
        >
        > what if you enable cookies?
        >
        > Tilman
        >
        > On Sun, 06 Nov 2005 14:03:37 -0000, frank visser wrote:
        >
        > >hi all,
        > >
        > >when scanning this site:
        > >http://stuartdavis.com/
        > >
        > >i get a huge multiplication of links due to the PHP system of
        > >session id's:
        > >
        > >for example:
        > >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple
        > >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
        > >PHPSESSID=6446040253bfd5caadea71f2973082b7
        > >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
        > >PHPSESSID=f676e48f73751ccef6c553cd33379eed
        > >etc., etc.
        > >
        > >they all refer to the same page/URL.
        > >
        > >is there a way around this? i have stopped scanning after 40.000
        > >links were scanned, for this gets a bit impractical.
        > >
        > >i could of course exclude URLs that refer to only one page (a
        Search
        > >page or a Contact page), but I am not sure if all of these are of
        > >this type.
        > >
        > >any input is welcome,
        > >
        > >frank
        > >
        > >
        > >
        > >
        > >
        > >
        > >
        > >
        > >Yahoo! Groups Links
        > >
        > >
        > >
        > >
        > >
        > >
        >
      • Tilman Hausherr
        No, I m talking about *enabling* cookies in Xenu. http://home.snafu.de/tilman/xenulink.html#cookies (I believe this was originally made for you!!!!) I have no
        Message 3 of 5 , Nov 6, 2005
        • 0 Attachment
          No, I'm talking about *enabling* cookies in Xenu.
          http://home.snafu.de/tilman/xenulink.html#cookies

          (I believe this was originally made for you!!!!)

          I have no idea how Xenu handles cookies. I don't do anything myself, I
          let the WININET.DLL handle it (when enabled).

          Tilman


          On Sun, 06 Nov 2005 19:18:46 -0000, frank visser wrote:

          >hi tilman,
          >
          >i have enabled all cookies now: direct, indirect and session (under
          >Privacy > Advanced > Disable automatic cookie handling).
          >
          >seems i get less session id URLs, but still find a lot of them, some
          >cases 33 for one URL:
          >
          >http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
          >PHPSESSID=1a5f4aa3adcc1bd55627a196bf16a7bb
          >http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
          >PHPSESSID=26bd7acfe18ba32f9e2c23c5f19c61d5
          >etc.
          >
          >is the logic that if cookies are disabled, a session ID is added to
          >the URL?
          >
          >are these session URLs generated by Xenu while spidering?
          >
          >when i look up the properties of these URLs, they all have different
          >URLs under "pages linking to this one", for example:
          >
          >http://stuartdavis.com/book/print/429?sort=asc&order=Venue&from=50
          >http://stuartdavis.com/book/print/429?sort=asc&order=City&from=0
          >http://stuartdavis.com/book/print/429?from=50&sort=asc&order=State
          >
          >is this all generated by Xenu?
          >
          >any input is helpful,
          >
          >frank
          >
          >--- In xenu-usergroup@yahoogroups.com, Tilman Hausherr <tilman@s...>
          >wrote:
          >>
          >> what if you enable cookies?
          >>
          >> Tilman
          >>
          >> On Sun, 06 Nov 2005 14:03:37 -0000, frank visser wrote:
          >>
          >> >hi all,
          >> >
          >> >when scanning this site:
          >> >http://stuartdavis.com/
          >> >
          >> >i get a huge multiplication of links due to the PHP system of
          >> >session id's:
          >> >
          >> >for example:
          >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple
          >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
          >> >PHPSESSID=6446040253bfd5caadea71f2973082b7
          >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
          >> >PHPSESSID=f676e48f73751ccef6c553cd33379eed
          >> >etc., etc.
          >> >
          >> >they all refer to the same page/URL.
          >> >
          >> >is there a way around this? i have stopped scanning after 40.000
          >> >links were scanned, for this gets a bit impractical.
          >> >
          >> >i could of course exclude URLs that refer to only one page (a
          >Search
          >> >page or a Contact page), but I am not sure if all of these are of
          >> >this type.
          >> >
          >> >any input is welcome,
          >> >
          >> >frank
          >> >
          >> >
          >> >
          >> >
          >> >
          >> >
          >> >
          >> >
          >> >Yahoo! Groups Links
          >> >
          >> >
          >> >
          >> >
          >> >
          >> >
          >>
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >Yahoo! Groups Links
          >
          >
          >
          >
          >
          >
        • Frank Visser
          Hi tilman, Shame on me - how could I forget! I set AllowCookies=1 and the issue has disappeared. I guess I hadn t spidered a site that was so heavily cookie
          Message 4 of 5 , Nov 6, 2005
          • 0 Attachment

            Hi tilman,

             

            Shame on me – how could I forget!

             

            I set AllowCookies=1 and the issue has disappeared.

             

            I guess I hadn’t spidered a site that was so heavily cookie based as this one ;-)

             

            Tnx,

             

            frank

             

            ========================================

            Frank Visser, Waterpoortweg 279, 1051 pv, Amsterdam

            Author of: Ken Wilber: Thought as Passion SUNY 2003

            Read all about Ken Wilber : http://www.integralworld.net

            ========================================


            From: xenu-usergroup@yahoogroups.com [mailto:xenu-usergroup@yahoogroups.com] On Behalf Of Tilman Hausherr
            Sent: zondag 6 november 2005 20:29
            To: xenu-usergroup@yahoogroups.com
            Subject: [personal] Re: [xenu-usergroup] Re: PHP and session IDs

             

            No, I'm talking about *enabling* cookies in Xenu.
            http://home.snafu.de/tilman/xenulink.html#cookies

            (I believe this was originally made for you!!!!)

            I have no idea how Xenu handles cookies. I don't do anything myself, I
            let the WININET.DLL handle it (when enabled).

            Tilman


            On Sun, 06 Nov 2005 19:18:46 -0000, frank visser wrote:

            >hi tilman,
            >
            >i have enabled all cookies now: direct, indirect and session (under
            >Privacy > Advanced > Disable automatic cookie handling).
            >
            >seems i get less session id URLs, but still find a lot of them, some
            >cases 33 for one URL:
            >
            >http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
            >PHPSESSID=1a5f4aa3adcc1bd55627a196bf16a7bb
            >http://stuartdavis.com/images/flyers/stu_11x17poster_color.jpg?
            >PHPSESSID=26bd7acfe18ba32f9e2c23c5f19c61d5
            >etc.
            >
            >is the logic that if cookies are disabled, a session ID is added to
            >the URL?
            >
            >are these session URLs generated by Xenu while spidering?
            >
            >when i look up the properties of these URLs, they all have different
            >URLs under "pages linking to this one", for example:
            >
            >http://stuartdavis.com/book/print/429?sort=asc&order=Venue&from=50
            >http://stuartdavis.com/book/print/429?sort=asc&order=City&from=0
            >http://stuartdavis.com/book/print/429?from=50&sort=asc&order=State
            >
            >is this all generated by Xenu?
            >
            >any input is helpful,
            >
            >frank
            >
            >--- In xenu-usergroup@yahoogroups.com, Tilman Hausherr <tilman@s...>
            >wrote:
            >>
            >> what if you enable cookies?
            >>
            >> Tilman
            >>
            >> On Sun, 06 Nov 2005 14:03:37 -0000, frank visser wrote:
            >>
            >> >hi all,
            >> >
            >> >when scanning this site:
            >> >http://stuartdavis.com/
            >> >
            >> >i get a huge multiplication of links due to the PHP system of
            >> >session id's:
            >> >
            >> >for example:
            >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple
            >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
            >>

            >PHPSESSID=6446040253bfd5caadea71f2973082b7

            >> >http://stuartdavis.com/albums/nomenestnumen/lyrics/disciple?
            >>
            >PHPSESSID=f676e48f73751ccef6c553cd33379eed

            >> >etc., etc.
            >> >
            >> >they all refer to the same page/URL.
            >> >
            >> >is there a way around this? i have stopped scanning after 40.000
            >> >links were scanned, for this gets a bit impractical.
            >> >
            >> >i could of course exclude URLs that refer to only one page (a
            >Search
            >> >page or a Contact page), but I am not sure if all of these are of
            >> >this type.
            >> >
            >> >any input is welcome,
            >> >
            >> >frank
            >> >
            >> >
            >> >
            >> >
            >> >
            >> >
            >> >
            >> >
            >> >Yahoo! Groups Links
            >> >
            >> >
            >> >
            >> >
            >> >
            >> >
            >>
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >Yahoo! Groups Links
            >
            >
            >
            >
            >
            >



          Your message has been successfully submitted and would be delivered to recipients shortly.