Loading ...
Sorry, an error occurred while loading the content.

Re: not all adresses are indexed

Expand Messages
  • Chris
    ... Google don t always give you the full picture. (Otherwise we d soon be able to work out how their algorithms work and it would put gazillions of SEO
    Message 1 of 7 , Jun 15, 2009
    • 0 Attachment
      --- In xenu-usergroup@yahoogroups.com, "Lars Ekdahl" <baloo5419@...> wrote:
      >
      > Maybe off toppic
      >
      > When I send the sitemap Google accept it and find totally all the 5676 adresses but index only 3395? Do you know why
      >

      Google don't always give you the "full" picture.
      (Otherwise we'd soon be able to work out how their algorithms work and it would put gazillions of SEO people out of work)

      But the main reason is supplying a sitemap to Google is NOT a guarantee that they will add ALL the URL's within the sitemap to their index.

      There maybe pages which Google will add to the 'secondary' index (better known as the 'supplemental' index.

      The "supplemental index" is contains files which Google doesn't know what to do with, OR, contains content is doesn't want in it's main index.

      One reason a page can end up in the "supplemental index" is 'duplicate content'. (i.e. you have a blog, you use TAGs and your tag page is exactly the same as the main landing page.
      Or, your page is a duplicate of a KNOWN PRIOR existing page on ANOTHER site (whether it be directly scraped or just a coincidental copy).

      Or, your page is so badly structured (in HTML terms for example), that Google couldn't find the content in your page, so threw that page in the "supplemental index" rather than put it in the main index. (Broken JS or too many HTML validation errors sometimes causes Googlebot to just 'give up').

      There are many other reasons why content might not appear in the main index, but the above two reasons hopefully give you some sort of explanation.

      Regards

      Chris
    • Lars Ekdahl
      ... Thanks! /Lars
      Message 2 of 7 , Jun 15, 2009
      • 0 Attachment
        --- In xenu-usergroup@yahoogroups.com, "Chris" <shinerweb@...> wrote:
        >
        > --- In xenu-usergroup@yahoogroups.com, "Lars Ekdahl" <baloo5419@> wrote:
        > >
        > > Maybe off toppic
        > >
        > > When I send the sitemap Google accept it and find totally all the 5676 adresses but index only 3395? Do you know why
        > >
        >
        > Google don't always give you the "full" picture.
        > (Otherwise we'd soon be able to work out how their algorithms work and it would put gazillions of SEO people out of work)
        >
        > But the main reason is supplying a sitemap to Google is NOT a guarantee that they will add ALL the URL's within the sitemap to their index.
        >
        > There maybe pages which Google will add to the 'secondary' index (better known as the 'supplemental' index.
        >
        > The "supplemental index" is contains files which Google doesn't know what to do with, OR, contains content is doesn't want in it's main index.
        >
        > One reason a page can end up in the "supplemental index" is 'duplicate content'. (i.e. you have a blog, you use TAGs and your tag page is exactly the same as the main landing page.
        > Or, your page is a duplicate of a KNOWN PRIOR existing page on ANOTHER site (whether it be directly scraped or just a coincidental copy).
        >
        > Or, your page is so badly structured (in HTML terms for example), that Google couldn't find the content in your page, so threw that page in the "supplemental index" rather than put it in the main index. (Broken JS or too many HTML validation errors sometimes causes Googlebot to just 'give up').
        >
        > There are many other reasons why content might not appear in the main index, but the above two reasons hopefully give you some sort of explanation.
        >
        > Regards
        >
        > Chris

        Thanks!
        /Lars
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.