Loading ...
Sorry, an error occurred while loading the content.

Re: [zms-developers] Google Sitemap generation - double links

Expand Messages
  • Sebastian Tänzer
    Hi Thorsten, great to hear. I thought about the dynamic generation which actually isn t too hard to do: request = container.REQUEST response =
    Message 1 of 11 , Jan 27, 2012
    View Source
    • 0 Attachment
      Hi Thorsten,

      great to hear. I thought about the dynamic generation which actually isn't too hard to do:

      request = container.REQUEST
      response =  request.response

      nodes = context.filteredChildNodes()

      folder_types = []

      meta = context.content.getMetaobjIds()
      for m in meta:
          obj = context.getMetaobj(m)
          if obj['type'] == 'ZMSDocument':
              folder_types.append( obj['id'] )

      content = []

      for n in nodes:
          if n.meta_id not in folder_types:
              content.append(n.meta_id)

      if content:
          return True
      else:
          return False


      Am 27.01.2012 um 12:36 schrieb Thorsten Weber:

       

      Hi Sebastian,


      that´s what i thought ...

      thanks for sharing the script - works perfect for me!

      a way to optimize might be to build the list of folder_types dynamically from metaobj_manager.


      kind regards,  
      Thorsten Weber

      thorsten weber
      software development

      pro in space gmbh
      spichernstrasse 34a
      50672 köln

      tweber@...
      http://www.proinspace.com

      T: 0049 221.29 21 79.26
      F: 0049 221.29 21 79.94

      managing directors:
      uli wilkes
      jürgen brandt
      oliver blaum

      amtsgericht köln
      hrb 33 444
      ust-idnr DE209049800

      Am 27.01.2012 um 12:12 schrieb Sebastian Tänzer:

       

      I found the problem:

      getHref2IndexHtml() generated the URL of the next child if the parent doesn't have any content (i.e. there's only folders in there).
      I solved this for now by using a python script checking for content like this:

      request = container.REQUEST
      response = request.response

      nodes = context.filteredChildNodes()
      folder_types = ['ZMSFolder', 'ZMSDocument']
      content = []

      for n in nodes:
      if n.meta_id not in folder_types:
      content.append(n.meta_id)

      if content:
      return True
      else:
      return False

      and including it in the sitemap generation:

      ...
      <dtml-in "content.filteredTreeNodes(REQUEST=REQUEST, meta_types=['ZMSFolder'])">
      <dtml-if "checkContent() and not isResource(REQUEST) and isActive(REQUEST)">
      <url>
      <loc><dtml-var domain><dtml-var "getHref2IndexHtml(REQUEST)" html_quote></loc>
      <lastmod><dtml-var "getLangFmtDate(ZopeTime(),'eng','%Y-%m-%d')"></lastmod>
      <dtml-comment><lastmod><dtml-var "getLangFmtDate(getObjProperty('change_dt',REQUEST),'eng','%Y-%m-%d')"></lastmod></dtml-comment>
      <dtml-if "getObjProperty('attr_zmsgoogle_bot_priority',REQUEST)"><priority><dtml-var "getObjProperty('attr_zmsgoogle_bot_priority',REQUEST)" fmt="%.5f"></priority><dtml-else></dtml-if>
      <dtml-if "getObjProperty('attr_zmsgoogle_bot_changefreq',REQUEST)"><changefreq><dtml-var "getObjProperty('attr_zmsgoogle_bot_changefreq',REQUEST)"></changefreq><dtml-else></dtml-if>
      </url>
      </dtml-if>
      </dtml-in>
      ...

      This works as expected for now.

      Optimizations highly welcome!

      Cheers, Sebastian

      Am 27.01.2012 um 11:32 schrieb Niels Dettenbach:

      > Am Freitag, 27. Januar 2012, 11:12:28 schrieben Sie:
      >> any idea about the double output of some folders?
      > ...not yet, as i can't reproduce this (at least with my published version).
      >
      > Will check this (i.e. if it relies on some hidden folders or something like
      > that) on wednesday next week in detail and come back then.
      >
      >
      > best regards,
      >
      >
      > Niels.
      > --
      > ---
      > Niels Dettenbach
      > Syndicat IT&Internet
      > http://www.syndicat.com/




    • Sascha Gottfried
      Hi ZMS developers, I read this thread and thought generating this sort of information should be possible mostly using ZMS API methods. The aspect of multiple
      Message 2 of 11 , Feb 1, 2012
      View Source
      • 0 Attachment
        Hi ZMS developers,

        I read this thread and thought generating this sort of information should be possible mostly using ZMS API methods.

        The aspect of multiple links to the same URL can be prevented when using getHref2IndexHtml(REQUEST, deep=0). Indeed the parameter 'deep' is true by default and this method traverses to the first object containing objects of type PAGEELEMENTS. Folders that contain just other folders do not contain PAGEELEMENTS.

        To filter certain types AND pay regard to custom developed ZMS types you should consider those API methods:

        getType()
        isPageContainer()
        isPage()
        isPageElement()
        isMetaType()


        If I run these methods on a 'ZMSFolder' I get these results:

        type: ZMSDocument
        meta_id: ZMSFolder
        isMetaType(PAGES): True
        isPage: True
        isPageElement: False
        isPageContainer: True

        If I run these methods on a 'ZMSDocument' I get these results:

        type: ZMSDocument
        meta_id: ZMSDocument
        isMetaType(PAGES): True
        isPage: True
        isPageElement: False
        isPageContainer: True

        Both 'ZMSFolder' and 'ZMSDocument' return 'ZMSDocument' as their ZMS type - they just differ in attribute 'meta_id'.

        Furthermore the code does not need to call isActive() on every item in the innermost loop since the call to filteredTreeNodes() calls isVisible() internally. This method checks for active items and checks multi-language topics as well (read the source) since the outermost loop is for multi-language ZMS sites. Nils/syndicat code does checking for isActive() as well.

        The latest code for skipping certain folders can be replaced with a call to:

        filteredChildNodes(meta_types=PAGES) - to scan for any item with ZMS Type PAGE

        filteredChildNodes(meta_types=['ZMSDocument']) - to make sure any PAGE item contains at least one ZMSDocument / or any other list of types

        filteredChildNodes(meta_types=PAGEELEMENTS) - if a PAGE element contains a single PAGEELEMENT this expression is true - (I think this is the requirement of Sebastian - but the filtering approaches of Sebastian and Torsten differed)

        I placed a code sample at Gist that does the filtering that was required - but it is a developer version that lists meta_id and URL for every item as well. Remove them for production use.
        https://gist.github.com/1718080

        At least the example shows that the API available misses expressiveness in this certain use case.

        I found a function isPageWithElements() in module zmscontainerobject.py. This should be available as an instance method - that could do the filtering in the innermost loop.
        Furthermore filteredTreeNodes() is not possible to skip 'resource' objects. This has to be done in the innermost loop as well.
        For another use case to query all tree objects it lacks the possibility to return inactive/invisible objects by unconditionally calling isVisible() internally.

        Good luck with your sitemaps!

        --- In zms-developers@yahoogroups.com, Sebastian Tänzer <st@...> wrote:
        >
        > Hi Thorsten,
        >
        > great to hear. I thought about the dynamic generation which actually isn't too hard to do:
        >
        > request = container.REQUEST
        > response = request.response
        >
        > nodes = context.filteredChildNodes()
        >
        > folder_types = []
        >
        > meta = context.content.getMetaobjIds()
        > for m in meta:
        > obj = context.getMetaobj(m)
        > if obj['type'] == 'ZMSDocument':
        > folder_types.append( obj['id'] )
        >
        > content = []
        >
        > for n in nodes:
        > if n.meta_id not in folder_types:
        > content.append(n.meta_id)
        >
        > if content:
        > return True
        > else:
        > return False
        >
        >
        > Am 27.01.2012 um 12:36 schrieb Thorsten Weber:
        >
        > > Hi Sebastian,
        > >
        > >
        > > that´s what i thought ...
        > >
        > > thanks for sharing the script - works perfect for me!
        > >
        > > a way to optimize might be to build the list of folder_types dynamically from metaobj_manager.
        > >
        > >
        > > kind regards,
        > > Thorsten Weber
        > >
        > > thorsten weber
        > > software development
        > >
        > > pro in space gmbh
        > > spichernstrasse 34a
        > > 50672 köln
        > >
        > > tweber@...
        > > http://www.proinspace.com
        > >
        > > T: 0049 221.29 21 79.26
        > > F: 0049 221.29 21 79.94
        > >
        > > managing directors:
        > > uli wilkes
        > > jürgen brandt
        > > oliver blaum
        > >
        > > amtsgericht köln
        > > hrb 33 444
        > > ust-idnr DE209049800
        > >
        > > Am 27.01.2012 um 12:12 schrieb Sebastian Tänzer:
        > >
        > >>
        > >> I found the problem:
        > >>
        > >> getHref2IndexHtml() generated the URL of the next child if the parent doesn't have any content (i.e. there's only folders in there).
        > >> I solved this for now by using a python script checking for content like this:
        > >>
        > >> request = container.REQUEST
        > >> response = request.response
        > >>
        > >> nodes = context.filteredChildNodes()
        > >> folder_types = ['ZMSFolder', 'ZMSDocument']
        > >> content = []
        > >>
        > >> for n in nodes:
        > >> if n.meta_id not in folder_types:
        > >> content.append(n.meta_id)
        > >>
        > >> if content:
        > >> return True
        > >> else:
        > >> return False
        > >>
        > >> and including it in the sitemap generation:
        > >>
        > >> ...
        > >> <dtml-in "content.filteredTreeNodes(REQUEST=REQUEST, meta_types=['ZMSFolder'])">
        > >> <dtml-if "checkContent() and not isResource(REQUEST) and isActive(REQUEST)">
        > >> <url>
        > >> <loc><dtml-var domain><dtml-var "getHref2IndexHtml(REQUEST)" html_quote></loc>
        > >> <lastmod><dtml-var "getLangFmtDate(ZopeTime(),'eng','%Y-%m-%d')"></lastmod>
        > >> <dtml-comment><lastmod><dtml-var "getLangFmtDate(getObjProperty('change_dt',REQUEST),'eng','%Y-%m-%d')"></lastmod></dtml-comment>
        > >> <dtml-if "getObjProperty('attr_zmsgoogle_bot_priority',REQUEST)"><priority><dtml-var "getObjProperty('attr_zmsgoogle_bot_priority',REQUEST)" fmt="%.5f"></priority><dtml-else></dtml-if>
        > >> <dtml-if "getObjProperty('attr_zmsgoogle_bot_changefreq',REQUEST)"><changefreq><dtml-var "getObjProperty('attr_zmsgoogle_bot_changefreq',REQUEST)"></changefreq><dtml-else></dtml-if>
        > >> </url>
        > >> </dtml-if>
        > >> </dtml-in>
        > >> ...
        > >>
        > >> This works as expected for now.
        > >>
        > >> Optimizations highly welcome!
        > >>
        > >> Cheers, Sebastian
        > >>
        > >> Am 27.01.2012 um 11:32 schrieb Niels Dettenbach:
        > >>
        > >> > Am Freitag, 27. Januar 2012, 11:12:28 schrieben Sie:
        > >> >> any idea about the double output of some folders?
        > >> > ...not yet, as i can't reproduce this (at least with my published version).
        > >> >
        > >> > Will check this (i.e. if it relies on some hidden folders or something like
        > >> > that) on wednesday next week in detail and come back then.
        > >> >
        > >> >
        > >> > best regards,
        > >> >
        > >> >
        > >> > Niels.
        > >> > --
        > >> > ---
        > >> > Niels Dettenbach
        > >> > Syndicat IT&Internet
        > >> > http://www.syndicat.com/
        > >>
        > >
        > >
        > >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.