Loading ...
Sorry, an error occurred while loading the content.

Re: [jug-detroit] Globus Toolkit

Expand Messages
  • Ilya Sterin
    Well, since it was Srini and I for a while, we used to alternate. He d come out here to Grosse Pointe one day and I would meet him in Birmingham the next day.
    Message 1 of 13 , Jan 31, 2009
    View Source
    • 0 Attachment
      Well, since it was Srini and I for a while, we used to alternate.  He'd come out here to Grosse Pointe one day and I would meet him in Birmingham the next day.  Woodward sounds fine for the next meeting.  We met a few times at the Borders cafe in Birmingham, but seating availability there varies.

      So this Wednesday at 7pm is fine with me if Srini and Glenn agree.  Pick any place around RO.

      Thanks.

      Ilya


      From: David Mckinnon <mckinnon.david@...>
      To: jug-detroit@yahoogroups.com
      Sent: Saturday, January 31, 2009 9:20:46 PM
      Subject: Re: [jug-detroit] Globus Toolkit

      Ilya,

      We never did pick a place.  Where do you and Srini meet usually? 

      There is a Caribou Coffee on Woodward that might be convenient for Glenn.   The one on Main in RO is probably pretty busy in the evenings.

      Let me know where you think we should meet and then we can post it to the mail list.

      I'd like to arrange meetings like this for the First Wednesday of every month if possible.  No presentations just meeting and talking.   I think Srini said 7PM ish.

      Let me know what you think. 

      David



      From: Ilya Sterin <isterin@yahoo. com>
      To: jug-detroit@ yahoogroups. com
      Sent: Saturday, January 31, 2009 4:10:54 PM
      Subject: Re: [jug-detroit] Globus Toolkit

      Glen, so it all depends on the grid framework you're using.  There are some generic distributing computing concepts that should and are a part of most grid frameworks, but things like disk storage are not necessarily something they all must support.

      There are two types of grids, compute grids and data grids.  Some frameworks provide both.  I think the biggest failure recovery they must and lots do support are the same issues that pertain to any dynamic infrastructure and distribute application.  Also, today with the availability of dynamic provisioning that clouds provide, there are even more things to support.  The major thing with distributed computing failures has always been network partitions and crash recovery.  How do you handle a completely decentralized infrastructure when a network partition occurs.  There are many different approaches, many of them are important in depth, as these topics are highly academic and theoretical.  Group communication concepts are probably some of the more prevalent  ones that deal with these concerns.  Take a look at spread toolkit and the papers on extended virtual synchrony if you're interested in getting a in-depth tutorial.

      There is really not a particular size that you start with to explore the capabilities.  Most grid toolkits take advantage of multiple CPUs just as they take advantage of multiple physical nodes.  The good grid toolkits make this completely transparent.  When I was developing our grid architecture I was doing it completely locally on my MacBook Pro and then migrated to EC2.  The process was about 95% transparent, the only things that had to change are configurations like communication protocol, discovery protocol, etc..., but that's all declarative, at least in GigaSpaces and GridGain.

      Here are some of the things to consider.

      The simplest approach to gridifying tasks is through Map/Reduce.  It's very straightforward and is supported by many grid toolkits (Hadoop, GridGain, GigaSpaces).  If all you need is what map/reduce provides, then Hadoop fits that like a glove.  GridGain also has a great implementation of Map/Reduce.  Hadoop uses it's own distributed file system, so most of the tasks have to be represented in a format writable to disk.  They have plugins for database persistence, but last time I looked at them they were in pretty poor quality.  GridGain supports adapters for most grid concerns and you can declaratively plug in JDBC, file system, etc... adapter to be used in place of a data grid.  (GridGain only provides a compute grid, but it's the best I've seen so far.  It's free, unless you want to buy support).

      Now, you also want to think whether you want to distribute your data.  If your data is currently persisted in a store that will not scale as you scale compute nodes that read/persist that data, data grid or data partitioning/ sharding is a must.  Unfortunately there is not a good open source implementation of a data grid out there, some commercial options are GigaSpaces and Oracle Coherence.  Most of them provide an in memory data grid and have read/write through support to a persistent data store.  This is great, because it's basically in memory caching for compute operations with transparent persistence.

      Another thing to discuss is grid management/monitori ng.  There are usually APIs for expanding and shrinking a grid, monitoring the grid and applying various declarative policies for failures and other management tasks.  This is all toolkit dependent of course.

      In my experience, any system that needs to scale can benefit from a grid architecture.  Because it's an architecture, you can build pretty much most applications on top of it.  The benefits of course have to be greater than the drawbacks as in most cases.  Grid apps are complex in nature, although good toolkits abstract you from distributed computing concerns and probably make it just as easy to build as multithreaded applications, anyone who's developed a large multithreaded application in an imperative language, knows that it's ridiculously  complex to get it right.  I won't even get into the world of testing multithreaded and distributed apps, which is basically horrendous  at this time.

      I can write about it for the next few days, but why don't we just talk at the next JUG meeting.  Also, there is a monthly meeting this Wednesday at a coffee shop that David announced.  Shrini and I have been meeting for months now and discussing different topics.  It's informal and a great way to discuss these topics.  Can you join us this Wednesday?

      Ilya


      From: Glenn Everitt <glenn.everitt@ gmail.com>
      To: jug-detroit@ yahoogroups. com
      Sent: Saturday, January 31, 2009 3:15:48 PM
      Subject: RE: [jug-detroit] Globus Toolkit

      Ilya:

      I don’t have a specific project in mind… yet ;-)    but I am interested in how grids are managed, how system resources are allocated, how grids work with network attached storage (NAS) and network infrastructure.  Do you have any experience with problem determination with grid based applications?  What type of failure modes are encountered in grid based systems?  How large of a configuration do you need to start exploring grid based architectures ( are two systems enough or do you need a dozen?)   Does grid computing work well with all problem domains or only specific problem sets that can be easily partitioned and results aggregated.  Do the grid toolkits you’ve used provide a way to know how many systems were allocated to “work” on a particular application?

       

      If you are still looking for a topic for your Lightning Talk I’d be very interested in grid based computing.

       

      Thanks for the response I’ll corner you at the next JUG so I can pester you with even more questions!

       

      Glenn Everitt

       


      From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
      Sent: Wednesday, January 28, 2009 10:20 PM
      To: jug-detroit@ yahoogroups. com
      Subject: Re: [jug-detroit] Globus Toolkit

       

      I don't have experience with Globus, but have worked with various other grid toolkits.  Two of the recent ones have been GridGain and GigaSpaces.  GridGain is a compute grid infrastructure, GigaSpaces is a compute and data grid based on space architecture theory.  Either way, both are nice, GridGain is open source and free.

       

      I know the GridGain founder and he was one of the original folks who worked on globus, so I think some concepts might be similar.

       

      Are you doing some grid projects there Glenn?

       

      Ilya

       


      From: Glenn Everitt <glenn.everitt@ gmail.com>
      To: jug-detroit@ yahoogroups. com
      Sent: Wednesday, January 28, 2009 9:24:10 PM
      Subject: [jug-detroit] Globus Toolkit

       

      Does anyone have any experience with the The Globus® Toolkit? It’s an open source software toolkit used for building grids.

       

      http://www.globus. org/toolkit/

       

       

      Glenn Everitt

       




    • Glenn Everitt
      How about Caribou listed below it is a little south and on the other side of the Woodward from the Borders in Birmingham. The Bombay store is in the center of
      Message 2 of 13 , Feb 1, 2009
      View Source
      • 0 Attachment

        How about Caribou listed below it is a little south and on the other side of the Woodward from the Borders in Birmingham.  The Bombay store is in the center of the plaza and is very noticeable from the street.  So, if that’s works for everyone I’ll plan on being there around 7:00PM

        Glenn Everitt

        Caribou Coffee

        31901 Woodward Ave., Meeting Room, Royal Oak, MI, 48073

        (248) 549-4591

        Located in Normandy Plaza on the West side of Woodward, just South of Normandy Rd. Caribou Coffee is at one end of the plaza. The Bombay store is also located in this shopping plaza.  Menu http://www.cariboucoffee.com/page/1/menu-nutrition.jsp

         

         

         


        From: jug-detroit@yahoogroups.com [mailto:jug-detroit@yahoogroups.com] On Behalf Of Ilya Sterin
        Sent: Saturday, January 31, 2009 11:26 PM
        To: jug-detroit@yahoogroups.com
        Subject: Re: [jug-detroit] Globus Toolkit

         

        Well, since it was Srini and I for a while, we used to alternate.  He'd come out here to Grosse Pointe one day and I would meet him in Birmingham the next day.  Woodward sounds fine for the next meeting.  We met a few times at the Borders cafe in Birmingham, but seating availability there varies.

         

        So this Wednesday at 7pm is fine with me if Srini and Glenn agree.  Pick any place around RO.

         

        Thanks.

         

        Ilya

         


        From: David Mckinnon <mckinnon.david@ ymail.com>
        To: jug-detroit@ yahoogroups. com
        Sent: Saturday, January 31, 2009 9:20:46 PM
        Subject: Re: [jug-detroit] Globus Toolkit

        Ilya,

        We never did pick a place.  Where do you and Srini meet usually? 

        There is a Caribou Coffee on Woodward that might be convenient for Glenn.   The one on Main in RO is probably pretty busy in the evenings.

        Let me know where you think we should meet and then we can post it to the mail list.

        I'd like to arrange meetings like this for the First Wednesday of every month if possible.  No presentations just meeting and talking.   I think Srini said 7PM ish.

        Let me know what you think. 

        David

         


        From: Ilya Sterin <isterin@yahoo. com>
        To: jug-detroit@ yahoogroups. com
        Sent: Saturday, January 31, 2009 4:10:54 PM
        Subject: Re: [jug-detroit] Globus Toolkit

        Glen, so it all depends on the grid framework you're using.  There are some generic distributing computing concepts that should and are a part of most grid frameworks, but things like disk storage are not necessarily something they all must support.

         

        There are two types of grids, compute grids and data grids.  Some frameworks provide both.  I think the biggest failure recovery they must and lots do support are the same issues that pertain to any dynamic infrastructure and distribute application.  Also, today with the availability of dynamic provisioning that clouds provide, there are even more things to support.  The major thing with distributed computing failures has always been network partitions and crash recovery.  How do you handle a completely decentralized infrastructure when a network partition occurs.  There are many different approaches, many of them are important in depth, as these topics are highly academic and theoretical.  Group communication concepts are probably some of the more prevalent  ones that deal with these concerns.  Take a look at spread toolkit and the papers on extended virtual synchrony if you're interested in getting a in-depth tutorial.

         

        There is really not a particular size that you start with to explore the capabilities.  Most grid toolkits take advantage of multiple CPUs just as they take advantage of multiple physical nodes.  The good grid toolkits make this completely transparent.  When I was developing our grid architecture I was doing it completely locally on my MacBook Pro and then migrated to EC2.  The process was about 95% transparent, the only things that had to change are configurations like communication protocol, discovery protocol, etc..., but that's all declarative, at least in GigaSpaces and GridGain.

         

        Here are some of the things to consider.

         

        The simplest approach to gridifying tasks is through Map/Reduce.  It's very straightforward and is supported by many grid toolkits (Hadoop, GridGain, GigaSpaces).  If all you need is what map/reduce provides, then Hadoop fits that like a glove.  GridGain also has a great implementation of Map/Reduce.  Hadoop uses it's own distributed file system, so most of the tasks have to be represented in a format writable to disk.  They have plugins for database persistence, but last time I looked at them they were in pretty poor quality.  GridGain supports adapters for most grid concerns and you can declaratively plug in JDBC, file system, etc... adapter to be used in place of a data grid.  (GridGain only provides a compute grid, but it's the best I've seen so far.  It's free, unless you want to buy support).

         

        Now, you also want to think whether you want to distribute your data.  If your data is currently persisted in a store that will not scale as you scale compute nodes that read/persist that data, data grid or data partitioning/ sharding is a must.  Unfortunately there is not a good open source implementation of a data grid out there, some commercial options are GigaSpaces and Oracle Coherence.  Most of them provide an in memory data grid and have read/write through support to a persistent data store.  This is great, because it's basically in memory caching for compute operations with transparent persistence.

         

        Another thing to discuss is grid management/monitori ng.  There are usually APIs for expanding and shrinking a grid, monitoring the grid and applying various declarative policies for failures and other management tasks.  This is all toolkit dependent of course.

         

        In my experience, any system that needs to scale can benefit from a grid architecture.  Because it's an architecture, you can build pretty much most applications on top of it.  The benefits of course have to be greater than the drawbacks as in most cases.  Grid apps are complex in nature, although good toolkits abstract you from distributed computing concerns and probably make it just as easy to build as multithreaded applications, anyone who's developed a large multithreaded application in an imperative language, knows that it's ridiculously  complex to get it right.  I won't even get into the world of testing multithreaded and distributed apps, which is basically horrendous  at this time.

         

        I can write about it for the next few days, but why don't we just talk at the next JUG meeting.  Also, there is a monthly meeting this Wednesday at a coffee shop that David announced.  Shrini and I have been meeting for months now and discussing different topics.  It's informal and a great way to discuss these topics.  Can you join us this Wednesday?

         

        Ilya

         


        From: Glenn Everitt <glenn.everitt@ gmail.com>
        To: jug-detroit@ yahoogroups. com
        Sent: Saturday, January 31, 2009 3:15:48 PM
        Subject: RE: [jug-detroit] Globus Toolkit

        Ilya:

        I don’t have a specific project in mind… yet ;-)    but I am interested in how grids are managed, how system resources are allocated, how grids work with network attached storage (NAS) and network infrastructure.  Do you have any experience with problem determination with grid based applications?  What type of failure modes are encountered in grid based systems?  How large of a configuration do you need to start exploring grid based architectures ( are two systems enough or do you need a dozen?)   Does grid computing work well with all problem domains or only specific problem sets that can be easily partitioned and results aggregated.  Do the grid toolkits you’ve used provide a way to know how many systems were allocated to “work” on a particular application?

         

        If you are still looking for a topic for your Lightning Talk I’d be very interested in grid based computing.

         

        Thanks for the response I’ll corner you at the next JUG so I can pester you with even more questions!

         

        Glenn Everitt

         


        From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
        Sent: Wednesday, January 28, 2009 10:20 PM
        To: jug-detroit@ yahoogroups. com
        Subject: Re: [jug-detroit] Globus Toolkit

         

        I don't have experience with Globus, but have worked with various other grid toolkits.  Two of the recent ones have been GridGain and GigaSpaces.  GridGain is a compute grid infrastructure, GigaSpaces is a compute and data grid based on space architecture theory.  Either way, both are nice, GridGain is open source and free.

         

        I know the GridGain founder and he was one of the original folks who worked on globus, so I think some concepts might be similar.

         

        Are you doing some grid projects there Glenn?

         

        Ilya

         


        From: Glenn Everitt <glenn.everitt@ gmail.com>
        To: jug-detroit@ yahoogroups. com
        Sent: Wednesday, January 28, 2009 9:24:10 PM
        Subject: [jug-detroit] Globus Toolkit

         

        Does anyone have any experience with the The Globus® Toolkit? It’s an open source software toolkit used for building grids.

         

        http://www.globus. org/toolkit/

         

         

        Glenn Everitt

         

         

         

         

      • Ilya Sterin
        Works for me. ________________________________ From: Glenn Everitt To: jug-detroit@yahoogroups.com Sent: Sunday, February 1, 2009
        Message 3 of 13 , Feb 1, 2009
        View Source
        • 0 Attachment
          Works for me.  


          From: Glenn Everitt <glenn.everitt@...>
          To: jug-detroit@yahoogroups.com
          Sent: Sunday, February 1, 2009 5:37:02 PM
          Subject: RE: [jug-detroit] Globus Toolkit

          How about Caribou listed below it is a little south and on the other side of the Woodward from the Borders in Birmingham.  The Bombay store is in the center of the plaza and is very noticeable from the street.  So, if that’s works for everyone I’ll plan on being there around 7:00PM

          Glenn Everitt

          Caribou Coffee

          31901 Woodward Ave., Meeting Room, Royal Oak, MI, 48073

          (248) 549-4591

          Located in Normandy Plaza on the West side of Woodward, just South of Normandy Rd. Caribou Coffee is at one end of the plaza. The Bombay store is also located in this shopping plaza.  Menu http://www.caribouc offee.com/ page/1/menu- nutrition. jsp

           

           

           


          From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
          Sent: Saturday, January 31, 2009 11:26 PM
          To: jug-detroit@ yahoogroups. com
          Subject: Re: [jug-detroit] Globus Toolkit

           

          Well, since it was Srini and I for a while, we used to alternate.  He'd come out here to Grosse Pointe one day and I would meet him in Birmingham the next day.  Woodward sounds fine for the next meeting.  We met a few times at the Borders cafe in Birmingham, but seating availability there varies.

           

          So this Wednesday at 7pm is fine with me if Srini and Glenn agree.  Pick any place around RO.

           

          Thanks.

           

          Ilya

           


          From: David Mckinnon <mckinnon.david@ ymail.com>
          To: jug-detroit@ yahoogroups. com
          Sent: Saturday, January 31, 2009 9:20:46 PM
          Subject: Re: [jug-detroit] Globus Toolkit

          Ilya,

          We never did pick a place.  Where do you and Srini meet usually? 

          There is a Caribou Coffee on Woodward that might be convenient for Glenn.   The one on Main in RO is probably pretty busy in the evenings.

          Let me know where you think we should meet and then we can post it to the mail list.

          I'd like to arrange meetings like this for the First Wednesday of every month if possible.  No presentations just meeting and talking.   I think Srini said 7PM ish.

          Let me know what you think. 

          David

           


          From: Ilya Sterin <isterin@yahoo. com>
          To: jug-detroit@ yahoogroups. com
          Sent: Saturday, January 31, 2009 4:10:54 PM
          Subject: Re: [jug-detroit] Globus Toolkit

          Glen, so it all depends on the grid framework you're using.  There are some generic distributing computing concepts that should and are a part of most grid frameworks, but things like disk storage are not necessarily something they all must support.

           

          There are two types of grids, compute grids and data grids.  Some frameworks provide both.  I think the biggest failure recovery they must and lots do support are the same issues that pertain to any dynamic infrastructure and distribute application.  Also, today with the availability of dynamic provisioning that clouds provide, there are even more things to support.  The major thing with distributed computing failures has always been network partitions and crash recovery.  How do you handle a completely decentralized infrastructure when a network partition occurs.  There are many different approaches, many of them are important in depth, as these topics are highly academic and theoretical.  Group communication concepts are probably some of the more prevalent  ones that deal with these concerns.  Take a look at spread toolkit and the papers on extended virtual synchrony if you're interested in getting a in-depth tutorial.

           

          There is really not a particular size that you start with to explore the capabilities.  Most grid toolkits take advantage of multiple CPUs just as they take advantage of multiple physical nodes.  The good grid toolkits make this completely transparent.  When I was developing our grid architecture I was doing it completely locally on my MacBook Pro and then migrated to EC2.  The process was about 95% transparent, the only things that had to change are configurations like communication protocol, discovery protocol, etc..., but that's all declarative, at least in GigaSpaces and GridGain.

           

          Here are some of the things to consider.

           

          The simplest approach to gridifying tasks is through Map/Reduce.  It's very straightforward and is supported by many grid toolkits (Hadoop, GridGain, GigaSpaces).  If all you need is what map/reduce provides, then Hadoop fits that like a glove.  GridGain also has a great implementation of Map/Reduce.  Hadoop uses it's own distributed file system, so most of the tasks have to be represented in a format writable to disk.  They have plugins for database persistence, but last time I looked at them they were in pretty poor quality.  GridGain supports adapters for most grid concerns and you can declaratively plug in JDBC, file system, etc... adapter to be used in place of a data grid.  (GridGain only provides a compute grid, but it's the best I've seen so far.  It's free, unless you want to buy support).

           

          Now, you also want to think whether you want to distribute your data.  If your data is currently persisted in a store that will not scale as you scale compute nodes that read/persist that data, data grid or data partitioning/ sharding is a must.  Unfortunately there is not a good open source implementation of a data grid out there, some commercial options are GigaSpaces and Oracle Coherence.  Most of them provide an in memory data grid and have read/write through support to a persistent data store.  This is great, because it's basically in memory caching for compute operations with transparent persistence.

           

          Another thing to discuss is grid management/monitori ng.  There are usually APIs for expanding and shrinking a grid, monitoring the grid and applying various declarative policies for failures and other management tasks.  This is all toolkit dependent of course.

           

          In my experience, any system that needs to scale can benefit from a grid architecture.  Because it's an architecture, you can build pretty much most applications on top of it.  The benefits of course have to be greater than the drawbacks as in most cases.  Grid apps are complex in nature, although good toolkits abstract you from distributed computing concerns and probably make it just as easy to build as multithreaded applications, anyone who's developed a large multithreaded application in an imperative language, knows that it's ridiculously  complex to get it right.  I won't even get into the world of testing multithreaded and distributed apps, which is basically horrendous  at this time.

           

          I can write about it for the next few days, but why don't we just talk at the next JUG meeting.  Also, there is a monthly meeting this Wednesday at a coffee shop that David announced.  Shrini and I have been meeting for months now and discussing different topics.  It's informal and a great way to discuss these topics.  Can you join us this Wednesday?

           

          Ilya

           


          From: Glenn Everitt <glenn.everitt@ gmail.com>
          To: jug-detroit@ yahoogroups. com
          Sent: Saturday, January 31, 2009 3:15:48 PM
          Subject: RE: [jug-detroit] Globus Toolkit

          Ilya:

          I don’t have a specific project in mind… yet ;-)    but I am interested in how grids are managed, how system resources are allocated, how grids work with network attached storage (NAS) and network infrastructure.  Do you have any experience with problem determination with grid based applications?  What type of failure modes are encountered in grid based systems?  How large of a configuration do you need to start exploring grid based architectures ( are two systems enough or do you need a dozen?)   Does grid computing work well with all problem domains or only specific problem sets that can be easily partitioned and results aggregated.  Do the grid toolkits you’ve used provide a way to know how many systems were allocated to “work” on a particular application?

           

          If you are still looking for a topic for your Lightning Talk I’d be very interested in grid based computing.

           

          Thanks for the response I’ll corner you at the next JUG so I can pester you with even more questions!

           

          Glenn Everitt

           


          From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
          Sent: Wednesday, January 28, 2009 10:20 PM
          To: jug-detroit@ yahoogroups. com
          Subject: Re: [jug-detroit] Globus Toolkit

           

          I don't have experience with Globus, but have worked with various other grid toolkits.  Two of the recent ones have been GridGain and GigaSpaces.  GridGain is a compute grid infrastructure, GigaSpaces is a compute and data grid based on space architecture theory.  Either way, both are nice, GridGain is open source and free.

           

          I know the GridGain founder and he was one of the original folks who worked on globus, so I think some concepts might be similar.

           

          Are you doing some grid projects there Glenn?

           

          Ilya

           


          From: Glenn Everitt <glenn.everitt@ gmail.com>
          To: jug-detroit@ yahoogroups. com
          Sent: Wednesday, January 28, 2009 9:24:10 PM
          Subject: [jug-detroit] Globus Toolkit

           

          Does anyone have any experience with the The Globus® Toolkit? It’s an open source software toolkit used for building grids.

           

          http://www.globus. org/toolkit/

           

           

          Glenn Everitt

           

           

           

           


        • Glenn Everitt
          Ilyia: Thanks for the great overview of grid platforms but I still have questions looking forward to talking with you Wed at 7:00 at Caribou Coffee. I guess I
          Message 4 of 13 , Feb 3, 2009
          View Source
          • 0 Attachment

            Ilyia:

            Thanks for the great overview of grid platforms but I still have questions looking forward to talking with you Wed at 7:00 at Caribou Coffee.  I guess I got lucky that our jug has a grid expert!

             

            Thanks again

            Glenn Everitt

             


            From: jug-detroit@yahoogroups.com [mailto:jug-detroit@yahoogroups.com] On Behalf Of Ilya Sterin
            Sent: Monday, February 02, 2009 12:27 AM
            To: jug-detroit@yahoogroups.com
            Subject: Re: [jug-detroit] Globus Toolkit

             

            Works for me.  

             


            From: Glenn Everitt <glenn.everitt@ gmail.com>
            To: jug-detroit@ yahoogroups. com
            Sent: Sunday, February 1, 2009 5:37:02 PM
            Subject: RE: [jug-detroit] Globus Toolkit

            How about Caribou listed below it is a little south and on the other side of the Woodward from the Borders in Birmingham.  The Bombay store is in the center of the plaza and is very noticeable from the street.  So, if that’s works for everyone I’ll plan on being there around 7:00PM

            Glenn Everitt

            Caribou Coffee

            31901 Woodward Ave., Meeting Room, Royal Oak, MI, 48073

            (248) 549-4591

            Located in Normandy Plaza on the West side of Woodward, just South of Normandy Rd. Caribou Coffee is at one end of the plaza. The Bombay store is also located in this shopping plaza.  Menu http://www.caribouc offee.com/ page/1/menu- nutrition. jsp

             

             

             


            From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
            Sent: Saturday, January 31, 2009 11:26 PM
            To: jug-detroit@ yahoogroups. com
            Subject: Re: [jug-detroit] Globus Toolkit

             

            Well, since it was Srini and I for a while, we used to alternate.  He'd come out here to Grosse Pointe one day and I would meet him in Birmingham the next day.  Woodward sounds fine for the next meeting.  We met a few times at the Borders cafe in Birmingham, but seating availability there varies.

             

            So this Wednesday at 7pm is fine with me if Srini and Glenn agree.  Pick any place around RO.

             

            Thanks.

             

            Ilya

             


            From: David Mckinnon <mckinnon.david@ ymail.com>
            To: jug-detroit@ yahoogroups. com
            Sent: Saturday, January 31, 2009 9:20:46 PM
            Subject: Re: [jug-detroit] Globus Toolkit

            Ilya,

            We never did pick a place.  Where do you and Srini meet usually? 

            There is a Caribou Coffee on Woodward that might be convenient for Glenn.   The one on Main in RO is probably pretty busy in the evenings.

            Let me know where you think we should meet and then we can post it to the mail list.

            I'd like to arrange meetings like this for the First Wednesday of every month if possible.  No presentations just meeting and talking.   I think Srini said 7PM ish.

            Let me know what you think. 

            David

             


            From: Ilya Sterin <isterin@yahoo. com>
            To: jug-detroit@ yahoogroups. com
            Sent: Saturday, January 31, 2009 4:10:54 PM
            Subject: Re: [jug-detroit] Globus Toolkit

            Glen, so it all depends on the grid framework you're using.  There are some generic distributing computing concepts that should and are a part of most grid frameworks, but things like disk storage are not necessarily something they all must support.

             

            There are two types of grids, compute grids and data grids.  Some frameworks provide both.  I think the biggest failure recovery they must and lots do support are the same issues that pertain to any dynamic infrastructure and distribute application.  Also, today with the availability of dynamic provisioning that clouds provide, there are even more things to support.  The major thing with distributed computing failures has always been network partitions and crash recovery.  How do you handle a completely decentralized infrastructure when a network partition occurs.  There are many different approaches, many of them are important in depth, as these topics are highly academic and theoretical.  Group communication concepts are probably some of the more prevalent  ones that deal with these concerns.  Take a look at spread toolkit and the papers on extended virtual synchrony if you're interested in getting a in-depth tutorial.

             

            There is really not a particular size that you start with to explore the capabilities.  Most grid toolkits take advantage of multiple CPUs just as they take advantage of multiple physical nodes.  The good grid toolkits make this completely transparent.  When I was developing our grid architecture I was doing it completely locally on my MacBook Pro and then migrated to EC2.  The process was about 95% transparent, the only things that had to change are configurations like communication protocol, discovery protocol, etc..., but that's all declarative, at least in GigaSpaces and GridGain.

             

            Here are some of the things to consider.

             

            The simplest approach to gridifying tasks is through Map/Reduce.  It's very straightforward and is supported by many grid toolkits (Hadoop, GridGain, GigaSpaces).  If all you need is what map/reduce provides, then Hadoop fits that like a glove.  GridGain also has a great implementation of Map/Reduce.  Hadoop uses it's own distributed file system, so most of the tasks have to be represented in a format writable to disk.  They have plugins for database persistence, but last time I looked at them they were in pretty poor quality.  GridGain supports adapters for most grid concerns and you can declaratively plug in JDBC, file system, etc... adapter to be used in place of a data grid.  (GridGain only provides a compute grid, but it's the best I've seen so far.  It's free, unless you want to buy support).

             

            Now, you also want to think whether you want to distribute your data.  If your data is currently persisted in a store that will not scale as you scale compute nodes that read/persist that data, data grid or data partitioning/ sharding is a must.  Unfortunately there is not a good open source implementation of a data grid out there, some commercial options are GigaSpaces and Oracle Coherence.  Most of them provide an in memory data grid and have read/write through support to a persistent data store.  This is great, because it's basically in memory caching for compute operations with transparent persistence.

             

            Another thing to discuss is grid management/monitori ng.  There are usually APIs for expanding and shrinking a grid, monitoring the grid and applying various declarative policies for failures and other management tasks.  This is all toolkit dependent of course.

             

            In my experience, any system that needs to scale can benefit from a grid architecture.  Because it's an architecture, you can build pretty much most applications on top of it.  The benefits of course have to be greater than the drawbacks as in most cases.  Grid apps are complex in nature, although good toolkits abstract you from distributed computing concerns and probably make it just as easy to build as multithreaded applications, anyone who's developed a large multithreaded application in an imperative language, knows that it's ridiculously  complex to get it right.  I won't even get into the world of testing multithreaded and distributed apps, which is basically horrendous  at this time.

             

            I can write about it for the next few days, but why don't we just talk at the next JUG meeting.  Also, there is a monthly meeting this Wednesday at a coffee shop that David announced.  Shrini and I have been meeting for months now and discussing different topics.  It's informal and a great way to discuss these topics.  Can you join us this Wednesday?

             

            Ilya

             


            From: Glenn Everitt <glenn.everitt@ gmail.com>
            To: jug-detroit@ yahoogroups. com
            Sent: Saturday, January 31, 2009 3:15:48 PM
            Subject: RE: [jug-detroit] Globus Toolkit

            Ilya:

            I don’t have a specific project in mind… yet ;-)    but I am interested in how grids are managed, how system resources are allocated, how grids work with network attached storage (NAS) and network infrastructure.  Do you have any experience with problem determination with grid based applications?  What type of failure modes are encountered in grid based systems?  How large of a configuration do you need to start exploring grid based architectures ( are two systems enough or do you need a dozen?)   Does grid computing work well with all problem domains or only specific problem sets that can be easily partitioned and results aggregated.  Do the grid toolkits you’ve used provide a way to know how many systems were allocated to “work” on a particular application?

             

            If you are still looking for a topic for your Lightning Talk I’d be very interested in grid based computing.

             

            Thanks for the response I’ll corner you at the next JUG so I can pester you with even more questions!

             

            Glenn Everitt

             


            From: jug-detroit@ yahoogroups. com [mailto:jug- detroit@yahoogro ups.com] On Behalf Of Ilya Sterin
            Sent: Wednesday, January 28, 2009 10:20 PM
            To: jug-detroit@ yahoogroups. com
            Subject: Re: [jug-detroit] Globus Toolkit

             

            I don't have experience with Globus, but have worked with various other grid toolkits.  Two of the recent ones have been GridGain and GigaSpaces.  GridGain is a compute grid infrastructure, GigaSpaces is a compute and data grid based on space architecture theory.  Either way, both are nice, GridGain is open source and free.

             

            I know the GridGain founder and he was one of the original folks who worked on globus, so I think some concepts might be similar.

             

            Are you doing some grid projects there Glenn?

             

            Ilya

             


            From: Glenn Everitt <glenn.everitt@ gmail.com>
            To: jug-detroit@ yahoogroups. com
            Sent: Wednesday, January 28, 2009 9:24:10 PM
            Subject: [jug-detroit] Globus Toolkit

             

            Does anyone have any experience with the The Globus® Toolkit? It’s an open source software toolkit used for building grids.

             

            http://www.globus. org/toolkit/

             

             

            Glenn Everitt

             

             

             

             

             

          Your message has been successfully submitted and would be delivered to recipients shortly.