Loading ...
Sorry, an error occurred while loading the content.

Re: Which programming language should I try to learn?

Expand Messages
  • Tom
    Steve, Just picking up on this thread a bit. Here s an interesting web page that rates popularity of programming languages:
    Message 1 of 7 , Jun 24, 2013
    • 0 Attachment
      Steve,

      Just picking up on this thread a bit. Here's an interesting web page that rates
      popularity of programming languages:

      http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html

      Note C is at the top followed by Java, Objective-C, and C++. Objective-C and C++
      have recently swapped places. This is explained by the many developers who are moving to the more lucrative Apple platforms (iPhone, iPad, Mac) where Objective-C is effectively
      the native programming language. Number 4 is an interpreted language, and number 5,
      C#, was Microsoft's attempt to usurp Java.

      A major characteristic of genealogical software is the need to have persistence of objects
      in the programs. That is, you need to be able to create a software object to represent
      some genealogical concept (e.g., person, family) and you need that object to continue
      being alive within your program. It has to have a lifetime beyond the individual software
      modules that create them, or display them or compute with them. Objects like these
      are called heap objects, meaning they exist in a largish, originally unstructured area of
      computer memory. The heap is carved up into smaller units of memory, to hold your
      objects, as the program proceeds. Because of this you need to use a programming
      language that helps you manage the heap. The popular "object-oriented" languages
      provide you with the support you need. These languages are Java, Objective-C, C++ and
      C#. All four of them actually manage to hide almost all issues of heap use and heap
      management from you. Note that C is not on the list. C is not object-oriented, and C does
      has only a very
      simple interface to the heap. C can be used in an object-oriented manner by how you
      structure your code, and the heap interface of C (the malloc and free system calls) can be
      encapsulated in a higher layer abstraction in C, but using C in this manner is not for the
      beginner.

      Your realistic options are Java, Objective-C and C++. A major advantage that Java
      and Objective-C have over C++ is complete, built-in support for the Unicode character
      set, especially important for genealogical applications. C++ provides Unicode support
      in fairly ugly, add-on manner, so you can get Unicode in C++ but it takes a little work.
      I don't know enough about C# to know if Unicode is the native character format, but I
      would expect that it is.

      Though Java is theoretically an interpreted language, practically speaking this is not
      important. Java source code is translated into a virtual assembly language, which is then
      interpreted when the program runs. However, all modern Java runtimes will translate the
      virtual assembly language into the native machine code of the host computer before
      actually running the code. One of the more colorful terms in computing, JIT compilers
      grew out of this practice, JIT meaning "just in time".

      The issues of which programming language to use versus which database to use are
      orthogonal issues. Every major database system will provide software libraries that
      allow access to their databases from all major languages. Because Java and C are so
      popular basically all
      databases have Java and C interface libraries. And here's a little secret. Objective-C and
      C++ are supersets of C -- they both started their lives as simple front-ends to C that
      were first translated into pure C by a very simple macro-expansion process, and then
      compiled by the C compiler. Therefore it is easy to use C libraries that access databases
      from Objective-C and C++ programs. The point is, you don't have to make your
      programming language decision based on your database decision or vice versa.

      Your database decision comes down to, more or less, whether you are going to use one
      of the relational databases (free ones being mySQL and sqlite), or one of the more
      modern "no-SQL" databases like MongoDB.

      Relational databases are best when the data is very, very regular. Personally I do not think
      of genealogical data as being regular enough. However all major genealogical programs
      force their data into relational form. You have seen some of the effects of this when
      trying to express names, dates and places in non-conventional formats. It is hard if not
      impossible to do this reasonably. Because of this I have never used a relational database
      in the systems I write. My LifeLines program, written almost 25 years ago used a custom
      database that I wrote in which each record was, wait for it, wait for it, a GEDCOM syntax
      record. My database was an early example of the types of databases that are today
      called no-SQL databases. I think MongoDB is the best example of one of those today,
      and my future genealogical software will use MongoDB. The MongoDB people support
      a C library, which I use from Objective-C.

      The arguments for a relational database boil down to arguments about speed of querying
      and power of the querying language. Speed involves indexing, and powerful querying
      involves richness of the querying language. SQL is the query language invented for
      relational databases. One of the arguments for relational databases is that they allow for
      very powerful indexing and use of SQL. However these are not really as issue anymore. All
      of the no-SQL languages provide indexing just as powerful as that in relational databases,
      and they all provide querying that is as effective or even better than SQL style queries.

      Tom
    • Steve Hayes
      ... The way I understood it is that C and C++ allow (require) the programmer to manage memory, whereas in most of the others this is handled automatically by
      Message 2 of 7 , Jun 25, 2013
      • 0 Attachment
        On 24 Jun 2013 at 12:00, Tom wrote:

        > Just picking up on this thread a bit. Here's an interesting web page that
        > rates popularity of programming languages:
        >
        > http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
        >
        > Note C is at the top followed by Java, Objective-C, and C++. Objective-C and
        > C++ have recently swapped places. This is explained by the many developers who
        > are moving to the more lucrative Apple platforms (iPhone, iPad, Mac) where
        > Objective-C is effectively the native programming language. Number 4 is an
        > interpreted language, and number 5, C#, was Microsoft's attempt to usurp Java.
        >
        > A major characteristic of genealogical software is the need to have
        > persistence of objects in the programs. That is, you need to be able to create
        > a software object to represent some genealogical concept (e.g., person,
        > family) and you need that object to continue being alive within your program.
        > It has to have a lifetime beyond the individual software modules that create
        > them, or display them or compute with them. Objects like these are called heap
        > objects, meaning they exist in a largish, originally unstructured area of
        > computer memory. The heap is carved up into smaller units of memory, to hold
        > your objects, as the program proceeds. Because of this you need to use a
        > programming language that helps you manage the heap. The popular
        > "object-oriented" languages provide you with the support you need. These
        > languages are Java, Objective-C, C++ and C#. All four of them actually manage
        > to hide almost all issues of heap use and heap management from you. Note that
        > C is not on the list. C is not object-oriented, and C does has only a very
        > simple interface to the heap. C can be used in an object-oriented manner by
        > how you structure your code, and the heap interface of C (the malloc and free
        > system calls) can be encapsulated in a higher layer abstraction in C, but
        > using C in this manner is not for the beginner.

        The way I understood it is that C and C++ allow (require) the programmer to
        manage memory, whereas in most of the others this is handled automatically by
        the compiler.

        And isn't the kind of information you refer to as objects ultimately stird on
        disk in a database, called from there and stored there again once it has been
        interrogated or manipulated?

        > Your database decision comes down to, more or less, whether you are going to
        > use one of the relational databases (free ones being mySQL and sqlite), or one
        > of the more modern "no-SQL" databases like MongoDB.

        I've been using non-relational databases up till now for this purpose.

        > Relational databases are best when the data is very, very regular. Personally
        > I do not think of genealogical data as being regular enough. However all major
        > genealogical programs force their data into relational form. You have seen
        > some of the effects of this when trying to express names, dates and places in
        > non-conventional formats. It is hard if not impossible to do this reasonably.
        > Because of this I have never used a relational database in the systems I
        > write. My LifeLines program, written almost 25 years ago used a custom
        > database that I wrote in which each record was, wait for it, wait for it, a
        > GEDCOM syntax record. My database was an early example of the types of
        > databases that are today called no-SQL databases. I think MongoDB is the best
        > example of one of those today, and my future genealogical software will use
        > MongoDB. The MongoDB people support a C library, which I use from Objective-C.
        >
        > The arguments for a relational database boil down to arguments about speed of
        > querying and power of the querying language. Speed involves indexing, and
        > powerful querying involves richness of the querying language. SQL is the query
        > language invented for relational databases. One of the arguments for
        > relational databases is that they allow for very powerful indexing and use of
        > SQL. However these are not really as issue anymore. All of the no-SQL
        > languages provide indexing just as powerful as that in relational databases,
        > and they all provide querying that is as effective or even better than SQL
        > style queries.

        I find that the non-relational database I use now (InMagic) works fine for
        speed. The disadvantage, vis-a-vis relational databases, is redundant data.


        --
        Steve Hayes
        E-mail: shayes@...
        Blog: http://khanya.wordpress.com
        Phone: 083-342-3563 or 012-333-6727
        Fax: 086-548-2525
      • Tom
        ... You are right about C and C++. Java has always been a language where the compiler managed the heap for you (see note below). Objective-C has evolved. In
        Message 3 of 7 , Jun 25, 2013
        • 0 Attachment
          --- In gensoft@yahoogroups.com, "Steve Hayes" <hayesstw@...> wrote:
          >
          > On 24 Jun 2013 at 12:00, Tom wrote:
          > >
          > > A major characteristic of genealogical software is the need to have
          > > persistence of objects in the programs. That is, you need to be able to create
          > > a software object to represent some genealogical concept (e.g., person,
          > > family) and you need that object to continue being alive within your program.
          > > It has to have a lifetime beyond the individual software modules that create
          > > them, or display them or compute with them. Objects like these are called heap
          > > objects, meaning they exist in a largish, originally unstructured area of
          > > computer memory. The heap is carved up into smaller units of memory, to hold
          > > your objects, as the program proceeds. Because of this you need to use a
          > > programming language that helps you manage the heap. The popular
          > > "object-oriented" languages provide you with the support you need. These
          > > languages are Java, Objective-C, C++ and C#. All four of them actually manage
          > > to hide almost all issues of heap use and heap management from you. Note that
          > > C is not on the list. C is not object-oriented, and C does has only a very
          > > simple interface to the heap. C can be used in an object-oriented manner by
          > > how you structure your code, and the heap interface of C (the malloc and free
          > > system calls) can be encapsulated in a higher layer abstraction in C, but
          > > using C in this manner is not for the beginner.
          >
          > The way I understood it is that C and C++ allow (require) the programmer to
          > manage memory, whereas in most of the others this is handled automatically by
          > the compiler.
          >
          > And isn't the kind of information you refer to as objects ultimately stird on
          > disk in a database, called from there and stored there again once it has been
          > interrogated or manipulated?

          You are right about C and C++. Java has always been a language where the compiler
          managed the heap for you (see note below). Objective-C has evolved. In Objective-C
          you used to manage the heap using reference counting that you did by making
          calls to retain, release and autorelease. The latest version of Objective-C has a feature
          called ARC (automatic reference counting) which essentially makes Objective-C
          equivalent to Java in terms of heap management. Basically the compiler inserts all
          the right heap management calls at the right places. I have shifted over to using ARC
          in all my development and it is wonderful.

          The proviso (the note below) is that programmers can still be stupid and the
          compilers can't help that. Heap management means creating new objects when
          needed and returning objects when no longer needed. Smart compilers are smart by
          figuring out when an object will never be referred to again and then getting rid of it
          automatically. The compiler does this by inserting code that keeps track of how many
          times a heap object is referred to. When this drops to zero the object is removed.
          But, programmers can (and often do) write code in which objects are still referred to but
          will never be used. A classic situation occurs when one object refers to another, and
          that other object refers back to the first object, but no other references to either
          object exists. Most compilers are not smart enough to recognize this case and the
          two objects persist for the life of the program, essentially as leaked memory. There
          are "garbage collection" algorithms that can find these leaks, but they are expensive
          to run and usually cause your programs to suddenly pause. Good monitoring tools
          used during development can find these problems for you.

          Tom
        • Tom
          ... Forgot to respond to the above. Yes, most objects of interest in a genealogical program will be stored in a database. Issues boil down to the format of the
          Message 4 of 7 , Jun 25, 2013
          • 0 Attachment
            > And isn't the kind of information you refer to as objects ultimately stird on
            > disk in a database, called from there and stored there again once it has been
            > interrogated or manipulated?

            Forgot to respond to the above. Yes, most objects of interest in a genealogical program
            will be stored in a database. Issues boil down to the format of the data in a persistent
            database versus the format of an object in a running computer program. When going
            in either direction a transformation must occur. In a relational database, the on-disk
            format are all in "tables" and there may be many tables involved in storing the information
            about a single person (the names will be in one table, the birth info in another, the source
            references
            in another, and so on and so on and so on). To bring a person into a program's memory
            so it can be computed with, database access code must be written to read all those
            tables and to construct an internal objects out of the information found in all those tables.

            In most non-sql databases, the objects are not stored in tables but in structures that
            can exactly mimic the structure that the objects should have when being processed
            in a program. Moving objects to and from one of these databases is trivial as no
            transformations are required. This is definitely the trend into the future.

            Tom
          Your message has been successfully submitted and would be delivered to recipients shortly.