Loading ...
Sorry, an error occurred while loading the content.

bar or scan-code systems to aid insect curation

Expand Messages
  • Julianna Tuell
    Hello All, I am looking for reviews and/or suggestions for the most user-friendly bar or scan-code system for helping to inventory an insect collection. If
    Message 1 of 9 , Aug 5, 2011
    View Source
    • 0 Attachment

      Hello All,

      I am looking for reviews and/or suggestions for the most user-friendly bar or scan-code system for helping to inventory an insect collection. If you use a system like this and you really like it, where might one order it?

      Thanks!
      Julianna

      --
      Julianna Tuell, Ph.D.
      Department of Entomology
      Michigan State University
      202 CIPS Bldg.
      East Lansing, MI 48824

      Email: tuelljul@... or tuelljul@...

    • Doug Yanega
      ... At the risk of being exposed to public ridicule, I generally suggest, to anyone who asks me, that bar code systems should be avoided unless some VERY
      Message 2 of 9 , Aug 5, 2011
      View Source
      • 0 Attachment
        Julianna Tuell wrote:

        >I am looking for reviews and/or suggestions for the most
        >user-friendly bar or scan-code system for helping to inventory an
        >insect collection. If you use a system like this and you really like
        >it, where might one order it?

        At the risk of being exposed to public ridicule, I generally suggest,
        to anyone who asks me, that bar code systems should be avoided unless
        some VERY specific conditions are met:

        (1) you are and will be working exclusively with brand-new specimens
        that do not already have any labels on them whatsoever.

        (2) you are prepared to buy and store 1.5 times as many unit trays
        and museum drawers for a given amount of specimens than if you were
        using non-barcoded database labels (*unless* all of your insects are
        over 15 mm long to begin with), as well as paying a lot more for a
        given amount of labels.

        (3) your intent, when scanning, is to never have the person doing the
        scan spend extra time to check what is printed on the label to make
        sure it is scanning exactly as it appears.

        If you can answer "Yes, absolutely" to all of those criteria, then
        barcoded labels should work fine for you. Otherwise, I recommend that
        people avoid them. We have well over 300,000 specimens in our
        collection with unique specimen numbers, linked to a database, and
        they are not barcoded. It works just fine; it's at least as practical
        for specimens that already have labels (since barcodes need to be
        exposed in order for a scanner to read them, but normally have
        human-readable text, the simplest option is to *not bother* exposing
        them so they can be scanned, and just use the human-readable text -
        otherwise, one has to increase the storage space a minimum of 1.5
        times), and one can buy multiple reams of archival paper (which we
        know will last at least 200 years) for the cost of one roll of
        thermal-transfer plastic (whose archival properties are unknown).

        If you're wondering where the "1.5 times" number comes from: with
        very few exceptions, standard insect specimen labels do not exceed 15
        mm long. The SHORTEST full-data labels (that have barcodes alongside
        the date/locality text) that I have ever seen are 22 mm long after
        trimming, and can be as long as 25 mm - and the same figure applies
        to adding a barcode-only label underneath an existing 15-mm label in
        such a way that the scannable portion is sticking out - meaning that
        yes, it takes at least 1.5 times as much space to store your
        specimens if you use barcode labels, if not *more*. We already spend
        at least $2K a year on drawers for new material, and nearly another
        $1K on unit trays. If we used barcode labels, we'd have to spend an
        additional $1,500 or more - at a time when we're being asked to cut
        20% from our budget. That is what is called a "no-brainer". (And, of
        course, if we wanted to retroactively put barcodes on all our
        existing holdings, we would expand beyond our present storage
        capacity and have to ask for a new building, since there is literally
        nowhere to add more compactors). On top of that, we are spending no
        money on scanner hardware or software, and we haven't needed to buy
        new archival paper for our labels in over 15 years (you can print 624
        unique labels on a single sheet of paper, so we've gone through only
        about 500 sheets so far).

        Barcode labels cost a LOT of extra money, and they almost never save
        time (the detailed analyses are very context-dependent and grueling
        to describe, and I will spare that for now); the only noteworthy
        thing they do is reduce the error rate when specimen numbers are
        being typed into a computer. The human error rate is, based on
        empirical data from using student workers, roughly one incorrect
        number per 1000 entries - and with nearly any sensibly-designed
        system and protocol, such errors are obvious and easily-corrected.
        Eliminating those errors by using a barcode scanner is nice, but
        extremely costly relative to the *miniscule* benefit. Note also that
        the thermal printers used to make barcode labels are such that they
        do have a *non-zero* error rate, because sometimes dots in the dot
        matrix fail to print, resulting in incorrect scans (you can't change
        a 1 to a 0 in a binary matrix without changing the data content).

        I don't expect anyone to agree with me - I'm used to getting all
        sorts of arguments that amount to little more than polite ways to
        inform me that I'm crazy - but there's a difference between how
        something looks as a *concept* and how it works in reality, and
        barcoded labels are precisely such a thing.

        Peace,
        --

        Doug Yanega Dept. of Entomology Entomology Research Museum
        Univ. of California, Riverside, CA 92521-0314 skype: dyanega
        phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
        http://cache.ucr.edu/~heraty/yanega.html
        "There are some enterprises in which a careful disorderliness
        is the true method" - Herman Melville, Moby Dick, Chap. 82
      • Griswold, Terry
        Just to add to the conversation, there are definite issues in space required to house coded specimens. We are generating labels that are approx. 9x17mm with a
        Message 3 of 9 , Aug 5, 2011
        View Source
        • 0 Attachment

          Just to add to the conversation, there are definite issues in space required to house coded specimens.  We are generating labels that are approx. 9x17mm with a matrix code on one end printed on archival paper (but these are still bigger than our older labels).  For new material this labels includes the standard data including floral records and is generated at the same time as specimen data is input into the database so there are no inconsistencies between the physical label and the database.  For retroactive data capture we generate a determination label with similar dimensions that includes the matrix code.  These coded labels comes with a cost: we have to use 3.5 font which can make reading labels with the naked eye difficult.  We find that for small to medium sized bees we are usually able to scan the codes on these det labels without removing them from the unit trays.

           

          terry

           

          Terry Griswold

          USDA ARS Bee Biology & Systematics Laboratory
          Utah State University
          Logan, UT 84322-5310
          USA

          435.797.2526

          435.797.0461 Fax

           

          From: beemonitoring@yahoogroups.com [mailto:beemonitoring@yahoogroups.com] On Behalf Of Doug Yanega
          Sent: Friday, August 05, 2011 12:40 PM
          To: beemonitoring@yahoogroups.com
          Subject: Re: [beemonitoring] bar or scan-code systems to aid insect curation

           

           

          Julianna Tuell wrote:

          >I am looking for reviews and/or suggestions for the most
          >user-friendly bar or scan-code system for helping to inventory an
          >insect collection. If you use a system like this and you really like
          >it, where might one order it?

          At the risk of being exposed to public ridicule, I generally suggest,
          to anyone who asks me, that bar code systems should be avoided unless
          some VERY specific conditions are met:

          (1) you are and will be working exclusively with brand-new specimens
          that do not already have any labels on them whatsoever.

          (2) you are prepared to buy and store 1.5 times as many unit trays
          and museum drawers for a given amount of specimens than if you were
          using non-barcoded database labels (*unless* all of your insects are
          over 15 mm long to begin with), as well as paying a lot more for a
          given amount of labels.

          (3) your intent, when scanning, is to never have the person doing the
          scan spend extra time to check what is printed on the label to make
          sure it is scanning exactly as it appears.

          If you can answer "Yes, absolutely" to all of those criteria, then
          barcoded labels should work fine for you. Otherwise, I recommend that
          people avoid them. We have well over 300,000 specimens in our
          collection with unique specimen numbers, linked to a database, and
          they are not barcoded. It works just fine; it's at least as practical
          for specimens that already have labels (since barcodes need to be
          exposed in order for a scanner to read them, but normally have
          human-readable text, the simplest option is to *not bother* exposing
          them so they can be scanned, and just use the human-readable text -
          otherwise, one has to increase the storage space a minimum of 1.5
          times), and one can buy multiple reams of archival paper (which we
          know will last at least 200 years) for the cost of one roll of
          thermal-transfer plastic (whose archival properties are unknown).

          If you're wondering where the "1.5 times" number comes from: with
          very few exceptions, standard insect specimen labels do not exceed 15
          mm long. The SHORTEST full-data labels (that have barcodes alongside
          the date/locality text) that I have ever seen are 22 mm long after
          trimming, and can be as long as 25 mm - and the same figure applies
          to adding a barcode-only label underneath an existing 15-mm label in
          such a way that the scannable portion is sticking out - meaning that
          yes, it takes at least 1.5 times as much space to store your
          specimens if you use barcode labels, if not *more*. We already spend
          at least $2K a year on drawers for new material, and nearly another
          $1K on unit trays. If we used barcode labels, we'd have to spend an
          additional $1,500 or more - at a time when we're being asked to cut
          20% from our budget. That is what is called a "no-brainer". (And, of
          course, if we wanted to retroactively put barcodes on all our
          existing holdings, we would expand beyond our present storage
          capacity and have to ask for a new building, since there is literally
          nowhere to add more compactors). On top of that, we are spending no
          money on scanner hardware or software, and we haven't needed to buy
          new archival paper for our labels in over 15 years (you can print 624
          unique labels on a single sheet of paper, so we've gone through only
          about 500 sheets so far).

          Barcode labels cost a LOT of extra money, and they almost never save
          time (the detailed analyses are very context-dependent and grueling
          to describe, and I will spare that for now); the only noteworthy
          thing they do is reduce the error rate when specimen numbers are
          being typed into a computer. The human error rate is, based on
          empirical data from using student workers, roughly one incorrect
          number per 1000 entries - and with nearly any sensibly-designed
          system and protocol, such errors are obvious and easily-corrected.
          Eliminating those errors by using a barcode scanner is nice, but
          extremely costly relative to the *miniscule* benefit. Note also that
          the thermal printers used to make barcode labels are such that they
          do have a *non-zero* error rate, because sometimes dots in the dot
          matrix fail to print, resulting in incorrect scans (you can't change
          a 1 to a 0 in a binary matrix without changing the data content).

          I don't expect anyone to agree with me - I'm used to getting all
          sorts of arguments that amount to little more than polite ways to
          inform me that I'm crazy - but there's a difference between how
          something looks as a *concept* and how it works in reality, and
          barcoded labels are precisely such a thing.

          Peace,
          --

          Doug Yanega Dept. of Entomology Entomology Research Museum
          Univ. of California, Riverside, CA 92521-0314 skype: dyanega
          phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
          http://cache.ucr.edu/~heraty/yanega.html
          "There are some enterprises in which a careful disorderliness
          is the true method" - Herman Melville, Moby Dick, Chap. 82

        • Doug Yanega
          ... Hmm. There are two things that your comments raise in my mind: (A) we also use Bartender, but someone, somewhere, told us originally (I was not directly
          Message 4 of 9 , Aug 5, 2011
          View Source
          • 0 Attachment
            Elizabeth Elle wrote:

            >I agree that #1 is a good criterion, but disagree a bit about #2 and
            >3. I print labels from Bartender (linked to our database in Access)
            >directly onto archival paper, and these labels are basically the
            >same size as 'normal' labels; they have all the location and
            >collection information plus a small (5mm x 5mm) barcode.

            Hmm. There are two things that your comments raise in my mind: (A) we
            also use Bartender, but someone, somewhere, told us originally (I was
            not directly involved at the time) that if we wanted to print out
            matrix barcode labels, it had to be on a $5K thermal printer, rather
            than on regular paper and a regular printer. I think maybe we were
            scammed, and are now stuck with a $5K white elephant. It prints 7
            labels a minute, and we've been told it's performing according to
            specs. Two faculty members use this system, but the Museum does not.
            (B) adding a 5 x 5 mm barcode to a label that is already 15 mm long
            does indeed lead to a 22-mm-long label. A few other folks have
            suggested shortening the labels by using tinier text, but then it
            becomes very very hard to read, and that's not user-friendly, since
            humans DO use those labels, especially humans borrowing those
            specimens who do not happen to have barcode scanners.

            >We mostly use the barcode and our handheld scanner to enter specimen
            >identifications into our database, linking the species name to the
            >unique numerical identifier on each label. Our system is heavily
            >derivative of the BBSL system, and for the work we do--mostly
            >ecological/fragmentation studies where we collect many thousands of
            >specimens in a given year--I find this reduces entry errors greatly.
            >I imagine that whether starting with a system like this makes sense
            >will depend greatly on how people collect things and how they want
            >to archive them.

            The list of complicating factors is pretty extensive, and even this
            brief description of your system points out two such factors: the
            first is that when one has specimens from thousands of different
            localities, the odds that a person mistyping a specimen number will
            turn up a record that happens to be from the same date and locality
            is very small, so they can easily catch their own error - but in your
            case, it sounds like you might have thousands of specimens with
            identical date/locality data, reducing the odds that pulling up the
            wrong record would make it obvious that it *was* the wrong record.
            That increases the utility of the barcodes in your context.

            What the second factor relates to is that you don't say whether you
            are labelling your specimens before or after some sort of rough
            morphospecies sorting; if they are largely sorted before labeling
            (this is something I do here), then the odds are greatly increased
            that any set of specimens belonging to the same taxon will have
            numbers in sequence, and this is one area where a human being who is
            handed a bunch of labels and asked to type in (or write down) the
            specimen numbers can actually go much faster than a person who is
            simply scanning one specimen after another rather than reading the
            labels themselves.

            Consider this experiment: I assign a student to scan a set of 50
            specimens that all belong to one species; they dutifully scan them,
            creating 50 find commands executed as part of one search, and it
            takes them about 5 minutes, depending in part on how nicely
            positioned the labels are in the unit tray (scan one, wait a second
            for the beep, perform a few keystrokes, scan the next, beep, a few
            more keystrokes, and so on). I hand the same unit to another student
            and then ask them to read the labels themselves and type them in, and
            they notice within a matter of seconds that the labels - while
            scrambled slightly as part of the experiment - form a sequence
            starting with 177521 and ending with 177570. They sit down, enter ONE
            find command (177521-177570), summon up all 50 records
            simultaneously, and they're finished in roughly one minute (most of
            that time being spent mentally going over the numbers, and only a few
            seconds actually typing). Which system is more effiicient? Even if
            the student with the scanner had noticed the sequentiality, they
            still would have had to spend the same amount of time mentally going
            over the numbers to be sure they were all there, and the time
            difference between scanning in two 6-digit numbers and *typing* in
            two 6-digit numbers is negligible. There is no way that the person
            with the scanner, *in this experiment*, could get the task done
            significantly faster than the person typing. The maximum time
            advantage is when all 50 numbers are completely randomized, and even
            then, a student who can read and type 6-digit numbers can work almost
            as fast as one using a scanner (unless the *only* task is the entry
            of numbers, and few or no other keystrokes are performed) - which is
            why I maintain that, in practice, the only functional difference is
            the error rate.

            Peace,
            --

            Doug Yanega Dept. of Entomology Entomology Research Museum
            Univ. of California, Riverside, CA 92521-0314 skype: dyanega
            phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
            http://cache.ucr.edu/~heraty/yanega.html
            "There are some enterprises in which a careful disorderliness
            is the true method" - Herman Melville, Moby Dick, Chap. 82
          • Derek Sikes
            A few comments from our system in the University of Alaska Museum Insect Collection: 1. space: we use California Academy unit trays and in a square tray can
            Message 5 of 9 , Aug 12, 2011
            View Source
            • 0 Attachment
              A few comments from our system in the University of Alaska Museum
              Insect Collection:

              1. space: we use California Academy unit trays and in a square tray
              can fit a maximum of 3 rows of specimens in a tray (this is about 30
              specimens +/- depending on width). Our barcodes are visible from above
              so take up space. The CASC also fits 3 rows per unit tray but without
              barcodes their rows have empty space between them for safety of the
              specimens. Thus, our codes don't lead to an increase in storage space
              but do lead to a reduction in safety, at least relative to the CASC.
              These square trays can fit 5 rows if packed 100% with 15mm label-no
              barcode specimens (and some collections use this approach) so the
              space in question is 2/5ths, not 1/2 of that available.

              2. When we scan barcodes it's fast - there is only only one search
              performed and the computer doesn't need to be manipulated between
              scans. We can scan any number of codes with no keystrokes in between.
              One of our largest scanning day involved 4,206 specimens scanned in 5
              hours. I would feel awful asking a student or team of students to type
              in that many numbers! (even if they could find some series that could
              speed entry - and I expect the error rate would be unacceptably high
              as the student's minds began to melt from the seemingly endless typing
              of myriad 12 digit numbers).

              I agree there are serious concerns and Doug has correctly pointed most
              of these out, but our (5 year) experience has so far been positive
              enough to recommend barcodes. Hopefully, we'll still recommend them 50
              years from now!

              -Derek Sikes



              On 8/5/11, Doug Yanega <dyanega@...> wrote:
              > Elizabeth Elle wrote:
              >
              >>I agree that #1 is a good criterion, but disagree a bit about #2 and
              >>3. I print labels from Bartender (linked to our database in Access)
              >>directly onto archival paper, and these labels are basically the
              >>same size as 'normal' labels; they have all the location and
              >>collection information plus a small (5mm x 5mm) barcode.
              >
              > Hmm. There are two things that your comments raise in my mind: (A) we
              > also use Bartender, but someone, somewhere, told us originally (I was
              > not directly involved at the time) that if we wanted to print out
              > matrix barcode labels, it had to be on a $5K thermal printer, rather
              > than on regular paper and a regular printer. I think maybe we were
              > scammed, and are now stuck with a $5K white elephant. It prints 7
              > labels a minute, and we've been told it's performing according to
              > specs. Two faculty members use this system, but the Museum does not.
              > (B) adding a 5 x 5 mm barcode to a label that is already 15 mm long
              > does indeed lead to a 22-mm-long label. A few other folks have
              > suggested shortening the labels by using tinier text, but then it
              > becomes very very hard to read, and that's not user-friendly, since
              > humans DO use those labels, especially humans borrowing those
              > specimens who do not happen to have barcode scanners.
              >
              >>We mostly use the barcode and our handheld scanner to enter specimen
              >>identifications into our database, linking the species name to the
              >>unique numerical identifier on each label. Our system is heavily
              >>derivative of the BBSL system, and for the work we do--mostly
              >>ecological/fragmentation studies where we collect many thousands of
              >>specimens in a given year--I find this reduces entry errors greatly.
              >>I imagine that whether starting with a system like this makes sense
              >>will depend greatly on how people collect things and how they want
              >>to archive them.
              >
              > The list of complicating factors is pretty extensive, and even this
              > brief description of your system points out two such factors: the
              > first is that when one has specimens from thousands of different
              > localities, the odds that a person mistyping a specimen number will
              > turn up a record that happens to be from the same date and locality
              > is very small, so they can easily catch their own error - but in your
              > case, it sounds like you might have thousands of specimens with
              > identical date/locality data, reducing the odds that pulling up the
              > wrong record would make it obvious that it *was* the wrong record.
              > That increases the utility of the barcodes in your context.
              >
              > What the second factor relates to is that you don't say whether you
              > are labelling your specimens before or after some sort of rough
              > morphospecies sorting; if they are largely sorted before labeling
              > (this is something I do here), then the odds are greatly increased
              > that any set of specimens belonging to the same taxon will have
              > numbers in sequence, and this is one area where a human being who is
              > handed a bunch of labels and asked to type in (or write down) the
              > specimen numbers can actually go much faster than a person who is
              > simply scanning one specimen after another rather than reading the
              > labels themselves.
              >
              > Consider this experiment: I assign a student to scan a set of 50
              > specimens that all belong to one species; they dutifully scan them,
              > creating 50 find commands executed as part of one search, and it
              > takes them about 5 minutes, depending in part on how nicely
              > positioned the labels are in the unit tray (scan one, wait a second
              > for the beep, perform a few keystrokes, scan the next, beep, a few
              > more keystrokes, and so on). I hand the same unit to another student
              > and then ask them to read the labels themselves and type them in, and
              > they notice within a matter of seconds that the labels - while
              > scrambled slightly as part of the experiment - form a sequence
              > starting with 177521 and ending with 177570. They sit down, enter ONE
              > find command (177521-177570), summon up all 50 records
              > simultaneously, and they're finished in roughly one minute (most of
              > that time being spent mentally going over the numbers, and only a few
              > seconds actually typing). Which system is more effiicient? Even if
              > the student with the scanner had noticed the sequentiality, they
              > still would have had to spend the same amount of time mentally going
              > over the numbers to be sure they were all there, and the time
              > difference between scanning in two 6-digit numbers and *typing* in
              > two 6-digit numbers is negligible. There is no way that the person
              > with the scanner, *in this experiment*, could get the task done
              > significantly faster than the person typing. The maximum time
              > advantage is when all 50 numbers are completely randomized, and even
              > then, a student who can read and type 6-digit numbers can work almost
              > as fast as one using a scanner (unless the *only* task is the entry
              > of numbers, and few or no other keystrokes are performed) - which is
              > why I maintain that, in practice, the only functional difference is
              > the error rate.
              >
              > Peace,
              > --
              >
              > Doug Yanega Dept. of Entomology Entomology Research Museum
              > Univ. of California, Riverside, CA 92521-0314 skype: dyanega
              > phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
              > http://cache.ucr.edu/~heraty/yanega.html
              > "There are some enterprises in which a careful disorderliness
              > is the true method" - Herman Melville, Moby Dick, Chap. 82
              >


              --

              +++++++++++++++++++++++++++++++++++
              Derek S. Sikes, Curator of Insects
              Assistant Professor of Entomology
              University of Alaska Museum
              907 Yukon Drive
              Fairbanks, AK 99775-6960

              dssikes@...
              http://users.iab.uaf.edu/~derek_sikes/sikes_lab.htm

              phone: 907-474-6278
              FAX: 907-474-5469

              University of Alaska Museum -
              http://www.uaf.edu/museum/collections/ento/
              +++++++++++++++++++++++++++++++++++

              Interested in Alaskan Entomology? Join the Alaska Entomological Society and
              / or sign up for the email listserv "Alaska Entomological Network" at
              http://www.akentsoc.org/contact.php
            • H
              All- sorry for the late addition to this discussion but I ve been avoiding it as much as I ve been thinking about it.. Here at the Bee Lab [Logan] we have used
              Message 6 of 9 , Aug 22, 2011
              View Source
              • 0 Attachment

                All- sorry for the late addition to this discussion but I've been avoiding it as much as I've been thinking about it..

                Here at the Bee Lab [Logan] we have used a progression of scanners and systems. Currently our database system involves using a SQL server [users work in MS ACCESS] dumping into a program called Bartender [made by Seagull > http://www.seagullscientific.com/aspx/welcome.aspx]. We no longer use "scanners" but instead have started using "Area Imagers"  [http://www.honeywellaidc.com/en-US/Pages/Product.aspx?category=Area%20Imager&cat=HSM&pid=4820_ ]. Please send me a email directly [hikerd@...] if you would like a PDF example of the insect labels. We print 2D-Data Matrix and not the typical barcode that most people are familiar.

                Advantages? Yes... we like our current system and the scanner saves time in our situation. The current version of Bartender is very flexible. The database/bartender is printing to a relatively inexpensive Brother HL-5370Dw series printer on 80lb. archival paper. While we have one main label format that includes your basic who, on what, when, and where, we also have about a dozen other label formats for special projects. Example - one format prints two labels. The first is our normal version plus a second, smaller number-only label to go with the separate DNA vial. Other versions include identification labels, Holotype labels, pollen slide labels and even temporary plant voucher labels for the plants. One goal in this is to have everything from entry, analysis to final reports flow through the database. This alludes to my next point, Quality Control.

                All of our project bees go through a Data Quality Check before the labels are put on the specimens. The process usually follows this progression:

                1. Project location is entered into an authority file [location table]
                2. Technician counts the specimens for each collection event
                3. Each collection event is entered into the database with number of specimens
                4. In the case of pan traps 10 entries can quickly be turned into 1500 labels.
                5. Data Quality Check - Summation sheet is printed out with Labels - so the Technician can easily proof the labels [4pt font] with the head labels and/or field notes.
                6. Entry errors fixes and reprinting is done if need be.
                7. Specimens are labeled.

                Normal batches like this are usually done for about 50 to 1500 specimens at a time. I cannot stress the data checks. If you have more than 2 people working within a data set, I will stress QC even more! Here is an example from the medical field: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2656002/
                Summation.."13.5% error rate in each of the databases." Maybe not the best example but it paints the picture and I'm assuming everyone here has had to clean a data set after finding what started to be a very small glitch.

                Labeling represents the first level of QC followed by a 1% spot check of the specimens before sorting to the generic level and remove of the head label/field notes. The only way to truly save time when investing time to build a database / use scanners and/or large data sets is to catch mistakes and clean the data set before analysis highlights entry mistakes. 
                 
                Other areas that have proven efficiency gains by using the Database/Scanner:

                • Specimen Identification input and thus Determination tracking
                • Loan processing
                • Flagging of questionable specimens or data entry

                Areas that will prove to be time sinks:

                • setup time
                • learning curve of a new system
                • Building authority tables [collector, locality, species... etc.]

                 Concerns/Priorities:

                • Unique number on every specimen.
                • Quality Control and Change Tracking
                • Flexibility of your system. It works great for the current project but what about the next.
                • Data Legacy?

                Hope this helps,
                H



                HW Ikerd
                Hikerd@...
                435-227-5711 (Google Voice)
                435-797-2526(work)
                http://biocol.org/urn:lsid:biocol.org:col:33039


              • Doug Yanega
                ... Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We ve noticed that even a high-end
                Message 7 of 9 , Aug 22, 2011
                View Source
                • 0 Attachment
                  Re: [beemonitoring] bar or scan-code systems to aid insect
                  Harold Ikerd wrote:

                  Advantages? Yes... we like our current system and the scanner saves time in our situation. The current version of Bartender is very flexible. The database/bartender is printing to a relatively inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.

                  Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We've noticed that even a high-end thermal printer will occasionally "drop out" black bits in the matrix, effectively turning binary "1"s onto "0"s, and corrupting the resulting scan. This would be an important baseline to establish, as it's certain to be non-zero; but non-zero by how much?

                  This alludes to my next point, Quality Control.

                  All of these steps you outline are sensible and practical, though they are (or should be) part and parcel of using a database as an integral part of specimen processing, and can be done with *any* uniquely-numbered labels, not just *scannable* uniquely-numbered labels.

                  Other areas that have proven efficiency gains by using the Database/Scanner:
                  • Specimen Identification input and thus Determination tracking
                  • Loan processing
                  • Flagging of questionable specimens or data entry

                  In side-by side comparison of labels with human-readable numbers versus machine-readable numbers, the scanners don't always win; various factors can enable a person to work just as fast, or faster, by NOT relying upon a scanner (especially when dealing with legacy material, as I've also mentioned before - admittedly, having the scannable label on the top can significantly improve efficiency). As I've noted before, the only concrete and indisputable advantage is in error rate, and the exact magnitude of that error depends (as noted above) on the data matrix being printed properly in the first place and not deteriorating over time, and - perhaps more significantly - all errors, be they human or machine, SHOULD be caught by QC protocols in either case, for comparable amounts of effort. In other words, I'm not so certain that the efficiency gains of scanning *have* been proven. To *prove* it requires side-by-side comparisons using systems where the ONLY difference is whether a scannable label format is involved or not (same database, same types of specimens, organized the same way, processed by the same people); even those comparisons I've been able to do often required using two different students as test subjects, which introduces unwanted variation.

                  Peace,
                  -- 
                  

                  Doug Yanega        Dept. of Entomology         Entomology Research Museum
                  Univ. of California, Riverside, CA 92521-0314        skype: dyanega
                  phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
                               http://cache.ucr.edu/~heraty/yanega.html
                    "There are some enterprises in which a careful disorderliness
                          is the true method" - Herman Melville, Moby Dick, Chap. 82
                • Daniel Kjar
                  carve the label in spanglish, in marble. you have no other option. and make sure 8 people watch you do it. i just cant trust anything else.... ... -- Dr.
                  Message 8 of 9 , Aug 22, 2011
                  View Source
                  • 0 Attachment
                    carve the label in spanglish, in marble.  you have no other option.   and make sure 8 people watch you do it.  i just cant trust anything else....

                    On 8/22/2011 7:23 PM, Doug Yanega wrote:
                     

                    Harold Ikerd wrote:

                    Advantages? Yes... we like our current system and the scanner saves time in our situation. The current version of Bartender is very flexible. The database/bartender is printing to a relatively inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.

                    Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We've noticed that even a high-end thermal printer will occasionally "drop out" black bits in the matrix, effectively turning binary "1"s onto "0"s, and corrupting the resulting scan. This would be an important baseline to establish, as it's certain to be non-zero; but non-zero by how much?

                    This alludes to my next point, Quality Control.

                    All of these steps you outline are sensible and practical, though they are (or should be) part and parcel of using a database as an integral part of specimen processing, and can be done with *any* uniquely-numbered labels, not just *scannable* uniquely-numbered labels.

                    Other areas that have proven efficiency gains by using the Database/Scanner:
                    • Specimen Identification input and thus Determination tracking
                    • Loan processing
                    • Flagging of questionable specimens or data entry

                    In side-by side comparison of labels with human-readable numbers versus machine-readable numbers, the scanners don't always win; various factors can enable a person to work just as fast, or faster, by NOT relying upon a scanner (especially when dealing with legacy material, as I've also mentioned before - admittedly, having the scannable label on the top can significantly improve efficiency). As I've noted before, the only concrete and indisputable advantage is in error rate, and the exact magnitude of that error depends (as noted above) on the data matrix being printed properly in the first place and not deteriorating over time, and - perhaps more significantly - all errors, be they human or machine, SHOULD be caught by QC protocols in either case, for comparable amounts of effort. In other words, I'm not so certain that the efficiency gains of scanning *have* been proven. To *prove* it requires side-by-side comparisons using systems where the ONLY difference is whether a scannable label format is involved or not (same database, same types of specimens, organized the same way, processed by the same people); even those comparisons I've been able to do often required using two different students as test subjects, which introduces unwanted variation.

                    Peace,
                    -- 
                    

                    Doug Yanega        Dept. of Entomology         Entomology Research Museum
                    Univ. of California, Riverside, CA 92521-0314        skype: dyanega
                    phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
                                 http://cache.ucr.edu/~heraty/yanega.html
                      "There are some enterprises in which a careful disorderliness
                            is the true method" - Herman Melville, Moby Dick, Chap. 82

                    -- 
                    Dr. Daniel Kjar
                    Assistant Professor of Biology
                    Division of Mathematics and Natural Sciences
                    Elmira College
                    1 Park Place
                    Elmira, NY 14901
                    607-735-1826
                    http://faculty.elmira.edu/dkjar
                    
                    "...humans send their young men to war; ants send their old ladies"
                    	-E. O. Wilson
                    
                  • John S. Ascher
                    I agree with Doug that retroactive data entry can be done efficiently and accurately without use a scanner. We have matrix codes encoding our specimen USIs,
                    Message 9 of 9 , Aug 22, 2011
                    View Source
                    • 0 Attachment
                      I agree with Doug that retroactive data entry can be done efficiently and
                      accurately without use a scanner. We have matrix codes encoding our
                      specimen USIs, but our technicians find it easy to read the alphanumeric
                      codes on the these USIs and to enter the USIs manually. They don't seem to
                      make frequent or serious errors in doing so (in part because our database
                      has built in safeguards to prevent overwriting of data, inadvertent entry
                      of too many records at once, etc) and all of them regard recording of the
                      USI as one of the quickest and easiest of their assigned tasks.

                      However, when dealing with proactive data entry systems or whenever one
                      must update species identifications or other information for a large
                      number of records, especially if these involve diverse taxa or collecting
                      events, it is extremely useful to have machine-readable matrix codes on
                      the USI labels. We find the scanner most useful when processing large
                      loans, especially complex ones such as those involving synoptic
                      collections.

                      We have the ability to make bulk updates to selected records using an Edit
                      Mode in our system, so in all but the most complex cases we can make
                      necessary name changes and make other updates quickly, easily, and
                      reliably, without having to read USIs from labels (whether manually or by
                      a scanner). Without such a "bulk update" functionality we would have far
                      more occasion to machine-read our USI labels.

                      Our labels have the USI alphnumeric written out as well as encoded in the
                      matrix, and this redundancy increases my confidence that most labels will
                      be readable by either a human or a machine in the distant future. The best
                      long-term insurance is to properly maintain and curate the specimens so
                      that these can be reexamined if associated data are lost or corrupted. It
                      should be obvious that archiving of specimens is even more important than
                      archiving associated data, but many data capture initiatives seem to
                      concern themselves more or only with the latter (the idea being I suppose
                      that once the data are recorded "permanently" we need not worry too much
                      about the physical specimen." It seems to me that essential curation both
                      before and after data entry is an under-appreciated aspect of existing and
                      planned databasing programs! I suppose that this is expected to be done
                      for free without dedicated external funding?

                      "Flagging of questionable specimens or data entry"

                      This is clearly a benefit of using a database and associated informatics
                      tools, but is this really a benefit derived from use of the scanner?

                      Having just visited the BBSL and glimpsed the size and rapid growth of
                      their collection, I can see that they really need make optimal use of
                      available tools, including machine-readable labels, as H is doing, whereas
                      as smaller collection could muddle through with a less sophisticated
                      system.

                      Rather than entering or reading USIs, the more difficult challenges our
                      data entry personnel face that routinely cause serious errors include 1)
                      the difficulty of deciphering illegible labels and 2) the difficulty of
                      sorting out near-duplicate localities. Many resulting outright errors or
                      suboptimal decisions can be sorted out after the fact, without use of a
                      scanner at any point, once these are detected by robots (e.g., Discover
                      Life error-checking functions) and human experts. Used of a shared,
                      web-based system (one with many users some of which are active and expert
                      in taxonomy and/or geography) increases the likelihood that such
                      retroactive corrections will be made in a timely fashion.

                      As far as efficiency of data entry, a big time sink is the need to cut and
                      affix species determination and USI labels with due care to delicate
                      specimens.

                      Two of the best ways to improve both reliability and efficiency of data
                      capture, both of which we use in our AMNH system are:

                      1) Bulk entry of as much data as possible from reliable,
                      centrally-maintained authority files so that data entry personnel select
                      from these rather than typing data (no data entry technician using our
                      system is allowed to type a bee name or a country name, ever!).

                      2) Full use of autocomplete and tab-through functionality (as available in
                      Mozilla Firefox) so that data bulk entered from the authority files
                      referred to above can be located and selected very quickly with few
                      keystrokes and with minimal or no use of a mouse.

                      These things are described in the following paper:

                      Schuh, R. T., Hewson-Smith, S., and J. S. Ascher. 2010. Specimen
                      databases: a case study in entomology using web-based software. American
                      Entomologist 56(4): 206-216.

                      I'm happy to send a pdf to anyone who's interested.

                      John



                      > Harold Ikerd wrote:
                      >
                      >>Advantages? Yes... we like our current system and the scanner saves
                      >>time in our situation. The current version of Bartender is very
                      >>flexible. The database/bartender is printing to a relatively
                      >>inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.
                      >
                      > Has anyone tested how well 2D matrices hold up over time, or what
                      > happens if the printer starts to get low on toner? We've noticed that
                      > even a high-end thermal printer will occasionally "drop out" black
                      > bits in the matrix, effectively turning binary "1"s onto "0"s, and
                      > corrupting the resulting scan. This would be an important baseline to
                      > establish, as it's certain to be non-zero; but non-zero by how much?
                      >
                      >>This alludes to my next point, Quality Control.
                      >
                      > All of these steps you outline are sensible and practical, though
                      > they are (or should be) part and parcel of using a database as an
                      > integral part of specimen processing, and can be done with *any*
                      > uniquely-numbered labels, not just *scannable* uniquely-numbered
                      > labels.
                      >
                      >>Other areas that have proven efficiency gains by using the
                      >> Database/Scanner:
                      >>
                      >>Specimen Identification input and thus Determination tracking
                      >>Loan processing
                      >>Flagging of questionable specimens or data entry
                      >
                      > In side-by side comparison of labels with human-readable numbers
                      > versus machine-readable numbers, the scanners don't always win;
                      > various factors can enable a person to work just as fast, or faster,
                      > by NOT relying upon a scanner (especially when dealing with legacy
                      > material, as I've also mentioned before - admittedly, having the
                      > scannable label on the top can significantly improve efficiency). As
                      > I've noted before, the only concrete and indisputable advantage is in
                      > error rate, and the exact magnitude of that error depends (as noted
                      > above) on the data matrix being printed properly in the first place
                      > and not deteriorating over time, and - perhaps more significantly -
                      > all errors, be they human or machine, SHOULD be caught by QC
                      > protocols in either case, for comparable amounts of effort. In other
                      > words, I'm not so certain that the efficiency gains of scanning
                      > *have* been proven. To *prove* it requires side-by-side comparisons
                      > using systems where the ONLY difference is whether a scannable label
                      > format is involved or not (same database, same types of specimens,
                      > organized the same way, processed by the same people); even those
                      > comparisons I've been able to do often required using two different
                      > students as test subjects, which introduces unwanted variation.
                      >
                      > Peace,
                      > --
                      >
                      > Doug Yanega Dept. of Entomology Entomology Research Museum
                      > Univ. of California, Riverside, CA 92521-0314 skype: dyanega
                      > phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
                      > http://cache.ucr.edu/~heraty/yanega.html
                      > "There are some enterprises in which a careful disorderliness
                      > is the true method" - Herman Melville, Moby Dick, Chap. 82


                      --
                      John S. Ascher, Ph.D.
                      Bee Database Project Manager
                      Division of Invertebrate Zoology
                      American Museum of Natural History
                      Central Park West @ 79th St.
                      New York, NY 10024-5192
                      work phone: 212-496-3447
                      mobile phone: 917-407-0378
                    Your message has been successfully submitted and would be delivered to recipients shortly.