Loading ...
Sorry, an error occurred while loading the content.

Re: [beemonitoring] bar or scan-code systems to aid insect curation

Expand Messages
  • Doug Yanega
    ... Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We ve noticed that even a high-end
    Message 1 of 9 , Aug 22, 2011
    • 0 Attachment
      Re: [beemonitoring] bar or scan-code systems to aid insect
      Harold Ikerd wrote:

      Advantages? Yes... we like our current system and the scanner saves time in our situation. The current version of Bartender is very flexible. The database/bartender is printing to a relatively inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.

      Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We've noticed that even a high-end thermal printer will occasionally "drop out" black bits in the matrix, effectively turning binary "1"s onto "0"s, and corrupting the resulting scan. This would be an important baseline to establish, as it's certain to be non-zero; but non-zero by how much?

      This alludes to my next point, Quality Control.

      All of these steps you outline are sensible and practical, though they are (or should be) part and parcel of using a database as an integral part of specimen processing, and can be done with *any* uniquely-numbered labels, not just *scannable* uniquely-numbered labels.

      Other areas that have proven efficiency gains by using the Database/Scanner:
      • Specimen Identification input and thus Determination tracking
      • Loan processing
      • Flagging of questionable specimens or data entry

      In side-by side comparison of labels with human-readable numbers versus machine-readable numbers, the scanners don't always win; various factors can enable a person to work just as fast, or faster, by NOT relying upon a scanner (especially when dealing with legacy material, as I've also mentioned before - admittedly, having the scannable label on the top can significantly improve efficiency). As I've noted before, the only concrete and indisputable advantage is in error rate, and the exact magnitude of that error depends (as noted above) on the data matrix being printed properly in the first place and not deteriorating over time, and - perhaps more significantly - all errors, be they human or machine, SHOULD be caught by QC protocols in either case, for comparable amounts of effort. In other words, I'm not so certain that the efficiency gains of scanning *have* been proven. To *prove* it requires side-by-side comparisons using systems where the ONLY difference is whether a scannable label format is involved or not (same database, same types of specimens, organized the same way, processed by the same people); even those comparisons I've been able to do often required using two different students as test subjects, which introduces unwanted variation.

      Peace,
      -- 
      

      Doug Yanega        Dept. of Entomology         Entomology Research Museum
      Univ. of California, Riverside, CA 92521-0314        skype: dyanega
      phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
                   http://cache.ucr.edu/~heraty/yanega.html
        "There are some enterprises in which a careful disorderliness
              is the true method" - Herman Melville, Moby Dick, Chap. 82
    • Daniel Kjar
      carve the label in spanglish, in marble. you have no other option. and make sure 8 people watch you do it. i just cant trust anything else.... ... -- Dr.
      Message 2 of 9 , Aug 22, 2011
      • 0 Attachment
        carve the label in spanglish, in marble.  you have no other option.   and make sure 8 people watch you do it.  i just cant trust anything else....

        On 8/22/2011 7:23 PM, Doug Yanega wrote:
         

        Harold Ikerd wrote:

        Advantages? Yes... we like our current system and the scanner saves time in our situation. The current version of Bartender is very flexible. The database/bartender is printing to a relatively inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.

        Has anyone tested how well 2D matrices hold up over time, or what happens if the printer starts to get low on toner? We've noticed that even a high-end thermal printer will occasionally "drop out" black bits in the matrix, effectively turning binary "1"s onto "0"s, and corrupting the resulting scan. This would be an important baseline to establish, as it's certain to be non-zero; but non-zero by how much?

        This alludes to my next point, Quality Control.

        All of these steps you outline are sensible and practical, though they are (or should be) part and parcel of using a database as an integral part of specimen processing, and can be done with *any* uniquely-numbered labels, not just *scannable* uniquely-numbered labels.

        Other areas that have proven efficiency gains by using the Database/Scanner:
        • Specimen Identification input and thus Determination tracking
        • Loan processing
        • Flagging of questionable specimens or data entry

        In side-by side comparison of labels with human-readable numbers versus machine-readable numbers, the scanners don't always win; various factors can enable a person to work just as fast, or faster, by NOT relying upon a scanner (especially when dealing with legacy material, as I've also mentioned before - admittedly, having the scannable label on the top can significantly improve efficiency). As I've noted before, the only concrete and indisputable advantage is in error rate, and the exact magnitude of that error depends (as noted above) on the data matrix being printed properly in the first place and not deteriorating over time, and - perhaps more significantly - all errors, be they human or machine, SHOULD be caught by QC protocols in either case, for comparable amounts of effort. In other words, I'm not so certain that the efficiency gains of scanning *have* been proven. To *prove* it requires side-by-side comparisons using systems where the ONLY difference is whether a scannable label format is involved or not (same database, same types of specimens, organized the same way, processed by the same people); even those comparisons I've been able to do often required using two different students as test subjects, which introduces unwanted variation.

        Peace,
        -- 
        

        Doug Yanega        Dept. of Entomology         Entomology Research Museum
        Univ. of California, Riverside, CA 92521-0314        skype: dyanega
        phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
                     http://cache.ucr.edu/~heraty/yanega.html
          "There are some enterprises in which a careful disorderliness
                is the true method" - Herman Melville, Moby Dick, Chap. 82

        -- 
        Dr. Daniel Kjar
        Assistant Professor of Biology
        Division of Mathematics and Natural Sciences
        Elmira College
        1 Park Place
        Elmira, NY 14901
        607-735-1826
        http://faculty.elmira.edu/dkjar
        
        "...humans send their young men to war; ants send their old ladies"
        	-E. O. Wilson
        
      • John S. Ascher
        I agree with Doug that retroactive data entry can be done efficiently and accurately without use a scanner. We have matrix codes encoding our specimen USIs,
        Message 3 of 9 , Aug 22, 2011
        • 0 Attachment
          I agree with Doug that retroactive data entry can be done efficiently and
          accurately without use a scanner. We have matrix codes encoding our
          specimen USIs, but our technicians find it easy to read the alphanumeric
          codes on the these USIs and to enter the USIs manually. They don't seem to
          make frequent or serious errors in doing so (in part because our database
          has built in safeguards to prevent overwriting of data, inadvertent entry
          of too many records at once, etc) and all of them regard recording of the
          USI as one of the quickest and easiest of their assigned tasks.

          However, when dealing with proactive data entry systems or whenever one
          must update species identifications or other information for a large
          number of records, especially if these involve diverse taxa or collecting
          events, it is extremely useful to have machine-readable matrix codes on
          the USI labels. We find the scanner most useful when processing large
          loans, especially complex ones such as those involving synoptic
          collections.

          We have the ability to make bulk updates to selected records using an Edit
          Mode in our system, so in all but the most complex cases we can make
          necessary name changes and make other updates quickly, easily, and
          reliably, without having to read USIs from labels (whether manually or by
          a scanner). Without such a "bulk update" functionality we would have far
          more occasion to machine-read our USI labels.

          Our labels have the USI alphnumeric written out as well as encoded in the
          matrix, and this redundancy increases my confidence that most labels will
          be readable by either a human or a machine in the distant future. The best
          long-term insurance is to properly maintain and curate the specimens so
          that these can be reexamined if associated data are lost or corrupted. It
          should be obvious that archiving of specimens is even more important than
          archiving associated data, but many data capture initiatives seem to
          concern themselves more or only with the latter (the idea being I suppose
          that once the data are recorded "permanently" we need not worry too much
          about the physical specimen." It seems to me that essential curation both
          before and after data entry is an under-appreciated aspect of existing and
          planned databasing programs! I suppose that this is expected to be done
          for free without dedicated external funding?

          "Flagging of questionable specimens or data entry"

          This is clearly a benefit of using a database and associated informatics
          tools, but is this really a benefit derived from use of the scanner?

          Having just visited the BBSL and glimpsed the size and rapid growth of
          their collection, I can see that they really need make optimal use of
          available tools, including machine-readable labels, as H is doing, whereas
          as smaller collection could muddle through with a less sophisticated
          system.

          Rather than entering or reading USIs, the more difficult challenges our
          data entry personnel face that routinely cause serious errors include 1)
          the difficulty of deciphering illegible labels and 2) the difficulty of
          sorting out near-duplicate localities. Many resulting outright errors or
          suboptimal decisions can be sorted out after the fact, without use of a
          scanner at any point, once these are detected by robots (e.g., Discover
          Life error-checking functions) and human experts. Used of a shared,
          web-based system (one with many users some of which are active and expert
          in taxonomy and/or geography) increases the likelihood that such
          retroactive corrections will be made in a timely fashion.

          As far as efficiency of data entry, a big time sink is the need to cut and
          affix species determination and USI labels with due care to delicate
          specimens.

          Two of the best ways to improve both reliability and efficiency of data
          capture, both of which we use in our AMNH system are:

          1) Bulk entry of as much data as possible from reliable,
          centrally-maintained authority files so that data entry personnel select
          from these rather than typing data (no data entry technician using our
          system is allowed to type a bee name or a country name, ever!).

          2) Full use of autocomplete and tab-through functionality (as available in
          Mozilla Firefox) so that data bulk entered from the authority files
          referred to above can be located and selected very quickly with few
          keystrokes and with minimal or no use of a mouse.

          These things are described in the following paper:

          Schuh, R. T., Hewson-Smith, S., and J. S. Ascher. 2010. Specimen
          databases: a case study in entomology using web-based software. American
          Entomologist 56(4): 206-216.

          I'm happy to send a pdf to anyone who's interested.

          John



          > Harold Ikerd wrote:
          >
          >>Advantages? Yes... we like our current system and the scanner saves
          >>time in our situation. The current version of Bartender is very
          >>flexible. The database/bartender is printing to a relatively
          >>inexpensive Brother HL-5370Dw series printer on 80lb. archival paper.
          >
          > Has anyone tested how well 2D matrices hold up over time, or what
          > happens if the printer starts to get low on toner? We've noticed that
          > even a high-end thermal printer will occasionally "drop out" black
          > bits in the matrix, effectively turning binary "1"s onto "0"s, and
          > corrupting the resulting scan. This would be an important baseline to
          > establish, as it's certain to be non-zero; but non-zero by how much?
          >
          >>This alludes to my next point, Quality Control.
          >
          > All of these steps you outline are sensible and practical, though
          > they are (or should be) part and parcel of using a database as an
          > integral part of specimen processing, and can be done with *any*
          > uniquely-numbered labels, not just *scannable* uniquely-numbered
          > labels.
          >
          >>Other areas that have proven efficiency gains by using the
          >> Database/Scanner:
          >>
          >>Specimen Identification input and thus Determination tracking
          >>Loan processing
          >>Flagging of questionable specimens or data entry
          >
          > In side-by side comparison of labels with human-readable numbers
          > versus machine-readable numbers, the scanners don't always win;
          > various factors can enable a person to work just as fast, or faster,
          > by NOT relying upon a scanner (especially when dealing with legacy
          > material, as I've also mentioned before - admittedly, having the
          > scannable label on the top can significantly improve efficiency). As
          > I've noted before, the only concrete and indisputable advantage is in
          > error rate, and the exact magnitude of that error depends (as noted
          > above) on the data matrix being printed properly in the first place
          > and not deteriorating over time, and - perhaps more significantly -
          > all errors, be they human or machine, SHOULD be caught by QC
          > protocols in either case, for comparable amounts of effort. In other
          > words, I'm not so certain that the efficiency gains of scanning
          > *have* been proven. To *prove* it requires side-by-side comparisons
          > using systems where the ONLY difference is whether a scannable label
          > format is involved or not (same database, same types of specimens,
          > organized the same way, processed by the same people); even those
          > comparisons I've been able to do often required using two different
          > students as test subjects, which introduces unwanted variation.
          >
          > Peace,
          > --
          >
          > Doug Yanega Dept. of Entomology Entomology Research Museum
          > Univ. of California, Riverside, CA 92521-0314 skype: dyanega
          > phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
          > http://cache.ucr.edu/~heraty/yanega.html
          > "There are some enterprises in which a careful disorderliness
          > is the true method" - Herman Melville, Moby Dick, Chap. 82


          --
          John S. Ascher, Ph.D.
          Bee Database Project Manager
          Division of Invertebrate Zoology
          American Museum of Natural History
          Central Park West @ 79th St.
          New York, NY 10024-5192
          work phone: 212-496-3447
          mobile phone: 917-407-0378
        Your message has been successfully submitted and would be delivered to recipients shortly.