Loading ...
Sorry, an error occurred while loading the content.

Re: DoseUtility

Expand Messages
  • dclunie99
    Hi Ed I have been doing a lot of tuning of the regular expression patterns for the Siemens OCR recently based on a wider variety of samples, in preparation for
    Message 1 of 9 , Nov 7, 2010
    View Source
    • 0 Attachment
      Hi Ed

      I have been doing a lot of tuning of the regular expression patterns for the Siemens OCR recently based on a wider variety of samples, in preparation for an RSNA demonstration, so you may find the 20101116 build more reliable in this respect.

      The DoseUtility itself is intended to be a test tool, and the toolkit contains specific classes that abstract the reported information (whether it comes from the OCR process, reading an RDSR or some other source).

      My intention is that developers and integrators can then subclass com.pixelmed.dose.CTDose and its various supporting classes to add their own methods that convert the attributes of the classes into the form necessary to insert into a database or reporting system. The toString() and getStructuredReport() methods should make it clear what to do.

      I should probably add a toCSV() method as well, even if it is only as an example.

      David

      PS. When you say it doesn't "label" correctly, can you elaborate ? Send along an example of bad behavior if you come across it.

      --- In pixelmed_dicom@yahoogroups.com, "edmundmcdonagh" <ed.mcdonagh@...> wrote:
      >
      > Hi
      >
      > I've been experimenting with DoseUtility, and it looks to be very useful, and quite impressive! However, I have one problem, and one question...
      >
      > Problem: With GE CT Dose Images, it does very well. With Siemens CT images, it picks up the total dose accurately, but struggles with the various series. Some it picks up, others it misses or doesn't label correctly.
      >
      > Is this possibly due to a limited range of expected words being looked for? Do I need to edit the OCR_Glyphs_DoseScreen.xml file or something, and if so how?
      >
      > Question: If I can get the accuracy and completeness up, I'd very much like to be able to install this on a local server and throw images at it as part of a wider dose extraction task. This much I think I can do. However, I'd then need to be able to get the extracted dose information out of DoseUtility in a structured fashion (even just a csv string) to feed into another file or database or somesuch.
    • Ed McDonagh
      Thank you David. I will look forward to the 20101116 version, and in the mean time work out how I might use the java classes you describe for my particular
      Message 2 of 9 , Nov 9, 2010
      View Source
      • 0 Attachment
        Thank you David.

        I will look forward to the 20101116 version, and in the mean time work
        out how I might use the java classes you describe for my particular
        application.

        To elaborate on the labelling and missing series thing - the utility
        seems to only pick up one series generally, when there might be several.
        I haven't worked out why any particular series are picked up and others
        ignored.

        By label, I am referring to the series description. In the report, there
        is the word 'Unknown' where I might expect there to be the series
        description.

        I was therefore wondering if my images had exam descriptions in that the
        OCR utility was not expecting to find, and therefore doesn't. However,
        it might be that this is what you have improved with the regex tuning?

        I have emailed you privately with some examples.

        Good luck for RSNA - is this part of a presented paper, a poster, or
        something else?

        Kind regards, and thank you for creating this utility and making it
        Free.

        Ed

        On Sun, 2010-11-07 at 08:51 +0000, dclunie99 wrote:
        >
        > Hi Ed
        >
        > I have been doing a lot of tuning of the regular expression patterns
        > for the Siemens OCR recently based on a wider variety of samples, in
        > preparation for an RSNA demonstration, so you may find the 20101116
        > build more reliable in this respect.
        >
        > The DoseUtility itself is intended to be a test tool, and the toolkit
        > contains specific classes that abstract the reported information
        > (whether it comes from the OCR process, reading an RDSR or some other
        > source).
        >
        > My intention is that developers and integrators can then subclass
        > com.pixelmed.dose.CTDose and its various supporting classes to add
        > their own methods that convert the attributes of the classes into the
        > form necessary to insert into a database or reporting system. The
        > toString() and getStructuredReport() methods should make it clear what
        > to do.
        >
        > I should probably add a toCSV() method as well, even if it is only as
        > an example.
        >
        > David
        >
        > PS. When you say it doesn't "label" correctly, can you elaborate ?
        > Send along an example of bad behavior if you come across it.
        >
        > --- In pixelmed_dicom@yahoogroups.com, "edmundmcdonagh"
        > <ed.mcdonagh@...> wrote:
        > >
        > > Hi
        > >
        > > I've been experimenting with DoseUtility, and it looks to be very
        > useful, and quite impressive! However, I have one problem, and one
        > question...
        > >
        > > Problem: With GE CT Dose Images, it does very well. With Siemens CT
        > images, it picks up the total dose accurately, but struggles with the
        > various series. Some it picks up, others it misses or doesn't label
        > correctly.
        > >
        > > Is this possibly due to a limited range of expected words being
        > looked for? Do I need to edit the OCR_Glyphs_DoseScreen.xml file or
        > something, and if so how?
        > >
        > > Question: If I can get the accuracy and completeness up, I'd very
        > much like to be able to install this on a local server and throw
        > images at it as part of a wider dose extraction task. This much I
        > think I can do. However, I'd then need to be able to get the extracted
        > dose information out of DoseUtility in a structured fashion (even just
        > a csv string) to feed into another file or database or somesuch.
        >
        >
        >
        >
        >

        #########################################################################
        Attention:
        This e-mail and any attachment is for authorised use by the intended
        recipient(s) only. It may contain proprietary, confidential and/or
        privileged information and should not be copied, disclosed, distributed,
        retained or used by any other party. If you are not an intended recipient
        please notify the sender immediately and delete this e-mail (including
        attachments and copies).

        The statements and opinions expressed in this e-mail are those of the
        author and do not necessarily reflect those of the Royal Marsden NHS
        Foundation Trust. The Trust does not take any responsibility for the
        statements and opinions of the author.

        Website: http://www.royalmarsden.nhs.uk
        #########################################################################
      • Ed McDonagh
        Hi David I have tested out the new webstart version of the tool, and it now finds all the appropriate series in the dose image without getting tripped up by
        Message 3 of 9 , Nov 10, 2010
        View Source
        • 0 Attachment
          Hi David

          I have tested out the new webstart version of the tool, and it now finds
          all the appropriate series in the dose image without getting tripped up
          by the D suffix on Dual Source series. Thank you for making the changes!

          I would now like to return to the other issue - is it intentional that
          where I might expect to see the series description (Control Scan,
          TestBolus, DS_Coronary), the dose report reports 'Unknown'. Also, where
          as the scan of the images extracts kV, mA, rotation time and slice
          thickness, this information is not extracted from the dose image. Again,
          is this intentional?

          Kind regards,

          Ed
          #########################################################################
          Attention:
          This e-mail and any attachment is for authorised use by the intended
          recipient(s) only. It may contain proprietary, confidential and/or
          privileged information and should not be copied, disclosed, distributed,
          retained or used by any other party. If you are not an intended recipient
          please notify the sender immediately and delete this e-mail (including
          attachments and copies).

          The statements and opinions expressed in this e-mail are those of the
          author and do not necessarily reflect those of the Royal Marsden NHS
          Foundation Trust. The Trust does not take any responsibility for the
          statements and opinions of the author.

          Website: http://www.royalmarsden.nhs.uk
          #########################################################################
        • David Clunie
          Hi Ed Since I began this working with GE screens only, which don t include the technique in the screen, I designed it to extract the technique from the
          Message 4 of 9 , Nov 10, 2010
          View Source
          • 0 Attachment
            Hi Ed

            Since I began this working with GE screens only, which don't include
            the technique in the screen, I designed it to extract the technique
            from the acquired images.

            Siemens does include some technique information in their screens, but
            not everything I wanted to try to build a (mostly) valid SR, so I
            still use their acquired images, and have not yet attempted to "merge"
            the OCR'd technique information with what is gleaned from the acquired
            images, though one day plan to.

            Likewise, for the protocol summary string at the start of the line,
            since I can get the full Series Description from the acquired images, I
            haven't bothered with it.

            But I can see how both of these could be useful when one doesn't have
            the acquired images, or their isn't value in the extra time taken to
            retrieve and process them.

            So it both are on my list of things to do.

            David

            On 11/10/10 9:01 AM, Ed McDonagh wrote:
            > Hi David
            >
            > I have tested out the new webstart version of the tool, and it now finds
            > all the appropriate series in the dose image without getting tripped up
            > by the D suffix on Dual Source series. Thank you for making the changes!
            >
            > I would now like to return to the other issue - is it intentional that
            > where I might expect to see the series description (Control Scan,
            > TestBolus, DS_Coronary), the dose report reports 'Unknown'. Also, where
            > as the scan of the images extracts kV, mA, rotation time and slice
            > thickness, this information is not extracted from the dose image. Again,
            > is this intentional?
            >
            > Kind regards,
            >
            > Ed
          • Jurgen Jacobs
            Hi David, Ed, I just started to follow your interesting discussion. I wasn t aware of your DoseUtility class and I just checked it out. Very nice work. At the
            Message 5 of 9 , Nov 10, 2010
            View Source
            • 0 Attachment
              Hi David, Ed,

              I just started to follow your interesting discussion. I wasn't aware of your DoseUtility class and I just checked it out. Very nice work. At the start of this year we've built a system to auto-extract dose information from CT series, based on SR and dose reports. For the dose reports I used an external, commercial OCR package. The biggest problem we faced were Philips and mainly Toshiba as the Aquilion has two dose reports containing different information. Also I'm not sure if you really need all information in the report. In our implementation, we blanked out large parts of the dose report, also to increase the OCR accuracy. If you are interested, we'll have a presentation at RSNA on exactly this topic (implementation, usage and pediatric dosimetry results; one of the physics CT sessions). When will your talk be ? If I can help you with additional test data, please let me know.

              Thanks for the great library.

              Jurgen

              On Wed, Nov 10, 2010 at 8:46 PM, David Clunie <dclunie@...> wrote:
               

              Hi Ed

              Since I began this working with GE screens only, which don't include
              the technique in the screen, I designed it to extract the technique
              from the acquired images.

              Siemens does include some technique information in their screens, but
              not everything I wanted to try to build a (mostly) valid SR, so I
              still use their acquired images, and have not yet attempted to "merge"
              the OCR'd technique information with what is gleaned from the acquired
              images, though one day plan to.

              Likewise, for the protocol summary string at the start of the line,
              since I can get the full Series Description from the acquired images, I
              haven't bothered with it.

              But I can see how both of these could be useful when one doesn't have
              the acquired images, or their isn't value in the extra time taken to
              retrieve and process them.

              So it both are on my list of things to do.

              David



              On 11/10/10 9:01 AM, Ed McDonagh wrote:
              > Hi David
              >
              > I have tested out the new webstart version of the tool, and it now finds
              > all the appropriate series in the dose image without getting tripped up
              > by the D suffix on Dual Source series. Thank you for making the changes!
              >
              > I would now like to return to the other issue - is it intentional that
              > where I might expect to see the series description (Control Scan,
              > TestBolus, DS_Coronary), the dose report reports 'Unknown'. Also, where
              > as the scan of the images extracts kV, mA, rotation time and slice
              > thickness, this information is not extracted from the dose image. Again,
              > is this intentional?
              >
              > Kind regards,
              >
              > Ed

            • David Clunie
              Hi Jurgen ... It seems that quite a number of folks have implemented this OCR approach. You may have heard of the work by Tessa Cook and Bill Boonn:
              Message 6 of 9 , Nov 11, 2010
              View Source
              • 0 Attachment
                Hi Jurgen

                On 11/10/10 3:20 PM, Jurgen Jacobs wrote:

                > I just started to follow your interesting discussion. I wasn't aware of your
                > DoseUtility class and I just checked it out. Very nice work. At the start of
                > this year we've built a system to auto-extract dose information from CT
                > series, based on SR and dose reports.

                It seems that quite a number of folks have implemented this OCR approach. You
                may have heard of the work by Tessa Cook and Bill Boonn:

                http://www.jacr.org/article/S1546-1440%2810%2900377-7/abstract

                and also George Shih at Weill Cornell (Valkyrie):

                http://www.arrs.org/Pressroom/info.cfm?prID=467

                > For the dose reports I used an
                > external, commercial OCR package.

                Since I didn't want to have any platform or commercial dependencies, and
                I couldn't make any of the public domain packages work immediately, and
                since the text is already binary, not rotated and consists of a few
                regular sized fonts, it was little effort to do this from scratch, but
                this does mean that it won't handle lossy compressed dose screens.

                > The biggest problem we faced were Philips
                > and mainly Toshiba as the Aquilion has two dose reports containing different
                > information.

                I haven't done much work on the Toshiba screens yet, since I haven't
                dealt with the multiple pages. Graham Warden has contributed some code
                for this but I haven't had a chance to incorporate it yet, but I plan
                to revisit this after RSNA.

                As for Philips, I found that I didn't need to do OCR at all; everything
                I needed was found in the Exposure Dose Sequence attribute within the
                header of the dose screen, or in some cases localizer when the dose
                screen was absent. See com.pixelmed.doseocr.ExposureDoseSequence.

                > Also I'm not sure if you really need all information in the
                > report.

                The total DLP, and DLP and CTDIvol per acquisition would seem to be the
                key pieces of information, but my goal was to construct as near a complete
                DICOM Radiation Dose SR object, so I try to extract more than this, as
                well as processing the acquired image headers to get technique and
                positioning information where possible.

                > In our implementation, we blanked out large parts of the dose
                > report, also to increase the OCR accuracy.

                I just ignore the OCR output that doesn't match the lines that I am
                looking for (i.e., it is a two step process of 1) OCR everything, then
                2) apply a regular expression to match the lines that contain the
                information needed).

                > If you are interested, we'll have
                > a presentation at RSNA on exactly this topic (implementation, usage and
                > pediatric dosimetry results; one of the physics CT sessions). When will your
                > talk be ? If I can help you with additional test data, please let me know.

                Looking forward to hearing your talk; Wed at 3pm, right ?

                I am not talking, just participating in a dose informatics demonstration
                that is part of the RSNA's Image Sharing demonstration, which runs all
                week, and should be easy to find.

                If you have a web page or place to download your code or documents, let
                me know, and I will add a link at:

                https://sites.google.com/site/medimgraddoseinformatics/

                which is the site for this group, which you might be interested in
                joining:

                http://groups.google.com/group/medical-imaging-radiation-dose-informatics

                David
              • Jurgen Jacobs
                Hi David, ... yes we are aware of these other approaches ... this is interesting, I ll have a look at it after RSNA ... our initial focus was to figure out an
                Message 7 of 9 , Nov 24, 2010
                View Source
                • 0 Attachment
                  Hi David,

                  On Thu, Nov 11, 2010 at 11:15 AM, David Clunie <dclunie@...> wrote:
                  > Hi Jurgen
                  >
                  > On 11/10/10 3:20 PM, Jurgen Jacobs wrote:
                  >
                  >> I just started to follow your interesting discussion. I wasn't aware of your
                  >> DoseUtility class and I just checked it out. Very nice work. At the start of
                  >> this year we've built a system to auto-extract dose information from CT
                  >> series, based on SR and dose reports.
                  >
                  > It seems that quite a number of folks have implemented this OCR approach. You
                  > may have heard of the work by Tessa Cook and Bill Boonn:
                  >
                  >   http://www.jacr.org/article/S1546-1440%2810%2900377-7/abstract
                  >
                  > and also George Shih at Weill Cornell (Valkyrie):
                  >
                  >   http://www.arrs.org/Pressroom/info.cfm?prID=467
                  >

                  yes we are aware of these other approaches

                  >> For the dose reports I used an
                  >> external, commercial OCR package.
                  >
                  > Since I didn't want to have any platform or commercial dependencies, and
                  > I couldn't make any of the public domain packages work immediately, and
                  > since the text is already binary, not rotated and consists of a few
                  > regular sized fonts, it was little effort to do this from scratch, but
                  > this does mean that it won't handle lossy compressed dose screens.
                  >
                  >> The biggest problem we faced were Philips
                  >> and mainly Toshiba as the Aquilion has two dose reports containing different
                  >> information.
                  >
                  > I haven't done much work on the Toshiba screens yet, since I haven't
                  > dealt with the multiple pages. Graham Warden has contributed some code
                  > for this but I haven't had a chance to incorporate it yet, but I plan
                  > to revisit this after RSNA.
                  >
                  > As for Philips, I found that I didn't need to do OCR at all; everything
                  > I needed was found in the Exposure Dose Sequence attribute within the
                  > header of the dose screen, or in some cases localizer when the dose
                  > screen was absent. See com.pixelmed.doseocr.ExposureDoseSequence.
                  >

                  this is interesting, I'll have a look at it after RSNA

                  >> Also I'm not sure if you really need all information in the
                  >> report.
                  >
                  > The total DLP, and DLP and CTDIvol per acquisition would seem to be the
                  > key pieces of information, but my goal was to construct as near a complete
                  > DICOM Radiation Dose SR object, so I try to extract more than this, as
                  > well as processing the acquired image headers to get technique and
                  > positioning information where possible.

                  our initial focus was to figure out an automated approach to full-fill
                  our legal dosimetry requierements. Afterwards, because we had the
                  stream of images anyway, we put effort in investigating if the system
                  was correctly used (towards patient positioning, topogram selection,
                  protocol selection etc)

                  >
                  >> In our implementation, we blanked out large parts of the dose
                  >> report, also to increase the OCR accuracy.
                  >
                  > I just ignore the OCR output that doesn't match the lines that I am
                  > looking for (i.e., it is a two step process of 1) OCR everything, then
                  > 2) apply a regular expression to match the lines that contain the
                  > information needed).
                  >
                  >> If you are interested, we'll have
                  >> a presentation at RSNA on exactly this topic (implementation, usage and
                  >> pediatric dosimetry results; one of the physics CT sessions). When will your
                  >> talk be ? If I can help you with additional test data, please let me know.
                  >
                  > Looking forward to hearing your talk; Wed at 3pm, right ?
                  >

                  yes indeed.

                  > I am not talking, just participating in a dose informatics demonstration
                  > that is part of the RSNA's Image Sharing demonstration, which runs all
                  > week, and should be easy to find.
                  >
                  > If you have a web page or place to download your code or documents, let
                  > me know, and I will add a link at:
                  >
                  >   https://sites.google.com/site/medimgraddoseinformatics/

                  currently it is a closed system, developed for our university, but
                  I'll keep this link in mind when things change.

                  >
                  > which is the site for this group, which you might be interested in
                  > joining:
                  >
                  >   http://groups.google.com/group/medical-imaging-radiation-dose-informatics
                  >
                  > David
                  >

                  regards, Jurgen

                  >
                  > ------------------------------------
                  >
                  > Yahoo! Groups Links
                  >
                  >
                  >
                  >
                • 6da5135f230bc945017c2edf62f0526d
                  Hi the OS X Binaries in 20130426_old and 20131018b_current appear to be damaged (the OS fails to start them and complains they are damaged), The one in the
                  Message 8 of 9 , Nov 3, 2013
                  View Source
                  • 0 Attachment
                    Hi


                    the OS X Binaries in 

                    20130426_old
                    and
                    20131018b_current

                    appear to be damaged (the OS fails to start them and complains they are damaged),
                    The one in the macexe download appears ok

                    I am running Java 7 and 10.7.5

                    Neil
                  Your message has been successfully submitted and would be delivered to recipients shortly.