Loading ...
Sorry, an error occurred while loading the content.
 

Wonder what they're choking on now?

Expand Messages
  • Jim Miller
    The queue status shows quite a bit of bogging down. I wonder if something is wrong in the server or some hams have dumped their whole logs on the queue again.
    Message 1 of 7 , Oct 6, 2013
      The queue status shows quite a bit of bogging down. I wonder if something is wrong in the server or some hams have dumped their whole logs on the queue again.
       
      jim ab3cv
    • Jim Miller
      Actually looks like the queue status reporting is broken. Just got a QSL from a contact a couple of hours ago. jim ab3cv
      Message 2 of 7 , Oct 6, 2013
        Actually looks like the queue status reporting is broken. Just got a QSL from a contact a couple of hours ago.
         
        jim ab3cv


        On Sun, Oct 6, 2013 at 12:26 PM, Jim Miller <jim@...> wrote:
        The queue status shows quite a bit of bogging down. I wonder if something is wrong in the server or some hams have dumped their whole logs on the queue again.
         
        jim ab3cv

      • Dave AA6YQ
        As I understand it, a user submitted a digitally signed log that was ôcorruptedö in a way that causes the LotW Server to die û but without removing the
        Message 3 of 7 , Oct 6, 2013

          As I understand it, a user submitted a digitally signed log that was “corrupted” in a way that causes the LotW Server to die – but without removing the corrupted log from the processing queue.  The result was a continuous “process and die” loop.

           

          This corrupted log has been manually removed; the LotW Server defect that prevented it from discarding the offending log will be corrected.

           

                73,

           

                      Dave, AA6YQ

           

          From: ARRL-LOTW@yahoogroups.com [mailto:ARRL-LOTW@yahoogroups.com] On Behalf Of Jim Miller
          Sent: Sunday, October 06, 2013 12:26 PM
          To: arrl-lotw
          Subject: [ARRL-LOTW] Wonder what they're choking on now?

           

           

          The queue status shows quite a bit of bogging down. I wonder if something is wrong in the server or some hams have dumped their whole logs on the queue again.

           

          jim ab3cv


          No virus found in this message.
          Checked by AVG - www.avg.com
          Version: 10.0.1432 / Virus Database: 3222/6227 - Release Date: 10/06/13

        • Rick Murphy
          To further expand on Dave s comment - the log in question was quite large and was quite scrambled. This caused the LoTW log import processing to abort since it
          Message 4 of 7 , Oct 7, 2013
            To further expand on Dave's comment - the log in question was quite large and was quite scrambled. This caused the LoTW log import processing to abort since it couldn't understand the log. The LoTW server didn't crash, but the log reader process crashed based on not being able to decode the scrambled log. Other logs were able to be processed while this was going on.

            Since the log in question wasn't properly processed, it was re-entered into the end of the queue, which allowed other logs to work - until it got back to the front of the line, causing the same crash.

            Unfortunately, the person who submitted this log re-submitted it when it was apparently not processed, adding to the apparent backlog. It's good that the faulty files didn't block everyone, but it does seem to get stuck in the queue status page since that page records when it was originally submitted. As ARRL staff see these logs, they're now manually deleting them while they try to contact the source.

            Hopefully the station in question will stop using the computer that they used to submit these logs once ARRL contacts them as it's pretty likely not working properly based on how messed up the log appeared.

            For the tech geeks who care (most of you can stop reading now), here's the details on the symptom and what is probably causing it.

            LOTW reads a log and writes signed log records into a memory buffer. This will periodically need to get resized for large logs, meaning that it gets copied around to new heap areas, potentially multiple times. Once the entire log is signed, that buffer is read, compressed using zlib to save space, then written to disk (this is a 1.13 user, so it isn't uploaded but that's not important here). The compressed file has a hash appended to verify that it has not been changed since the .TQ8 was written.

            The log as received passes the integrity check and is decompressed. The resulting signed log has characters missing. It's an ADIF-like format with <TAG:size> format. In some cases, the tags are incomplete. In some cases, characters are switched: "1" becomes "E", etc. Software defects don't usually cause "<CALL:4>K1MU" to sometimes appear as "<CXLL:4>K1MU" and sometimes CALK1MU.  My opinion is that the multiple copying of the signed buffer around in memory is causing part of it to get lost and altered. This is most likely caused by bad RAM or a bad processor (i.e. bad capacitors on the motherboard).
            73,
                -Rick


            On Sun, Oct 6, 2013 at 2:00 PM, Dave AA6YQ <aa6yq@...> wrote:
             

            As I understand it, a user submitted a digitally signed log that was “corrupted” in a way that causes the LotW Server to die – but without removing the corrupted log from the processing queue.  The result was a continuous “process and die” loop.

             

            This corrupted log has been manually removed; the LotW Server defect that prevented it from discarding the offending log will be corrected.

             

                  73,

             

                        Dave, AA6YQ

             

            From: ARRL-LOTW@yahoogroups.com [mailto:ARRL-LOTW@yahoogroups.com] On Behalf Of Jim Miller
            Sent: Sunday, October 06, 2013 12:26 PM
            To: arrl-lotw
            Subject: [ARRL-LOTW] Wonder what they're choking on now?

             

             

            The queue status shows quite a bit of bogging down. I wonder if something is wrong in the server or some hams have dumped their whole logs on the queue again.

             

            jim ab3cv


            No virus found in this message.
            Checked by AVG - www.avg.com
            Version: 10.0.1432 / Virus Database: 3222/6227 - Release Date: 10/06/13




            --
            Rick Murphy, CISSP-ISSAP, K1MU/4, Annandale VA USA
          • Peter Laws
            ... ** This is way down in the weeds, so most of you will probably want to tune out ... ** Thanks, Rick. Do you mean LOTW, the server, in the first paragraph
            Message 5 of 7 , Oct 7, 2013
              On Mon, Oct 7, 2013 at 5:12 AM, Rick Murphy <k1mu@...> wrote:

              >
              > LOTW reads a log and writes signed log records into a memory buffer. This will periodically need to get resized for large logs, meaning that it gets copied around to new heap areas, potentially multiple times. Once the entire log is signed, that buffer is read, compressed using zlib to save space, then written to disk (this is a 1.13 user, so it isn't uploaded but that's not important here). The compressed file has a hash appended to verify that it has not been changed since the .TQ8 was written.
              >
              > The log as received passes the integrity check and is decompressed. The resulting signed log has characters missing. It's an ADIF-like format with <TAG:size> format. In some cases, the tags are incomplete. In some cases, characters are switched: "1" becomes "E", etc. Software defects don't usually cause "<CALL:4>K1MU" to sometimes appear as "<CXLL:4>K1MU" and sometimes CALK1MU. My opinion is that the multiple copying of the signed buffer around in memory is causing part of it to get lost and altered. This is most likely caused by bad RAM or a bad processor (i.e. bad capacitors on the motherboard).
              >

              ** This is way down in the weeds, so most of you will probably want to
              tune out ... **

              Thanks, Rick. Do you mean LOTW, the server, in the first paragraph I
              quoted above or TQSL?

              We make hashes of the content of the message to insure that the
              message has not changed since it was signed and, from the signature,
              who created the file, right? Please correct me if I'm confused.
              Wait, I'm often confused so better correct me if I've got things
              wrong. :-)

              If I am correct, at what point does the file become corrupt? Before
              the hash is generated, I'm assuming, because if it was corrupted after
              the hash was made it wouldn't pass the authentication check, would it?

              If the bad ADIF was put out by the logging program, why does TQSL
              accept it in the first place?






              --
              Peter Laws | N5UWY | plaws plaws net | Travel by Train!
            • Rick Murphy
              ... The LoTW server. ... No, you re right. However, POST SIGNING the signed content becomes scrambled. However, it s a bit more subtle than that. Each QSO in a
              Message 6 of 7 , Oct 7, 2013
                On Mon, Oct 7, 2013 at 1:13 PM, Peter Laws <plaws0@...> wrote:
                 

                On Mon, Oct 7, 2013 at 5:12 AM, Rick Murphy <k1mu@...> wrote:

                >
                > LOTW reads a log and writes signed log records into a memory buffer. This will periodically need to get resized for large logs, meaning that it gets copied around to new heap areas, potentially multiple times. Once the entire log is signed, that buffer is read, compressed using zlib to save space, then written to disk (this is a 1.13 user, so it isn't uploaded but that's not important here). The compressed file has a hash appended to verify that it has not been changed since the .TQ8 was written.
                >
                > The log as received passes the integrity check and is decompressed. The resulting signed log has characters missing. It's an ADIF-like format with <TAG:size> format. In some cases, the tags are incomplete. In some cases, characters are switched: "1" becomes "E", etc. Software defects don't usually cause "<CALL:4>K1MU" to sometimes appear as "<CXLL:4>K1MU" and sometimes CALK1MU. My opinion is that the multiple copying of the signed buffer around in memory is causing part of it to get lost and altered. This is most likely caused by bad RAM or a bad processor (i.e. bad capacitors on the motherboard).
                >

                ** This is way down in the weeds, so most of you will probably want to
                tune out ... **

                Thanks, Rick. Do you mean LOTW, the server, in the first paragraph I
                quoted above or TQSL?

                The LoTW server.
                 
                We make hashes of the content of the message to insure that the
                message has not changed since it was signed and, from the signature,
                who created the file, right? Please correct me if I'm confused.
                Wait, I'm often confused so better correct me if I've got things
                wrong. :-)

                No, you're right. However, POST SIGNING the signed content becomes scrambled. However, it's a bit more subtle than that. 
                Each QSO in a signed log has it's own canonical form signed data which is individually signed.
                 
                If I am correct, at what point does the file become corrupt? Before
                the hash is generated, I'm assuming, because if it was corrupted after
                the hash was made it wouldn't pass the authentication check, would it?

                It wasn't possible to verify anything about the file since there wasn't any valid content. Or, at least not much valid content.

                If the bad ADIF was put out by the logging program, why does TQSL
                accept it in the first place?

                There wasn't anything wrong with the input file. It's just that after TQSL signed the log, it was passed through a meat grinder before being stored for transmittal to LoTW. 

                Oh, and HQ have fixed the error that caused this log to abort. It'll now have everything rejected and not loop back to the end of the queue.
                73,
                    -Rick
                -- 
                Rick Murphy, CISSP-ISSAP, K1MU/4, Annandale VA USA
              • Peter Laws
                ... OK, this is exactly where I was going. It just seemed to me that it would be pretty simple for the ingest algorithm to recognize a bogus file without
                Message 7 of 7 , Oct 7, 2013
                  On Mon, Oct 7, 2013 at 2:20 PM, Rick Murphy <k1mu@...> wrote:


                  >
                  > Oh, and HQ have fixed the error that caused this log to abort. It'll now have everything rejected and not loop back to the end of the queue.

                  OK, this is exactly where I was going. It just seemed to me that it
                  would be pretty simple for the ingest algorithm to recognize a bogus
                  file without getting all worked up about it. I'll be glad once TQSL
                  2.x is well on the way and more attention can be turned to the LOTW
                  processes themselves.

                  Love the 2.x - got notified that there was a new config, clicked
                  "here" to update, not another thought about it. Good stuff.

                  --
                  Peter Laws | N5UWY | plaws plaws net | Travel by Train!
                Your message has been successfully submitted and would be delivered to recipients shortly.