Loading ...
Sorry, an error occurred while loading the content.

Re: [unison-users] Unison found changes where rsync did not?

Expand Messages
  • Alan Schmitt
    ... http://www.cis.upenn.edu/%7Ebcpierce/unison/download/releases/stable/unison-manual.html#backups Alan
    Message 1 of 9 , Mar 22, 2012
    View Source
    • 0 Attachment
      On 22 mars 2012, at 10:32, Ivan Shmakov wrote:

      > This way, if anything goes wrong, I could always examine the
      > things post factum. (Obviously, one is supposed to periodically
      > purge the contents of the backup directory.)
      >
      > Unfortunately, I don't know if this behavior is easily
      > achievable with Unison.

      http://www.cis.upenn.edu/%7Ebcpierce/unison/download/releases/stable/unison-manual.html#backups

      Alan
    • Mark Casey
      ... Definitely, but both replicas were identical according to rsync right before I ran Unison. So if their prior states were also identical they should have
      Message 2 of 9 , Mar 22, 2012
      View Source
      • 0 Attachment
        On 3/22/2012 2:34 AM, Alan Schmitt wrote:
        > Excel files are particularly tricky: they may sometime change even
        > without the user explicitly changing them, and without their modtime
        > changing.
        >
        >
        > Alan
        >

        Definitely, but both replicas were identical according to rsync right
        before I ran Unison. So if their prior states were also identical they
        should have been "changed only in identical ways" and not conflicted.
        That was what happened to the vast majority of the files that had
        changed since the last time I'd allowed Unison to sync Excel files; most
        of them didn't complain at all. I think I changed whether or not user,
        group, and times were supposed to be synced since I first excluded
        *.xls* though, I'm pretty sure it was that.

        Mar
      • Dave Warren
        ... How does rsync determine if files are identical, does it look at contents or just compare meta-data and assume if the dates and sizes are the same, the
        Message 3 of 9 , Mar 25, 2012
        View Source
        • 0 Attachment
          On 3/22/2012 12:08 PM, Mark Casey wrote:
           

          On 3/22/2012 2:34 AM, Alan Schmitt wrote:
          > Excel files are particularly tricky: they may sometime change even
          > without the user explicitly changing them, and without their modtime
          > changing.

          Definitely, but both replicas were identical according to rsync right
          before I ran Unison. So if their prior states were also identical they
          should have been "changed only in identical ways" and not conflicted.
          That was what happened to the vast majority of the files that had
          changed since the last time I'd allowed Unison to sync Excel files; most
          of them didn't complain at all. I think I changed whether or not user,
          group, and times were supposed to be synced since I first excluded
          *.xls* though, I'm pretty sure it was that.


          How does rsync determine if files are identical, does it look at contents or just compare meta-data and assume if the dates and sizes are the same, the files are too?
          -- 
          Dave Warren
          http://www.hireahit.com/
          http://ca.linkedin.com/in/davejwarren
          
          
        • Thomas Boehm
          ... Did you run rsync with the -c option to determine whether they are identical?
          Message 4 of 9 , Mar 26, 2012
          View Source
          • 0 Attachment
            Mark Casey wrote:
            > Definitely, but both replicas were identical according to rsync right
            > before I ran Unison.

            Did you run rsync with the -c option to determine whether they are
            identical?
          • Mark Casey
            ... I believe you and Dave were moving down the same path and I think you ve found the rest of the issue. I d suspected in the past that such shortcuts were in
            Message 5 of 9 , Mar 26, 2012
            View Source
            • 0 Attachment
              On 3/26/2012 2:56 AM, Thomas Boehm wrote:
              >
              > Mark Casey wrote:
              > > Definitely, but both replicas were identical according to rsync right
              > > before I ran Unison.
              >
              > Did you run rsync with the -c option to determine whether they are
              > identical?
              >
              >

              I believe you and Dave were moving down the same path and I think you've
              found the rest of the issue. I'd suspected in the past that such
              shortcuts were in use with rsync based on how much more quickly it can
              return "identical" on a 2nd or 3rd run over a medium sized data set,
              especially when one is remote. It had never mattered before, which is of
              course why that is the default, so I'd never confirmed the details on
              the man page. Anyway the answer is no, I wasn't using '-c', and that was
              the issue.

              Having now done more research on the Excel side of things I know more
              specifically what it does that is, as Alan mentioned, tricky (aka stupid
              IMHO). Now this may still not be 100% accurate, but is pretty close: If
              you open a spreadsheet (via Samba in my case) that was last saved by
              someone else Excel will update an 'owner' sort of field that is within
              the file and save it, I think along with an internal timestamp. For us
              this usually does not change the size of the file, and although the
              file's modtime (according to the filesystem) will be updated while the
              file is open, Excel will revert it back to its prior value in many
              (all?) circumstances if you save no real changes to the file. So the
              file has changed but by default rsync won't know it (but with fastcheck
              off, or in later versions, Unison will).

              I have found no way to get around this behavior in Excel, and since
              we're using Unison on two user-facing file servers I think I'll have to
              simply exclude just those spreadsheets that are active enough to be
              opened on either end at the same time (and warn users that changes to
              those files will be synced one-way each night by another job).
              Fortunately, all the files in that category serve the same function
              (Receivables reports) so they all have very similar names and we won't
              require a dozen different ignore rules.

              So the specific cause was Excel being goofy in addition to a hole in my
              understanding of rsync's default methods. I thought it was pretty clear
              this was roughly where we'd end up for some time now :D, so I appreciate
              everyone continuing to give input.

              Thanks again,
              Mark
            Your message has been successfully submitted and would be delivered to recipients shortly.