Loading ...
Sorry, an error occurred while loading the content.

OT - mail archive

Expand Messages
  • John Allen
    I realize that this is off topic, but as there are more email experts assembled here than any where else I know of .... I have a couple of users who are using
    Message 1 of 17 , Apr 25, 2013
    • 0 Attachment
      I realize that this is off topic, but as there are more email experts assembled here than any where else I know of ....

      I have a couple of users who are using their maildir as online storage for emails (current and archival).
      They have done this on their own and are prepared to live with some of the limitations.
      However, I think there must be a better way of doing this, preferably one that could also be used to store non-mail documents, provides search  etc, WAN accessible.

      I have looked at a few things, but all of them seem to have problems.

      Suggestions with any experience would be welcome?

      If you don't want to reply on the mail list please use my gmail account.

      Thanks
      John A

    • Jeroen Geilman
      ... What perceived limitations ? IMAP stores in maildir format scale quite well; pretty much the only limitation is storage I/O. If you mean you want a more
      Message 2 of 17 , Apr 25, 2013
      • 0 Attachment
        On 04/25/2013 08:56 PM, John Allen wrote:
        I realize that this is off topic, but as there are more email experts assembled here than any where else I know of ....

        I have a couple of users who are using their maildir as online storage for emails (current and archival).
        They have done this on their own and are prepared to live with some of the limitations.

        What perceived limitations ?

        IMAP stores in maildir format scale quite well; pretty much the only limitation is storage I/O.

        If you mean you want a more efficient mailstore, you can look into dbmail or dbox storage (the former is a mysql mailstore and IMAP server; the latter is a newer mailstore format supported by dovecot, among others.)

        However, I think there must be a better way of doing this,

        A better way of doing *what* ? What "problem" do you want to solve?

        preferably one that could also be used to store non-mail documents, provides search  etc, WAN accessible.

        How is that related to users who use their IMAP mailstore as a.. mail store ?
        IMAP tends to be accessible from the outside in any case, and any MUA worth its salt can search.


        I have looked at a few things, but all of them seem to have problems.

        What things would those be ? You're not giving us much to go on, here.


        -- 
        J.
        
      • grarpamp
        ... Depending on your FS and horsepower, anything over 1000 x (n * 10) files in a directory can start to sink you pretty quick. I ve always wondered if there s
        Message 3 of 17 , Apr 25, 2013
        • 0 Attachment
          > maildir format scale[s] quite well; pretty much the only
          > limitation is storage I/O.

          Depending on your FS and horsepower, anything over
          1000 x (n * 10) files in a directory can start to sink you
          pretty quick. I've always wondered if there's a maildir split
          specified out there that applications could utilize...
          where n is your split width... tmp/n, new/n, cur/n.
        • postfix@netorbit.it
          ... what about shifting this problem to the storage layer? Apart using SSDs, what about using having a striped array as a RAID 1+0 layout, in order to have
          Message 4 of 17 , Apr 26, 2013
          • 0 Attachment
            On 26/04/2013 00:15, grarpamp wrote:
            >> maildir format scale[s] quite well; pretty much the only
            >> limitation is storage I/O.
            > Depending on your FS and horsepower, anything over
            > 1000 x (n * 10) files in a directory can start to sink you
            > pretty quick. I've always wondered if there's a maildir split
            > specified out there that applications could utilize...
            > where n is your split width... tmp/n, new/n, cur/n.
            what about shifting this problem to the storage layer?
            Apart using SSDs, what about using having a striped array as a RAID 1+0
            layout, in order to have some redudancy as well?
          • grarpamp
            ... Adding IOPS, tuning dirhash and whatnot definitely work up to a certain point. Sometimes you ll blow out your backup programs and other tools at various
            Message 5 of 17 , Apr 26, 2013
            • 0 Attachment
              >> I've always wondered if there's a maildir split
              >> specified out there that applications could utilize...
              >> where n is your split width... tmp/n, new/n, cur/n.
              >
              > what about shifting this problem to the storage layer?
              > Apart using SSDs, what about using having a striped array as a RAID 1+0
              > layout, in order to have some redudancy as well?

              Adding IOPS, tuning dirhash and whatnot definitely work up to a
              certain point. Sometimes you'll blow out your backup programs and
              other tools at various points along the way till you're again bound
              by too many files in one dir or have hit the cpu/ram wall for it all.
            • Wietse Venema
              ... Faster disks don t solve algorithmic problems (problems related to the number of files per directory). Wietse
              Message 6 of 17 , Apr 26, 2013
              • 0 Attachment
                postfix@...:
                > On 26/04/2013 00:15, grarpamp wrote:
                > >> maildir format scale[s] quite well; pretty much the only
                > >> limitation is storage I/O.
                > > Depending on your FS and horsepower, anything over
                > > 1000 x (n * 10) files in a directory can start to sink you
                > > pretty quick. I've always wondered if there's a maildir split
                > > specified out there that applications could utilize...
                > > where n is your split width... tmp/n, new/n, cur/n.
                > what about shifting this problem to the storage layer?

                Faster disks don't solve algorithmic problems (problems related
                to the number of files per directory).

                Wietse
              • Robert Schetterer
                ... alternate you may use mdbox http://wiki2.dovecot.org/MailboxFormat/dbox ... Best Regards MfG Robert Schetterer -- [*] sys4 AG http://sys4.de, +49 (89) 30
                Message 7 of 17 , Apr 26, 2013
                • 0 Attachment
                  Am 26.04.2013 13:20, schrieb Wietse Venema:
                  > postfix@...:
                  >> On 26/04/2013 00:15, grarpamp wrote:
                  >>>> maildir format scale[s] quite well; pretty much the only
                  >>>> limitation is storage I/O.
                  >>> Depending on your FS and horsepower, anything over
                  >>> 1000 x (n * 10) files in a directory can start to sink you
                  >>> pretty quick. I've always wondered if there's a maildir split
                  >>> specified out there that applications could utilize...
                  >>> where n is your split width... tmp/n, new/n, cur/n.
                  >> what about shifting this problem to the storage layer?
                  >
                  > Faster disks don't solve algorithmic problems (problems related
                  > to the number of files per directory).

                  alternate you may use mdbox

                  http://wiki2.dovecot.org/MailboxFormat/dbox

                  >
                  > Wietse
                  >



                  Best Regards
                  MfG Robert Schetterer

                  --
                  [*] sys4 AG

                  http://sys4.de, +49 (89) 30 90 46 64
                  Franziskanerstraße 15, 81669 München

                  Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
                  Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
                  Aufsichtsratsvorsitzender: Florian Kirstein
                • grarpamp
                  ... Both of these hold all messages in a single directory. So sdbox would be no advantage there. And mdbox does not support one message per file, nor without
                  Message 8 of 17 , Apr 26, 2013
                  • 0 Attachment
                    >>>> specified out there that applications could utilize...
                    >>>> where n is your split width... tmp/n, new/n, cur/n.

                    > alternate you may use mdbox
                    > http://wiki2.dovecot.org/MailboxFormat/dbox

                    Both of these hold all messages in a single directory.
                    So sdbox would be no advantage there.
                    And mdbox does not support one message per file, nor
                    without metadata added to it, so those needing that for
                    other purposes would have no advantage.
                    [There is a per file limit specified in bytes, not count. It's
                    not clear what the behavior would be if a proposed new msg in
                    a new file would exceed a lesser byte limit. Perhaps a safe
                    bounce or queue.]

                    I do like that they are reasonably well specified, publicly
                    on a wiki as opposed to only in source, and have a comparison
                    table of support on the parent page. All of which lead to easier
                    review and adoption by interested parties.
                  • Reindl Harald
                    ... pff and you realized that the not a file per message is exactly the solution for problems with tens thousands of files in a folder?
                    Message 9 of 17 , Apr 26, 2013
                    • 0 Attachment
                      Am 26.04.2013 21:24, schrieb grarpamp:
                      >>>>> specified out there that applications could utilize...
                      >>>>> where n is your split width... tmp/n, new/n, cur/n.
                      >
                      >> alternate you may use mdbox
                      >> http://wiki2.dovecot.org/MailboxFormat/dbox
                      >
                      > Both of these hold all messages in a single directory.
                      > So sdbox would be no advantage there.
                      > And mdbox does not support one message per file

                      pff and you realized that the "not a file per message" is
                      exactly the solution for problems with tens thousands of
                      files in a folder?
                    • Patrick Domack
                      ... It used to take me well over 2days to backup my email. I switched to using mdbox 2.2years ago, and I have been extreemly happy with it. Current stats on my
                      Message 10 of 17 , Apr 26, 2013
                      • 0 Attachment
                        Quoting grarpamp <grarpamp@...>:

                        >>>>> specified out there that applications could utilize...
                        >>>>> where n is your split width... tmp/n, new/n, cur/n.
                        >
                        >> alternate you may use mdbox
                        >> http://wiki2.dovecot.org/MailboxFormat/dbox
                        >
                        > Both of these hold all messages in a single directory.
                        > So sdbox would be no advantage there.
                        > And mdbox does not support one message per file, nor
                        > without metadata added to it, so those needing that for
                        > other purposes would have no advantage.
                        > [There is a per file limit specified in bytes, not count. It's
                        > not clear what the behavior would be if a proposed new msg in
                        > a new file would exceed a lesser byte limit. Perhaps a safe
                        > bounce or queue.]
                        >
                        > I do like that they are reasonably well specified, publicly
                        > on a wiki as opposed to only in source, and have a comparison
                        > table of support on the parent page. All of which lead to easier
                        > review and adoption by interested parties.

                        It used to take me well over 2days to backup my email.

                        I switched to using mdbox 2.2years ago, and I have been extreemly
                        happy with it.

                        Current stats on my personal mailbox is 4.8gigs compressed, holding
                        8.3gigs of email.
                        470k emails in 300 mdbox files (exactly 300 currently), the current
                        mdbox file number is 850

                        I purge emails weekly out of my mdbox files.

                        So with my average mailfolder holding 10k+ emails with maildir, or 300
                        files in a folder for mdbox, mdbox wins for me.

                        I never switched to dbox, but the gains with using it are also
                        available for using mdbox, you can use single instance storage, and
                        archival storage. I think the limit is using 3 different paths to hold
                        your dbox/mdbox files.
                      • grarpamp
                        ... It is *a* solution, not *the* solution, and obviously not one of the type I describes. And a fine pff to you my friend.
                        Message 11 of 17 , Apr 26, 2013
                        • 0 Attachment
                          >>>>>> specified out there that applications could utilize...
                          >>>>>> where n is your split width... tmp/n, new/n, cur/n.

                          > pff and you realized that the "not a file per message" is
                          > exactly the solution for problems with tens thousands of

                          It is *a* solution, not *the* solution, and obviously not one
                          of the type I describes. And a fine pff to you my friend.
                        • Robert Schetterer
                          ... if done a mailbox archive test with postfix bcc, filtering with sieve to domain, user, date, mail-in , mail-out sort in subfolders i dont see urgent need
                          Message 12 of 17 , Apr 26, 2013
                          • 0 Attachment
                            Am 26.04.2013 21:24, schrieb grarpamp:
                            >>>>> specified out there that applications could utilize...
                            >>>>> where n is your split width... tmp/n, new/n, cur/n.
                            >
                            >> alternate you may use mdbox
                            >> http://wiki2.dovecot.org/MailboxFormat/dbox
                            >
                            > Both of these hold all messages in a single directory.
                            > So sdbox would be no advantage there.
                            > And mdbox does not support one message per file, nor
                            > without metadata added to it, so those needing that for
                            > other purposes would have no advantage.

                            if done a mailbox archive test with postfix bcc, filtering
                            with sieve to domain, user, date, mail-in , mail-out sort in subfolders

                            i dont see urgent need for having each mail a file, as you may connect
                            to the archive via imap client for restore one mail,
                            mdbox to my knowledge is the best open compromise on filesystem mailbox
                            formats for filesystems, but however maildir should work too if used in
                            a well desigend setup, you might need scripting taring partitioning too,
                            on the long run.
                            Other Soltuions do store archive in Databases etc

                            > [There is a per file limit specified in bytes, not count. It's
                            > not clear what the behavior would be if a proposed new msg in
                            > a new file would exceed a lesser byte limit. Perhaps a safe
                            > bounce or queue.]
                            >
                            > I do like that they are reasonably well specified, publicly
                            > on a wiki as opposed to only in source, and have a comparison
                            > table of support on the parent page. All of which lead to easier
                            > review and adoption by interested parties.
                            >

                            in germany companies 10 year archive for finance mails is a must have by
                            law, so there are serveral certified professional solutions, you may
                            have a look on.

                            Dovecot also offers some archive solution, and a object store
                            you might have a look on

                            At the end this isnt a real postfix theme, its more about storage etc


                            Best Regards
                            MfG Robert Schetterer

                            --
                            [*] sys4 AG

                            http://sys4.de, +49 (89) 30 90 46 64
                            Franziskanerstraße 15, 81669 München

                            Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
                            Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
                            Aufsichtsratsvorsitzender: Florian Kirstein
                          • Stan Hoeppner
                            ... True. mbox solved this problem before it really began, back before people started mass archiving thousands or tens of thousands of emails. When used with
                            Message 13 of 17 , Apr 26, 2013
                            • 0 Attachment
                              On 4/26/2013 9:32 PM, grarpamp wrote:
                              >>>>>>> specified out there that applications could utilize...
                              >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                              >
                              >> pff and you realized that the "not a file per message" is
                              >> exactly the solution for problems with tens thousands of
                              >
                              > It is *a* solution, not *the* solution,

                              True. mbox solved this problem before it really began, back before
                              people started mass archiving thousands or tens of thousands of emails.
                              When used with modern MUAs and multiple name spaces you can mitigate or
                              eliminate the locking problems, especially if using sieve to sort to
                              these folders/files during delivery.

                              I've been using such a setup with Dovecot, Thunderbird, and Roundcube,
                              for may years. The largest of my mbox files, XFS list mail, is only
                              19K+ emails. Full text searching it is relatively quick even if the FTS
                              index isn't primed, and especially given the age of the hardware. If I
                              was using maildir storage I can only assume FTS would take quite a while
                              longer, as well as backup.

                              --
                              Stan
                            • grarpamp
                              ... I must admit giving yourself the local equivalent of your own lifetime email account is an interesting approach if you don t really need access to the raw
                              Message 14 of 17 , Apr 26, 2013
                              • 0 Attachment
                                > re: the last two posts

                                I must admit giving yourself the local equivalent
                                of your own lifetime email account is an interesting
                                approach if you don't really need access to the raw
                                message files on disk.
                              • Reindl Harald
                                ... boy you replied to Faster disks don t solve algorithmic problems (problems related to the number of files per directory) with And mdbox does not support
                                Message 15 of 17 , Apr 27, 2013
                                • 0 Attachment
                                  Am 27.04.2013 04:32, schrieb grarpamp:
                                  >>>>>>> specified out there that applications could utilize...
                                  >>>>>>> where n is your split width... tmp/n, new/n, cur/n.
                                  >
                                  >> pff and you realized that the "not a file per message" is
                                  >> exactly the solution for problems with tens thousands of
                                  >
                                  > It is *a* solution, not *the* solution, and obviously not one
                                  > of the type I describes. And a fine pff to you my friend.

                                  boy you replied to "Faster disks don't solve algorithmic problems
                                  (problems related to the number of files per directory)" with
                                  "And mdbox does not support one message per file"

                                  no it is not *the* solution, but "does not support one message
                                  püer file is pure bullshit in this context because it is what
                                  you want
                                • grarpamp
                                  ... No, actually right up there is what I was surveying. But you failed to grok that in your search for more pfft. I m sure it s a nice day, go outside :)
                                  Message 16 of 17 , Apr 27, 2013
                                  • 0 Attachment
                                    >> specified out there that applications could utilize...
                                    >> where n is your split width... tmp/n, new/n, cur/n.

                                    > it is what you want

                                    No, actually right up there is what I was surveying.
                                    But you failed to grok that in your search for more pfft.
                                    I'm sure it's a nice day, go outside :)
                                  • Reindl Harald
                                    ... maybe you should learn how to use a mail-client and quote before you post to a mail-server list - your answer above makes no sense at all in context of the
                                    Message 17 of 17 , Apr 27, 2013
                                    • 0 Attachment
                                      Am 27.04.2013 23:03, schrieb grarpamp:
                                      >>> specified out there that applications could utilize...
                                      >>> where n is your split width... tmp/n, new/n, cur/n.
                                      >
                                      >> it is what you want
                                      >
                                      > No, actually right up there is what I was surveying.
                                      > But you failed to grok that in your search for more pfft.
                                      > I'm sure it's a nice day, go outside :)

                                      maybe you should learn how to use a mail-client and quote
                                      before you post to a mail-server list - your answer above
                                      makes no sense at all in context of the thread
                                    Your message has been successfully submitted and would be delivered to recipients shortly.