Loading ...
Sorry, an error occurred while loading the content.
 

New Greylisting daemon

Expand Messages
  • Nicolas HAHN
    Hi, For a GNU GPLv3 open source project I m working on - the ELSE - and about which I posted some time ago there, I ve studied greylisting and various open
    Message 1 of 19 , Apr 17, 2014
      Hi,

      For a GNU GPLv3 open source project I'm working on - the ELSE - and
      about which I posted some time ago there, I've studied greylisting and
      various open source tools like PostGrey, or GLD (that seems to not be
      maintained any more), or policyd. I've also read
      http://www.postfix.org/SMTPD_POLICY_README.html

      Then, re-using some code of the author of GLD (<salim@...>
      <http://www.gasmi.net>), I tried to make, let's say, an experimental
      version of a "Postfix SMTP access policy delegation" implementation in a
      daemon. I've called this module, the "GreyLSE". Yeah... Probably there
      are better names...

      In short, the GreyLSE is:
      - a daemon made with C/C++
      - needs the PostgreSQL database of the ELSE because works only with that
      - should be able to handle a lot of Postfix policy delegation requests
      per second, due to the fact it creates a child (with a max limit) for
      each Postfix request, but, and this is maybe where I could see a
      difference with GLD for instance, using only a unique database
      connection (the greylse spawn a database child only for this purpose).
      For instance in GLD, and if I don't do mistake, a connection to the
      database was created for each Postfix request, then closed. So the
      GreyLSE can, in theory, process much more Postfix requests per second
      (according tests for a big worldwide ORG, it was rated at 20
      requests/second opening and closing each time a new DB connection, and
      now it is rated as high as 900 requests/second, but it depends of course
      of your server).
      - the GreyLSE in conjonction with the Web interface used to define rules
      and control it (in the ELSE), can manage also: auto-cleaning,
      whitelists, blacklists, holdlists, and in the futur what I call the
      RTAAM. Probably other things in the futur (SPF, ...)
      - the GreyLSE can also works with the ELSEMC (User's personnal messaging
      center) that is another web interface allowing users to control their
      personnal black/white/grey lists
      - several instances of the GreyLSE installed on various servers in an
      ISP type infra can work together
      - all the logic is processed by the database. In fact, the GreyLSE is
      just an interface waiting for Postfix connections, spawning processes,
      calling the SQL method in the database that is in charge of all the
      decision logic, then provides the answer of the SQL method to Postfix.

      Now, I don't have any comparison base with other existing softwares to
      be able to see if 900 requests/second in the conditions of our tests is
      good or not. And it's probably difficult to ask here what you may think
      about that... I should make a lab to test GLD, postgrey, and the GreyLSE
      to have relevant data...

      I wanted to ask to this Postfix community if you think it would be
      better to provide the GreyLSE as a standalone tiny software with its DB
      schema doing only greylisting, or if having it as an add-on like today,
      useable with the ELSE and its big database, integrated in the ELSE Web
      UI, and integrating more features, would be something that could have
      the preference of the community potentially using this kind of
      software... Maybe not a question for this mailing-list... I don't know.

      A question for the community here:
      What would be your expectations or interest in this kind of software?

      A question on Postfix (and sorry if it is an idiot one):
      For now, the GreyLSE wait a Postfix connection, read the data related to
      "a unique recipient", and provides the answer to postfix for this
      recipient then close the TCP connection. I've seen in
      SMTPD_POLICY_README.html, that Postfix can continue to send data
      (keeping the same instance name) to the same TCP connection if the
      policy server don't close it.
      May I ask this: if we consider the policy server keep the connection
      opened and don't close it by itself, will Postfix use the connection to
      send any policy requests to the policy server for all recipients related
      to the same email (same instance name) and THEN close the connection to
      the policy server, or will it continue to use the same connection until
      eventually it is closed by the policy server, whatever is the email in
      processing (so the same TCP connection is used for multiple unrelated
      emails)?

      I don't known if I understood the SMTPD_POLICY_README correctly, but my
      own answer to my question would be that Postfix continue to use a TCP
      connection that is maintained by the policy server (and this is up to
      the policy server to check if the instance variable "change" at some
      point... i could test all of that in a lab but if it's faster to get an
      answer from there...

      Best regards,
      --
      Nicolas HAHN
    • lists@rhsoft.net
      ... forget it - starting 2014 and limit to a single DB backend is crazy
      Message 2 of 19 , Apr 17, 2014
        Am 17.04.2014 21:26, schrieb Nicolas HAHN:
        > For a GNU GPLv3 open source project I'm working on - the ELSE - and about which I posted some time ago there, I've
        > studied greylisting and various open source tools like PostGrey, or GLD (that seems to not be maintained any more),
        > or policyd. I've also read http://www.postfix.org/SMTPD_POLICY_README.html
        >
        > Then, re-using some code of the author of GLD (<salim@...> <http://www.gasmi.net>), I tried to make, let's
        > say, an experimental version of a "Postfix SMTP access policy delegation" implementation in a daemon. I've called
        > this module, the "GreyLSE". Yeah... Probably there are better names...
        >
        > In short, the GreyLSE is:
        > - a daemon made with C/C++
        > - needs the PostgreSQL database of the ELSE because works only with that

        forget it - starting 2014 and limit to a single DB backend is crazy
      • Nicolas HAHN
        In short, the GreyLSE is: - a daemon made with C/C++ - needs the PostgreSQL database of the ELSE because works only with that ... Hummm... It s a new tool...
        Message 3 of 19 , Apr 17, 2014
          In short, the GreyLSE is:
          - a daemon made with C/C++
          - needs the PostgreSQL database of the ELSE because works only with that

          > forget it - starting 2014 and limit to a single DB backend is crazy

          Hummm... It's a new tool... The possibility to use other backends in the
          futur is not closed specifically for this tool... We all have to start
          somewhere when we try to create a soft, and I cannot be everywhere...
          Even if i wrote the ELSE will never use MySQL... but ELSE is not GreyLSE :)

          And I suppose you know what is the kind of answer the community provides
          for this kind of comment: You want this tool to make use of another
          backend? Welcome in the project: your first step could be to develop all
          code to use MySQL... :)
        • lists@rhsoft.net
          ... don t get me wrong but abstraction layers exists http://www.tildeslash.com/libzdb/ nobody needs to write backends for every database frankly for a
          Message 4 of 19 , Apr 17, 2014
            Am 17.04.2014 21:44, schrieb Nicolas HAHN:
            > In short, the GreyLSE is:
            > - a daemon made with C/C++
            > - needs the PostgreSQL database of the ELSE because works only with that
            >
            >> forget it - starting 2014 and limit to a single DB backend is crazy
            >
            > Hummm... It's a new tool... The possibility to use other backends in the futur is not closed specifically for this
            > tool... We all have to start somewhere when we try to create a soft, and I cannot be everywhere... Even if i wrote
            > the ELSE will never use MySQL... but ELSE is not GreyLSE :)
            >
            > And I suppose you know what is the kind of answer the community provides for this kind of comment: You want this
            > tool to make use of another backend? Welcome in the project: your first step could be to develop all code to use
            > MySQL... :)

            don't get me wrong but abstraction layers exists
            http://www.tildeslash.com/libzdb/

            nobody needs to write backends for every database

            frankly for a greylisting daemon there is no need for a full-featured database server
            like MySQl or PostgrSQL, in context of postfix it should at least support BDB as
            postfix does
          • Nicolas HAHN
            MySQL... :) ... Well your point is valuable and you re right, abstractions layers exist. Except that the GreyLSE is built for ISP type loads (well, this is
            Message 5 of 19 , Apr 17, 2014
              MySQL... :)

              > don't get me wrong but abstraction layers exists
              > http://www.tildeslash.com/libzdb/
              >
              > nobody needs to write backends for every database
              >
              > frankly for a greylisting daemon there is no need for a full-featured database server
              > like MySQl or PostgrSQL, in context of postfix it should at least support BDB as
              > postfix does
              Well your point is valuable and you're right, abstractions layers exist.
              Except that the GreyLSE is built for ISP type loads (well, this is what
              I wouldl ike to focus on), and my wish was to optimize the thing
              everywhere possible. Adding abstraction layers is adding milliseconds to
              the processing... I would prefer, for example, to offer the possibility
              to compile it with various DB libs... Why not...

              And ISPs have the need for clusters of databases, reporting,
              data-warehouse, web GUI for control and so on... ISPs handle several
              millions emails a day... The ELSE (which until now was the analytics),
              the GreyLSE, the RTAAM, are all built for that, with PostgreSQL database
              size of 1 tera byte mini (that's my current conditions of test...)

              So again, your point is valuable, but it depends from what side of the
              bridge we are...
            • Patrick Laimbock
              On 17-04-14 21:56, lists@rhsoft.net wrote: [snip] ... Why add BDB when there s LMDB? Postfix also supports LMDB and besides being faster and a lot of other
              Message 6 of 19 , Apr 17, 2014
                On 17-04-14 21:56, lists@... wrote:
                [snip]
                > frankly for a greylisting daemon there is no need for a full-featured database server
                > like MySQl or PostgrSQL, in context of postfix it should at least support BDB as
                > postfix does

                Why add BDB when there's LMDB? Postfix also supports LMDB and besides
                being faster and a lot of other goodness, it does not have all the
                problems that BDB has.

                http://symas.com/mdb/

                HTH,
                Patrick
              • lists@rhsoft.net
                ... whatever backends, there needs to be at least one without an explicit daemon and no maintainance of the backend itself or a free choice you can hardly
                Message 7 of 19 , Apr 17, 2014
                  Am 17.04.2014 22:48, schrieb Patrick Laimbock:
                  > On 17-04-14 21:56, lists@... wrote:
                  > [snip]
                  >> frankly for a greylisting daemon there is no need for a full-featured database server
                  >> like MySQl or PostgrSQL, in context of postfix it should at least support BDB as
                  >> postfix does
                  >
                  > Why add BDB when there's LMDB? Postfix also supports LMDB and besides being faster and a lot of other goodness, it
                  > does not have all the problems that BDB has.
                  >
                  > http://symas.com/mdb/

                  whatever backends, there needs to be at least one without an
                  explicit daemon and no maintainance of the backend itself
                  or a free choice

                  you can hardly demand somone having a perfect working MySQL
                  infrastructure with admins knowing what they are doing setup
                  PostgreSQl for greylisting only which means *two* full
                  featurd database servers with all their problems combined

                  well you can, but the result would be low acceptance

                  an no database abstraction alyer is *really not* the
                  performance problem to excuse a "vendor-lockin" or to
                  say it in other words: if you start these days a proect
                  and the frist decision you make is what RDBMS you will
                  use your whole software design is broken from that moment
                • Nicolas HAHN
                  an no database abstraction alyer is *really not* the performance problem to excuse a vendor-lockin or to say it in other words: if you start these
                  Message 8 of 19 , Apr 17, 2014
                    <snip>
                    an no database abstraction alyer is *really not* the performance problem
                    to excuse a "vendor-lockin" or to say it in other words: if you start
                    these days a proect and the frist decision you make is what RDBMS you
                    will use your whole software design is broken from that moment

                    Again, we'll see where we'll go with the GreyLSE, and also strongly
                    considering community requests (if they are some); That's the minimum
                    for an open source tool.

                    But consider that for now, this is developped according the needs of
                    ISPs, with the ISP's environment, type of load, constraints, with all
                    intelligence deported to the database using SQL stored procedures and so
                    on. And the best DB candidate for that was PostgreSQL, an enterprise
                    class DB engine, classified in the challengers box of the gartner
                    Quadrant, where we even don't see MySQL :) My 2 cents in the coffee
                    machine...

                    For now, the GreyLSE is highly scalable in its experimental version, at
                    first view it can process considerably more requests than GLD or
                    Postgrey are able to do, furthemore with additional features and good
                    integration and user-interaction if Web UI are used.

                    What I also repeat is that all options remain open. But you cannot have
                    stored procedures in BDB or other "lightweight" DB backends. So, an
                    orientation has been decided (and of course other futur orientations can
                    lead to other versions of the GreyLSE), and again, it's just needed to
                    start somewhere and work on an orientation among others... I don't want
                    to start a church war on this mailing list and that's not the goal, I
                    presume, of this mailing list.

                    I've understood your point.
                  • Viktor Dukhovni
                    ... Give the OP a break, it seems he is trying to put together an integrated tool set, rather than general-purpose components. In an integrated tool set one
                    Message 9 of 19 , Apr 17, 2014
                      On Thu, Apr 17, 2014 at 10:55:32PM +0200, lists@... wrote:

                      > whatever backends, there needs to be at least one without an
                      > explicit daemon and no maintainance of the backend itself
                      > or a free choice

                      Give the OP a break, it seems he is trying to put together an
                      integrated tool set, rather than general-purpose components. In
                      an integrated tool set one sometimes sacrifices flexibility in the
                      name of inter-component integration.

                      The database abstraction may come along later, or the tool set may
                      essentially be an application-specific extension of the required
                      embedded database.

                      Vertical products (say like Zimbra) built around Postfix are a good
                      thing IMHO, we need both general purpose components for DIY
                      administrators and integrated products for users who want a complete
                      system.

                      --
                      Viktor.
                    • Henrik K
                      ... SQL backend for greylisting and most other stuff is pretty pointless and awkward to set up. My own perl greylister simply stores everything in memory and
                      Message 10 of 19 , Apr 17, 2014
                        On Thu, Apr 17, 2014 at 10:05:05PM +0200, Nicolas HAHN wrote:
                        >
                        > exist. Except that the GreyLSE is built for ISP type loads (well, this is
                        > what I wouldl ike to focus on), and my wish was to optimize the thing
                        > everywhere possible. Adding abstraction layers is adding milliseconds to
                        > the processing... I would prefer, for example, to offer the possibility
                        > to compile it with various DB libs... Why not...

                        SQL backend for greylisting and most other stuff is pretty pointless and
                        awkward to set up. My own perl greylister simply stores everything in
                        memory and easily performs 5000+ requests per second. If you need more
                        redundancy, you could simply use say Redis which performs as well.
                      • Nicolas HAHN
                        to compile it with various DB libs... Why not... ... 5000+ it s nice :) Is it able to do global site whitelisting? Is it able to do global site blacklisting?
                        Message 11 of 19 , Apr 17, 2014
                          to compile it with various DB libs... Why not...

                          > SQL backend for greylisting and most other stuff is pretty pointless and
                          > awkward to set up. My own perl greylister simply stores everything in
                          > memory and easily performs 5000+ requests per second. If you need more
                          > redundancy, you could simply use say Redis which performs as well.
                          >

                          5000+ it's nice :)

                          Is it able to do global site whitelisting?
                          Is it able to do global site blacklisting?
                          Is it able to provide real time reports?
                          Is it able to detect attacks on your messaging system and stop them in
                          less than 5 minutes?
                          Is it able to interact with users to manage every user's sender white
                          list, sender black list, recipient white list, and even interact with
                          grey list?
                          Is it able to automatically clean bad traffic (bounces and long term
                          defers) from mailing lists?
                          Is it implementing extremely complex rule system with management of
                          priorities?

                          Tell me your features and we'll see, if you have same as GreyLSE
                          features, if 5000+ will still be there :)

                          What's the name of your product? I would be very interested to take a
                          look any way :)
                        • Jan P. Kessler
                          Hi, maybe you should set up an own mailing list for GreyLSE. The are a lot of coders at this list. If any of them would use this list to discuss their own
                          Message 12 of 19 , Apr 18, 2014
                            Hi,

                            maybe you should set up an own mailing list for GreyLSE. The are a lot of coders at this list. If any of them would use this list to discuss their own topics it might become somewhat confusing here.

                            - should be able to handle a lot of Postfix policy delegation requests 
                            per second, due to the fact it creates a child (with a max limit) for 
                            each Postfix request, but, and this is maybe where I could see a 

                            Does that mean, that it forks when a new request arrives? Keep in mind that process forking is very expensive. Take a look at preforking (simple, lots of examples) or multi-threaded models (more complex but even more efficient).

                            I wanted to ask to this Postfix community if you think it would be 
                            better to provide the GreyLSE as a standalone tiny software with its DB 
                            schema doing only greylisting, or if having it as an add-on like today, 
                            useable with the ELSE and its big database, integrated in the ELSE Web 
                            UI, and integrating more features, would be something that could have 
                            the preference of the community potentially using this kind of 
                            software... Maybe not a question for this mailing-list... I don't know.

                            Well, there are lots of existing and working "standalone" applications for greylisting (in fact I don't miss any features with the ones I use). So maybe it might be more promising to concentrate on the ELSE plugin approach - imho of course.

                            A question on Postfix (and sorry if it is an idiot one):
                            For now, the GreyLSE wait a Postfix connection, read the data related to 
                            "a unique recipient", and provides the answer to postfix for this 
                            recipient then close the TCP connection. I've seen in 
                            SMTPD_POLICY_README.html, that Postfix can continue to send data 
                            (keeping the same instance name) to the same TCP connection if the 
                            policy server don't close it.
                            May I ask this: if we consider the policy server keep the connection 
                            opened and don't close it by itself, will Postfix use the connection to 
                            send any policy requests to the policy server for all recipients related 
                            to the same email (same instance name) and THEN close the connection to 
                            the policy server, or will it continue to use the same connection until 
                            eventually it is closed by the policy server, whatever is the email in 
                            processing (so the same TCP connection is used for multiple unrelated 
                            emails)?

                            Yes, the last option. It will reuse the connection:

                            "On active systems a policy daemon process is used multiple times, for up to $max_use incoming SMTP connections."
                            [http://www.postfix.org/SMTPD_POLICY_README.html].

                            So, where is your code? Did I miss a link?

                              Jan

                          • Nicolas HAHN
                            ... You re right, old, historycal mailing lists exist on the X-Itools sourceforge project page. ... Yes. I m working on preforking (in fact, I ve started to
                            Message 13 of 19 , Apr 18, 2014
                              Le 18/04/2014 10:17, Jan P. Kessler a écrit :
                              Hi,

                              maybe you should set up an own mailing list for GreyLSE. The are a lot of coders at this list. If any of them would use this list to discuss their own topics it might become somewhat confusing here.
                              You're right, old, historycal mailing lists exist on the X-Itools sourceforge project page.

                              Does that mean, that it forks when a new request arrives? Keep in mind that process forking is very expensive. Take a look at preforking (simple, lots of examples) or multi-threaded models (more complex but even more efficient).

                              Yes. I'm working on preforking (in fact, I've started to analyze prefork.c from Apache web server some days ago...). Threads are an option, but we choose forking for better isolation. Some people say forking and threading is basically the same in term of perfs, that's even written in some books dedicated to linux programming, some others have another opinion. The best would be to get our own experience about that: just for fun, that could be interesting to make a multi-threaded version to see how perfs are going. At this time in my testing conditions, the overall decision process regarding one recipient takes 0.7 milliseconds minimum, 1.5 milliseconds maximum on a huge database that is under heavy load performing also all ELSE operations (so not only GreyLSE). That means between 667 and 1428 requests/s. So average is 1047 requests/s. Conservative value: 900 requests/s, with only one database management child. A GreyLSE detached from the ELSE, with it's dedicated small DB, would we able to perform more requests/s. I'll probably also offer a way to configure the number of database childs...

                              Yes, the last option. It will reuse the connection:

                              Yes, a test lab this morning confirmed that. Thanks any way for this confirmation.


                              So, where is your code? Did I miss a link?

                              Well, I've decided to not put links any more on a mailing list dedicated to postfix (this is in link with what you wrote in your first sentence...), and my posts here are very rare. If people want to take a look at this very young project module, they can find it by themselves on sourceforge (or by checking my very first posts to this postfix mailing list)
                            • Jan P. Kessler
                              ... Good. I d agree that the difference in performance between a preforking and a multithreaded approach will not be that big. A multithreaded model requires
                              Message 14 of 19 , Apr 18, 2014
                                > Yes. I'm working on preforking (in fact, I've started to analyze
                                > prefork.c from Apache web server some days ago...). Threads are an
                                > option, but we choose forking for better isolation. Some people say
                                > forking and threading is basically the same in term of perfs, that's
                                > even written in some books dedicated to linux programming, some others
                                > have another opinion.

                                Good. I'd agree that the difference in performance between a preforking
                                and a multithreaded approach will not be that big. A multithreaded model
                                requires less memory. But with the (hopefully) small footprint of
                                greylisting code this should be neglectable.

                                >>
                                >> So, where is your code? Did I miss a link?
                                >>
                                > Well, I've decided to not put links any more on a mailing list
                                > dedicated to postfix (this is in link with what you wrote in your
                                > first sentence...), and my posts here are very rare. If people want to
                                > take a look at this very young project module, they can find it by
                                > themselves on sourceforge (or by checking my very first posts to this
                                > postfix mailing list)

                                I should have added that all I said was only IMO, of course. It is not
                                up to me to decide what does belong here and what does not. I am happy
                                about anyone offering time and efforts to develop free software or to
                                share something valuable with the community.
                              • Wietse Venema
                                ... This is preferred usage. Closing the socket after each reply is wasteful. Wietse
                                Message 15 of 19 , Apr 18, 2014
                                  Jan P. Kessler:
                                  > > May I ask this: if we consider the policy server keep the connection
                                  > > opened and don't close it by itself, will Postfix use the connection to

                                  This is preferred usage. Closing the socket after each reply is wasteful.

                                  Wietse
                                • Nicolas HAHN
                                  ... Thanks for the answer. Comments from Jan P. Kessler helped also. I ve updated my code to keep connections opened unless a configurable timeout. According
                                  Message 16 of 19 , Apr 18, 2014
                                    > This is preferred usage. Closing the socket after each reply is wasteful.
                                    >
                                    > Wietse

                                    Thanks for the answer. Comments from Jan P. Kessler helped also.

                                    I've updated my code to keep connections opened unless a configurable
                                    timeout. According my tests, I've new data:

                                    - GreyLSE is now rated at 150000 SQL requests/second, to speak only of
                                    the SQL part.
                                    - The global process from the acceptance of a postfix request to the
                                    delivery of the policy decision is now rated at around 15000 req/s.

                                    I see my processing childs dying after having processed 100 postfix
                                    requests, for the ones not dying because of the timeout.

                                    So the database is not the bottleneck. I now think the bottleneck is
                                    simply the latency needed for network communication, and postfix
                                    delegation protocol handling.

                                    The average number of bytes transferred for every postfix request is 600
                                    bytes.
                                    600 bytes x 15000 requests/s = 8.58 Mbytes/s. That's the theorical speed
                                    of a 100 Mbits/s line. (more or less)

                                    I think that "je tiens le bon bout!"

                                    :)
                                  • Benny Pedersen
                                    ... if its time to test it ? is there a maillist for this project ? or even code download link ? wiki ?
                                    Message 17 of 19 , Apr 19, 2014
                                      Nicolas HAHN skrev den 2014-04-18 20:22:
                                      >> This is preferred usage. Closing the socket after each reply is
                                      >> wasteful.
                                      > Thanks for the answer. Comments from Jan P. Kessler helped also.

                                      if its time to test it ?

                                      is there a maillist for this project ?

                                      or even code download link ?

                                      wiki ?
                                    • Wietse Venema
                                      ... [...] ... By design, most Postfix daemon processes terminate voluntarily after $max-use (default: 100) client connections, or after $max_idle (default:
                                      Message 18 of 19 , Apr 19, 2014
                                        Nicolas HAHN:
                                        > > This is preferred usage. Closing the socket after each reply is wasteful.
                                        >
                                        > Thanks for the answer. Comments from Jan P. Kessler helped also.
                                        [...]
                                        > I see my processing childs dying after having processed 100 postfix
                                        > requests, for the ones not dying because of the timeout.

                                        By design, most Postfix daemon processes terminate voluntarily after
                                        $max-use (default: 100) client connections, or after $max_idle
                                        (default: 100) seconds of inactivity.

                                        One a busy system this gives most of the benefits of preforking,
                                        without the dangers of memory leaks in libraries. On an idle
                                        system no-one cares about performance.

                                        Wietse
                                      • Nicolas HAHN
                                        I ll answer you in private soon Beeny to not polute postfix mailing list.
                                        Message 19 of 19 , Apr 19, 2014
                                          I'll answer you in private soon Beeny to not polute postfix mailing list.


                                          Le 19/04/2014 14:44, Benny Pedersen a écrit :
                                          > if its time to test it ?
                                          >
                                          > is there a maillist for this project ?
                                          >
                                          > or even code download link ?
                                          >
                                          > wiki ?
                                        Your message has been successfully submitted and would be delivered to recipients shortly.