Loading ...
Sorry, an error occurred while loading the content.

24036Re: [Clip] VanWeerthuizenian Expressions

Expand Messages
  • Art Kocsis
    Sep 19 7:33 PM
      Sorry, another long post. But there's a lot to cover.

      On Wed 11 Sep 2013 08:10:08 -0700 Flo said:
      "...what matters here is this specific approach which - as Wayne explained -
      could apply to much more complex queries (with AND, OR, XOR, NOT etc). In
      cases like that, it would be difficult to use RegEx, and compound loops
      would end in extremely long "IF, ELSE IF, ELSE IF Chains" (Wayne)."

      On Thu, 19 Sep 2013 03:52:04 -0000 Flo said:
      "Nevertheless, for me the question remains: Is there any way to enlarge Wayne's
      concept to conditions like 'A AND B AND NOT (E OR D)', for example? Or is there
      no way to get beyond the borders of those five conditions?"

      BTW Five???  I only see four: NOT, OR, AND, XOR. And those are operators,
      not conditions. What am I missing>

      Flo, I think what you are asking is: given the absence of logical operators
      in Notetab Clip Code, how can I generalize the use of arithmetic operators
      to apply them to complex logical expressions without obtuse and lengthy
      If - Then - Else trees.

      The answer has two parts:
        1: The mapping of the logical operations to arithmetical operations
        2: The transformation of complex logical expressions into canonical form

      Regarding #1, there are a number of different mapping of logical to arithmetical
      operations as Wayne and I have discussed and with no major standouts among them.
      They all suffer from the inherent limitation of being strictly binary operations
      in Notetab, (i.e., MAX (A,B,C,D...) is not allowed). To recap two mappings:

      Operation             Addition only            Allowing Subtraction
      ============          ======================   =====================
        NOT A        <==>   (A + 1) MOD 2             (1 - A)
      A AND B        <==>   A * B                     MIN (A, B)
      A OR  B        <==>   MAX (A, B)                MAX (A, B)
      A XOR B        <==>   (A + B) MOD 2             ABS (A-B)
      NOT A AND B    <==>   (MIN (A, B) + 1) MOD 2    1 - MIN (A,B)
      NOT A OR  B    <==>   (MAX (A, B) + 1) MOD 2    1 - MAX (A,B)
      NOT A XOR B    <==>   (A + B + 1) MOD 2         1 - ABS (A-B)

      Regarding #2 as Joy and I discussed, any logical expression can be transformed
      into an equivalent DNF (an OR of ANDs), or CNF (an AND of ORs), expression. These
      canonical expressions use only the three logical operators: NOT, AND and OR.

      You can do that transform either by a series of transformations using
      DeMorgan's laws or - easier for very complicated expressions - by using truth tables. Each row in a truth table is a conjunction (AND) of the state of the columns (conditions). The disjunction (OR) of all the rows resulting in a true (=1) state is the DNF (Disjunctive Normal Form) of the system. For example:

          ##  A B C D E   X     Y1  Y2  Y3    Y     DNF Term
          ==  = = = = =   =     ==  ==  ==    =     ======================
          01  0 0 x 0 0   0     0   0   0     0    
          02  0 0 x 0 1   0     0   0   0     0    
          03  0 0 x 1 0   0     0   0   1     1     ¬A AND ¬B AND D AND ¬E
          04  0 0 x 1 1   0     0   0   0     0
          05  0 1 x 0 0   0     0   0   1     1     ¬A AND B AND ¬D AND ¬E
          06  0 1 x 0 1   0     0   1   0     1     ¬A AND B AND ¬D AND E
          07  0 1 x 1 0   0     0   1   1     1     ¬A AND B AND D AND ¬E
          08  0 1 x 1 1   0     0   0   0     0    
          09  1 0 x 0 0   0     0   0   0     0    
          10  1 0 x 0 1   0     0   0   0     0    
          11  1 0 x 1 0   0     0   0   0     0    
          12  1 0 x 1 1   0     0   0   0     0    
          13  1 1 x 0 0   1     1   0   0     1     A AND B AND ¬D AND ¬E
          14  1 1 x 0 1   0     0   0   0     0    
          15  1 1 x 1 0   0     0   0   0     0    
          16  1 1 x 1 1   0     0   0   0     0    

      For instance, taking your example, A AND B AND NOT (E OR D), (Col X),
      the DNF form for X is simply row 13: X = A AND B AND ¬D AND ¬E
      (where ¬ is the NOT operator). That seems like a lot of work compared
      to the trivial DeMorgan transformation but this is meant to illustrate
      a process so let's add some complication:

      Let Y = [A AND B AND NOT (E OR D)]
                 OR {¬A AND [B AND (¬D OR E) OR B AND (D OR ¬E)]}
                 OR [(¬A AND ¬E AND (B OR D)]
            = Y1 OR Y2 OR Y3

      Where Y1 = A AND B AND NOT (E OR D)
            Y2 = ¬A AND [B AND (¬D OR E) OR B AND (D OR ¬E)]
            Y3 = (¬A AND ¬E AND (B OR D)

      The Y column is the conjunction of the Y1, Y2 and Y3 columns and the final DNF
      expression is just the OR (Sum), of the Y rows with true values:

      Y = R3 OR R5 OR R6 OR R7 OR R13
        = (¬A AND ¬B AND D AND ¬E) OR (¬A AND B AND ¬D AND ¬E) OR (¬A AND B AND ¬D AND E)
               OR (¬A AND B AND D AND ¬E) OR (A AND B AND ¬D AND ¬E)

      This logical expression can now be evaluated in six clear and easily followed steps,
      none of which involve an If test:
         Evaluate each of the five row expressions using the canonical AND operations
         Evaluate the final result using the canonical OR operation on those five results.

      Alternatively, the expression could be coded directly using my functions. I did find
      it necessary to first transform the expression from embedded binary notation to prefix
      notation. Even then I had to be very careful not to introduce an error. I could then
      use NTB's replace dialog to convert to Clip syntax. Without a parser to analyze and
      convert an arbitrary expression with nested parentheses to Notetab format it still is
      a bit tricky. The functions work well for short or medium complexity expressions but
      a truth table is easier and more reliable for anything more complex.

      Y1 = and(A,B,¬or(E,D))
      Y2 = and(¬A,¬E,or(B,D))
      Y3 = and(¬A,or(and(B,or(¬D,E)),and(B,or(D,¬E))))

      ^!Set %Y1%^$AND(^%A%,^%B%,^$NOR(^%E%,^%D%)$)$
      ^!Set %Y2%^$AND(^$NOT(^%A%)$$,^$NOT(^%E%)$$,^$OR(^%B%,^%D%)$)$
      ^!Set %Y3%^$AND(^$NOT(^%A%)$$,^$OR(^$AND(^%B%,^$OR(^$NOT(^%D%)$$,^%E%)$)$,^$AND(^%B%,^$OR(^%D%,^$NOT(^%E%)$$)$)$
      ^!Set %Y%=^$OR(^%Y1%;^%Y2%;^%Y3%)$

      Some things to notice/consider:
      • The astute reader will notice that Y2 is equivalent to ¬A AND B AND (D XOR E)
           However, remember we are limiting ourselves to NOT, AND and OR operators.
      • Separating the terms into partial result columns makes it easier and less
           error prone to generate the truth table.
      • There can be some overlap in the partial results (e.g., row 7, col Y2 and Y3)
      • The growth of terms (and complexity) is linear, not exponential
      • Changes to the system criteria (truth table), are easily and fairly reliably
           propagated to the DNF expression terms (this is much, much less error prone
           than If - Then - Else logic trees.
      • Additional columns can easily be added to record the results of alternate
           logical expressions using the same worksheet and input parameters states.
      • The truth table is independent and external to any program language or code.
           It is simply prep work prior to program implementation.
      • A truth table is easy to generate, easy to understand and concise in execution
      • Most important, using a truth table to analyze and code logic expressions is
           the most reliable and least error prone method available.

      OTOH, Trying to implement complex expressions by means of If - Then - Else
      trees is a recipe for frustration and errors. Clear computation and  program
      flow is a basic requirement for generating error free code as well for
      downstream maintainability. The VanWeerthuizenian Expressions get very messy
      very, very quickly once you go beyond the trivial two parameter, binary
      condition unless you restrict yourself to canonical forms. This is primarily
      due to the restriction of the MAX and MIN functions to two operands. That is
      why I removed that restriction from my AND and OR functions. However, in
      canonical form the benefit of my functions is much less dramatic than I expected.

      For example for R3 = ¬A AND ¬B AND D AND ¬E

      ^!Set %R3%=^$Calc(min((1-^%A%),min((1-^%B%),min(^%D%,(1-^%E%))));0)$
      ^!Set %R3%=^$AND(^$NOT(^%A%)$;^$NOT(^%B%)$;^%D%;^$NOT(^%E%)$)$

      I think the functions are clearer and less error prone than all the nested
      parentheses and remembering the correct mappings but not earth shakingly so.

      However, the major point and lesson here is the transformation to DNF
      form. Whether you use the VanWeerthuizenian constructs or my functions,
      the best path to error free coding and ease of maintenance is to transform
      your test criteria into canonical form using a single variable for each
      criteria, generate the DNF expression (by either Boolean transformations
      or via a truth table), and then evaluate the resulting products and sums.
      Do not use an If test until the final result.

      Applying this paradigm to your example:


      Let Y1 = A AND B; Y2 = A OR B; Y3 = A NOT B; Y4 = A XOR B; Y5 = NOT A

      The truth table is:

      #  A  B  Y1  Y2  Y3  Y4  Y5
      =  =  =  ==  ==  ==  ==  ==
      1  0  0  0   0   0   0   1
      2  0  1  0   1   0   1   1
      3  1  0  0   1   1   1   0
      4  1  1  1   1   0   0   0

      And the DNFs for the five cases are:

      Y1 = A AND B
      Y2 = A OR B
      Y3 = A AND ¬B
      Y4 = ¬A AND B OR A AND ¬B
      Y5 = ¬A

      It doesn't seem like we gained much but yours is a fairly trivial test and
      remember, I am illustrating a generalized process, not content.
      Implementing this in a clip:
      ;######################### Start of Clip Code ###############################
      ;^!SetDebug On
      ^!Set %Lines%=^$GetTextLineCount$
      ^!Set %Row%=1; %Hits%=^%Empty%
      ^!Set %Case%=^?{(H=5)Find lines matching...==_A AND B|A OR B|A NOT B|A XOR B|NOT A}

      ^!Jump ^%Row%
      ^!Set %A%=0; %B%=0; %Y%=0
      ^!IfMatch "^.*A.*$" "^$GetLine$" ^!Set %A%=1
      ^!IfMatch ".*B.*$" "^$GetLine$" ^!Set %B%=1

      ^!Set %Y1%=^$Calc(MIN(^%A%;^%B%))$
      ^!Set %Y2%=^$Calc(MAX(^%A%;^%B%))$
      ^!Set %Y3%=^$Calc(MIN(^%A%;1-^%B%))$
      ^!Set %Y41%=^$Calc(MIN(1-^%A%;^%B%))$
      ^!Set %Y42%=^$Calc(MIN(^%A%;1-^%B%))$
      ^!Set %Y4%=^$Calc(MAX(^%Y41%;^%Y42%))$
      ^!Set %Y5%=^$Calc(1-^%A%)$

      ^!IfMatch "^%Case%" "A AND B" ^!Set %Y%=^%Y1%
      ^!IfMatch "^%Case%" "A OR B" ^!Set %Y%=^%Y2%
      ^!IfMatch "^%Case%" "A NOT B" ^!Set %Y%=^%Y3%
      ^!IfMatch "^%Case%" "A XOR B" ^!Set %Y%=^%Y4%
      ^!IfMatch "^%Case%" "NOT A" ^!Set %Y%=^%Y5%

      ^!IfTrue ^%Y% ^!Set %Hits%=^%Hits%^$GetLine$^P
      ^!Inc %Row%
      ^!If ^$GetRow$ < ^%Lines% Loop
      ^!Info [L]Expression: ^%Case%^P^PMatches:^P^%Hits%

      ^!GoTo End

      ;Alternate set of logic expression calculations
      ^!Set %Y1%=^$AND(^%A%;^%B%)$
      ^!Set %Y2%=^$OR(%A%;^%B%)$
      ^!Set %Y3%=^$AND(^%A%;^$NOT(^%B%)$)$
      ^!Set %Y41%=^$AND(^$NOT(^%A%);^%B%)$;)$
      ^!Set %Y42%=^$AND(^%A%;^$NOT(^%B%)$)$
      ^!Set %Y4%=^$OR(^%Y41%;^%Y42%)$
      ^!Set %Y5%=^$NOT(^%A%)$
      ;######################### End of Clip Code ###############################

      This, of course, could be rearranged to use your computed GoTos, the DNF
      computations could be moved to the %case% tests with only %Y41% and %Y42%
      pre-computed or other optimizations which might or might not be significant
      depending on the application but would probably be at the cost of clarity
      and generalization. Remember, the emphasis here is on clarity of program
      flow (primarily linear), to minimize coding errors and to ease maintenance.
      The various sections are segregated, similar operations are consolidated, and
      program branches are minimized. A flow chart of this clip would not have any
      crossed lines.

      I hope this answers your questions.  Art
    • Show all 12 messages in this topic