Loading ...
Sorry, an error occurred while loading the content.

RE: [CANLIST] - J1939 Multi Frame

Expand Messages
  • Allen Pothoof
    Bram, You shouldn t be having a problem if you and the other nodes have properly implemented Transport Protocol (I know, easy for me to say...). If you are
    Message 1 of 14 , Oct 6, 2011
    View Source
    • 0 Attachment
      Bram,
       
      You shouldn't be having a problem if you and the other nodes have properly implemented Transport Protocol (I know, easy for me to say...).
       
      If you are receiving connections from multiple originators (doesn't sound like it) you should be handling them based on the source address.
       
      If you have a single originator sending multiple Request To Send messages for the same PGN, only the last one is active and I would suspect the originating node has an issue.  You can only have one connection per source node per PGN at a time.
       
      If you have a single originator sending multiple RTSs for multiple PGNs, the simpliest way to handle things is to accept the first one and reject the others until the first set of data messages is fully processed.  That's the way I've handled it in the past (the few times its been an issue) if for no other reason than that I've never had enough RAM to afford the multiple buffers.  All you're really doing here is making the "one connection per source node per PGN" a little stricter by becoming "one connection per source node."
       
      Regards,
      Al Pothoof
       

       
      > From: canbus@...
      > To: canlist@...
      > Subject: RE: [CANLIST] - J1939 Multi Frame
      > Date: Thu, 6 Oct 2011 23:27:47 +0200
      >
      >
      > On a slightly related note: what is the proper way of dealing with such
      > duplicates in J1939 transport control frames for multi-frame PGNs?
      >
      > This week I encountered a truck where one of the nodes on the J1939 network
      > was spewing out TPCM and TPDT frames in no predictable order due to an
      > unknown condition (only observed it briefly and advised the owner to have it
      > looked at). Got called in because our application was behaving "funny",
      > which ultimately boiled down to the fact that from time to time, the TPDT
      > frames with payload data from random PGNs came in just the right order to
      > match a preceding TPCM which got then reassembled into a chimera of PGNs
      > that - will you believe it - made the application act funny...
      >
      > So I went through our code to see how we could best deal with these kinds of
      > situations, but other than setting time-outs and aborting reassembly after
      > receiving an out-of-order TPDT frame there doesn't seem a lot one can do.
      > The standard isn't very helpful either.
      >
      > What is the recommended way of dealing with a duplicate TPDT frame in a
      > J1939 multi-frame-transfer? My gut feeling would be to abort reassembly, but
      > if it's a common occurrence it could prevent 'proper' operation.
      >
      > This would be a transfer as expected:
      > TPCM: fragments->3, size->17
      > TPDT: seq->1, data->1-7
      > TPDT: seq->2, data->8-14
      > TPDT: seq->3, data->15-17 (+padding)
      >
      > How to deal with this:
      > TPCM: fragments->3, size->17
      > TPDT: seq->1, data->1-7
      > TPDT: seq->2, data->8-14
      > TPDT: seq->2, data->8-14 (duplicate)
      > TPDT: seq->3, data->15-17 (+padding)
      >
      > cheers,
      > Bram
      >
      > -----Original Message-----
      > From: canlist-owner@...-informatik.de
      > [mailto:canlist-owner@...-informatik.de] On Behalf Of John
      > Dammeyer
      > Sent: woensdag 5 oktober 2011 18:08
      > To: CANLIST@...
      > Subject: RE: [CANLIST] socketcan and codesys (rollover?)
      >
      > That's the problem.
      >
      > Now remember, this is extremely rare if the bus isn't noisy and properly
      > terminated without reflections. But imagine such a system where two years
      > later during an upgrade one extra user information display is added to the
      > bus for the operator to work with at another point on the machine. The
      > stub wire tapped onto the network is just long enough to create a
      > reflection on the bus during the ACK pulse.
      >
      > The reflection itself travels down the bus and happens to fall on that bit
      > position for that one node. Now depending on what happens and when it
      > happens a toggle (or delta) is resent. But because of the bus timing it's
      > not repeatable. And since it happens a few days after the new user
      > display which is a passive device there's no direct cause/relationship
      > with this.
      >
      > The first finger pointing will go to the software guys because they
      > performed an upgrade on some of the nodes firmware or maybe even upgraded
      > their windows user interface to WIN-7 from WIN-XP. And so on. It's a
      > problem that may never be solved but will require hundreds of man hours
      > etc. The engineers should be so lucky that the problem would be
      > repeatable. When it's not you need to do a work-around.
      >
      > John Dammeyer
      >
      > Automation Artisans Inc.
      > http://www.autoartisans.com/ELS/
      > Ph. 1 250 544 4950
      >
      >
      >
      > --
      > Archives and useful links: http://groups.yahoo.com/group/CANbus
      > Subscribe and unsubscribe at www.vector-informatik.com/canlist/
      > Report any problems to <canlist-owner@...>
    • Bram Kerkhof
      I know my node implements TP properly, not too sure about the others though ;-) In any case: I was looking if there are best practices on how to safeguard your
      Message 2 of 14 , Oct 11, 2011
      View Source
      • 0 Attachment

        I know my node implements TP properly, not too sure about the others though ;-)

         

        In any case: I was looking if there are best practices on how to safeguard your own implementation for other nodes misbehaving during TP transfers. I just witnessed one in “the field” that went berserk with BAM announcements and transfers in no particular order, that happened to form correct sequences from time to time according to the infinite-monkey-theorem . If you are depending on the data in such a BAM transfer (eg. EC1), it’s nice to have a level of confidence that the data you’re processing is actually sane.

         

        In my case, I added an extra safeguard that monitors the number of valid and invalid TP transfers for every SA I’m interested in, and if the ratio between these two is over a particular threshold, I’m no longer listening ;-)

         

        cheers,

        Bram

         

        PS: Apologies if you happen to receive this message (partially) twice; blame it on big-finger-syndrome.

         

        From: canlist-owner@...-informatik.de [mailto:canlist-owner@...-informatik.de] On Behalf Of Allen Pothoof
        Sent: vrijdag 7 oktober 2011 1:36
        To: Canlist
        Subject: RE: [CANLIST] - J1939 Multi Frame

         

        Bram,
         
        You shouldn't be having a problem if you and the other nodes have properly implemented Transport Protocol (I know, easy for me to say...).
         
        If you are receiving connections from multiple originators (doesn't sound like it) you should be handling them based on the source address.
         
        If you have a single originator sending multiple Request To Send messages for the same PGN, only the last one is active and I would suspect the originating node has an issue.  You can only have one connection per source node per PGN at a time.
         
        If you have a single originator sending multiple RTSs for multiple PGNs, the simpliest way to handle things is to accept the first one and reject the others until the first set of data messages is fully processed.  That's the way I've handled it in the past (the few times its been an issue) if for no other reason than that I've never had enough RAM to afford the multiple buffers.  All you're really doing here is making the "one connection per source node per PGN" a little stricter by becoming "one connection per source node."
         
        Regards,
        Al Pothoof
         

         

        > From: canbus@...

        > To: canlist@...
        > Subject: RE: [CANLIST] - J1939 Multi Frame
        > Date: Thu, 6 Oct 2011 23:27:47 +0200
        >
        >
        > On a slightly related note: what is the proper way of dealing with such
        > duplicates in J1939 transport control frames for multi-frame PGNs?
        >
        > This week I encountered a truck where one of the nodes on the J1939 network
        > was spewing out TPCM and TPDT frames in no predictable order due to an
        > unknown condition (only observed it briefly and advised the owner to have it
        > looked at). Got called in because our application was behaving "funny",
        > which ultimately boiled down to the fact that from time to time, the TPDT
        > frames with payload data from random PGNs came in just the right order to
        > match a preceding TPCM which got then reassembled into a chimera of PGNs
        > that - will you believe it - made the application act funny...
        >
        > So I went through our code to see how we could best deal with these kinds of
        > situations, but other than setting time-outs and aborting reassembly after
        > receiving an out-of-order TPDT frame there doesn't seem a lot one can do.
        > The standard isn't very helpful either.
        >
        > What is the recommended way of dealing with a duplicate TPDT frame in a
        > J1939 multi-frame-transfer? My gut feeling would be to abort reassembly, but
        > if it's a common occurrence it could prevent 'proper' operation.
        >
        > This would be a transfer as expected:
        > TPCM: fragments->3, size->17
        > TPDT: seq->1, data->1-7
        > TPDT: seq->2, data->8-14
        > TPDT: seq->3, data->15-17 (+padding)
        >
        > How to deal with this:
        > TPCM: fragments->3, size->17
        > TPDT: seq->1, data->1-7
        > TPDT: seq->2, data->8-14
        > TPDT: seq->2, data->8-14 (duplicate)
        > TPDT: seq->3, data->15-17 (+padding)
        >
        > cheers,
        > Bram
        >
        > -----Original Message-----
        > From: canlist-owner@...-informatik.de
        > [mailto:canlist-owner@...-informatik.de] On Behalf Of John
        > Dammeyer
        > Sent: woensdag 5 oktober 2011 18:08
        > To: CANLIST@...
        > Subject: RE: [CANLIST] socketcan and codesys (rollover?)
        >
        > That's the problem.
        >
        > Now remember, this is extremely rare if the bus isn't noisy and properly
        > terminated without reflections. But imagine such a system where two years
        > later during an upgrade one extra user information display is added to the
        > bus for the operator to work with at another point on the machine. The
        > stub wire tapped onto the network is just long enough to create a
        > reflection on the bus during the ACK pulse.
        >
        > The reflection itself travels down the bus and happens to fall on that bit
        > position for that one node. Now depending on what happens and when it
        > happens a toggle (or delta) is resent. But because of the bus timing it's
        > not repeatable. And since it happens a few days after the new user
        > display which is a passive device there's no direct cause/relationship
        > with this.
        >
        > The first finger pointing will go to the software guys because they
        > performed an upgrade on some of the nodes firmware or maybe even upgraded
        > their windows user interface to WIN-7 from WIN-XP. And so on. It's a
        > problem that may never be solved but will require hundreds of man hours
        > etc. The engineers should be so lucky that the problem would be
        > repeatable. When it's not you need to do a work-around.
        >
        > John Dammeyer
        >
        > Automation Artisans Inc.
        > http://www.autoartisans.com/ELS/
        > Ph. 1 250 544 4950
        >
        >
        >
        > --
        > Archives and useful links: http://groups.yahoo.com/group/CANbus
        > Subscribe and unsubscribe at www.vector-informatik.com/canlist/
        > Report any problems to <canlist-owner@...>

      Your message has been successfully submitted and would be delivered to recipients shortly.