Loading ...
Sorry, an error occurred while loading the content.

Interrupt Stacks [was Contradiction in the Docs...]

Expand Messages
  • Gregory N
    David, I will babble a little more about stack usage. I have no idea if this is useful to you or not, but it is general usage info that needs to be known (so
    Message 1 of 11 , Dec 1, 2011
    • 0 Attachment
      David,

      I will babble a little more about stack usage. I have no idea if this is useful to you or not, but it is general usage info that needs to be known (so I changed the Subject).

      The default behavior of interrupts is to use the same stack as stack of the thread that it interrupted. This then increases the stack requirement of EVERY thread. Suppose, for example, your interrupt handling needs 1Kb and your thread logic needs 1Kb for its own processing. Then you would have to set aside 2Kb of stack for the task to include both the thread's stack needs and interrupt handling needs.

      That default behavior, like any default behavior, may or may not be what you want. If you have many threads, then the overhead of adding the interrupt level stack require to each thread might be excessive. Or, if your interrupt level logic requires a lot of stack space then adding that large number to each stack might be a problem.

      The alternative behavior is controlled by the CONFIG_ARCH_INTERRUPTSTACK setting in most architectures. This setting will use a separate stack for interrupt handling: On each interrupt, the user stack pointer will be saved and a different interrupt stack will be used just for interrupt level processing.

      This can help because now the stack size of each thread does not need to include the interrupt level stack allocation as well. But this also requires one additional interrupt stack. The decision to use CONFIG_ARCH_INTERRUPTSTACK all comes down to "Which stack usage modle saves you the most memory?"

      I mention this in your case because from your previous email, it looked like your IDLE stack was overflowing into the .bss area. This might occur because your IDLE stack size might not be big enough to include all of the interrupt level processing that occurs on the same stack.

      Greg
    • David Sidrane
      Hi Greg, Thank you for the information. I have done the ARM stack swap code on other projects and do understand and appreciate the beauty of what you have done
      Message 2 of 11 , Dec 1, 2011
      • 0 Attachment

        Hi Greg,

         

        Thank you for the information. I have done the ARM stack swap code on other projects and do understand and appreciate the beauty of what you have done with CONFIG_ARCH_INTERRUPTSTACK. We have 5 threads total and all of them have head room.  I did add this rather inelegant code

         

        bzero((void*)&g_idletcb, sizeof(_TCB));

          g_idletcb.task_state = TSTATE_TASK_RUNNING;

          g_idletcb.entry.main = (main_t)os_start;

         

        #if defined(CONFIG_DEBUG) && defined(CONFIG_DEBUG_STACK)

          extern uint32_t _ebss;

          g_idletcb.adj_stack_size = CONFIG_IDLETHREAD_STACKSIZE;

          g_idletcb.adj_stack_ptr = &_ebss+CONFIG_IDLETHREAD_STACKSIZE;;

          g_idletcb.stack_alloc_ptr = &_ebss;

          uint32_t *sp = (uint32_t *)&sp;

          while (--sp >= &_ebss) *sp = 0xDEADBEEF;

        #endif

         

         

        to os_start.c to check the idle task penetration and add a foreach on the stack check to a util thread.  All was fine.  

         

        When I looked at all the pointers and the bss all the relationship was fine.

         

        The more I look at it I think it is a HW issue. It looks like the DMA read from 0 the value for next lli and then from the IDEL_SATCK and then chews it's way through memory. I did try to figure out what was different by running JleninEXE's h, g commands. I also added code in my ISR to bump all the system registers and then diffed the results of a good (with ice) and bad run,. What I can see is below.

         

        This one I wonder about is TRCENA The comment " This enables control of power usage unless tracing is required" makes me wonder if I am not initializing something……….

         

         

         

         

        1: lpc17_dmainterrupt: source 0x0000000040034018

         

        This is the DMA reg 0x50004100 it should be 40034018

         

        844:   lpc17_dmainterrupt: 0xe000ed30 0x000000001

        A BKPT may be executed in debug monitor mode which will cause the debug monitor handler to be run but the

        Debug Fault Status Register (DFSR) at address 0xE000ED30 will not have bit 1 set to indicate the cause was a

        BKPT instruction. This will only occur if an interrupt other than the Debug Monitor is already being processed

        just before the BKPT is executed

         

        892:  lpc17_dmainterrupt: 0xe000edf0 0x030010000 01010001

        10.2.1. Debug Halting Control and Status Register

        The purpose of the Debug Halting Control and Status Register (DHCSR) is to:

        • provide status information about the state of the processor
        • enable core debug
        • halt and step the processor.

        The DHCSR:

        • is a 32-bit read/write register
        • address is 0xE000EDF0.

        Note

        The DHCSR is only reset from a system reset, including power on. Bit 16 of DHCSR is Unpredictable on reset.

        Figure 10.1 shows the arrangement of bits in the register.

        Figure 10.1. Debug Halting Control and Status Register format

        Table 10.2 shows the bit functions of the Debug ID Register.

        Table 10.2. Debug Halting Control and Status Register

        Bits

        Type

        Field

        Function

        [31:16]

        Write

        DBGKEY

        Debug Key. 0xA05F must be written whenever this register is written. Reads back as status bits [25:16]. If not written as Key, the write operation is ignored and no bits are written into the register.

        [31:26]

        -

        -

        Reserved, RAZ.

        [25]

        Read

        S_RESET_ST

        Indicates that the core has been reset, or is now being reset, since the last time this bit was read. This a sticky bit that clears on read. So, reading twice and getting 1 then 0 means it was reset in the past. Reading twice and getting 1 both times means that it is being reset now (held in reset still).

        [24]

        Read

        S_RETIRE_ST

        Indicates that an instruction has completed since last read. This is a sticky bit that clears on read. This determines if the core is stalled on a load/store or fetch.

        [23:20]

        -

        -

        Reserved, RAZ.

        [19]

        Read

        S_LOCKUP

        Reads as one if the core is running (not halted) and a lockup condition is present.

        [18]

        Read

        S_SLEEP

        Indicates that the core is sleeping (WFI, WFE or SLEEP-ON-EXIT). Must use C_HALT to gain control or wait for interrupt to wake-up. For more information on SLEEP-ON-EXIT see Table 7.1.

        [17]

        Read

        S_HALT

        The core is in debug state when S_HALT is set.

        [16]

        Read

        S_REGRDY

        Register Read/Write on the Debug Core Register Selector register is available. Last transfer is complete.

        [15:6]

        -

        -

        Reserved.

        [5]

        Read/write

        C_SNAPSTALL

        If the core is stalled on a load/store operation the stall ceases and the instruction is forced to complete. This enables Halting debug to gain control of the core. It can only be set if:

        C_DEBUGEN = 1

        C_HALT = 1

        The core reads S_RETIRE_ST as 0. This indicates that no instruction has advanced. This prevents misuse.

        The bus state is Unpredictable when this is used.

        S_RETIRE can detect core stalls on load/store operations.

        [4]

        -

        -

        Reserved.

        [3]

        Read/write

        C_MASKINTS

        Mask interrupts when stepping or running in halted debug. Does not affect NMI, which is not maskable. Must only be modified when the processor is halted (S_HALT == 1).

        [2]

        Read/write

        C_STEP

        Steps the core in halted debug. When C_DEBUGEN = 0, this bit has no effect. Must only be modified when the processor is halted (S_HALT == 1).

        [1]

        Read/write

        C_HALT

        Halts the core. This bit is set automatically when the core Halts. For example Breakpoint. This bit clears on core reset. This bit can only be written if C_DEBUGEN is 1, otherwise it is ignored. When setting this bit to 1, C_DEBUGEN must also be written to 1 in the same value (value[1:0] is 2’b11). The core can halt itself, but only if C_DEBUGEN is already 1 and only if it writes with b11).

        [0]

        Read/write

        C_DEBUGEN

        Enables debug. This can only be written by AHB-AP and not by the core. It is ignored when written by the core, which cannot set or clear it.

        The core must write a 1 to it when writing C_HALT to halt itself.

        If not enabled for Halting mode, C_DEBUGEN = 1, all other fields are disabled.

        This register is not reset on a system reset. It is reset by a power-on reset. However, the C_HALT bit always clears on a system reset.

        To halt on a reset, the following bits must be enabled:

        • bit [0], VC_CORERESET, of the Debug Exception and Monitor Control Register
        • bit [0],C_DEBUGEN, of the Debug Halting Control and Status Register.

         

         

        894:   lpc17_dmainterrupt: 0xe000edf8 0xa048e4a00000000

        10.2.3. Debug Core Register Data Register

        The purpose of the Debug Core Register Data Register (DCRDR) is to hold data for reading and writing registers to and from the processor.

        The DCRDR:

        ·         is a 32-bit read/write register

        ·         address 0xE000EDF8.

        This is the data value written to the register selected by the Debug Register Selector Register.

        When the processor receives a request from the Debug Core Register Selector, this register is read or written by the processor using a normal load-store unit operation.

        If core register transfers are not being performed, software-based debug monitors can use this register for communication in non-halting debug. For example, OS RSD and Real View Monitor. This enables flags and bits to acknowledge state and indicate if commands have been accepted to, replied to, or accepted and replied to.

         

         

        895: lpc17_dmainterrupt: 0xe000edfc 0x00100 0000

         

        10.2.4. Debug Exception and Monitor Control Register

        The purpose of the Debug Exception and Monitor Control Register (DEMCR) is:

        • Vector catching. That is, to cause debug entry when a specified vector is committed for execution.
        • Debug monitor control.

        The DEMCR:

        • is a 32-bit read/write register
        • address 0xE000EDFC

        Figure 10.2 shows the arrangement of bits in the register.

        Figure 10.3. Debug Exception and Monitor Control Register format

        Table 10.4 shows the bit functions of the Debug Exception and Monitor Control Register.

        Table 10.4. Debug Exception and Monitor Control Register

        Bits

        Type

        Field

        Function

        [31:25]

        -

        -

        Reserved, SBZP

        [24]

        Read/write

        TRCENA

      • david_s5y
        Greg, I figured out the cause of the GPDMA corruption. Unfortunately NXP did not put the following in the GPDMA section of the manual, just the power mode
        Message 3 of 11 , Dec 29, 2011
        • 0 Attachment
          Greg,

          I figured out the cause of the GPDMA corruption. Unfortunately NXP did
          not put the following in the GPDMA section of the manual, just the power
          mode section:

          "The GPDMA may operate in Sleep mode to access AHB SRAMs and peripherals
          with
          GPDMA support, but the GPDMA cannot access the flash memory or the main
          SRAM,
          which are disabled in order to save power."

          My DMA structs and target memory were in man SRAM and I need to move
          them to Bank 0 or 1.

          What is the best way to request allocation in an area of memory?

          If that can not be done, then how do I preallocate some of Bank 0 and
          tell nuttx to exclude just a chunk of bank o?

          David
          --- In nuttx@yahoogroups.com, "Gregory N" <spudarnia@...> wrote:
          >
          > Hi, David,
          >
          > I'm not sure how I can help you with this problem. I don't follow
          everything that you say here (You are much closer to the problem than I
          can be by reading your email.).
          >
          > > The values that get over written into the GPDMA registers come from
          10 words (32bit) below the IDLE_STACK value.
          >
          > The IDLE_STACK sits right above the end of .bss. So these values are
          in .bss?
          >
          > That sounds like some stack overflow (since the ARM has a push down
          stack). You mention "I have checked the stack penetrations and it looks
          ok", so I presume that you have already ruled out a stack overflow?
          There is stack overflow monitoring logic in arch/arm/src/common that can
          be enabled with CONFIG_DEBUG_STACK. You can increase the IDLE thread
          stack size with CONFIG_IDLETHREAD_STACKSIZE.
          >
          > > Somehow the 4 words at 0x10004290 are written to 0x50004100. Could
          there be an issues context switching? (See below)
          >
          > I'm not aware of any issues with context switching. The context
          switching has been around for a long time so I tend to trust it (of
          course, code that has been around for a long time sometimes has errors
          too).
          >
          > I'm not sure how a context switch could write into 0x50004100 unless
          it corrupted the stack or registers. And the system wouldn't run very
          long at all of the the context switching were routinely trashing values.
          Context switching bugs don't tend to be subtle.
          >
          > I even see incidentally places in the code where the stack
          pointers(SP=R13 and MSP) are sitting in the top of the .bss region:
          >
          > > lpc17_dmadump: DESTADDR[50004104]: 10004298
          > > ...
          > > lpc17_dmadump: CONTROL[5000410c]: 100042b8
          > > ...
          > > R8 = 10004288, ..., R12= 100042B8, R13= 10004288, MSP= 10004288, ...
          >
          > But perhaps those are just coincidences? If your .bss ends at
          0x10004290 and your IDLE stack size is 2Kb or so, then I would expect
          the SP/MSP to be a little below 0x10006290.
          >
          > > the DMA regs do not get corrupted. Yet changing sched_yield(); to
          usleep(1000); Ye They do get corrupted.
          >
          > NOTE that if each task has a unique priority (which ideally should
          always be true), then sched_yield() does nothing since you cannot yield
          to threads of lower (or higher) priority.
          >
          > Greg
          >
        • Gregory N
          ... The NuttX memory manager treats all memory given to it as the same. There is no interface to select any characteristics of the memory and so mechanism to
          Message 4 of 11 , Dec 29, 2011
          • 0 Attachment
            > My DMA structs and target memory were in man SRAM and I need to move
            > them to Bank 0 or 1.
            >
            > What is the best way to request allocation in an area of memory?

            The NuttX memory manager treats all memory given to it as the same. There is no interface to select any characteristics of the memory and so mechanism to pick Bank 0 or 1 memory.

            > If that can not be done, then how do I preallocate some of Bank 0 and
            > tell nuttx to exclude just a chunk of bank o?

            Yes, I think you will have to pre-allocate memory.

            I recall that you are using an LPC17xx of some kind. If you are using the NuttX linker script, then all .data and .bss will go into main memory. The heap region for the LPC17xx is created in arch/arm/src/lpc17xx/lpc17_allocateheap.c The function up_allocate_heap() adds the remaining main SRAM to the heap; the function up_addregion() adds the available portions of AHB bank1 and bank2 to the heap.

            At the top of arch/arm/src/lpc17xx/lpc17_allocateheap.c, you will see how I pre-allocated the Ethernet packet buffers. Like your GPDMA, these have to reside in bank1 or bank2. That is pretty complicated and could be simplified. It uses EMAC RAM definitions in lpcp17_emacram.h and along with definitions at the beginning of lpc17_allocateheap.c to decide how much of the AHB SRAM is set aside for Ethernet packet buffers and how much is available for heap.

            You could do something like I did for the Ethernet packet buffers. That would be complicated but might be a good idea because then you can assure that your buffer allocation is compatible with the Ethernet buffer allocation.

            Or you could modify the linker script to position data in bank1 or bank 2. You would have to:

            1) Declare data to reside in specially named sections using the GCC section attribute:

            http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html#Variable-Attributes
            http://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html

            This data should be uninitialized. Nothing will set any initial values for you.

            2) Modify the linker script to position these specially named variables in bank 1 or bank 2 wherever you want them. The linker script should export the variables to mark the beginning and end of the special memory.

            3) Modify up_addregion() so that the memory bounded by these exported values are not included in the stack.

            Or maybe you could just not add bank 1 to the heap and just managed the memory yourself.

            There are lots of options, but none are elegant.

            Greg
          • Mike Smith
            ... FWIW, I think this will be an issue with the STM32F4 as well, as the CCM SRAM appears not to be visible to the DMA controllers by my reading of the
            Message 5 of 11 , Dec 29, 2011
            • 0 Attachment

              On Dec 29, 2011, at 8:47 AM, Gregory N wrote:

               


              Or maybe you could just not add bank 1 to the heap and just managed the memory yourself.

              There are lots of options, but none are elegant.

              FWIW, I think this will be an issue with the STM32F4 as well, as the CCM SRAM appears not to be visible to the DMA controllers by my reading of the datasheet.

              If you were to imagine a new interface for allocating memory with specific characteristics, would you extend the malloc namespace (malloc_<something>) or overload mmap() in some fashion?

               = Mike

            • Gregory N
              ... There are a five global variables used to manage the heap (see mm/mm_internal.h). If those were repackaged in a structure, and references to that
              Message 6 of 11 , Dec 29, 2011
              • 0 Attachment
                > > There are lots of options, but none are elegant.
                > >
                > >
                > FWIW, I think this will be an issue with the STM32F4 as well, as the CCM SRAM appears not to be visible to the DMA controllers by my reading of the datasheet.
                >
                > If you were to imagine a new interface for allocating memory with specific characteristics, would you extend the malloc namespace (malloc_<something>) or overload mmap() in some fashion?

                There are a five global variables used to manage the heap (see mm/mm_internal.h). If those were repackaged in a structure, and references to that structure were used as "handles," then I could imagine two versions each function memory management function: A function that accepts a handle and operates on a particular memory space and a standard function that uses a standard heap. Something like:

                struct mm_heap_s g_stdheap;
                #define malloc(s) heap_malloc(&g_stdheap, s)

                void *heap_create(void);
                void *heap_addregion(void *handle, void *start, size_t size);
                void *heap_malloc(void *handle, size_t size);
                etc.

                Then you could create a new heap, receive a heap handle, then call counterparts to all of the standard allocation functions.

                But I can think of lots of reasons why you would not want to do that:

                1. Of course, it is non-standard and I constantly have to struggle to avoid incorporating non-standard interfaces.

                2. Something like the above would partition the memory. That would prevent SRAM from being used efficiently. You could have plenty of available memory in heap1 but heap2 could be used up or fragments.

                A better solution would be more complex than would be worthwhile. You would really want to filter possible solutions for the allocation to meet certain minimal criteria. That introduces a whole new set of memory management problems. I don't really want to go there.

                3. Do you really, really need an allocator for DMA memory? DMA memory is not usually something that changes dynamically. Usually, you establish DMA buffers initially and they are never freed.. they are usually statically allocated. Creating non-standard interfaces to cosmetically allocate static memory seems like a pointless exercise.

                4. DMA memory can have lots of other requirements. Such as alignment or cache-ability. I have worked with a lot of memory managers that try to do everything for everyone and they get very ugly very quickly.

                5. If DMA memory is allocated then never freed, it contributes to fragmentation of memory.

                So I am not convinced that having such special allocators makes sense. Especially since it is so easy to statically allocate DMA memory with just the properties that you need using section attributes and the linker script.

                Greg
              • david_s5y
                It was simple enought to pull it into the linker script and offset the constants in the up_allocate_heap().
                Message 7 of 11 , Dec 29, 2011
                • 0 Attachment
                  It was simple enought to pull it into the linker script and offset the constants in the up_allocate_heap().

                  --- In nuttx@yahoogroups.com, "Gregory N" <spudarnia@...> wrote:
                  >
                  > > > There are lots of options, but none are elegant.
                  > > >
                  > > >
                  > > FWIW, I think this will be an issue with the STM32F4 as well, as the CCM SRAM appears not to be visible to the DMA controllers by my reading of the datasheet.
                  > >
                  > > If you were to imagine a new interface for allocating memory with specific characteristics, would you extend the malloc namespace (malloc_<something>) or overload mmap() in some fashion?
                  >
                  > There are a five global variables used to manage the heap (see mm/mm_internal.h). If those were repackaged in a structure, and references to that structure were used as "handles," then I could imagine two versions each function memory management function: A function that accepts a handle and operates on a particular memory space and a standard function that uses a standard heap. Something like:
                  >
                  > struct mm_heap_s g_stdheap;
                  > #define malloc(s) heap_malloc(&g_stdheap, s)
                  >
                  > void *heap_create(void);
                  > void *heap_addregion(void *handle, void *start, size_t size);
                  > void *heap_malloc(void *handle, size_t size);
                  > etc.
                  >
                  > Then you could create a new heap, receive a heap handle, then call counterparts to all of the standard allocation functions.
                  >
                  > But I can think of lots of reasons why you would not want to do that:
                  >
                  > 1. Of course, it is non-standard and I constantly have to struggle to avoid incorporating non-standard interfaces.
                  >
                  > 2. Something like the above would partition the memory. That would prevent SRAM from being used efficiently. You could have plenty of available memory in heap1 but heap2 could be used up or fragments.
                  >
                  > A better solution would be more complex than would be worthwhile. You would really want to filter possible solutions for the allocation to meet certain minimal criteria. That introduces a whole new set of memory management problems. I don't really want to go there.
                  >
                  > 3. Do you really, really need an allocator for DMA memory? DMA memory is not usually something that changes dynamically. Usually, you establish DMA buffers initially and they are never freed.. they are usually statically allocated. Creating non-standard interfaces to cosmetically allocate static memory seems like a pointless exercise.
                  >
                  > 4. DMA memory can have lots of other requirements. Such as alignment or cache-ability. I have worked with a lot of memory managers that try to do everything for everyone and they get very ugly very quickly.
                  >
                  > 5. If DMA memory is allocated then never freed, it contributes to fragmentation of memory.
                  >
                  > So I am not convinced that having such special allocators makes sense. Especially since it is so easy to statically allocate DMA memory with just the properties that you need using section attributes and the linker script.
                  >
                  > Greg
                  >
                Your message has been successfully submitted and would be delivered to recipients shortly.