Loading ...
Sorry, an error occurred while loading the content.

Re: [blug-prog] Linux kernel stack limit

Expand Messages
  • Om
    ... One kernel thread does not even get full 8kB. the task_struct is kept in the kernel stack of that process. This is quite big structure taking around 1k
    Message 1 of 10 , Jul 13, 2004
    • 0 Attachment
      int3 wrote:
      >>Why is that the stack size of linux kernel is 8k in
      >>32 bit arch? Whis is this limitation and is there ne
      >
      >
      > Since, kernel cannot offord to give more stack for Each thread.
      >
      One kernel thread does not even get full 8kB. the task_struct is kept in
      the kernel stack of that process. This is quite big structure taking
      around 1k (Not very sure, have a look at include/linux/sched.h) so in
      effect, process gets around 7kB stack only

      >
      >
      > There is no way to increase or decrease the kernel stack, this is tested and
      > given that number and it is not Blind number. Even in NT 3.51 it gives 8k.
      > Win2k Onwards it gives 12k.

      If you change the task_union union, I think it is possible to change the
      stack
      size on the kernel. But there may not exist any system calls for this
      purpose,
      since it is a compile time value

      I have read quite sometime back that using some system calls we can
      specify the alternate stack on which the function should run. This is to
      be supported by hardware (processor). ia32 supports this and this might
      be used in norton av.

      HTH.
      Om.
    • suresh kumar
      ... u mean to say that for every task that is running in user space, there is a corresponding stack in kernel space which will be used when ever a system call
      Message 2 of 10 , Jul 13, 2004
      • 0 Attachment
        > in user mode, it will be security flaw. This kernel stack is limited to 8k
        > in Kernel side.
        >
        > If we are running too many threads in the system, then it is difficult to
        > give more memory in kernel side. User application always can be pagable and
        u mean to say that for every task that is running in user space, there is
        a corresponding stack in kernel space which will be used when ever a
        system call is executed.
        > it doesnt have any problem, but in Kernel side not all the code can stored
        > on non-paged.
        and since kernel memory can't be swapped, we can't have bigger sized stack
        in the kernel.
        >
        > IIRC, if everyone can offord more RAM, then surly they can increase. I am
        > sure in Itanium 64-bit systems, it is increased more like Tones :-)

        I read that for a 64 bit processor the stack limit is 16k only.

        >
        > Btw, Norton anti-virus in WinNT, maintains its own stack, since it is not
        what does maintaining own stack mean?

        thanks
        Suresh

        --
        ______________________________________________
        IndiaInfo Mail - the free e-mail service with a difference! www.indiainfo.com
        Check out our value-added Premium features, such as an extra 20MB for mail storage, POP3, e-mail forwarding, and ads-free mailboxes!

        Powered by Outblaze
      • suresh kumar
        ... we don t even get 7k, 1k is required when ever an interrupt comes and it uses the current execution thread s stack only. if we r prograimming for packet
        Message 3 of 10 , Jul 13, 2004
        • 0 Attachment
          > One kernel thread does not even get full 8kB. the task_struct is kept in
          > the kernel stack of that process. This is quite big structure taking
          > around 1k (Not very sure, have a look at include/linux/sched.h) so in
          > effect, process gets around 7kB stack only
          we don't even get 7k, 1k is required when ever an interrupt
          comes and it uses the current execution thread's stack only.
          if we r prograimming for packet processing, then some amount of
          stack is already consumed by the linux functions itself.
          >
          >
          > I have read quite sometime back that using some system calls we can
          > specify the alternate stack on which the function should run. This is to
          what is this alternate stack?
          > be supported by hardware (processor). ia32 supports this and this might
          > be used in norton av.
          I should have mentioned, it is not with system calls that I have
          problem, it is with tasklet processiong.


          --
          ______________________________________________
          IndiaInfo Mail - the free e-mail service with a difference! www.indiainfo.com
          Check out our value-added Premium features, such as an extra 20MB for mail storage, POP3, e-mail forwarding, and ads-free mailboxes!

          Powered by Outblaze
        • int3
          ... Yep. I dont see what other methods OS can follow to handle system call without its own indivisual stack in kernel side. ... It can have both Paged and
          Message 4 of 10 , Jul 13, 2004
          • 0 Attachment
            > > in user mode, it will be security flaw. This kernel stack is
            > limited to 8k
            > > in Kernel side.
            > >
            > > If we are running too many threads in the system, then it is
            > difficult to
            > > give more memory in kernel side. User application always can be
            > pagable and
            > u mean to say that for every task that is running in user space, there is
            > a corresponding stack in kernel space which will be used when ever a
            > system call is executed.

            Yep. I dont see what other methods OS can follow to handle system call
            without its own indivisual stack in kernel side.

            > > it doesnt have any problem, but in Kernel side not all the code
            > can stored
            > > on non-paged.
            > and since kernel memory can't be swapped, we can't have bigger
            > sized stack
            > in the kernel.

            It can have both Paged and Non-paged, not all can swapped out.

            > >
            > > IIRC, if everyone can offord more RAM, then surly they can
            > increase. I am
            > > sure in Itanium 64-bit systems, it is increased more like Tones :-)
            >
            > I read that for a 64 bit processor the stack limit is 16k only.

            Hmm... it tooo less.

            > >
            > > Btw, Norton anti-virus in WinNT, maintains its own stack, since
            > it is not
            > what does maintaining own stack mean?

            It switches to its own stack before doing any operation in Kernel.

            As OM, explained it will do it with hardware support.

            Regards,
            Satish K.S
          • int3
            the size of the stack below 8k? Is there ne tool ... try this : http://www.uwsg.iu.edu/hypermail/linux/kernel/9702.3/0364.html Regards, Satish K.S
            Message 5 of 10 , Jul 13, 2004
            • 0 Attachment
              the size of the stack below 8k? Is there ne tool
              > by which I can know to what max extent the stack has reached,
              > so that I can be careful.

              try this :
              http://www.uwsg.iu.edu/hypermail/linux/kernel/9702.3/0364.html

              Regards,
              Satish K.S
            • Karthick Ramnarayanan
              ... Okay!! (For INTEL arch.) Let me clear this confusion. Whats this 8K that people are referring to ? I hope the author has gone through the kernel source, or
              Message 6 of 10 , Jul 14, 2004
              • 0 Attachment
                -----Original Message-----
                >>From: suresh kumar [mailto:suresh_vin@...]
                >>Sent: Monday, July 12, 2004 8:09 PM
                >>To: linux-bangalore-programming@yahoogroups.com
                >>Subject: [blug-prog] Linux kernel stack limit

                >>Hi,
                >>Why is that the stack size of linux kernel is 8k in
                >>32 bit arch? Whis is this limitation and is there ne
                >>way that I can increase the size of the stack?What other
                >>alternatives are there in C programming so that I maintain
                >>the size of the stack below 8k? Is there ne tool
                >>by which I can know to what max extent the stack has reached,
                >>so that I can be careful.
                >>Thanks in adv.
                >>regards
                >>Suresh

                >>V.V Suresh Kumar

                Okay!! (For INTEL arch.)
                Let me clear this confusion. Whats this 8K that people are referring to
                ?
                I hope the author has gone through the kernel source, or has some idea
                about the kernel internals.
                From USER_SPACE or USER_SPACE view of a Process:
                Stack limit: By default,its set to RLIM_INFINITY or 2**32 -1 .
                You can get it by getrlimit(RLIMIT_STACK,&rlimit);
                But that's still restricted by the virtual address space or TASK_SIZE -
                3GB for a 3 : 1 split.
                I think INT3 or Satheesh, has confused you a bit, even though he is a
                top class hacker.
                First explanation:
                How the stack limit gets set to RLIM_INFINITY by default?
                That's a "execve" trick, while setting up your argument pages or the
                argument stack. You start at STACK_TOP and then subtract, 32 pages for
                the arguments and start your stack from there. So that's the program
                stack where the program stack(minus arg.) gets expanded through std.
                page faults. The arg pages are still COW pages.

                FROM KERNEL SPACE or Kernel space view of a Process:
                Lets take an example to actually explain the 8K funda: or the
                THREAD_SIZE funda or the CURRENT pointer is aligned to a _KERNEL_
                _STACK_ (ptregs->thread.esp and _not_ ptregs->esp) on a THREAD_SIZE or
                8K boundary.
                It would be clear if you see the "copy_thread" (i386/kernel/process.c)
                or if you are aware about the context switching phenomenon.
                Note this carefully:
                Assuming (with INT3 explanation also), you are aware about syscall or
                exception handling in monolithic kernels.
                From User space:
                Mov syscall NR to eax, args to ebx,ecx,edx,esi,edi and (ebp)offsetted,

                int 0x80
                Enter Kernel Space or RING0 :
                IDT handler invoked:
                Actions:

                Save all the registers
                [ THE STACK THAT is used is not the PROCESS stack per say(that you
                get through execve or in RING3). You cannot find out by looking at the
                code. As its not an explicit "mov" to ESP, but an implicit MOV to setup
                the kernel stack by the CPU, coz the kernel manipulates the TASK STATE
                SEGMENT on context-switches, for "esp" and "esp0" to be that of
                "current->thread.esp and current->thread.esp0 (to which CURRENT pointer
                is aligned on a THREAD_SIZE or 8K boundary (TSS is setup during init
                fork through loading the GDT by an LTR (load task register)
                instruction.) ]

                The order of saving before the kernel stacks it manually (as defined by
                struct pt_regs in ptrace.h),
                Is: Context switches from RING3 to RING0.
                CPU stacks:
                Old SS
                OLD ESP -> this is the process stack
                EFLAGS
                CS -> for PROCESS setup in GDT, 0x8 or multiples of 8 -> If you
                understand GDT)
                OLD EIP
                Then the STACK dump as defined by "struct pt_regs" by the Kernel.

                Now in RING0, you use the 8K Kernel stack per Process in Kernel Space,as
                that's the THREAD_SIZE or the memory allocated to task_struct ,on
                process creation. This is a safe figure,and should be okay till the
                ret_from_syscall or ret_from_{intr,exception).
                The value of OLD_ESP gets passed on to the child-thread on a fork.
                All the registers from parent to child on the stack (pt_regs) are copied
                to the child. The EIP is set to "ret_from_fork", and on a "__switch_to
                (asm/include/system.h), the switching processes EIP is saved to
                task_struct->thread.eip , ESP to task_struct->thread.esp. Then the
                process selected by the scheduler, has its stack from thread.esp and EIP
                from thread.eip setup.
                EIP is moved by just pushing the EIP onto the stack before jumping to
                "FASTCALL (args in ax, and dx), routine switch_to in
                arch/kernel/process.c" , whose return would end up in EIP that was
                pushed.

                So the return path, would have all the values from the "thread->esp"
                popped up the respective registers (same as pt_regs), and an "iret"
                would have the CPU popping off the EIP,CS,EFLAGS,ESP AND SS. Note that
                ESP and SS aren't stacked by the CPU for DPL (descriptor privilege
                level) jumps.(RINGO TO RINGO switches for example).
                Hope the confusion is cleared now.
                Regards,

                A.R.Karthick
                Software Engineer,
                Infosys Technologies LTD,
                Bangalore
                Tel:
                Office Direct-+91-80-25010915
                Office -+91-80 - 28520261 - Extn(2915)
                Res: +91-80-26784135
              Your message has been successfully submitted and would be delivered to recipients shortly.