Loading ...
Sorry, an error occurred while loading the content.
 

Re: [nslu2-linux] SlugOS/BE -- Boot Sequence

Expand Messages
  • Mike (mwester)
    ... Yes, there is no mechanism in Linux that can ensure that a USB mass storage device retains its device name. Any perceived consistency is entirely
    Message 1 of 3 , Mar 10, 2008
      Scott Ruckh wrote:
      > I am trying to find out more about the boot sequence of the NSLU2
      > running SlugOS/BE.
      >
      > The situation I am having is that the disks that are connected do not
      > appear to be recognized the same between boots. For example, the sda1
      > drive might become the sdc1 drive after a reboot.

      Yes, there is no mechanism in Linux that can ensure that a USB mass
      storage device retains its device name. Any perceived consistency is
      entirely accidental! In the worst cases, it can happen without any
      human interaction at all due to race conditions; the one that ends up as
      /dev/sda is the first one that finished initialization at power-up.

      > When SlugOS/BE was first installed, a single disk was plugged into Port
      > 2. Two partitions were created, and they were known as /dev/sda1 and
      > /dev/sda2 (where sda2 was a swap parition).
      >
      > While all the apps were being configured, this configuration stayed the
      > same. At this point there were no troubles at all, and everything
      > worked great!
      >
      > Then a hub was introduced into Port 1 of the NSLU2 and a NTFS formatted
      > drive and a thumb drive were plugged into the hub. Even at this point
      > everything worked fine while the system was still active. I created the
      > thumb drive (sdc1) for swap, and mounted the NTFS parition (sdb5).
      > Everything was working just fine....Unti...the NSLU2 was rebooted.

      Yep, you changed the USB world, and your device names moved about.

      > After the reboot the drives got mapped differently. Quite often the
      > SlugOS will not even pivot to the drive which was originally initialized
      > with the 'turnup' command.. The only recovery method is to boot the
      > NSLU2 with no drives attached. Even when the SlugOS does not pivot to
      > the hard drive, the internal SlugOS firmware does not even get the
      > proper IP address. It is only when no drives are connected that the
      > NSLU2 can once again be logged into using the IP address assigned when
      > 'turnup init' was run.

      The pivot should be done on the partition's UUID, which is which is
      unique to each partition on the system, regardless of the device
      identifier. There is a fallback if the UUID doesn't exist (SlugOS will
      attempt to use the original /dev/sdxx that was used at the time you did
      the turnup operation). Also, there is a time delay -- unless you
      specified otherwise, it will wait 10 seconds for all the device to
      become ready before it attempts to pivot.

      > Now, there is no rhyme or reason to when the SlugOS will actually boot
      > from the Hard Drive. I could plug the original drive all by itself back
      > into PORT 2 of the NSLU2 and SlugOS still may not boot to the Hard
      > Drive. Although this configuration worked time and time again when the
      > system was first configured, now there are no guarantees if it will
      > work. If it does not work after the first boot, it will not ever work
      > unless something else is done (like move the disk to PORT 1 on the
      > NSLU2; again I am not sure why this little trick even works).
      >
      > There is no pattern to when the slug is going to boot and when it will
      > not; it really appears to be blind luck. For example plugging the disk,
      > that was always plugged into PORT 2, and plugging it into PORT 1 might
      > make it boot fine.

      It's impossible to know since we don't know the path through the startup
      process -- but it sounds as if the UUID may have been lost or it's a
      timeout problem.

      > I tried to use the UUID directive in the /etc/fstab on the drive which
      > was originally intialized when SlugOS was first installed, but that does
      > not seem to improve anything.

      That fstab is not useful until *after* the pivot to the new filesystem
      occurs.

      > I could not locate the files that pivot the root file system and that
      > mounts the root file system and the swap drives.

      They are all on the internal flash. See /initrd/linuxrc for the file
      that run right after the kernel boots. It will invoke /boot/disk (a
      script that you'll find as /initrd/boot/disk) with the timeout value as
      well as the UUID and the /dev/sdxx values to be used for the pivot.

      You need to confirm that the UUID mentioned in that linuxrc file still
      matches the one on your root filesystem (google for the command to run;
      I seem to recall "blkid" but I could be wrong). Also, you might
      consider changing the timeout value to give the hub and all the other
      stuff more time to sort themselves out. (To edit the linuxrc file, you
      need to boot up to flash because /initrd is mounted read-only.)

      > I would like to mount the Thumb Drive as a higher priority for swap
      > space, but I could not find where that is configured. It must happen
      > well before the /etc/fstab table is used from the drive where SlugOS is
      > supposed to boot from because that file does not seem to play a part in
      > mounting the root "/" file system or the swap partitions.

      Swap happens post-pivot. Search the startup scripts for "swapon". I
      think it's rather indiscriminate; I seem to recall that it just does a
      "swapon -a".

      > I am trying to better understand the boot process so that I may be able
      > to persistantly mount drives to their proper location and also
      > understand why the SlugOS does not always boot from the Hard Drive where
      > is was originally configured. I am also trying to understand why when
      > the SlugOS does not boot successfully from the HDD, the firmware OS is
      > not even accessible on the network. Once all drives have been
      > disconnected the Firmware SlugOS boots with the IP values assigned
      > during the 'turnup init' process. Why isn't this true when a HDD is
      > plugged in, but the root "/" file system is never pivoted to the HDD?
      >
      > Now that multiple drives have been introduces it just appears to be luck
      > when the SlugOS will pivot correctly to the originally initialized
      > drive. I have run e2fsck on the drive and that still does not appear to
      > be beneficial. Unfortunately I have not spotted a pattern to when it
      > will boot properly and when it wont. The only pattern that appears to
      > be consistent is that if you get it to boot once and do not touch a
      > single drive, subsequent boots appear to work fine. If anything
      > changes, all bets are off, and getting SlugOS booted again is just an
      > act of black magic.
      >
      > The original goal is to keep the originally initialized drive plugged
      > into Port 2 and have a hub in Port 1. The thumb drive would always stay
      > plugged into the hub as a swap partition, and the NTFS would be able to
      > be mobile, in that it could be moved from NSLU2 to a Windows based PC.
      >
      > Even though I have gotten this configuration working while the slug is
      > on, I can not seem to ensure that it will work between boots.
      >
      > Is there a WIKI on the boot process, and what files to look at so I can
      > mount the drives to the proper file system mount points? If I had any
      > idea of how the boot sequence worked I might be able to better
      > understand why the SlugOS pivots correctly sometimes, and why it doesn't
      > other times.

      It's all there in /initrd on the booted device. Just follow the logic
      in the scripts; it's pretty simple for a basic disk boot.

      I might suggest that you consider a serial port for your NSLU2 as well;
      that's the ultimate tool to debug these sorts of problems.

      BTW, I'm assuming you are using a recent SlugOS version; you don't
      mention which one.

      > Sorry for the long post.
      >
      > Thanks.
      > Scott

      Mike (mwester)
    • Scott Ruckh
      This is what you said Mike (mwester) ... Thanks I will take a look. It was blkid that I used to get UUID for putting in the /etc/fstab file. Although, as
      Message 2 of 3 , Mar 10, 2008
        This is what you said Mike (mwester)
        > Scott Ruckh wrote:
        >> I am trying to find out more about the boot sequence of the NSLU2
        >> running SlugOS/BE.
        >>
        >> The situation I am having is that the disks that are connected do not
        >> appear to be recognized the same between boots. For example, the sda1
        >> drive might become the sdc1 drive after a reboot.
        >
        > Yes, there is no mechanism in Linux that can ensure that a USB mass
        > storage device retains its device name. Any perceived consistency is
        > entirely accidental! In the worst cases, it can happen without any
        > human interaction at all due to race conditions; the one that ends up as
        > /dev/sda is the first one that finished initialization at power-up.
        >
        >> When SlugOS/BE was first installed, a single disk was plugged into Port
        >> 2. Two partitions were created, and they were known as /dev/sda1 and
        >> /dev/sda2 (where sda2 was a swap parition).
        >>
        >> While all the apps were being configured, this configuration stayed the
        >> same. At this point there were no troubles at all, and everything
        >> worked great!
        >>
        >> Then a hub was introduced into Port 1 of the NSLU2 and a NTFS formatted
        >> drive and a thumb drive were plugged into the hub. Even at this point
        >> everything worked fine while the system was still active. I created the
        >> thumb drive (sdc1) for swap, and mounted the NTFS parition (sdb5).
        >> Everything was working just fine....Unti...the NSLU2 was rebooted.
        >
        > Yep, you changed the USB world, and your device names moved about.
        >
        >> After the reboot the drives got mapped differently. Quite often the
        >> SlugOS will not even pivot to the drive which was originally initialized
        >> with the 'turnup' command.. The only recovery method is to boot the
        >> NSLU2 with no drives attached. Even when the SlugOS does not pivot to
        >> the hard drive, the internal SlugOS firmware does not even get the
        >> proper IP address. It is only when no drives are connected that the
        >> NSLU2 can once again be logged into using the IP address assigned when
        >> 'turnup init' was run.
        >
        > The pivot should be done on the partition's UUID, which is which is
        > unique to each partition on the system, regardless of the device
        > identifier. There is a fallback if the UUID doesn't exist (SlugOS will
        > attempt to use the original /dev/sdxx that was used at the time you did
        > the turnup operation). Also, there is a time delay -- unless you
        > specified otherwise, it will wait 10 seconds for all the device to
        > become ready before it attempts to pivot.
        >
        >> Now, there is no rhyme or reason to when the SlugOS will actually boot
        >> from the Hard Drive. I could plug the original drive all by itself back
        >> into PORT 2 of the NSLU2 and SlugOS still may not boot to the Hard
        >> Drive. Although this configuration worked time and time again when the
        >> system was first configured, now there are no guarantees if it will
        >> work. If it does not work after the first boot, it will not ever work
        >> unless something else is done (like move the disk to PORT 1 on the
        >> NSLU2; again I am not sure why this little trick even works).
        >>
        >> There is no pattern to when the slug is going to boot and when it will
        >> not; it really appears to be blind luck. For example plugging the disk,
        >> that was always plugged into PORT 2, and plugging it into PORT 1 might
        >> make it boot fine.
        >
        > It's impossible to know since we don't know the path through the startup
        > process -- but it sounds as if the UUID may have been lost or it's a
        > timeout problem.
        >
        >> I tried to use the UUID directive in the /etc/fstab on the drive which
        >> was originally intialized when SlugOS was first installed, but that does
        >> not seem to improve anything.
        >
        > That fstab is not useful until *after* the pivot to the new filesystem
        > occurs.
        >
        >> I could not locate the files that pivot the root file system and that
        >> mounts the root file system and the swap drives.
        >
        > They are all on the internal flash. See /initrd/linuxrc for the file
        > that run right after the kernel boots. It will invoke /boot/disk (a
        > script that you'll find as /initrd/boot/disk) with the timeout value as
        > well as the UUID and the /dev/sdxx values to be used for the pivot.
        >
        > You need to confirm that the UUID mentioned in that linuxrc file still
        > matches the one on your root filesystem (google for the command to run;
        > I seem to recall "blkid" but I could be wrong). Also, you might
        > consider changing the timeout value to give the hub and all the other
        > stuff more time to sort themselves out. (To edit the linuxrc file, you
        > need to boot up to flash because /initrd is mounted read-only.)

        Thanks I will take a look. It was blkid that I used to get UUID for putting in the /etc/fstab file.
        Although, as already discussed, it was not helpful as this needs to be done pre-pivot.

        >
        >> I would like to mount the Thumb Drive as a higher priority for swap
        >> space, but I could not find where that is configured. It must happen
        >> well before the /etc/fstab table is used from the drive where SlugOS is
        >> supposed to boot from because that file does not seem to play a part in
        >> mounting the root "/" file system or the swap partitions.
        >
        > Swap happens post-pivot. Search the startup scripts for "swapon". I
        > think it's rather indiscriminate; I seem to recall that it just does a
        > "swapon -a".

        Thanks. I will search and try to make the swapon command more specific.

        >
        >> I am trying to better understand the boot process so that I may be able
        >> to persistantly mount drives to their proper location and also
        >> understand why the SlugOS does not always boot from the Hard Drive where
        >> is was originally configured. I am also trying to understand why when
        >> the SlugOS does not boot successfully from the HDD, the firmware OS is
        >> not even accessible on the network. Once all drives have been
        >> disconnected the Firmware SlugOS boots with the IP values assigned
        >> during the 'turnup init' process. Why isn't this true when a HDD is
        >> plugged in, but the root "/" file system is never pivoted to the HDD?
        >>
        >> Now that multiple drives have been introduces it just appears to be luck
        >> when the SlugOS will pivot correctly to the originally initialized
        >> drive. I have run e2fsck on the drive and that still does not appear to
        >> be beneficial. Unfortunately I have not spotted a pattern to when it
        >> will boot properly and when it wont. The only pattern that appears to
        >> be consistent is that if you get it to boot once and do not touch a
        >> single drive, subsequent boots appear to work fine. If anything
        >> changes, all bets are off, and getting SlugOS booted again is just an
        >> act of black magic.
        >>
        >> The original goal is to keep the originally initialized drive plugged
        >> into Port 2 and have a hub in Port 1. The thumb drive would always stay
        >> plugged into the hub as a swap partition, and the NTFS would be able to
        >> be mobile, in that it could be moved from NSLU2 to a Windows based PC.
        >>
        >> Even though I have gotten this configuration working while the slug is
        >> on, I can not seem to ensure that it will work between boots.
        >>
        >> Is there a WIKI on the boot process, and what files to look at so I can
        >> mount the drives to the proper file system mount points? If I had any
        >> idea of how the boot sequence worked I might be able to better
        >> understand why the SlugOS pivots correctly sometimes, and why it doesn't
        >> other times.
        >
        > It's all there in /initrd on the booted device. Just follow the logic
        > in the scripts; it's pretty simple for a basic disk boot.

        Thanks for the tips.

        >
        > I might suggest that you consider a serial port for your NSLU2 as well;
        > that's the ultimate tool to debug these sorts of problems.
        >
        > BTW, I'm assuming you are using a recent SlugOS version; you don't
        > mention which one.

        SlugOS/BE 4.8 Beta

        >
        >> Sorry for the long post.
        >>
        >> Thanks.
        >> Scott
        >
        > Mike (mwester)
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.