Loading ...
Sorry, an error occurred while loading the content.

13811Re: [nslu2-linux] Re: Big endian vs Little endian and Network performance

Expand Messages
  • Yann E. MORIN
    Jul 2, 2006
    • 0 Attachment
      Brian,
      All,

      On Sunday 02 July 2006 201, Brian Wood wrote:
      > Why "Not the best option", do you mean you should be using a faster
      > drive for root and swap ? What is the "best" option?

      Swap is so _slowww_, when compared to memory. So the faster the hdd, the
      better for swap usage.

      > Something like this:
      > http://www.geeks.com/details.asp?invtid=77P1636&cat=HDD

      Hmm, donuts!... :-) Anyway i would not entrust flash-based hard-drive, wether
      they be NAND or NOR, as a swap device. Sounds stupid because flash wears out
      one time or the other, and swap using it as swap is, well, don't take it as
      an offense, Just Plain Stupid (TM).

      > Oh Boy! Real World Numbers, this is what we need.

      My job. Part of it, at least.

      > But you're indicating it's the network that's limiting speed:
      > So perhaps the question is what is limiting the network speed? If the
      > limitation is on the slug end is it the slug itself ? (hardware) or
      > could the NIC drivers possibly be made better ? TCP/IP stack?

      I still have to transfert a far bigger file using nc (no fs overhead) and see
      the load of the slug. And do the reverse as well: transfert this file from the
      slug to the PC, and looking at the load as well.

      Well, after trying, I could not get nc on the slug to connect to my PC...
      "nc kemper 10000" returned exit status 1, when my PC was listening... :-(

      > How do we go about determining which end is limiting the speed? I
      > guess by connecting to something else (PC-to-PC ?)

      I have a second PC at hand that can be connected to the first through a switch.
      I'll try that. Results below.

      > It's hard for me to see how any switch, "real" or otherwise, could be
      > faster than a pice of copper, even though I will grant that a lot of
      > consumer switches are crap - they have NICs that will ACK a "100
      > speed full-duplex" connection even though they have processing
      > engines that can't manage even half that.

      Not to mention backplane which might not even exceed 100Mbps. No just imagine
      that you've got 4 machines, discussing by pairs at 100Mbps, through a four-port
      switch with a 100Mbps backplane, then the backplane will be the limiting factor:

      ------------------------- backplane @100Mbps
      | | | |
      PC1 <-> PC2 PC3 <-> PC4 all @100Mbps

      > But I don't think our friend reporting 80Mbs. is lying to us either.
      > I think he is really seeing those numbers on his screen, we just have
      > to figure out if they mean what they seem to mean or not :-)

      Local cache on the PC side. I can't yet understand how he managed, but that
      sounds like some kind of cache issue.

      Anyway, results of more tests follow:

      ===============================================================================

      Two PC exchange of 1GiB file, cat-5 cables and a no-mark 100Mbps switch
      in-between.

      On PC1 (not loaded):
      root@lesneven:# nc -l -p 10000 >/dev/null

      On PC2 (not loaded, 1024GiB RAM):
      root@kemper:# dd if=/dev/urandom of=urandom.dat bs=1048576 count=1024 # 1GiB
      root@kemper:# cat urandom.dat |time nc -q 0 landeda 10000
      0.12user 1.43system 1:31.22elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+341minor)pagefaults 0swaps

      That is an average transfert rate of 89.804Mibps, that is 11.225MiBps.
      Whaooo! impressive! I never tried that and I'm really surprised!

      ===============================================================================

      Back to the slug, transfering a 1GiB file from PC to slug over a netcat
      session.

      On slug:
      root@landeda:~# nc -l -p 10000 >/dev/null &
      root@landeda:~# while true; do
      > LOAD=`cat /proc/loadavg`
      > echo ${LOAD}
      > sleep 5
      > done >load.dat

      On PC (not loaded, 1024GiB RAM):
      root@kemper:# dd if=/dev/urandom of=urandom.dat bs=1048576 count=1024 # 1GiB
      root@kemper:# cat urandom.dat |time nc -q 0 landeda 10000
      0.02user 1.62system 2:51.59elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+386minor)pagefaults 0swaps

      That is an average bitrate of 47.741Mibps, that is 5.967MiBps. We are back to
      the numbers we already had. Not good numbers, but rather low compared to what
      my network cables and NIC can sustain. Now to the load:

      root@landeda:~# cat load.dat
      1.01 1.09 1.13 1/61 18434 <- The load of a slug at rest is at least 1.00
      because the ixp400_eth.ko module is badly written
      and causes a 1.00 load always, although the CPU
      usage is nil. Zero, nothing, not a single cycle
      used, but a 1.00 load nonetheless... :-( Go into
      CSR code to understand that if you can. I resigned.
      So remove this 1.00 in later calculations.
      1.01 1.09 1.13 2/61 18437
      1.01 1.09 1.13 1/61 18440
      1.01 1.09 1.13 2/61 18443
      1.01 1.08 1.12 1/61 18446
      1.01 1.08 1.12 2/61 18449 <- after ~30s, the load did not move: my
      little script has almost no impact. Good.
      1.09 1.10 1.13 3/61 18452 <- Aha! Transmission in action!
      1.16 1.11 1.13 2/61 18455
      1.23 1.13 1.14 3/61 18458
      1.29 1.14 1.14 3/61 18461
      1.34 1.16 1.15 3/61 18464
      1.40 1.17 1.15 3/61 18467
      1.44 1.18 1.15 3/61 18470 <- load rising :-|
      1.49 1.20 1.16 3/61 18473
      1.53 1.21 1.16 3/61 18476
      1.57 1.22 1.17 4/61 18479
      1.60 1.24 1.17 3/61 18482
      1.63 1.25 1.18 3/61 18485
      1.66 1.26 1.18 3/61 18488
      1.69 1.27 1.18 4/61 18491
      1.71 1.28 1.19 3/61 18494
      1.74 1.30 1.19 3/61 18497
      1.76 1.31 1.20 3/61 18500 <- still rising... :-/
      1.78 1.32 1.20 3/61 18503
      1.79 1.33 1.20 2/61 18506
      1.81 1.34 1.21 3/61 18509
      1.83 1.35 1.21 3/61 18512
      1.84 1.36 1.22 3/61 18515
      1.85 1.37 1.22 3/61 18518
      1.86 1.38 1.22 3/61 18521
      1.87 1.39 1.23 3/61 18524
      1.88 1.40 1.23 3/61 18527 <- load rather high... :-(
      1.89 1.41 1.24 3/61 18530
      1.90 1.42 1.24 3/61 18533
      1.91 1.43 1.24 3/61 18536
      1.92 1.44 1.25 3/61 18539
      1.92 1.45 1.25 3/61 18542
      1.93 1.46 1.26 3/61 18545
      1.93 1.47 1.26 3/61 18548
      1.94 1.48 1.26 3/61 18551 <- Hmm, this is the last result when transission
      was still on-going...
      1.94 1.48 1.27 2/60 18554 <- transmission is finished...
      1.87 1.48 1.27 2/60 18557
      1.80 1.47 1.26 3/60 18560
      1.73 1.46 1.26 2/60 18563

      Maximum load when receiving: 0.94. That's really, really _big_.

      So what does that _mean_? It looks like we _are_ CPU-bound on the slug, and thus
      it can not even sustain receiving more than around 48Mibps. That is _receiving_.

      Looking at the WiKi, there is a section on using the DMA accelerations from the
      NPEs. That could help, but yet... :-/

      ===============================================================================

      Regards,
      Yann E. MORIN.

      --
      .-----------------.--------------------.------------------.--------------------.
      | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
      | +0/33 662376056 | Software Designer | \ / CAMPAIGN | ^ |
      | --==< °_° >==-- °---.----------------: X AGAINST | /e\ There is no |
      | web: ymorin.free.fr | SETI@home 3808 | / \ HTML MAIL | """ conspiracy. |
      °---------------------°----------------°------------------°--------------------°
    • Show all 45 messages in this topic