lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Dec 2008 17:44:55 -0700
From:	"Dan Williams" <dan.j.williams@...il.com>
To:	"Justin Piszcz" <jpiszcz@...idpixels.com>
Cc:	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
	"Alan Piszcz" <ap@...arrain.com>
Subject: Re: 2.6.27.8: OOM killer: [swapper: page allocation failure]

On Fri, Dec 12, 2008 at 5:14 AM, Justin Piszcz <jpiszcz@...idpixels.com> wrote:
> Kernel: 2.6.27.8
>
> I have a RAID-6 volume with 6 disks:
> mdadm --create -v /dev/md0 --level=6 -n 6 -e 0.90 /dev/sd[e-j]1
>
> While it is building...
>
> # cat /proc/mdstat
> Personalities : [raid1] [raid6] [raid5] [raid4]
> md0 : active raid6 sdj1[5] sdi1[4] sdh1[3] sdg1[2] sdf1[1] sde1[0]
>      1562834944 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
>      [===>.................]  resync = 19.0% (74237440/390708736)
> finish=346.7min speed=15208K/sec
>
> unused devices: <none>
>
> During this time, I am copying files to it at ~60-100MiB/s using 2 rsyncs
> over NFS from 2 separate machines => this one [over the network] (on a pci-e
> x1 / built-in to the motherboard).
>
> top output:
> top - 07:10:06 up 12:32, 30 users,  load average: 3.46, 4.49, 4.39
> Tasks: 316 total,   2 running, 313 sleeping,   1 stopped,   0 zombie
> Cpu(s): 12.1%us,  3.7%sy,  0.0%ni, 32.5%id, 49.2%wa,  1.1%hi,  1.2%si,
>  0.0%st
> Mem:   8100836k total,  8057480k used,    43356k free,        0k buffers
> Swap: 16787884k total,      144k used, 16787740k free,  6609412k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  5349 root      15  -5     0    0    0 R   10  0.0  11:59.94 md0_raid5
>    1 root      20   0 10312  748  620 S    0  0.0   0:02.14 init
>
> I see this over and over in the kernel log (maybe it will stop when the
> RAID6
> rebuild is completed)?  So far, my rsyncs are continuing just fine and
> nothing except klogd/swapper processes have been killed.
>
> [44893.846532] swapper: page allocation failure. order:0, mode:0x20
[..]
>
> Raid optimizations used:
>
> # cat oraid.sh |grep -v ^#|grep .
> . /etc/profile
> echo "Optimizing RAID Arrays..."
> cd /sys/block
> DISKS=$(/bin/ls -1d sd[e-j])
> echo "Setting read-ahead to 32 MiB for /dev/md0"
> blockdev --setra 65536 /dev/md0
> echo "Setting stripe_cache_size to 16 MiB for /dev/md0"
> echo 16384 > /sys/block/md0/md/stripe_cache_size
>

If you set the stripe_cache_size high enough you can strangle the
system.  Do you really need 384MiB of raid cache?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists