[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111107045928.GK8927@hexapodia.org>
Date: Sun, 6 Nov 2011 20:59:28 -0800
From: Andy Isaacson <adi@...apodia.org>
To: linux-kernel@...r.kernel.org, linux-mm@...r.kernel.org
Subject: long sleep_on_page delays writing to slow storage
I am running 1a67a573b (3.1.0-09125 plus a small local patch) on a Core
i7, 8 GB RAM, writing a few GB of data to a slow SD card attached via
usb-storage with vfat. I mounted without specifying any options,
/dev/sdb1 /mnt/usb vfat rw,nosuid,nodev,noexec,relatime,uid=22448,gid=22448,fmask=0022,dmask=0022,codepage=cp437,iocharset=utf8,shortname=mixed,errors=remount-ro 0 0
and I'm using rsync to write the data.
We end up in a fairly steady state with a half GB dirty:
Dirty: 612280 kB
The dirty count stays high despite running sync(1) in another xterm.
The bug is,
Firefox (iceweasel 7.0.1-4) hangs at random intervals. One thread is
stuck in sleep_on_page
[<ffffffff810c50da>] sleep_on_page+0xe/0x12
[<ffffffff810c525b>] wait_on_page_bit+0x72/0x74
[<ffffffff811030f9>] migrate_pages+0x17c/0x36f
[<ffffffff810fa24a>] compact_zone+0x467/0x68b
[<ffffffff810fa6a7>] try_to_compact_pages+0x14c/0x1b3
[<ffffffff810cbda1>] __alloc_pages_direct_compact+0xa7/0x15a
[<ffffffff810cc4ec>] __alloc_pages_nodemask+0x698/0x71d
[<ffffffff810f89c2>] alloc_pages_vma+0xf5/0xfa
[<ffffffff8110683f>] do_huge_pmd_anonymous_page+0xbe/0x227
[<ffffffff810e2bf4>] handle_mm_fault+0x113/0x1ce
[<ffffffff8102fe3d>] do_page_fault+0x2d7/0x31e
[<ffffffff812fe535>] page_fault+0x25/0x30
[<ffffffffffffffff>] 0xffffffffffffffff
And it stays stuck there for long enough for me to find the thread and
attach strace. Apparently it was stuck in
1320640739.201474 munmap(0x7f5c06b00000, 2097152) = 0
for something between 20 and 60 seconds.
There's no reason to let a 6MB/sec high latency device lock up 600 MB of
dirty pages. I'll have to wait a hundred seconds after my app exits
before the system will return to usability.
And there's no way, AFAICS, for me to work around this behavior in
userland.
And I don't understand how this compact_zone thing is intended to work
in this situation.
edited but nearly full dmesg at
http://web.hexapodia.org/~adi/snow/dmesg-3.1.0-09126-g4730284.txt
Thoughts?
Thanks,
-andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists