linux-kernel - Re: [PATCH] remove throttle_vm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <E1IdZLg-0002Wr-00@dorka.pomaz.szeredi.hu>
Date:	Fri, 05 Oct 2007 00:39:16 +0200
From:	Miklos Szeredi <miklos@...redi.hu>
To:	akpm@...ux-foundation.org
CC:	miklos@...redi.hu, wfg@...l.ustc.edu.cn, a.p.zijlstra@...llo.nl,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] remove throttle_vm_writeout()

> None of the above.
> 
>     [PATCH] vm: pageout throttling
>     
>     With silly pageout testcases it is possible to place huge amounts of memory
>     under I/O.  With a large request queue (CFQ uses 8192 requests) it is
>     possible to place _all_ memory under I/O at the same time.
>     
>     This means that all memory is pinned and unreclaimable and the VM gets
>     upset and goes oom.
>     
>     The patch limits the amount of memory which is under pageout writeout to be
>     a little more than the amount of memory at which balance_dirty_pages()
>     callers will synchronously throttle.
>     
>     This means that heavy pageout activity can starve heavy writeback activity
>     completely, but heavy writeback activity will not cause starvation of
>     pageout.  Because we don't want a simple `dd' to be causing excessive
>     latencies in page reclaim.
> 
> afaict that problem is still there.  It is possible to get all of
> ZONE_NORMAL dirty on a highmem machine.  With a large queue (or lots of
> queues), vmscan can them place all of ZONE_NORMAL under IO.
> 
> It could be that we've fixed this problem via other means in the interrim,
> but from a quick peek to seems to me that the scanner will still do a 100%
> CPU burn when all of a zone's pages are under writeback.

Ah, OK.

I did read the changelog, but you added quite a bit of translation ;)

> throttle_vm_writeout() should be a per-zone thing, I guess.  Perhaps fixing
> that would fix your deadlock.  That's doubtful, but I don't know anything
> about your deadlock so I cannot say.

No, doing the throttling per-zone won't in itself fix the deadlock.

Here's a deadlock example:

Total memory = 32M
/proc/sys/vm/dirty_ratio = 10
dirty_threshold = 3M
ratelimit_pages = 1M

Some program dirties 4M (dirty_threshold + ratelimit_pages) of mmap on
a fuse fs.  Page balancing is called which turns all these into
writeback pages.

Then userspace filesystem gets a write request, and tries to allocate
memory needed to complete the writeout.

That will possibly trigger direct reclaim, and throttle_vm_writeout()
will be called.  That will block until nr_writeback goes below 3.3M
(dirty_threshold + 10%).  But since all 4M of writeback is from the
fuse fs, that will never happen.

Does that explain it better?

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/