lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 10 Feb 2009 01:12:27 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Jens Axboe <jens.axboe@...cle.com>, akpm@...ux-foundation.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>, thomas.pi@...or.dea,
	Yuriy Lalym <ylalym@...il.com>, ltt-dev@...ts.casi.polymtl.ca,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm fix page writeback accounting to fix oom condition
	under heavy I/O

* Linus Torvalds (torvalds@...ux-foundation.org) wrote:
> 
> 
> On Mon, 9 Feb 2009, Mathieu Desnoyers wrote:
> > 
> > So this patch fixes this behavior by only decrementing the page accounting
> > _after_ the block I/O writepage has been done.
> 
> This makes no sense, really.
> 
> Or rather, I don't mind the notion of updating the counters only after IO 
> per se, and _that_ part of it probably makes sense. But why is it that you 
> only then fix up two of the call-sites. There's a lot more call-sites than 
> that for this function. 
> 
> So if this really makes a big difference, that's an interesting starting 
> point for discussion, but I don't see how this particular patch could 
> possibly be the right thing to do.
> 

Yes, you are right. Looking in more details at /proc/meminfo under the
workload, I notice this :

MemTotal:       16028812 kB
MemFree:        13651440 kB
Buffers:            8944 kB
Cached:          2209456 kB   <--- increments up to ~16GB

        cached = global_page_state(NR_FILE_PAGES) -
                        total_swapcache_pages - i.bufferram;

SwapCached:            0 kB
Active:            34668 kB
Inactive:        2200668 kB   <--- also

                K(pages[LRU_INACTIVE_ANON] + pages[LRU_INACTIVE_FILE]),

Active(anon):      17136 kB
Inactive(anon):        0 kB
Active(file):      17532 kB
Inactive(file):  2200668 kB   <--- also

                K(pages[LRU_INACTIVE_FILE]),

Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      19535024 kB
SwapFree:       19535024 kB
Dirty:           1159036 kB
Writeback:             0 kB  <--- stays close to 0
AnonPages:         17060 kB
Mapped:             9476 kB
Slab:              96188 kB
SReclaimable:      79776 kB
SUnreclaim:        16412 kB
PageTables:         3364 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    27549428 kB
Committed_AS:      54292 kB
VmallocTotal:   34359738367 kB
VmallocUsed:        9960 kB
VmallocChunk:   34359727667 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        7552 kB
DirectMap2M:    16769024 kB

So I think simply substracting K(pages[LRU_INACTIVE_FILE]) from
avail_dirty in clip_bdi_dirty_limit() and to consider it in
balance_dirty_pages() and throttle_vm_writeout() would probably make my
problem go away, but I would like to understand exactly why this is
needed and if I would need to consider other types of page counts that
would have been forgotten.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ