[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F7FCC8A.6050707@openvz.org>
Date: Sat, 07 Apr 2012 09:11:38 +0400
From: Konstantin Khlebnikov <khlebnikov@...nvz.org>
To: Hugh Dickins <hughd@...gle.com>
CC: Oleg Nesterov <oleg@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Roland Dreier <roland@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH RFC] mm: account VMA before forced-COW via /proc/pid/mem
Hugh Dickins wrote:
> On Thu, 5 Apr 2012, Konstantin Khlebnikov wrote:
>> Oleg Nesterov wrote:
>>> On 04/04, Konstantin Khlebnikov wrote:
>>>> Oleg Nesterov wrote:
>>>>> On 04/02, Konstantin Khlebnikov wrote:
>>>>>>
>>>>>> Currently kernel does not account read-only private mappings into
>>>>>> memory commitment.
>>>>>> But these mappings can be force-COW-ed in get_user_pages().
>>>>>
>>>>> Heh. tail -n3 Documentation/vm/overcommit-accounting
>>>>> may be you should update it then.
>>>>
>>>> I just wonder how fragile this accounting...
>>>
>>> I meant, this patch could also remove this "TODO" from the docs.
>>
>> Actually I dug into this code for killing VM_ACCOUNT vma flag.
>> Currently we cannot do this only because asymmetry in mprotect_fixup():
>> it account vma on read-only -> writable conversion, but keep on backward
>> operation.
>> Probably we can kill this asymmetry, and after that we can recognize
>> accountable vma
>> by its others flags state, so we don't need special VM_ACCOUNT for this.
>
> (I believe the VM_ACCOUNT flag will need to stay.)
>
> But this is just a quick note to say that I'm not ignoring you: I have
> a strong interest in this, but only now found time to look through the
> thread and ponder, and I'm not yet ready to decide.
>
> I've long detested that behaviour of GUP write,force, and my strong
> preference would be not to layer more strangeness upon strangeness,
> but limit the damage by making GUP write,force fail in that case,
> instead of inserting a PageAnon page into a VM_SHARED mapping.
>
> I think it's unlikely that it will cause a regression in real life
> (it already fails if you did not open the mmap'ed file for writing),
> but it would be a user-visible change in behaviour, and I've research
> to do before arriving at a conclusion.
Agree, but this stuff is very weak. Even if sysctl vm.overcommit_memory=2,
probably we should fixup accounting in /proc/pid/mem only for this case,
because vm.overcommit_memory=2 supposed to protect against overcommit, but it does not.
>
> Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists