linux-kernel - Re: [rfc] lru_add_drain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 10 Sep 2009 01:46:01 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	Lee Schermerhorn <Lee.Schermerhorn@...com>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mike Galbraith <efault@....de>, Ingo Molnar <mingo@...e.hu>,
	linux-mm <linux-mm@...ck.org>,
	Oleg Nesterov <onestero@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [rfc] lru_add_drain_all() vs isolation

Hi, Lee.
Long time no see. :)

On Thu, Sep 10, 2009 at 1:18 AM, Lee Schermerhorn
<Lee.Schermerhorn@...com> wrote:
> On Thu, 2009-09-10 at 00:39 +0900, Minchan Kim wrote:
>> On Wed, Sep 9, 2009 at 1:27 PM, KOSAKI Motohiro
>> <kosaki.motohiro@...fujitsu.com> wrote:
>> >> The usefulness of a scheme like this requires:
>> >>
>> >> 1. There are cpus that continually execute user space code
>> >>    without system interaction.
>> >>
>> >> 2. There are repeated VM activities that require page isolation /
>> >>    migration.
>> >>
>> >> The first page isolation activity will then clear the lru caches of the
>> >> processes doing number crunching in user space (and therefore the first
>> >> isolation will still interrupt). The second and following isolation will
>> >> then no longer interrupt the processes.
>> >>
>> >> 2. is rare. So the question is if the additional code in the LRU handling
>> >> can be justified. If lru handling is not time sensitive then yes.
>> >
>> > Christoph, I'd like to discuss a bit related (and almost unrelated) thing.
>> > I think page migration don't need lru_add_drain_all() as synchronous, because
>> > page migration have 10 times retry.
>> >
>> > Then asynchronous lru_add_drain_all() cause
>> >
>> >  - if system isn't under heavy pressure, retry succussfull.
>> >  - if system is under heavy pressure or RT-thread work busy busy loop, retry failure.
>> >
>> > I don't think this is problematic bahavior. Also, mlock can use asynchrounous lru drain.
>>
>> I think, more exactly, we don't have to drain lru pages for mlocking.
>> Mlocked pages will go into unevictable lru due to
>> try_to_unmap when shrink of lru happens.
>> How about removing draining in case of mlock?
>>
>> >
>> > What do you think?
>
>
> Remember how the code works:  __mlock_vma_pages_range() loops calliing
> get_user_pages() to fault in batches of 16 pages and returns the page
> pointers for mlocking.  Mlocking now requires isolation from the lru.
> If you don't drain after each call to get_user_pages(), up to a
> pagevec's worth of pages [~14] will likely still be in the pagevec and
> won't be isolatable/mlockable().  We can end up with most of the pages

Sorry for confusing.
I said not lru_add_drain but lru_add_drain_all.
Now problem is schedule_on_each_cpu.

Anyway, that case pagevec's worth of pages will be much
increased by the number of CPU as you pointed out.

> still on the normal lru lists.  If we want to move to an almost
> exclusively lazy culling of mlocked pages to the unevictable then we can
> remove the drain.  If we want to be more proactive in culling the
> unevictable pages as we populate the vma, we'll want to keep the drain.
>

It's not good that lazy culling of many pages causes high reclaim overhead.
But now lazy culling of reclaim is doing just only shrink_page_list.
we can do it shrink_active_list's page_referenced so that we can sparse
cost of lazy culling.

> Lee
>
>



-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/