linux-kernel - Re: [PATCH] cpuset: fix allocating page cache/slab object on the unallowed node when memory spread is set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200902121255.43047.nickpiggin@yahoo.com.au>
Date:	Thu, 12 Feb 2009 12:55:42 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Paul Menage <menage@...gle.com>
Cc:	Paul Jackson <pj@....com>, miaox@...fujitsu.com,
	Andrew Morton <akpm@...ux-foundation.org>, mingo@...e.hu,
	linux-kernel@...r.kernel.org, cl@...ux-foundation.org
Subject: Re: [PATCH] cpuset: fix allocating page cache/slab object on the unallowed node when memory spread is set

On Thursday 12 February 2009 12:19:11 Paul Menage wrote:
> On Wed, Feb 11, 2009 at 4:54 PM, Nick Piggin <nickpiggin@...oo.com.au> 
wrote:
> > It would be possible, depending on timing, for the allocating thread to
> > see either pre or post mems_allowed even if access was fully locked.
>
> Right - seeing either the pre set or the post set is fine.
>
> > The only difference is that a partially changed mems_allowed could be
> > seen. But what does this really mean? Some combination of the new and
> > the old nodes. I don't think this is too much of a problem.
>
> But if the old and new nodes are disjoint, that could lead to seeing no
> nodes.

Well we could structure updates as setting all new allowed nodes,
then clearing newly disallowed ones.


> Also, having the results of cpuset_zone_allowed() and
> cpuset_current_mems_allowed change at random times over the course of
> a call to alloc_pages() might cause interesting effects (e.g. we make
> progress freeing pages from one set of nodes, and then call
> get_page_from_freelist() on a different set of nodes).

But again, is this really a problem? We're talking about a tiny
possibility in a very uncommon case anyway when the cpuset is
changing.

If it can cause an outright error like OOM of course that's no
good, but if it just requires us to go around the reclaim loop
or allocate from another zone... I don't think that's so bad.


> > This could work if we *really* need an atomic snapshot of mems_allowed.
> > seqcount synchronisation would be an alternative too that could allow
> > sleeping more easily than SRCU (OTOH if you don't need sleeping, then
> > RCU should be faster than seqcount).
> >
> > But I'm not convinced we do need this to be atomic.
>
> It's possible that I'm being overly-paranoid here. The decision to
> make mems_allowed updates be purely pulled by the task itself predates
> my involvement with cpusets code by a long time.

It's not such a bad model, but the problem with it is that it needs
to be carefully spread over the VM, and in fastpaths too. Now if it
were something really critical, fine, but I'm hoping we can do
without.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/