[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1002241253560.30870@chino.kir.corp.google.com>
Date: Wed, 24 Feb 2010 13:06:44 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Miao Xie <miaox@...fujitsu.com>
cc: Nick Piggin <npiggin@...e.de>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Lee Schermerhorn <lee.schermerhorn@...com>
Subject: Re: [regression] cpuset,mm: update tasks' mems_allowed in time
(58568d2)
On Wed, 24 Feb 2010, Miao Xie wrote:
> >> Sorry, Could you explain what you advised?
> >> I think it is hard to fix this problem by adding a variant, because it is
> >> hard to avoid loading a word of the mask before
> >>
> >> nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
> >>
> >> and then loading another word of the mask after
> >>
> >> tsk->mems_allowed = *newmems;
> >>
> >> unless we use lock.
> >>
> >> Maybe we need a rw-lock to protect task->mems_allowed.
> >>
> >
> > I meant that we need to define synchronization only for configurations
> > that do not do atomic nodemask_t stores, it's otherwise unnecessary.
> > We'll need to load and store tsk->mems_allowed via a helper function that
> > is defined to take the rwlock for such configs and only read/write the
> > nodemask for others.
> >
>
> By investigating, we found that it is hard to guarantee the consistent between
> mempolicy and mems_allowed because mempolicy was designed as a self-update function.
> it just can be changed by one's self. Maybe we must change the implement of mempolicy.
>
Before your change, cpuset nodemask changes were serialized on
manage_mutex which would, in turn, serialize the rebinding of each
attached task's mempolicy. update_nodemask() is now serialized on
cgroup_lock(), which also protects scan_for_empty_cpusets(), so the cpuset
code protects it adequately. If a concurrent mempolicy change from a
user's set_mempolicy() happens, however, it could introduce an
inconsistency between them.
If we protect current->mems_allowed with a rwlock or seqlock for configs
where MAX_NUMNODES > BITS_PER_LONG, then we can always guarantee that we
get the entire nodemask. The same problem is present for
current->cpus_allowed, however, with NR_CPUS > BITS_PER_LONG. We must be
able to safely dereference both masks without the chance of returning
nodes_empty() or cpus_empty().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists