[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090317131251.GU16897@balbir.in.ibm.com>
Date: Tue, 17 Mar 2009 18:42:51 +0530
From: Balbir Singh <balbir@...ux.vnet.ibm.com>
To: Bharata B Rao <bharata@...ux.vnet.ibm.com>
Cc: Li Zefan <lizf@...fujitsu.com>, linux-kernel@...r.kernel.org,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
Paul Menage <menage@...gle.com>, Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH -tip] cpuacct: Make cpuacct hierarchy walk in
cpuacct_charge() safe when rcupreempt is used.
* Bharata B Rao <bharata@...ux.vnet.ibm.com> [2009-03-17 13:06:49]:
> On Tue, Mar 17, 2009 at 02:28:11PM +0800, Li Zefan wrote:
> > Bharata B Rao wrote:
> > > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > > rcupreempt is used.
> > >
> > > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > > This can race with the task's movement between cgroups. This race
> > > can cause an access to freed ca pointer in cpuacct_charge(). This will not
> >
> > Actually it can also end up access invalid tsk->cgroups. ;)
> >
> > get tsk->cgroups (cg)
> > (move tsk to another cgroup) or (tsk exiting)
> > -> kfree(tsk->cgroups)
> > get cg->subsys[..]
>
> Ok :) Here is the patch again with updated description.
>
> cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> rcupreempt is used.
>
> cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> This can race with the task's movement between cgroups. This race
> can cause an access to freed ca pointer in cpuacct_charge() or access
> to invalid cgroups pointer of the task. This will not happen with rcu or
> tree rcu as cpuacct_charge() is called with preemption disabled. However if
> rcupreempt is used, the race is seen. Thanks to Li Zefan for explaining this.
>
> Fix this race by explicitly protecting ca and the hierarchy walk with
> rcu_read_lock().
>
Looks good and works very well (except for the batch issue that you
pointed out, it takes up to batch values before updates are seen).
I'd like to get the patches in -tip and see the results, I would
recommend using percpu_counter_sum() while reading the data as an
enhancement to this patch. If user space does not overwhelm with a lot
of reads, sum would work out better.
Tested-by: Balbir Singh <balbir@...ux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@...ux.vnet.ibm.com>
--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists