[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1104271641350.25369@chino.kir.corp.google.com>
Date: Wed, 27 Apr 2011 16:51:07 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: john stultz <johnstul@...ibm.com>
cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Dave Hansen <dave@...ux.vnet.ibm.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
Michal Nazarewicz <mina86@...a86.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/2] break out page allocation warning code
On Tue, 26 Apr 2011, john stultz wrote:
> Sorry if this somehow got off on the wrong foot. Its just surprising to
> see such passion bubble up after almost two years of quiet since the
> proc patch went in.
>
It hasn't been two years, it hasn't even been 18 months.
$ git diff 4614a696bd1c.. | grep "^+.*current\->comm" | wc -l
42
Apparently those dozens of new references directly to current->comm since
the change also were unaware of the need to use get_task_comm() to avoid a
racy writer. I don't think there's any code in the kernel that is ok with
corrupted task names being printed: those messages are usually important.
> So I'm not proposing comm be totally lock free (Dave Hansen might do
> that for me, we'll see :) but when the original patch was proposed, the
> idea that transient empty or incomplete comms would be possible was
> brought up and didn't seem to be a big enough issue at the time to block
> it from being merged.
>
I'm not really interested in the discussion that happened at the time, I'm
concerned about racy readers of any thread's comm that result in corrupted
strings being printed or used in the kernel.
> Its just having a more specific case where these transient
> null/incomplete comms causes an issue would help prioritize the need for
> correctness.
>
It doesn't seem like there was any due diligence to ensure other code
wasn't broken. When comm could only be changed by prctl(), we needed no
protection for current->comm and so code naturally will reference it
directly. Since that's now changed, no audit was done to ensure the 300+
references throughout the tree doesn't require non-racy reads.
> In the meantime, I'll put some effort into trying to protect unlocked
> current->comm acccess using get_task_comm() where possible. Won't happen
> in a day, and help would be appreciated.
>
We need to stop protecting ->comm with ->alloc_lock since it is used for
other members of task_struct that may or may not be held in a function
that wants to read ->comm. We should probably introduce a seqlock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists