lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55427794.30808@hp.com>
Date:	Thu, 30 Apr 2015 14:42:28 -0400
From:	Waiman Long <waiman.long@...com>
To:	Jason Low <jason.low2@...com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Aswin Chandramouleeswaran <aswin@...com>,
	Scott J Norton <scott.norton@...com>
Subject: Re: [PATCH v2 2/5] sched, numa: Document usages of mm->numa_scan_seq

On 04/29/2015 02:45 PM, Jason Low wrote:
> On Wed, 2015-04-29 at 14:14 -0400, Waiman Long wrote:
>> On 04/28/2015 04:00 PM, Jason Low wrote:
>>> The p->mm->numa_scan_seq is accessed using READ_ONCE/WRITE_ONCE
>>> and modified without exclusive access. It is not clear why it is
>>> accessed this way. This patch provides some documentation on that.
>>>
>>> Signed-off-by: Jason Low<jason.low2@...com>
>>> ---
>>>    kernel/sched/fair.c |   12 ++++++++++++
>>>    1 files changed, 12 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 5a44371..794f7d7 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -1794,6 +1794,11 @@ static void task_numa_placement(struct task_struct *p)
>>>    	u64 runtime, period;
>>>    	spinlock_t *group_lock = NULL;
>>>
>>> +	/*
>>> +	 * The p->mm->numa_scan_seq gets updated without
>>> +	 * exclusive access. Use READ_ONCE() here to ensure
>>> +	 * that the field is read in a single access.
>>> +	 */
>>>    	seq = READ_ONCE(p->mm->numa_scan_seq);
>>>    	if (p->numa_scan_seq == seq)
>>>    		return;
>>> @@ -2107,6 +2112,13 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
>>>
>>>    static void reset_ptenuma_scan(struct task_struct *p)
>>>    {
>>> +	/*
>>> +	 * We only did a read acquisition of the mmap sem, so
>>> +	 * p->mm->numa_scan_seq is written to without exclusive access.
>>> +	 * That's not much of an issue though, since this is just used
>>> +	 * for statistical sampling. Use WRITE_ONCE and READ_ONCE, which
>>> +	 * are not expensive, to avoid load/store tearing.
>>> +	 */
>>>    	WRITE_ONCE(p->mm->numa_scan_seq, READ_ONCE(p->mm->numa_scan_seq) + 1);
>>>    	p->mm->numa_scan_offset = 0;
>>>    }
>> READ_ONCE followed by a WRITE_ONCE won't stop load/store tearing from
>> happening unless you use an atomic instruction to do the increment. So I
>> think your comment may be a bit misleading.
> Right, the READ and WRITE operations will still be done separately and
> won't be atomic. Here, we're saying that this prevents load/store
> tearing on each of those individual write/read operations. Please let me
> know if you prefer this to be worded differently.

I do have a question of what kind of tearing you are talking about. Do 
you mean the tearing due to mm being changed in the middle of the 
access? The reason why I don't like this kind of construct is that I am 
not sure if
the address translation p->mm->numa_scan_seq is being done once or 
twice. I looked at the compiled code and the translation is done only once.

Anyway, the purpose of READ_ONCE and WRITE_ONCE is not for eliminating 
data tearing. They are to make sure that the compiler won't compile away 
data access and they are done in the order they appear in the program. I 
don't think it is a good idea to associate tearing elimination with 
those macros. So I would suggest removing the last sentence in your comment.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ