linux-kernel - Re: [RFC][PATCH 0/5] Introduce checks for preemptable code for this_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1316536260.29966.93.camel@gandalf.stny.rr.com>
Date:	Tue, 20 Sep 2011 12:31:00 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Christoph Lameter <cl@...two.org>
Cc:	Valdis.Kletnieks@...edu, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [RFC][PATCH 0/5] Introduce checks for preemptable code for
 this_cpu_read/write()

On Tue, 2011-09-20 at 11:08 -0500, Christoph Lameter wrote:
> On Tue, 20 Sep 2011, Steven Rostedt wrote:
> 
> > On Tue, 2011-09-20 at 09:57 -0500, Christoph Lameter wrote:
> >
> > > this_cpu_xx functions are made for those locations that have
> > > preemption enabled. If you can use those function (classic case is a
> > > per cpu counter increment in the network subsystem) then you can avoid
> > > preempt disable/enable or get_cpu/put_cpu.
> >
> > If the variables are used for a very short time, then the latencies
> > introduced by a simple:
> >
> > 	var = get_cpu_var(my_var);
> > 	if (var)
> > 		do_something_quick();
> > 	put_cpu_var(my_var);
> >
> > Otherwise if that do_something_quick(); migrates, it may be doing
> > something it shouldn't be doing!
> 
> That is obviously a use case in which this_cpu_xx ops could not be used
> since we must stay on the same cpu. I think there is still a lot of
> confusion on your part. The example here shows it.

No, the problem is that you do not understand what I'm saying.

You are saying that you have other tricks to use per cpu data even
though that task that uses it can migrate to other cpus.

But the issue I have with this, is that this_cpu() is all about tricks.
The normal use case of per_cpu data is to disable preemption, use the
local variable and then enable preemption. That's the normal case and
should be what this_cpu_*() should be the default for.

But instead, you say this_cpu() is for data that doesn't care what CPU
it is at the time. It's really just as snapshot of the data. Which to me
seems to be a special use case that should not be the default. It is
confusing and hard to understand. Basically, you want __this_cpu_*() as
the normal case, which is ass backwards.

If your page allocator has special per_cpu tricks, then change the API
of it to something completely different to what the rest of the kernel
uses. Basic English shows to me that when I use "this_cpu_*()" I'm
getting a variable that is for the current CPU that the task is on, and
I should stay on that CPU. If you want something that is tricky, add
that to the name. Don't make everyone use this confusing "__this_cpu*()"
code which seems to be inconsistent in your own code too.

Have the "snapshot" version. This documents very nicely that the
variable that you read or modify is simply a one shot deal and not to
use it later. Don't use a common term like "this_cpu" and expect
everyone to understand that it should only be used for per cpu tricks.

I personally believe that we shouldn't even have a modification version
of the snapshot code. But perhaps it is fine for inc and dec and
cmpxchg. Put "snapshot" in the name to document that this is a hack to
make things work, and let all other users in the kernel use a name that
is not confusing.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/