lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 20 Sep 2011 11:16:09 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Christoph Lameter <cl@...two.org>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Christoph Lameter <cl@...ux.com>
Subject: Re: [RFC][PATCH 2/5] mm: Switch mod_state() to __this_cpu_read()

On Tue, 2011-09-20 at 09:46 -0500, Christoph Lameter wrote:

> The effects on just reading are not that severe. We may calculate the
> address of the per cpu variable on one cpu and then be migrated to another
> processor. Then there is an access to the per cpu area of another
> processor which in many cases is harmless and will just cause the
> cacheline to go from exclusive mode into shared mode on the other cpu. The
> next write by the other cpu then brings it back to exclusive mode evicting
> the cacheline from the cpu we migrated to.

Then what protects another process from touching that same CPU variable?
Looks very error prone to me.

> 
> The same conventions are also used for RMV instructions like this_cpu_inc.
> In that case the situation is much more sever. Now we are locklessly
> incrementing a counter in the per cpu area of another cpu. That is a race
> condition that can corrupt the counter data.

If you read a per cpu variable with simple this_cpu_read() and then make
a decision based on that variable, it may be useless in the process if
you migrate.

> 
> > The point is, most places use this_cpu_read(). Adding a "__" or "raw_"
> > prefix usually means that less checks are done. Think of the
> > copy_from_user(). You have __copy_from_user() that does less checks.
> 
> Why do you need any checks there? this_cpu_xxx operations should take care
> of preemption.

NO IT DOES NOT!!!! Look at the damn memcg bug and figure it out!

> 
> > Most of the per_cpu() and get_cpu_var() code was blindly replaced with
> > the this_cpu_*() variants. The original code had preemption disabled
> > checks. Now they don't. Which means if you use the this_cpu_read() and
> > then make some decision on it, that read may be useless and buggy.
> >
> > We have smp_processor_id() that does the check (and the old per_cpu()
> > and get_cpu_var()). Now we have this_cpu_read() which replaced most of
> > the per_cpu() code in the kernel, making them vulnerable to bugs when
> > preemption is enabled.
> 
> this_cpu_xx can only replace a get_cpu_var/per_cpu section where a single
> word is read or updated. If multiple per cpu variables are accessed and
> modified in one go then it is better to keep disabling preemption for
> the whole section and use the __this_cpu_xx operations.

And that is what __get_cpu_var() is for.

> 
> If there are cases where this was not done correctly then lets fix those.

Lets not let this happen in the first place! There's absolutely no
advantage of not disabling preemption for a period of time while per cpu
variables are being used. If we annotate this properly, then the -rt
kernel could handle this in other ways too, to not keep the latency of
preemption off too long.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ