lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 12 Mar 2008 18:17:03 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Christoph Lameter <clameter@....com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] Implement slub fastpath with sequence number

* Peter Zijlstra (peterz@...radead.org) wrote:
> On Tue, 2008-03-11 at 05:31 -0400, Mathieu Desnoyers wrote:
> > Here is a new version that works. tested on x86. tweaked the bitmasks
> > into unions to remove operations from the critical path, but I tried to
> > keep that clean. It applies on vm.git HEAD.
> > 
> > It allows the cmpxchg_local to detect object re-use by keeping a counter in the
> > freeoffset MSBs.
> > 
> > Whenever an object is freed in the cpu slab cache, the counter is incremented.
> > Whenever the alloc/free slow paths are modifying the offset or freebase, the
> > sequence counter is also incremented. It is used to make sure we know if
> > freebase has been modified in an interrupt nested over the fast path.
> 
> Is it (remotely) possible that the version will wrap giving the false
> impression that nothing has changed and thus falsely proceed with a
> wrong object?
> 

If we have exactly, on a 32 bits arch, 65536 memory alloc or free in the
slab we are dealing with done in interrupt and softirqs nested over the
cmpxchg_local loop, without ever giving the control back to check the
value, and that after we get the control back the offset within the slab
is exactly the same, then, yes, it's possible. In that case, we would
think it's ok to proceed with the object and we would have memory
corruption (object used twice or free object list corruption).

Given that the alloc fast path takes about 115 cycles on a 3GHz Pentium
4 for an alloc/free pair (65536 * 38.33ns = 2.5ms total), having this
scenario would mean that every other interrupt would not have been
serviced for 2.5ms. In that case, we would probably have other problems
to deal with. On 64 bits architectures, with 2^32 bits, we would have to
wait for about 160 seconds (approx.).

This is why I added a check to verify if the sequence number delta is
bigger than half of the number of bits we have to count it :

+#ifdef CONFIG_DEBUG_VM
+       /*
+        * Just to be paranoid : warn if we detect that enough free or
+        * slow paths nested on top of us to get the counter to go
+        * half-way to overflow. That would be insane to do that much
+        * allocations/free in interrupt handers, but check it anyway.
+        */
+       WARN_ON(result - old > -1UL >> 1);
+#endif

If we ever have a kernel which starts to behave weirdly and *could* be
unlucky and get nearer to overflow, this check would likely detect it.
Worse case I have seen so far on my stressed machine was a delta of 3.

> I would really prefer if we defer all this fast path fiddling until we
> have the cpu_ops in place, this all just makes the code utterly
> unreadable.
> 
> 

Even with cpu_ops in place, I think it would be safer to still disable
preemption in the fastpath. It would make sure a thread is not stopped
in the middle of the cmpxchg loop, a lot of activity happens, and later
on the thread is woken up. In this scenario, the 16 bits might not be
enough to keep track of allocations/frees in the slab.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ