lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090615085209.GB30964@wotan.suse.de>
Date:	Mon, 15 Jun 2009 10:52:09 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Pekka Enberg <penberg@...helsinki.fi>
Cc:	Heiko Carstens <heiko.carstens@...ibm.com>,
	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, cl@...ux-foundation.org,
	kamezawa.hiroyu@...fujitsu.com, lizf@...fujitsu.com, mingo@...e.hu,
	yinghai@...nel.org, benh@...nel.crashing.org
Subject: Re: [GIT PULL v2] Early SLAB fixes for 2.6.31

Hi Pekka,

On Mon, Jun 15, 2009 at 11:32:59AM +0300, Pekka Enberg wrote:
> Hi Nick,
> 
> On Mon, 2009-06-15 at 10:26 +0200, Nick Piggin wrote:
> > > 
> > > Might be that I missed something. Maybe some special flag?
> > 
> > No, just a bug in the conversion.
> > 
> > If you predicate the schedule_work call on slab_state == SYSFS, then
> > it should work (when sysfs comes up later in init, previously added
> > slabs will be registered with sysfs).
> > 
> > Oh, and you'd need to also not pass __SYSFS_ADD_DEFERRED into
> > kmem_cache_create in that case too.
> 
> I am not sure I follow you here. We are setting up slab so early that we
> absolutely _must_ defer sysfs setup. But we're also setting up slab much
> earlier than workqueues, so we shouldn't really do schedule_work() at
> that point.

There's 2 types of deferring going on here. In early boot we
obviously don't touch sysfs at all. All other times, we still
need to defer because we don't know the context it is being
called from.


> Furthermore, early boot cache sysfs setup is explicitly
> handled in slab_sysfs_init() so I think we need something like the patch
> below?

I don't think so because keventd probably comes up before sysfs
still is ready (in this respect the existing pre-early-slab code
is probably still broken but just never done such early DMA
allocations). So it just makes sense to do it all via the slab_state
flag.

Also, if you keep the deferred flag on there for early created
slabs, then the next sysfs_add_work will try to add it again
(after slab_sysfs_init already added it).

Here

---
 mm/slub.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c
+++ linux-2.6/mm/slub.c
@@ -2596,6 +2596,7 @@ static noinline struct kmem_cache *dma_k
 	struct kmem_cache *s;
 	char *text;
 	size_t realsize;
+	unsigned long slabflags;
 
 	s = kmalloc_caches_dma[index];
 	if (s)
@@ -2617,9 +2618,18 @@ static noinline struct kmem_cache *dma_k
 			 (unsigned int)realsize);
 	s = kmalloc(kmem_size, flags & ~SLUB_DMA);
 
+	/*
+	 * Must defer sysfs creation to a workqueue because we don't know
+	 * what context we are called from. Before sysfs comes up, we don't
+	 * need to do anything because slab_sysfs_init will start by
+	 * adding all existing slabs to sysfs.
+	 */
+	slabflags = SLAB_CACHE_DMA;
+	if (slab_state >= SYSFS)
+		slabflags |= __SYSFS_ADD_DEFERRED;
+
 	if (!s || !text || !kmem_cache_open(s, flags, text,
-			realsize, ARCH_KMALLOC_MINALIGN,
-			SLAB_CACHE_DMA|__SYSFS_ADD_DEFERRED, NULL)) {
+			realsize, ARCH_KMALLOC_MINALIGN, slabflags, NULL)) {
 		kfree(s);
 		kfree(text);
 		goto unlock_out;
@@ -2628,7 +2638,8 @@ static noinline struct kmem_cache *dma_k
 	list_add(&s->list, &slab_caches);
 	kmalloc_caches_dma[index] = s;
 
-	schedule_work(&sysfs_add_work);
+	if (slab_state >= SYSFS)
+		schedule_work(&sysfs_add_work);
 
 unlock_out:
 	up_write(&slub_lock);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ