lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Jun 2007 01:25:26 +0100 (BST)
From:	Hugh Dickins <hugh@...itas.com>
To:	Russell King <rmk+lkml@....linux.org.uk>
cc:	Christoph Lameter <clameter@....com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Nicolas Ferre <nicolas.ferre@....atmel.com>,
	ARM Linux Mailing List 
	<linux-arm-kernel@...ts.arm.linux.org.uk>,
	Linux Kernel list <linux-kernel@...r.kernel.org>,
	Marc Pignat <marc.pignat@...s.ch>,
	Andrew Victor <andrew@...people.com>,
	Pierre Ossman <drzeus@...eus.cx>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Oops in a driver while using SLUB as a SLAB allocator

On Sun, 24 Jun 2007, Russell King wrote:
> On Sun, Jun 24, 2007 at 11:24:16AM +0100, Hugh Dickins wrote:
> > On Sun, 24 Jun 2007, Russell King wrote:
> > > On Fri, Jun 22, 2007 at 07:39:33PM +0100, Hugh Dickins wrote:
> > > 
> > > Please forward the original problem report.
> > 
> > Done.
> 
> Okay, that seems to back up my suspicions - it's definitely AT91-based.
> Since AT91-based machines do not have a DMA coherent cache,
> arch_is_coherent() must be defined to '0'.  The only way that kmalloc
> could be reached is if that were defined to something other than '0',
> and if that's done on a machine with DMA incoherent caches, that will
> lead to data corruption.

Yes, having looked through that now, I agree with you 100%.

> 
> I think we need to wait for Nicolas to respond on this issue before
> running headlong into applying a sticky plaster for something which is
> actually a deeper issue.

No need for Nicolas to respond, I think I've found what's "wrong":
see below.

> 
> However, the arch_is_coherent() path _is_ buggy as it stands, but in
> more than the way identified thus far.  Eg, it doesn't set __GFP_DMA
> appropriately for various DMA masks, so it might return DMA inaccessible
> memory.

I expect you're right, but that's a separate issue.  I had thought
you were approving Christoph's ARM patch because both you and he seemed
to agree that kmalloc was inappropriate for use in dma_alloc_coherent,
whatever additional issues you saw with it.

I still don't see why kmalloc is wrong there myself: for a while
I bought Christoph's alignment argument, but now I don't see why
(more than long) alignment is important to it.  But I'm easily
wrong when it comes to DMA mapping issues.

> 
> If we're after a simple fix for 2.6.22, the _easiest_ solution would be
> to delete the entire arch_is_coherent() branches in arch/arm/mm/consistent.c;
> that will result in a working solution for everyone, albiet at a slightly
> lower performance for the DMA-coherent CPUs.

The fix for 2.6.22 is my PageSlab test in page_mapping which Linus
already put into -git.

And I now rather think that needs to stay, not be replaced by the
VM_BUG_ON Christoph was proposing for 2.6.23 (which earlier I acked).

Christoph responded to my page_mapping patch by looking at arch/arm,
and there finding a kmalloc in dma_alloc_coherent which he didn't
like; but you're right, it's entirely irrelevant to Nicolas' oops.

The slub allocation which gives rise to Nicolas' oops in not in
ARM, but (I'm guessing) in drivers/mmc/core/sd.c: one of those
	status = kmalloc(64, GFP_KERNEL);
where status is passed down for the response from mmc_sd_switch.

And what is wrong with using kmalloc there?
Why should that be changed to allocate a whole page?
How many other such cases might there be?

And the flush_dcache_page in at91mci_post_dma_read looks correct
to me too: it has just filled and perhaps also swabbed a buffer,
that buffer might in some cases be mapped into userspace, so it
calls flush_dcache_page.

In the kmalloc case it's not mapped into userspace: flush_dcache_page
should detect that and do nothing, as it does with slab; but slub was
reusing page->mapping for something else, so we oopsed.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ