lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20090605074246R.fujita.tomonori@lab.ntt.co.jp>
Date:	Fri, 5 Jun 2009 07:43:01 +0900
From:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
To:	just.for.lkml@...glemail.com
Cc:	jens.axboe@...cle.com, fujita.tomonori@....ntt.co.jp,
	bharrosh@...asas.com, hancockrwd@...il.com,
	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org
Subject: Re: sata_sil24 0000:04:00.0: DMA-API: device driver frees DMA sg
 list 	with different entry count [map count=13] [unmap count=10]

On Thu, 4 Jun 2009 20:07:36 +0200
Torsten Kaiser <just.for.lkml@...glemail.com> wrote:

> On Thu, Jun 4, 2009 at 9:53 AM, Jens Axboe <jens.axboe@...cle.com> wrote:
> > On Thu, Jun 04 2009, FUJITA Tomonori wrote:
> >> On Thu, 04 Jun 2009 10:15:14 +0300
> >> Boaz Harrosh <bharrosh@...asas.com> wrote:
> >>
> >> > On 06/04/2009 09:33 AM, FUJITA Tomonori wrote:
> >> > > On Thu, 4 Jun 2009 08:12:34 +0200
> >> > > Torsten Kaiser <just.for.lkml@...glemail.com> wrote:
> >> > >
> >> > >> On Thu, Jun 4, 2009 at 2:02 AM, FUJITA Tomonori
> >> > >> <fujita.tomonori@....ntt.co.jp> wrote:
> >> > >>> On Wed, 3 Jun 2009 21:30:32 +0200
> >> > >>> Torsten Kaiser <just.for.lkml@...glemail.com> wrote:
> >> > >>>> Still happens with 2.6.30-rc8 (see trace at the end of the email)
> >> > >>>>
> >> > >>>> As orig_n_elem is only used two times in libata-core.c I suspected a
> >> > >>>> corruption of the qc->sg, but adding checks for this did not trigger.
> >> > >>>> So I looked into lib/dma-debug.c.
> >> > >>>> It seems add_dma_entry() does not protect against adding the same
> >> > >>>> entry twice.
> >> > >>> Do you mean that add_dma_entry() doesn't protect against adding a new
> >> > >>> entry identical to the existing entry, right?
> >> > >> Yes, as I read the hash bucket code in lib/dma-debug.c a second entry
> >> > >> from the same device and the same address will just be added to the
> >> > >> list and on unmap it will always return the first entry.
> >> > >
> >> > > It means that two different DMA operations will be performed against
> >> > > the same dma addresss on the same device at the same time. It doesn't
> >> > > happen unless there is a bug in a driver, an IOMMU or somewhere, as I
> >> > > wrote in the previous mail.
> >> > >
> >> >
> >> > What about the draining buffers used by libata. Are they not the same buffer
> >> > for all devices for all requests?
> >>
> >> I'm not sure if the drain buffer is used like that. But is there
> >> easier ways to see the same buffer; e.g. sending the same buffer twice
> >> with DIO?
> >
> > I'm pretty sure we discussed this some months ago, the intel iommu
> > driver had a similar bug iirc. Lets say you want to write the same 4kb
> > block to two spots on the disk. You prepare and submit that with
> > O_DIRECT and using aio. On a device with NCQ, that could easily map the
> > same page twice. Or, perhaps more likely, doing 512b writes and not
> > getting all of them merged.
> 
> I have a even better theory: RAID1
> There are two disk on this sil24 controller that are uses as an RAID1
> to form my root partition.
> 
> That also fits the pattern of the very large number of duplicate dma
> mappings (as each data block needs to be written twice), but that the
> DMA-API debug check only triggers during heavier load: Most of the
> time both drives are in sync and so the write request should be
> idential, so it does not matter which entry gets returned from the
> hash bucket.
> But when I run 'updatedb' to trigger this error the read request
> disturb the pattern and the write requests also become asymetric.
> 
> >> As I wrote, I assume that he uses GART IOMMU;
> 
> [    0.010000] Checking aperture...
> [    0.010000] No AGP bridge found
> [    0.010000] Node 0: aperture @ a7f0000000 size 32 MB
> [    0.010000] Aperture beyond 4GB. Ignoring.
> [    0.010000] Your BIOS doesn't leave a aperture memory hole
> [    0.010000] Please enable the IOMMU option in the BIOS setup
> (sadly my BIOS does not have such an option...)
> [    0.010000] This costs you 64 MB of RAM
> [    0.010000] Mapping aperture over 65536 KB of RAM @ 20000000
> [    0.010000] Memory: 4057512k/4718592k available (4674k kernel code,
> 524868k absent, 136212k reserved
> , 2520k data, 1172k init)
> [snip]
> [    1.304386] DMA-API: preallocated 32768 debug entries
> [    1.309439] DMA-API: debugging enabled by kernel config
> [    1.310123] PCI-DMA: Disabling AGP.
> [    1.313711] PCI-DMA: aperture base @ 20000000 size 65536 KB
> [    1.320002] PCI-DMA: using GART IOMMU.
> [    1.323763] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
> [    1.330640] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31
> [    1.340007] hpet0: 3 comparators, 32-bit 25.000000 MHz counter

You use GART IOMMU. So I thought that you shouldn't hit this problem
because an IOMMU gives an unique dma address per dma mapping... but I
forgot one really important thing about GART, it's not real IOMMU
hardware. It does address remapping only when necessary (an address
can be accessed by a device). It's possible that you see multiple DMA
transfers performed against the same dma address on one device at the
same time.


> >> it allocates an unique
> >> dma address per dma mapping operation.
> >>
> >> However, dma-debug is broken wrt this, I guess.
> >
> > Seems so.
> 
> Yes, as the md code for RAID1 has a very good cause to send the same
> memory page twice to this device.

Yeah, now it's clear for me why you hit this bug.

I'm not sure there is any simple way to fix dma-debug wrt this. I
think that it's better to just disable it since 2.6.30 will be
released shortly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ