linux-kernel - Re: [PATCH] dma-debug: New interfaces to debug dma mapping errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1348082193.2707.64.camel@lorien2>
Date:	Wed, 19 Sep 2012 13:16:33 -0600
From:	Shuah Khan <shuah.khan@...com>
To:	Joerg Roedel <joerg.roedel@....com>
Cc:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Greg KH <greg@...ah.com>, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com, rob@...dley.net, akpm@...ux-foundation.org,
	bhelgaas@...gle.com, stern@...land.harvard.edu,
	LKML <linux-kernel@...r.kernel.org>, linux-doc@...r.kernel.org,
	devel@...uxdriverproject.org, x86@...nel.org, shuahkhan@...il.com
Subject: Re: [PATCH] dma-debug: New interfaces to debug dma mapping errors

On Wed, 2012-09-19 at 15:08 +0200, Joerg Roedel wrote:
> On Tue, Sep 18, 2012 at 01:42:49PM -0600, Shuah Khan wrote:
> > Are you ok with the system wide and per device error counts I added? Any
> > comments on the overall approach?
> 
> The general approach of having error counters is fine. But the addresses
> allocated/addresses checked thing should be done per allocation and not
> with counter comparison for several reasons:
> 
> 	1. When doing it per-allocation we know exactly which allocation
> 	   was not checked and can tell the driver developer. The code
> 	   saves stack-traces for that. This is much more useful than
> 	   telling the developer 'somewhere you do not check your
> 	   dma-handles'
Right. It would point directly the actual mapping instead of a blind
count.

> 
> 	2. Checking this per-allocation gives you the per-device and
> 	   also the per-driver checking you want.

Yes it would.
> 
> 	3. You don't need to change 'struct device' for that.

Right - heard from others as well on this one :)

> 
> There are more reasons, like that this approach fits a lot better to the
> general idea of the DMA-API debugging code.
> 
> > The approach you suggested will cover the cases where drivers fail to
> > check good map cases. We won't able to catch failed maps that get used
> > without checks. Are you not concerned about these cases? These could
> > cause a silent error with wild writes or could bring the system down. Or
> > are you recommending changing the infrastructure to track failed maps as
> > well?
> 
> It is fine to only check the good-map cases. Think about what
> DMA-debugging is good for: It is a tool for driver developers to find
> bugs in their code they wouldn't notice otherwise. An unchecked bad-map
> case is a bug they would notice otherwise. So if we check only the
> good-map cases and warn the driver developers about non-checked
> addresses they fix it and make the drivers more robust against failed
> allocations, fixing also the bad-map cases.

ok makes sense now that understand the scope of the dma-debug api. Here
is what I will do then, do checks on good maps. With that scope, there
is no need for another table.

> 
> > I am still pursuing a way to track failed map cases. I combined the flag
> > idea with one of the ideas I am looking into. Details below: (if this
> > sounds like a reasonable approach, I can do v2 patch and we can discuss
> > the code)
> 
> Why do you want to track the bad-map cases?

I am still concerned about data corruption type issues that will be hard
to debug and hoping having a error count might be an indicator. However,
I agree with what you said about not having the actual mapping
association is not very useful.

-- Shuah

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/