[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120919130859.GR2505@amd.com>
Date: Wed, 19 Sep 2012 15:08:59 +0200
From: Joerg Roedel <joerg.roedel@....com>
To: Shuah Khan <shuah.khan@...com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Greg KH <greg@...ah.com>, <tglx@...utronix.de>,
<mingo@...hat.com>, <hpa@...or.com>, <rob@...dley.net>,
<akpm@...ux-foundation.org>, <bhelgaas@...gle.com>,
<stern@...land.harvard.edu>, LKML <linux-kernel@...r.kernel.org>,
<linux-doc@...r.kernel.org>, <devel@...uxdriverproject.org>,
<x86@...nel.org>, <shuahkhan@...il.com>
Subject: Re: [PATCH] dma-debug: New interfaces to debug dma mapping errors
On Tue, Sep 18, 2012 at 01:42:49PM -0600, Shuah Khan wrote:
> Are you ok with the system wide and per device error counts I added? Any
> comments on the overall approach?
The general approach of having error counters is fine. But the addresses
allocated/addresses checked thing should be done per allocation and not
with counter comparison for several reasons:
1. When doing it per-allocation we know exactly which allocation
was not checked and can tell the driver developer. The code
saves stack-traces for that. This is much more useful than
telling the developer 'somewhere you do not check your
dma-handles'
2. Checking this per-allocation gives you the per-device and
also the per-driver checking you want.
3. You don't need to change 'struct device' for that.
There are more reasons, like that this approach fits a lot better to the
general idea of the DMA-API debugging code.
> The approach you suggested will cover the cases where drivers fail to
> check good map cases. We won't able to catch failed maps that get used
> without checks. Are you not concerned about these cases? These could
> cause a silent error with wild writes or could bring the system down. Or
> are you recommending changing the infrastructure to track failed maps as
> well?
It is fine to only check the good-map cases. Think about what
DMA-debugging is good for: It is a tool for driver developers to find
bugs in their code they wouldn't notice otherwise. An unchecked bad-map
case is a bug they would notice otherwise. So if we check only the
good-map cases and warn the driver developers about non-checked
addresses they fix it and make the drivers more robust against failed
allocations, fixing also the bad-map cases.
> I am still pursuing a way to track failed map cases. I combined the flag
> idea with one of the ideas I am looking into. Details below: (if this
> sounds like a reasonable approach, I can do v2 patch and we can discuss
> the code)
Why do you want to track the bad-map cases?
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists