linux-kernel - Re: [PATCH 2/2] iommu/amd: use handle_mm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54C4ECBC.5070301@amd.com>
Date:	Sun, 25 Jan 2015 15:16:44 +0200
From:	Oded Gabbay <oded.gabbay@....com>
To:	Jesse Barnes <jbarnes@...tuousgeek.org>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"jroedel@...e.de" <jroedel@...e.de>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	<iommu@...ts.linux-foundation.org>,
	"Bridgman, John" <John.Bridgman@....com>,
	"Elifaz, Dana" <Dana.Elifaz@....com>
Subject: Re: [PATCH 2/2] iommu/amd: use handle_mm_fault directly v2

On 11/13/2014 12:10 AM, Jesse Barnes wrote:
> This could be useful for debug in the future if we want to track
> major/minor faults more closely, and also avoids the put_page trick we
> used with gup.
>
> In order to do this, we also track the task struct in the PASID state
> structure.  This lets us update the appropriate task stats after the
> fault has been handled, and may aid with debug in the future as well.
>
> v2: drop task accounting; GPU activity may have been submitted by a
>      different thread than the one binding the PASID (Joerg)
>
> Tested-by: Oded Gabbay<oded.gabbay@....com>
> Signed-off-by: Jesse Barnes<jbarnes@...tuousgeek.org>

Hi Jesse,

I know I tested your patch a few months ago, but we have a new feature (still 
internally) in the driver, which has some conflicts with this patch.

Our feature is basically doing "exception handling" by registering a callback 
function with the iommu driver in inv_ppr_cb.

Now, with the old code (we used 3.17.2 until a few days ago), this callback 
function was called in, at least, three use-cases (which we are testing):

(1) Writing to a "bad" system memory address, which is *not* in the process's 
memory address space.

(2) Writing to a read-only page, which is inside the process's memory address space

(3) Reading from a page without permissions, which is inside the process's 
memory address space

With the new code (3.19-rc5), this callback is only called in the first 
use-case, while (2) and (3) are handled in handle_mm_fault(), which is now 
called from do_fault. The return value of handle_mm_fault() is 0, so 
handle_fault_error() is not called and amdkfd doesn't get notification, hence 
our test fails.

This is a problem for us as we want to propagate these exceptions to the user 
space HSA runtime, so it could handle them.

I have 2 questions:

1. Why don't we call inv_ppr_cb() in any case ?
2. How come handle_mm_fault() returns 0 in cases (2) and (3) ? Or in other 
words, what is considered to be a success in handle_mm_fault() and is it visible 
to the user-space process ?

Thanks,

	Oded
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/