[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d79c089-7b21-cf7f-66ea-078d44c5e007@nvidia.com>
Date: Thu, 21 May 2020 13:40:39 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Chris Wilson <chris@...is-wilson.co.uk>,
Andrew Morton <akpm@...ux-foundation.org>
CC: Souptick Joarder <jrdr.linux@...il.com>,
Matthew Wilcox <willy@...radead.org>,
Jani Nikula <jani.nikula@...ux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
Matthew Auld <matthew.auld@...el.com>,
<intel-gfx@...ts.freedesktop.org>,
<dri-devel@...ts.freedesktop.org>,
LKML <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: Solved: [PATCH 0/4] mm/gup, drm/i915: refactor gup_fast, convert to
pin_user_pages()
On 2020-05-21 12:11, John Hubbard wrote:
> On 2020-05-21 11:57, Chris Wilson wrote:
>> Quoting John Hubbard (2020-05-19 01:21:20)
>>> This needs to go through Andrew's -mm tree, due to adding a new gup.c
>>> routine. However, I would really love to have some testing from the
>>> drm/i915 folks, because I haven't been able to run-time test that part
>>> of it.
>>
>> CI hit
>>
>> <4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699
>> internal_get_user_pages_fast+0x63a/0xac0
OK, what happened here is that it's WARN()'ing due to passing in the new
FOLL_FAST_ONLY flag, which was not added to the whitelist.
So the fix is easy, and should be applied to the refactoring patch. I'll
send out a v2 of the series, which will effectively have this applied:
diff --git a/mm/gup.c b/mm/gup.c
index 6cbe98c93466..4f0ca3f849d1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2696,7 +2696,8 @@ static int internal_get_user_pages_fast(unsigned long start,
int nr_pages,
int nr_pinned = 0, ret = 0;
if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
- FOLL_FORCE | FOLL_PIN | FOLL_GET)))
+ FOLL_FORCE | FOLL_PIN | FOLL_GET |
+ FOLL_FAST_ONLY)))
return -EINVAL;
start = untagged_addr(start) & PAGE_MASK;
>> <4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek
>> snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel
>> snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core
>> ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei
>> intel_lpss_pci prime_numbers
>> <4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G U
>> 5.7.0-rc5-CI-Patchwork_17704+ #1
>> <4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake
>> U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
>> <4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
>> <4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44
>> 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff
>> ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
>> <4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
>> <4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
>> <4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
>> <4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
>> <4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
>> <4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
>> <4> [185.667806] FS: 00007f419e875e40(0000) GS:ffff88849fe00000(0000)
>> knlGS:0000000000000000
>> <4> [185.667808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
>> <4> [185.667814] PKRU: 55555554
>> <4> [185.667816] Call Trace:
>> <4> [185.667912] ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.667918] ? mark_held_locks+0x49/0x70
>> <4> [185.667998] ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.668073] ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>>
>> and then panicked, across a range of systems.
>> -Chris
>>
btw, the panic seems to indicate an additional, pre-existing problem:
i915_gem_userptr_get_pages(), in this case at least, is not able to
recover from a get_user_pages/pin_user_pages failure.
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists