lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 27 May 2022 11:09:29 +0200
From:   Mauro Carvalho Chehab <mchehab@...nel.org>
To:     unlisted-recipients:; (no To-header on input)
Cc:     Mauro Carvalho Chehab <mchehab@...nel.org>,
        Andi Shyti <andi.shyti@...ux.intel.com>,
        Daniel Vetter <daniel@...ll.ch>,
        Daniele Ceraolo Spurio <daniele.ceraolospurio@...el.com>,
        David Airlie <airlied@...ux.ie>,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        John Harrison <John.C.Harrison@...el.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Lucas De Marchi <lucas.demarchi@...el.com>,
        Matt Roper <matthew.d.roper@...el.com>,
        Matthew Auld <matthew.auld@...el.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org, mauro.chehab@...ux.intel.com,
        Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
        Sushma Venkatesh Reddy <sushma.venkatesh.reddy@...el.com>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Dave Airlie <airlied@...hat.com>,
        Jon Bloomfield <jon.bloomfield@...el.com>,
        Jani Nikula <jani.nikula@...el.com>, stable@...r.kernel.org
Subject: [PATCH] drm/i915: don't flush TLB on GEN8

i915 selftest hangcheck is causing the i915 driver timeouts, as
reported by Intel CI:

	http://gfx-ci.fi.intel.com/cibuglog-ng/issuefilterassoc/24297?query_key=42a999f48fa6ecce068bc8126c069be7c31153b4

When such test runs, the only output is:

	[   68.811639] i915: Performing live selftests with st_random_seed=0xe138eac7 st_timeout=500
	[   68.811792] i915: Running hangcheck
	[   68.811859] i915: Running intel_hangcheck_live_selftests/igt_hang_sanitycheck
	[   68.816910] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
	[   68.841597] i915: Running intel_hangcheck_live_selftests/igt_reset_nop
	[   69.346347] igt_reset_nop: 80 resets
	[   69.362695] i915: Running intel_hangcheck_live_selftests/igt_reset_nop_engine
	[   69.863559] igt_reset_nop_engine(rcs0): 709 resets
	[   70.364924] igt_reset_nop_engine(bcs0): 903 resets
	[   70.866005] igt_reset_nop_engine(vcs0): 659 resets
	[   71.367934] igt_reset_nop_engine(vcs1): 549 resets
	[   71.869259] igt_reset_nop_engine(vecs0): 553 resets
	[   71.882592] i915: Running intel_hangcheck_live_selftests/igt_reset_idle_engine
	[   72.383554] rcs0: Completed 16605 idle resets
	[   72.884599] bcs0: Completed 18641 idle resets
	[   73.385592] vcs0: Completed 17517 idle resets
	[   73.886658] vcs1: Completed 15474 idle resets
	[   74.387600] vecs0: Completed 17983 idle resets
	[   74.387667] i915: Running intel_hangcheck_live_selftests/igt_reset_active_engine
	[   74.889017] rcs0: Completed 747 active resets
	[   75.174240] intel_engine_reset(bcs0) failed, err:-110
	[   75.174301] bcs0: Completed 525 active resets

After that, the machine just silently hangs.

The root cause is that the flush TLB logic is not working as
expected on GEN8.

Tested on an Intel NUC5i7RYB with an i7-5557U Broadwell CPU.

This patch partially reverts the logic by skipping GEN8 from
the TLB cache flush.

Cc: Tvrtko Ursulin <tvrtko.ursulin@...el.com>
Cc: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@...el.com>
Cc: Daniel Vetter <daniel.vetter@...ll.ch>
Cc: Dave Airlie <airlied@...hat.com>
Cc: Jon Bloomfield <jon.bloomfield@...el.com>
Cc: Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>
Cc: Jani Nikula <jani.nikula@...el.com>
Cc: stable@...r.kernel.org # Kernel 5.17 and upper

Fixes: 494c2c9b630e ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Mauro Carvalho Chehab <mchehab@...nel.org>
---

Patch resent, as the first version was using an old email. That's what happens
when writing patches on old test machines ;-)

 drivers/gpu/drm/i915/gt/intel_gt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 034182f85501..7965a77e5046 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1191,10 +1191,10 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
 	if (GRAPHICS_VER(i915) == 12) {
 		regs = gen12_regs;
 		num = ARRAY_SIZE(gen12_regs);
-	} else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) {
+	} else if (GRAPHICS_VER(i915) > 8 && GRAPHICS_VER(i915) <= 11) {
 		regs = gen8_regs;
 		num = ARRAY_SIZE(gen8_regs);
-	} else if (GRAPHICS_VER(i915) < 8) {
+	} else if (GRAPHICS_VER(i915) <= 8) {
 		return;
 	}
 
-- 
2.36.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ