lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Jun 2022 11:00:56 +0200
From:   Mauro Carvalho Chehab <mauro.chehab@...ux.intel.com> (by way of
        Mauro Carvalho Chehab <mauro.chehab@...ux.intel.com>)
To:     Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
Cc:     Andi Shyti <andi.shyti@...ux.intel.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Chris Wilson <chris.p.wilson@...el.com>,
        Fei Yang <fei.yang@...el.com>,
        Thomas Hellstrom <thomas.hellstrom@...el.com>,
        Bruce Chang <yu.bruce.chang@...el.com>,
        Daniel Vetter <daniel@...ll.ch>,
        Dave Airlie <airlied@...hat.com>,
        David Airlie <airlied@...ux.ie>,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        John Harrison <John.C.Harrison@...el.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Matt Roper <matthew.d.roper@...el.com>,
        Matthew Brost <matthew.brost@...el.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@...el.com>,
        Umesh Nerlige Ramappa <umesh.nerlige.ramappa@...el.com>,
        dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org,
        Mika Kuoppala <mika.kuoppala@...ux.intel.com>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        stable@...r.kernel.org,
        Thomas Hellström 
        <thomas.hellstrom@...ux.intel.com>
Subject: Re: [PATCH 5/6] drm/i915/gt: Serialize GRDOM access between
 multiple engine resets

Hi Tvrtko,

On Fri, 24 Jun 2022 09:34:21 +0100
Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com> wrote:

> On 23/06/2022 12:17, Andi Shyti wrote:
> > Hi Mauro,
> > 
> > On Wed, Jun 15, 2022 at 04:27:39PM +0100, Mauro Carvalho Chehab wrote:  
> >> From: Chris Wilson <chris.p.wilson@...el.com>
> >>
> >> Don't allow two engines to be reset in parallel, as they would both
> >> try to select a reset bit (and send requests to common registers)
> >> and wait on that register, at the same time. Serialize control of
> >> the reset requests/acks using the uncore->lock, which will also ensure
> >> that no other GT state changes at the same time as the actual reset.
> >>
> >> Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
> >>
> >> Reported-by: Mika Kuoppala <mika.kuoppala@...ux.intel.com>
> >> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
> >> Cc: Mika Kuoppala <mika.kuoppala@...ux.intel.com>
> >> Cc: Andi Shyti <andi.shyti@...el.com>
> >> Cc: stable@...r.kernel.org
> >> Acked-by: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
> >> Signed-off-by: Mauro Carvalho Chehab <mchehab@...nel.org>  
> > 
> > Reviewed-by: Andi Shyti <andi.shyti@...ux.intel.com>  
> 
> Notice I had a bunch of questions and asks in this series so please do 
> not merge until those are addressed.
> 
> In this particular patch (and some others) for instance Fixes: tag, at 
> least against that sha, shouldn't be there.

Hmm... I sent an answer to your points, but I can't see it at:

	https://lore.kernel.org/all/160e613f-a0a8-18ff-5d4b-249d4280caa8@linux.intel.com/

Maybe it got lost somewhere, I dunno.

Yeah, indeed the fixes tag on patch 5/6 should be removed as this is not
directly related to changeset 7938d61591d3. Yet, this one is required for
patch 6 to work.

The other patches on this series, though, are modifying the code 
introduced by changeset 7938d61591d3.

Patch 2 is clearly a workaround needed for TLB cache invalidation to
work on some GPUs. So, while not related to Broadwell, they're also
fixing some TLB cache issues. So, IMO, it should keep the fixes.

I tried to port just the two serialize patches to drm-tip, in order
to solve the issues on Broadwell, but it didn't work, as the logic 
inside the spinlock could be calling schedule() with a spinlock hold:
 
	Jun 14 17:38:48 silver kernel: [   23.227813] BUG: sleeping function called from invalid context at drivers/gpu/drm/i915/intel_uncore.c:2496
	Jun 14 17:38:48 silver kernel: [   23.227816] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 37, name: kworker/u8:1
	Jun 14 17:38:48 silver kernel: [   23.227818] preempt_count: 1, expected: 0
	Jun 14 17:38:48 silver kernel: [   23.227819] RCU nest depth: 0, expected: 0
	Jun 14 17:38:48 silver kernel: [   23.227820] 5 locks held by kworker/u8:1/37:
	Jun 14 17:38:48 silver kernel: [   23.227822]  #0: ffff88811159b538 ((wq_completion)i915){+.+.}-{0:0}, at: process_one_work+0x1e0/0x580
	Jun 14 17:38:48 silver kernel: [   23.227831]  #1: ffffc90000183e60 ((work_completion)(&(&i915->mm.free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1e0/0x580
	Jun 14 17:38:48 silver kernel: [   23.227837]  #2: ffff88811b34c5e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: __i915_gem_free_objects+0xba/0x210 [i915]
	Jun 14 17:38:48 silver kernel: [   23.228283]  #3: ffff88810a66c2d8 (&gt->tlb_invalidate_lock){+.+.}-{3:3}, at: intel_gt_invalidate_tlbs+0xe7/0x4d0 [i915]
	Jun 14 17:38:48 silver kernel: [   23.228663]  #4: ffff88810a668f28 (&uncore->lock){-.-.}-{2:2}, at: intel_gt_invalidate_tlbs+0x115/0x4d0 [i915]

I didn't investigate the root cause, but it seems related to PM, so 
patches 1 and 3 seem to be required for the serialization logic
to actually work.

So, I would keep the Fixes: tag mentioning changeset 7938d61591d3
on patches: 1, 2, 3 and 6.

Yet, IMO the entire series should be merged on -stable.

If that's OK for you and there's no additional issues to be
addressed, I'll submit a v2 of this series removing the Fixes tag
from patches 4 and 5.

Regards,
Mauro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ