lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 18 Oct 2022 13:32:35 +0300
From:   Ville Syrjälä <ville.syrjala@...ux.intel.com>
To:     Hans de Goede <hdegoede@...hat.com>
Cc:     Jani Nikula <jani.nikula@...ux.intel.com>,
        Thorsten Leemhuis <regressions@...mhuis.info>,
        intel-gfx <intel-gfx@...ts.freedesktop.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-gfx] alderlake crashes (random memory corruption?) with
 6.0 i915 / ucode related

On Mon, Oct 17, 2022 at 04:32:28PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 10/17/22 15:35, Jani Nikula wrote:
> > On Mon, 17 Oct 2022, Hans de Goede <hdegoede@...hat.com> wrote:
> >> Hi,
> >>
> >> On 10/17/22 13:19, Thorsten Leemhuis wrote:
> >>> CCing the regression mailing list, as it should be in the loop for all
> >>> regressions, as explained here:
> >>> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
> >>
> >> Yes sorry about that I meant to Cc the regressions list, not you personally,
> >> but the auto-completion picked the wrong address-book entry
> >> (and I did not notice this).
> >>
> >>> On 17.10.22 12:48, Hans de Goede wrote:
> >>>> On 10/17/22 10:39, Jani Nikula wrote:
> >>>>> On Mon, 17 Oct 2022, Jani Nikula <jani.nikula@...ux.intel.com> wrote:
> >>>>>> On Thu, 13 Oct 2022, Hans de Goede <hdegoede@...hat.com> wrote:
> >>>>>>> With 6.0 the following WARN triggers:
> >>>>>>> drivers/gpu/drm/i915/display/intel_bios.c:477:
> >>>>>>>
> >>>>>>>         drm_WARN(&i915->drm, min_size == 0,
> >>>>>>>                  "Block %d min_size is zero\n", section_id);
> >>>>>>
> >>>>>> What's the value of section_id that gets printed?
> >>>>>
> >>>>> I'm guessing this is [1] fixed by commit d3a7051841f0 ("drm/i915/bios:
> >>>>> Use hardcoded fp_timing size for generating LFP data pointers") in
> >>>>> v6.1-rc1.
> >>>>>
> >>>>> I don't think this is the root cause for your issues, but I wonder if
> >>>>> you could try v6.1-rc1 or drm-tip and see if we've fixed the other stuff
> >>>>> already too?
> >>>>
> >>>> 6.1-rc1 indeed does not trigger the drm_WARN and for now (couple of
> >>>> reboots, running for 5 minutes now) it seems stable. 6.0.0 usually
> >>>> crashed during boot (but not always).
> >>>>
> >>>> Do you think it would be worthwhile to try 6.0.0 with d3a7051841f0 ?
> >>
> >> So I have been trying 6.0.0 with d3a7051841f0 doing a whole bunch of
> >> reboots + general use and that seems stable, then I reverted it and
> >> the very first boot of the kernel with that broke again, so I'm
> >> pretty sure that d3a7051841f0 fixes things.
> >>
> >> So d3a7051841f0 seems to do more then just fix the WARN().
> > 
> > Wow, so I guess we do screw up the parsing royally then. :o
> 
> I'm running the kernel with lockdep + list-debugging enabled and
> I could not reproduce this (not easily at least) on a standard
> Fedora 6.0.0 build without that. So maybe the parsing just manages
> to write out of binds a tiny bit which just happens to hit a list_head
> somewhere ... ?

We don't parse any of the LFP data stuff if we didn't manage
to generate the data ptrs. So can't really see how that would
happen. Another theory might be that something else gets
screwed up if we fail to parse anything, but can't really
think how that would lead to list corruption either.

> 
> Either way things look stable with d3a7051841f0 and it turns out
> that Fedora already had that cherry-picked downstream in the
> 5.19.13 kernel which was stable for me too.
> 
> >> So lets try to get d3a7051841f0 added to the official stable series
> >> ASAP (I just noticed that Mark Pearson from Lenovo has already added it
> >> to Fedora's 6.0.2 build.
> > 
> > I think I'd also pick d3a7051841f0^ i.e. both commits:
> > 
> > d3a7051841f0 ("drm/i915/bios: Use hardcoded fp_timing size for generating LFP data pointers")
> > 4e78d6023c15 ("drm/i915/bios: Validate fp_timing terminator presence")
> > 
> > for stable.

Ack from me.

> 
> That sounds good, can you take care of submitting these to gkh ?
> 
> Regards,
> 
> Hans

-- 
Ville Syrjälä
Intel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ