[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bff3562-bb1e-04e6-6eca-8d9bc355f2eb@suse.com>
Date: Fri, 20 May 2022 11:41:01 +0200
From: Jan Beulich <jbeulich@...e.com>
To: Chuck Zmudzinski <brchuckz@...scape.net>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Jani Nikula <jani.nikula@...ux.intel.com>,
Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
xen-devel@...ts.xenproject.org, x86@...nel.org,
linux-kernel@...r.kernel.org, intel-gfx@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, Juergen Gross <jgross@...e.com>
Subject: Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode
availability
On 20.05.2022 10:30, Chuck Zmudzinski wrote:
> On 5/20/2022 2:59 AM, Chuck Zmudzinski wrote:
>> On 5/20/2022 2:05 AM, Jan Beulich wrote:
>>> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>>>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>>>
>>>>>> ... these uses there are several more. You say nothing on why
>>>>>> those want
>>>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>>>> and came to the conclusion that these all would also better
>>>>>> observe the
>>>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>>>> only predicate). In fact, as said in the description of my earlier
>>>>>> patch, in
>>>>>> my debugging I did find the use in i915_gem_object_pin_map() to be
>>>>>> the
>>>>>> problematic one, which you leave alone.
>>>>> Oh, I missed that one, sorry.
>>>> That is why your patch would not fix my Haswell unless
>>>> it also touches i915_gem_object_pin_map() in
>>>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>>>
>>>>> I wanted to be rather defensive in my changes, but I agree at least
>>>>> the
>>>>> case in arch_phys_wc_add() might want to be changed, too.
>>>> I think your approach needs to be more aggressive so it will fix
>>>> all the known false negatives introduced by bdd8b6c98239
>>>> such as the one in i915_gem_object_pin_map().
>>>>
>>>> I looked at Jan's approach and I think it would fix the issue
>>>> with my Haswell as long as I don't use the nopat option. I
>>>> really don't have a strong opinion on that question, but I
>>>> think the nopat option as a Linux kernel option, as opposed
>>>> to a hypervisor option, should only affect the kernel, and
>>>> if the hypervisor provides the pat feature, then the kernel
>>>> should not override that,
>>> Hmm, why would the kernel not be allowed to override that? Such
>>> an override would affect only the single domain where the
>>> kernel runs; other domains could take their own decisions.
>>>
>>> Also, for the sake of completeness: "nopat" used when running on
>>> bare metal has the same bad effect on system boot, so there
>>> pretty clearly is an error cleanup issue in the i915 driver. But
>>> that's orthogonal, and I expect the maintainers may not even care
>>> (but tell us "don't do that then").
>
> Actually I just did a test with the last official Debian kernel
> build of Linux 5.16, that is, a kernel before bdd8b6c98239 was
> applied. In fact, the nopat option does *not* break the i915 driver
> in 5.16. That is, with the nopat option, the i915 driver loads
> normally on both the bare metal and on the Xen hypervisor.
> That means your presumption (and the presumption of
> the author of bdd8b6c98239) that the "nopat" option was
> being observed by the i915 driver is incorrect. Setting "nopat"
> had no effect on my system with Linux 5.16. So after doing these
> tests, I am against the aggressive approach of breaking the i915
> driver with the "nopat" option because prior to bdd8b6c98239,
> nopat did not break the i915 driver. Why break it now?
Because that's, in my understanding, is the purpose of "nopat"
(not breaking the driver of course - that's a driver bug -, but
having an effect on the driver).
Jan
Powered by blists - more mailing lists