lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c93c7153-f062-df46-5ff4-416b14ff1e92@netscape.net>
Date:   Tue, 19 Jul 2022 09:16:32 -0400
From:   Chuck Zmudzinski <brchuckz@...scape.net>
To:     Thorsten Leemhuis <regressions@...mhuis.info>,
        Juergen Gross <jgross@...e.com>,
        xen-devel@...ts.xenproject.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        Borislav Petkov <bp@...en8.de>
Cc:     jbeulich@...e.com, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        "# 5 . 17" <stable@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Pavel Machek <pavel@....cz>, Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: [PATCH 0/3] x86: make pat and mtrr independent from each other

On 7/18/2022 7:32 AM, Chuck Zmudzinski wrote:
> On 7/17/2022 3:55 AM, Thorsten Leemhuis wrote:
> > Hi Juergen!
> >
> > On 15.07.22 16:25, Juergen Gross wrote:
> > > Today PAT can't be used without MTRR being available, unless MTRR is at
> > > least configured via CONFIG_MTRR and the system is running as Xen PV
> > > guest. In this case PAT is automatically available via the hypervisor,
> > > but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> > > 
> > > As an additional complexity the availability of PAT can't be queried
> > > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> > > to be disabled. This leads to some drivers believing that not all cache
> > > modes are available, resulting in failures or degraded functionality.
> > > 
> > > The same applies to a kernel built with no MTRR support: it won't
> > > allow to use the PAT MSR, even if there is no technical reason for
> > > that, other than setting up PAT on all cpus the same way (which is a
> > > requirement of the processor's cache management) is relying on some
> > > MTRR specific code.
> > > 
> > > Fix all of that by:
> > > 
> > > - moving the function needed by PAT from MTRR specific code one level
> > >   up
> > > - adding a PAT indirection layer supporting the 3 cases "no or disabled
> > >   PAT", "PAT under kernel control", and "PAT under Xen control"
> > > - removing the dependency of PAT on MTRR
> >
> > Thx for working on this. If you need to respin these patches for one
> > reason or another, could you do me a favor and add proper 'Link:' tags
> > pointing to all reports about this issue? e.g. like this:
> >
> >  Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
> >
> > These tags are considered important by Linus[1] and others, as they
> > allow anyone to look into the backstory weeks or years from now. That is
> > why they should be placed in cases like this, as
> > Documentation/process/submitting-patches.rst and
> > Documentation/process/5.Posting.rst explain in more detail. I care
> > personally, because these tags make my regression tracking efforts a
> > whole lot easier, as they allow my tracking bot 'regzbot' to
> > automatically connect reports with patches posted or committed to fix
> > tracked regressions.
> >
> > [1] see for example:
> > https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com/
> >
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> >
>
> I echo Thorsten's thx for starting on this now instead of waiting until
> September which I think is when Juergen said he could start working
> on this last week. I agree with Thorsten that Link tags are needed.
> Since multiple patches have been proposed to fix this regression,
> perhaps a Link to each proposed patch, and a note that
> the original report identified a specific commit which when reverted
> also fixes it. IMO, this is all part of the backstory Thorsten refers to.
>
> It looks like with this approach, a fix will not be coming real soon,
> and Borislav Petkov also discouraged me from testing this
> patch set until I receive a ping telling me it is ready for testing,
> which seems to confirm that this regression will not be fixed
> very soon. Please correct me if I am wrong about how long
> it will take to fix it with this approach.
>
> Also, is there any guarantee this approach is endorsed by
> all the maintainers who will need to sign-off, especially
> Linus? I say this because some of the discussion on the
> earlier proposed patches makes me doubt this. I am especially
> referring to this discussion:
>
> https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898a28@leemhuis.info/
>
> and also, here:
>
> https://lore.kernel.org/lkml/YsRjX%2FU1XN8rq+8u@zn.tnic/
>
> where Borislav Petkov argues that Linux should not be
> patched at all to fix this regression but instead the fix
> should come by patching the Xen hypervisor.
>
> So I have several questions, presuming at least the fix is going
> to be delayed for some time, and also presuming this approach
> is not yet an approach that has the blessing of the maintainers
> who will need to sign-off:
>
> 1. Can you estimate when the patch series will be ready for
> testing and suitable for a prepatch or RC release?
>
> 2. Can you estimate when the patch series will be ready to be
> merged into the mainline release? Is there any hope it will be
> fixed before the next longterm release hosted on kernel.org?
>
> 3. Since a fix is likely not coming soon, can you explain
> why the commit that was mentioned in the original
> report cannot be reverted as a temporary solution while
> we wait for the full fix to come later? I can say that
> reverting that commit (It was a commit affecting
> drm/i915) does fix the issue on my system with no
> negative side effects at all. In such a case, it seems
> contrary to Linus' regression rule to not revert the
> offending commit, even if reverting the offending
> commit is not going to be the final solution. IOW,
> I am trying to argue that an important corollary to
> the Linus regression rule is that we revert commits
> that introduce regressions, especially when there
> are no negative effects when reverting the offending
> commit. Why are we not doing that in this case?
>
> 4. Can you explain why this patch series is superior
> to the other proposed patches that are much more
> simple and have been reported to fix the regression?
>
> 5. This approach seems way too aggressive for backporting
> to the stable releases. Is that correct? Or, will the patches
> be backported to the stable releases? I was told that
> backports to the stable releases are needed to keep things
> consistent across all the supported versions when I submitted
> a patch to fix this regression that identified a specific five year
> old commit that my proposed patch would fix.
>
> Remember, this is a regression that is really bothering
> people now. For example, I am now in a position where
> I cannot install the updates of the Linux kernel that Debian
> pushes out to me without patching the kernel with my
> own private build that has one of the known fixes that
> have already been identified as ways to workaround this
> regression while we wait for the full solution that will
> hopefully come later.
>
> Chuck
>
> > P.S.: As the Linux kernel's regression tracker I deal with a lot of
> > reports and sometimes miss something important when writing mails like
> > this. If that's the case here, don't hesitate to tell me in a public
> > reply, it's in everyone's interest to set the public record straight.
> >
> > BTW, let me tell regzbot to monitor this thread:
> >
> > #regzbot ^backmonitor:
> > https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
>

OK, the comments Boris made on the individual patches of
this patch set answers most of my questions. Thx, Boris.

Chuck

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ