[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200714170234.GD2080@chrisdown.name>
Date: Tue, 14 Jul 2020 18:02:34 +0100
From: Chris Down <chris@...isdown.name>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Borislav Petkov <bp@...en8.de>, linux-kernel@...r.kernel.org,
sean.j.christopherson@...el.com, torvalds@...ux-foundation.org,
x86@...nel.org, kernel-team@...com,
Matthew Garrett <matthewgarrett@...gle.com>
Subject: Re: [PATCH -v2.1] x86/msr: Filter MSR writes
Luck, Tony writes:
>On Tue, Jul 14, 2020 at 05:04:48PM +0100, Chris Down wrote:
>> Borislav Petkov writes:
>> > On Tue, Jul 14, 2020 at 01:19:55PM +0100, Chris Down wrote:
>> > > That is, even with pr_err_ratelimited, we still end up logging on basically
>> > > every single write, even though it's from the same TGID writing to the same
>> > > MSRs, and end up becoming >80% of kmsg.
>> > >
>> > > Of course, one can boot with `allow_writes=1` to avoid these messages at
>> >
>> > Yes, use that.
>> >
>> > From a quick scan over that "tool" you pointed me at, it pokes at some
>> > MSRs from userspace which the kernel *also* writes to and this is
>> > exactly what should not be allowed.
>>
>> I don't think we're in disagreement about that. My concern is strictly about
>> the amount of spam caused for some of those existing use cases during the
>> transition phase. People should know that their tools would break, but there
>> shouldn't be so many messages generated that it inevitably pushes other
>> useful information out of the kmsg buffer.
>
>Maybe we just need smarter filtering of warnings. It doesn't
>seem at all useful to warn for the same MSR 1000's of times.
>Maybe keep a count of warnings for each MSR and just stop
>all reports when reach a threshold?
That also a fine good solution, albeit more complex than just using the
existing custom ratelimit_state infrastructure. Doing so probably also means
we'd miss out on some of the other stuff that comes for free with it.
My only other concern with ratelimiting per-TGID or per-MSR was that the
ratelimit cache table could become unwieldy, but if we keep it simple by
limiting the size and not printing after we reach that, that sounds fine too.
Any solution which means that we avoid saturating kmsg for workloads which
currently twiddle MSRs sounds fine to me. People should know that we don't
support or encourage this, but it shouldn't be at the cost of potentially
pushing everything else out of the kmsg buffer.
Powered by blists - more mailing lists