[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180321063256.bdqcpvgb3auxzwzk@gmail.com>
Date: Wed, 21 Mar 2018 07:32:56 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
David Laight <David.Laight@...lab.com>,
Rahul Lakkireddy <rahul.lakkireddy@...lsio.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"hpa@...or.com" <hpa@...or.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"ganeshgr@...lsio.com" <ganeshgr@...lsio.com>,
"nirranjan@...lsio.com" <nirranjan@...lsio.com>,
"indranil@...lsio.com" <indranil@...lsio.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Fenghua Yu <fenghua.yu@...el.com>,
Eric Biggers <ebiggers3@...il.com>
Subject: Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> And even if you ignore that "maintenance problems down the line" issue
> ("we can fix them when they happen") I don't want to see games like
> this, because I'm pretty sure it breaks the optimized xsave by tagging
> the state as being dirty.
That's true - and it would penalize the context switch cost of the affected task
for the rest of its lifetime, as I don't think there's much that clears XINUSE
other than a FINIT, which is rarely done by user-space.
> So no. Don't use vector stuff in the kernel. It's not worth the pain.
I agree, but:
> The *only* valid use is pretty much crypto, and even there it has had issues.
> Benchmarks use big arrays and/or dense working sets etc to "prove" how good the
> vector version is, and then you end up in situations where it's used once per
> fairly small packet for an interrupt, and it's actually much worse than doing it
> by hand.
That's mainly because the XSAVE/XRESTOR done by kernel_fpu_begin()/end() is so
expensive, so this argument is somewhat circular.
IFF it was safe to just use the vector unit then vector unit based crypto would be
very fast for small buffer as well, and would be even faster for larger buffer
sizes as well. Saving and restoring up to ~1.5K of context is not cheap.
Thanks,
Ingo
Powered by blists - more mailing lists