[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4hFLnY4b8a7z+rWVeayHka4BLZyXse_ExSeRWuBRxjCwA@mail.gmail.com>
Date: Mon, 8 Feb 2021 17:03:25 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Kees Cook <keescook@...omium.org>
Cc: Jonathan Corbet <corbet@....net>, linux-cxl@...r.kernel.org,
Ben Widawsky <ben.widawsky@...el.com>,
Linux ACPI <linux-acpi@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>,
Linux PCI <linux-pci@...r.kernel.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Chris Browy <cbrowy@...ry-design.com>,
Ira Weiny <ira.weiny@...el.com>,
Jon Masters <jcm@...masters.org>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Rafael Wysocki <rafael.j.wysocki@...el.com>,
Randy Dunlap <rdunlap@...radead.org>,
Vishal Verma <vishal.l.verma@...el.com>,
daniel.lll@...baba-inc.com,
"John Groves (jgroves)" <jgroves@...ron.com>,
"Kelley, Sean V" <sean.v.kelley@...el.com>
Subject: Re: [PATCH 08/14] taint: add taint for direct hardware access
On Mon, Feb 8, 2021 at 3:36 PM Dan Williams <dan.j.williams@...el.com> wrote:
>
> On Mon, Feb 8, 2021 at 2:09 PM Kees Cook <keescook@...omium.org> wrote:
> >
> > On Mon, Feb 08, 2021 at 02:00:33PM -0800, Dan Williams wrote:
> > > [ add Jon Corbet as I'd expect him to be Cc'd on anything that
> > > generically touches Documentation/ like this, and add Kees as the last
> > > person who added a taint (tag you're it) ]
> > >
> > > Jon, Kees, are either of you willing to ack this concept?
> > >
> > > Top-posting to add more context for the below:
> > >
> > > This taint is proposed because it has implications for
> > > CONFIG_LOCK_DOWN_KERNEL among other things. These CXL devices
> > > implement memory like DDR would, but unlike DDR there are
> > > administrative / configuration commands that demand kernel
> > > coordination before they can be sent. The posture taken with this
> > > taint is "guilty until proven innocent" for commands that have yet to
> > > be explicitly allowed by the driver. This is different than NVME for
> > > example where an errant vendor-defined command could destroy data on
> > > the device, but there is no wider threat to system integrity. The
> > > taint allows a pressure release valve for any and all commands to be
> > > sent, but flagged with WARN_TAINT_ONCE if the driver has not
> > > explicitly enabled it on an allowed list of known-good / kernel
> > > coordinated commands.
> > >
> > > On Fri, Jan 29, 2021 at 4:25 PM Ben Widawsky <ben.widawsky@...el.com> wrote:
> > > >
> > > > For drivers that moderate access to the underlying hardware it is
> > > > sometimes desirable to allow userspace to bypass restrictions. Once
> > > > userspace has done this, the driver can no longer guarantee the sanctity
> > > > of either the OS or the hardware. When in this state, it is helpful for
> > > > kernel developers to be made aware (via this taint flag) of this fact
> > > > for subsequent bug reports.
> > > >
> > > > Example usage:
> > > > - Hardware xyzzy accepts 2 commands, waldo and fred.
> > > > - The xyzzy driver provides an interface for using waldo, but not fred.
> > > > - quux is convinced they really need the fred command.
> > > > - xyzzy driver allows quux to frob hardware to initiate fred.
> > > > - kernel gets tainted.
> > > > - turns out fred command is borked, and scribbles over memory.
> > > > - developers laugh while closing quux's subsequent bug report.
> >
> > But a taint flag only lasts for the current boot. If this is a drive, it
> > could still be compromised after reboot. It sounds like this taint is
> > really only for ephemeral things? "vendor shenanigans" is a pretty giant
> > scope ...
> >
>
> That is true. This is more about preventing an ecosystem / cottage
> industry of tooling built around bypassing the kernel. So the kernel
> complains loudly and hopefully prevents vendor tooling from
> propagating and instead directs that development effort back to the
> native tooling. However for the rare "I know what I'm doing" cases,
> this tainted kernel bypass lets some experimentation and debug happen,
> but the kernel is transparent that when the capability ships in
> production it needs to be a native implementation.
>
> So it's less, "the system integrity is compromised" and more like
> "you're bypassing the development process that ensures sanity for CXL
> implementations that may take down a system if implemented
> incorrectly". For example, NVME reset is a non-invent, CXL reset can
> be like surprise removing DDR DIMM.
>
> Should this be more tightly scoped to CXL? I had hoped to use this in
> other places in LIBNVDIMM, but I'm ok to lose some generality for the
> specific concerns that make CXL devices different than other PCI
> endpoints.
As I type this out it strikes me that plain WARN already does
TAINT_WARN and meets the spirit of what is trying to be achieved.
Appreciate the skeptical eye Kees, we'll drop this one.
Powered by blists - more mailing lists