lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1457734432.6393.199.camel@hpe.com>
Date:	Fri, 11 Mar 2016 15:13:52 -0700
From:	Toshi Kani <toshi.kani@....com>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	"Luis R. Rodriguez" <mcgrof@...nel.org>,
	Toshi Kani <toshi.kani@...com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Dave Airlie <airlied@...hat.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linux-arch@...r.kernel.org, X86 ML <x86@...nel.org>,
	Daniel Vetter <daniel.vetter@...el.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Borislav Petkov <bp@...en8.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andy Lutomirski <luto@...nel.org>,
	Brian Gerst <brgerst@...il.com>
Subject: Re: Overlapping ioremap() calls, set_memory_*() semantics

Hi Ingo,

My apology for the delay...

On Wed, 2016-03-09 at 10:15 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@....com> wrote:
> 
> > On Tue, 2016-03-08 at 13:16 +0100, Ingo Molnar wrote:
> > > * Toshi Kani <toshi.kani@....com> wrote:
> > > 
> > > > > So where is the problem? The memtype implementation and hence
> > > > > most ioremap() users are supposed to be safe. set_memory_*() APIs
> > > > > are supposed to be safe as well, as they too go via the memtype
> > > > > API.
> > > > 
> > > > Let me try to summarize...
> > > > 
> > > > The original issue Luis brought up was that drivers written to work
> > > > with MTRR may create a single ioremap range covering multiple cache
> > > > attributes since MTRR can overwrite cache attribute of a certain
> > > > range.  Converting such drivers with PAT-based ioremap interfaces,
> > > > i.e. ioremap_wc() and ioremap_nocache(), requires a separate
> > > > ioremap map for each cache attribute, which can be challenging as
> > > > it may result in overlapping ioremap ranges (in his term) with
> > > > different cache attributes.
> > > > 
> > > > So, Luis asked about 'sematics of overlapping ioremap()' calls.
> > > >  Hence, I responded that aliasing mapping itself is supported, but
> > > > alias with different cache attribute is not.  We have checks in
> > > > place to detect such condition.  Overlapping ioremap calls with a
> > > > different cache attribute either fails or gets redirected to the
> > > > existing cache attribute on x86.
> > > 
> > > Ok, fair enough!
> > > 
> > > So to go back to the original suggestion from Luis, I've quoted it,
> > > but with a s/overlapping/aliased substitution:
> > > 
> > > > I had suggested long ago then that one possible resolution was for
> > > > us to add an API that *enables* aliased ioremap() calls, and only
> > > > use it on select locations in the kernel. This means we only have
> > > > to convert a few users to that call to white list such semantics,
> > > > and by default we'd disable aliased calls. To kick things off -- is
> > > > this strategy agreeable for all other architectures?
> > > 
> > > I'd say that since the overwhelming majority of ioremap() calls are
> > > not aliased, ever, thus making it 'harder' to accidentally alias is
> > > probably a good idea.
> > 
> > Did you mean 'aliased' or 'aliased with different cache attribute'?
> >  The former check might be too strict.
> 
> I'd say even 'same attribute' aliasing is probably relatively rare.
> 
> And 'different but compatible cache attribute' is in fact more of a sign
> that the driver author does the aliasing for a valid _reason_: to have
> two different types of access methods to the same piece of physical
> address space...

Right.  So, if we change to fail ioremap() on aliased cases, it'd be easier
to start with the different attribute case first.  This case should be rare
enough that we can manage to identify such callers and make them use a new
API as necessary.  If we go ahead to fail any aliased cases, it'd be
challenging to manage without a regression or two.

> > > The memtype infrastructure of phyisical memory ranges in that case
> > > acts as a security measure, to avoid unintended (not just physically
> > > incompatible) aliasing. I suspect it would be helpful during driver
> > > development as well.
> > 
> > The memtype infrastructure does not track caller interfaces.  So, it
> > will check against to any map, i.e. kernel & user map.  I assume a
> > kernel map is created before user map, though.
> 
> I don't understand this, could you give me a specific example?

Multiple pfn map interfaces use the memtype.  So, when ioremap() detects an
aliased map in the memtype, it may be created by ioremap(),
remap_pfn_range(), or any pfn map that is tracked.  So, strictly speaking,
we cannot check against other ioremap().

> > > the other problem listed:
> > > 
> > > > As another problem case, set_memor_*() will not fail on MMIO even
> > > > though set_memor_*() is designed only for RAM.
> > > 
> > > So what does this mean exactly? Having WB caching on MMIO area is not
> > > good, but UC, WC and WB sure is still sensible in some cases, right?
> > 
> > I responded to Luis in other email that:
> > > Drivers use ioremap family with a right cache type when mapping MMIO
> > > ranges, ex. ioremap_wc().  They do not need to change the type to
> > > MMIO. RAM is different since it's already mapped with WB at boot-
> > > time. set_memory_*() allows us to change the type from WB, and put it
> > > back to WB.
> 
> So there's two different uses here:
> 
>  - set_memory_x() to set the caching attribute
>  - set_memory_x() to set any of the RWX access attributes
> 
> I'd in fact suggest we split these two uses to avoid confusion: keep 
> set_memory_x() APIs for the RWX access attributes uses, and introduce
> a new API that sets caching attributes:
> 
> 	set_cache_attr_wc()
> 	set_cache_attr_uc()
> 	set_cache_attr_wb()
> 	set_cache_attr_wt()
> 
> it has the same arguments, so it's basically just a rename initially.

I think the "set_memory_" prefix implies that their target is regular
memory only.

> And at that point we could definitely argue that set_cache_attr_*() APIs
> should probably generate a warning for _RAM_, because they mostly make
> sense for MMIO type of physical addresses, right? Regular RAM should
> always be WB.
> 
> Are there cases where we change the caching attribute of RAM for valid
> reasons, outside of legacy quirks?

ati_create_page_map() is one example that it gets a RAM page
by __get_free_page(), and changes it to UC by calling set_memory_uc().
Since RAM is already mapped with WB, the driver has to change its attribute
when it needs a different cache type.  For MMIO, drivers should be able to
create a map with the right attribute since they own the ranges.

> > > Basically if ioremap_uc() and ioremap_wc() is allowed on MMIO areas,
> > > then I see no reason in principle why it should be invalid to change
> > > the area from UC to WC after it has been ioremap()ed.
> > 
> > The current implementation does not support MMIO.
> >  - It does not track cache attribute correctly for MMIO since it uses
> > __pa().
> 
> Hm, so __pa() works on mmio addresses as well - at least on x86. The
> whole mmtype tree is physical address based - which too is MMIO
> compatible in principle.
>
> The only problem would be if it was fundamentally struct page * based -
> but it's not AFAICS. We have a few APIs that work on struct page *
> arrays, but those are just vectored helper APIs AFAICS.
> 
> What am I missing?

I do not think __pa() returns the right physical address for an ioremap'd
virtual address since its virtual address is allocated from the vmalloc
range.  But we can simply use slow_virt_to_phys(), instead.

> >  - It only supports attribute transition of {WB -> NewType -> WB} for
> > RAM.  RAM is tracked differently that WB is treated as "no map".  So,
> > this transition does not cause a conflict on RAM.  This will causes a
> > conflict on MMIO when it is tracked correctly.   
> 
> That looks like a bug?

This is by design since set_memory_xx was introduced for RAM only.  If we
extend it to MMIO, then we need to change how memtype manages MMIO.

Thanks,
-Toshi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ