[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXMiSpMMi-4P8FTMeH_0J+6eNj0RAVJDhZYQOZub1jUOA@mail.gmail.com>
Date: Mon, 15 Sep 2014 18:22:56 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Toshi Kani <toshi.kani@...com>
Cc: "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Arnd Bergmann <arnd@...db.de>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Juergen Gross <jgross@...e.com>,
Stefan Bader <stefan.bader@...onical.com>,
Henrique de Moraes Holschuh <hmh@....eng.br>,
Yigal Korman <yigal@...xistor.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Subject: Re: [PATCH v2 6/6] x86, pat: Update documentation for WT changes
On Mon, Sep 15, 2014 at 2:19 PM, Toshi Kani <toshi.kani@...com> wrote:
> On Wed, 2014-09-10 at 15:34 -0600, Toshi Kani wrote:
>> On Wed, 2014-09-10 at 13:29 -0700, Andy Lutomirski wrote:
>> > On Wed, Sep 10, 2014 at 1:12 PM, Toshi Kani <toshi.kani@...com> wrote:
>> > > On Wed, 2014-09-10 at 11:30 -0700, Andy Lutomirski wrote:
>> > >> On Wed, Sep 10, 2014 at 9:51 AM, Toshi Kani <toshi.kani@...com> wrote:
>> > >> > +Drivers may map the entire NV-DIMM range with ioremap_cache and then change
>> > >> > +a specific range to wt with set_memory_wt.
>> > >>
>> > >> That's mighty specific :)
>> > >
>> > > How about below?
>> > >
>> > > Drivers may use set_memory_wt to set WT type for cached reserve ranges.
>> >
>> > Do they have to be cached?
>>
>> Yes, set_memory_xyz only supports WB->type->WB transition.
>>
>> > How about:
>> >
>> > Drivers may call set_memory_wt on ioremapped ranges. In this case,
>> > there is no need to change the memory type back before calling
>> > iounmap.
>> >
>> > (Or only on cached ioremapped ranges if that is, in fact, the case.)
>>
>> Sounds good. Yes, I will use cashed ioremapped ranges.
>
> Well, testing "no need to change the memory type back before calling
> iounmap" turns out to be a good test case. I realized that
> set_memory_xyz only works properly for RAM. There are two problems for
> using this interface for ioremapped ranges.
>
> 1) set_memory_xyz calls reserve_memtype() with __pa(addr). However,
> __pa() translates the addr into a fake physical address when it is an
> ioremapped address.
>
> 2) reserve_memtype() does not work for set_memory_xyz. For RAM, the WB
> state is managed untracked. Hence, WB->new->WB is not considered as a
> conflict. For ioremapped ranges, WB is tracked in the same way as other
> cache types. Hence, WB->new is considered as a conflict.
>
> In my previous testing, 2) was undetected since 1) led using a fake
> physical address which was not tracked for WB. This made ioremapped
> ranges worked just like RAM. :-(
>
> Anyway, 1) can be fixed by using slow_virt_to_phys() instead of __pa().
> set_memory_xyz is already slow, but this makes it even slower, though.
>
> For 2), WB has to be continuously tracked in order to detect aliasing,
> ex. ioremap_cache and ioremap to a same address. So, I think
> reserve_memtype() needs the following changes:
> - Add a new arg to see if an operation is to create a new mapping or to
> change cache attribute.
> - Track overlapping maps so that cache type change to an overlapping
> range can be detected and failed.
>
> This level of changes requires a separate set of patches if we pursue to
> support ioremapped ranges. So, I am considering to take one of the two
> options below.
>
> A) Drop the patch for set_memory_wt.
>
> B) Keep the patch for set_memory_wt, but document that it fails with
> -EINVAL and its use is for RAM only.
>
I vote A. I see no great reason to add code that can't be used. Once
someone needs this ability, they can add it :)
It's too bad that ioremap is called ioremap and not iomap. Otherwise
the natural solution would be to add a different function call
ioremap_wt that's like set_memory_wt but for ioremap ranges. Calling
it ioreremap_wt sounds kind of disgusting :)
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists