linux-kernel - Re: Problem with atomic accesses in pstore on some ARM CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABXOdTfhXhq6pbqAqGQwHcsCXvyshS8JbDGQCBKxfd-2s=M-fw@mail.gmail.com>
Date:	Tue, 16 Aug 2016 13:26:32 -0700
From:	Guenter Roeck <groeck@...gle.com>
To:	Colin Cross <ccross@...roid.com>
Cc:	Mark Rutland <mark.rutland@....com>,
	Will Deacon <will.deacon@....com>,
	Tony Luck <tony.luck@...el.com>,
	Kees Cook <keescook@...omium.org>,
	Jeffy Chen <jeffy.chen@...k-chips.com>,
	Douglas Anderson <dianders@...omium.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Robin Murphy <robin.murphy@....com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: Problem with atomic accesses in pstore on some ARM CPUs

On Tue, Aug 16, 2016 at 10:35 AM, Colin Cross <ccross@...roid.com> wrote:
> On Mon, Aug 15, 2016 at 3:15 PM, Mark Rutland <mark.rutland@....com> wrote:
>> On Tue, Aug 16, 2016 at 08:02:53AM -0700, Guenter Roeck wrote:
>>> On Tue, Aug 16, 2016 at 6:21 AM, Will Deacon <will.deacon@....com> wrote:
>>> > On Tue, Aug 16, 2016 at 06:14:53AM -0700, Guenter Roeck wrote:
>>> >> On Tue, Aug 16, 2016 at 3:32 AM, Robin Murphy <robin.murphy@....com> wrote:
>>> >> > On 16/08/16 00:19, Guenter Roeck wrote:
>>> >> >> we are having a problem with atomic accesses in pstore on some ARM
>>> >> >> CPUs (specifically rk3288 and rk3399). With those chips, atomic
>>> >> >> accesses fail with both pgprot_noncached and pgprot_writecombine
>>> >> >> memory. Atomic accesses do work when selecting PAGE_KERNEL protection.
>>> >> >
>>> >> > What's the pstore backed by? I'm guessing it's not normal DRAM.
>>> >> >
>>> >>
>>> >> it is normal DRAM.
>>> >
>>> > In which case, why does it need to be mapped with weird attributes?
>>> > Is there an alias in the linear map you can use?
>>> >
>>>
>>> I don't really _want_ to do anything besides using pstore as-is, or,
>>> in other words, to have the upstream kernel work with the affected
>>> systems.
>>>
>>> The current pstore code runs the following code for memory with
>>> pfn_valid() = true.
>>>
>>>         if (memtype)
>>>                 prot = pgprot_noncached(PAGE_KERNEL);
>>>         else
>>>                 prot = pgprot_writecombine(PAGE_KERNEL);
>>>         ...
>>>         vaddr = vmap(pages, page_count, VM_MAP, prot);
>>>
>>> It then uses the memory pointed to by vaddr for atomic operations.
>>
>> This means that the generic ramoops / pstore code is making non-portable
>> assumptions about memory types.
>>
>> So _something_ has to happen to that code.
>>
>>> In my case, both protection options don't work. Everything works fine
>>> (or at least doesn't create an exception) if I use
>>>         vaddr = vmap(pages, page_count, VM_MAP, PAGE_KERNEL);
>>> instead.
>>
>> Architecturally, that will give you a memory type to which we can safely use
>> atomics on.
>>
>> It would be nice to know why the ramoops/pstore code must use atomics, and
>> exactly what it's trying to achieve (i.e. is this just for serialisation, or an
>> attempt to ensure persistence).
>>
>> Depending on that, it may be possible to fix things more generically by using
>> memremap by default, for example, and only allowing uncached mappings on those
>> architectures which support them.
>
> persistent_ram uses atomic ops in uncached memory to store the start
> and end positions in the ringbuffer so that the state of the
> ringbuffer will be valid if the kernel crashes at any time.  This was
> inherited from Android's ram_console implementation, and worked
> through armv7.  It has been causing more and more problems recently,
> see for example 027bc8b08242c59e19356b4b2c189f2d849ab660 (pstore-ram:
> Allow optional mapping with pgprot_noncached) and
> 7ae9cb81933515dc7db1aa3c47ef7653717e3090 (pstore-ram: Fix hangs by
> using write-combine mappings).
>
> Maybe it should be replaced with a spinlock in normal ram protecting
> writes to the uncached region.

The necessary functions already exist, and are used for memory mapped
with ioremap() / ioremap_wc(). They were introduced with commit
0405a5cec3 ("pstore/ram: avoid atomic accesses for ioremapped
regions"), and the description in that patch sounds quite similar to
the current problem. Given that, would it be acceptable to remove
buffer_start_add_atomic() and buffer_size_add_atomic(), and always use
buffer_start_add_locked() and buffer_size_add_locked() instead ? Those
functions still use atomic_set() and atomic_read(), which works fine
in my tests. The only difference is that a spinlock in main memory is
used instead of atomic_cmpxchg().

Thanks,
Guenter