[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAFEAcA_dSxRgkvuJW6aHdSv88NkXXHBMAzjyJMTRbW3mXAV3Sg@mail.gmail.com>
Date: Thu, 13 Nov 2014 17:39:00 +0000
From: Peter Maydell <peter.maydell@...aro.org>
To: Will Deacon <will.deacon@....com>
Cc: Chanho Min <chanho.min@....com>,
Russell King <linux@....linux.org.uk>,
Jon Medhurst <tixy@...aro.org>,
Taras Kondratiuk <taras.kondratiuk@...aro.org>,
Olof Johansson <olof@...om.net>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Gunho Lee <gunho.lee@....com>, HyoJun Im <hyojun.im@....com>,
Jongsung Kim <neidhard.kim@....com>,
"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
mtk.manpages@...il.com
Subject: Re: [PATCH] ARM: cacheflush: disallow pending signals during cacheflush
On 13 November 2014 11:26, Will Deacon <will.deacon@....com> wrote:
> Whilst I don't think this is the correct solution, I agree that there's
> a potential issue here. We could change the restart return value to
> -ERESTARTNOINTR instead, but I can imagine something like a periodic
> SIGALRM which could prevent a large cacheflush from ever completing.
> Do we actually care about making forward progress in such a scenario?
>
> It is interesting to note that this change has been in mainline since
> May last year without any reported issues. That could be down to a number
> of reasons:
>
> (1) People are using old kernels on ARM
>
> (2) Code doesn't check the return value from the cacheflush system call,
> because it historically always returned 0
...and the documentation comment in the source code didn't say
anything about the syscall having a return value; it only
described the input parameters. I would actually be surprised
if any userspace caller of this syscall checked its return value
(the libgcc cacheflush function used by gcc's clear_cache builtin
doesn't, to pick one popularly used example).
> (3) People are getting lucky with timing, as this is likely difficult
> to hit
(4) The resulting misbehaviour ("my JIT crashes occasionally and
non-reproducibly at some point possibly some while after the
cacheflush call") will be extremely hard to track back
to this kernel change
> This leaves me with the following questions:
>
> - Has this change been shown to break anything in practice?
> - Can we change the internal return value to -ERESTARTNOINTR?
> - What do we do about kernels that *do* return -EINTR? (>=3.12?)
My suggestion would be "treat this as a bugfix, put it into
stable kernels in the usual way (and assume distros will pick
it up if appropriate)".
> - Can we get a manpage put together to describe this mess?
That would be nice :-)
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists