[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5j+DoXK_9v430xhCzOO1bsuuhe84fjkf7+DZ0r35hUNBBg@mail.gmail.com>
Date: Sat, 7 Sep 2013 20:34:15 -0700
From: Kees Cook <keescook@...omium.org>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: "David S. Miller" <davem@...emloft.net>,
LKML <linux-kernel@...r.kernel.org>,
linux-crypto@...r.kernel.org, Tyler Hicks <tyhicks@...onical.com>
Subject: Re: race condition in crypto larval handling
On Sat, Sep 7, 2013 at 6:32 PM, Herbert Xu <herbert@...dor.apana.org.au> wrote:
> On Fri, Sep 06, 2013 at 04:20:50PM -0700, Kees Cook wrote:
>>
>> In the two-thread situation, the first thread gets a larval with
>> refcnt 2 via crypto_larval_add. (Why 2?) The next thread finds the
>> larval via crypto_larval_add's call to __crypto_alg_lookup() and sees
>> the ref bump to 3. While exiting crypto_alg_mod_lookup, each thread
>> decrements the ref count twice.
>>
>> It seems to me like either each call to crypto_larval_lookup() should
>> result in a refcount bumped by two, or crypto_alg_mod_lookup() should
>> decrement only once, and the initial refcount should be 1 not 2 from
>> crypto_larval_add. But it's not clear to me which is sensible here.
>>
>> What's the right solution here?
>
> First of all thanks a lot for tracking this problem down! It's
> been bothering me for months but I was unable to find a good
> reproducer.
>
> So now that you've identified the problem, the solution is easy.
> crypto_larval_lookup should only ever return a larval if it created
> one. Any larval created earlier must be waited on first before we
> return.
>
> So does this patch fix the crash for you?
>
> diff --git a/crypto/api.c b/crypto/api.c
> index 320ea4d..a2b39c5 100644
> --- a/crypto/api.c
> +++ b/crypto/api.c
> @@ -34,6 +34,8 @@ EXPORT_SYMBOL_GPL(crypto_alg_sem);
> BLOCKING_NOTIFIER_HEAD(crypto_chain);
> EXPORT_SYMBOL_GPL(crypto_chain);
>
> +static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg);
> +
> struct crypto_alg *crypto_mod_get(struct crypto_alg *alg)
> {
> return try_module_get(alg->cra_module) ? crypto_alg_get(alg) : NULL;
> @@ -144,8 +146,11 @@ static struct crypto_alg *crypto_larval_add(const char *name, u32 type,
> }
> up_write(&crypto_alg_sem);
>
> - if (alg != &larval->alg)
> + if (alg != &larval->alg) {
> kfree(larval);
> + if (crypto_is_larval(alg))
> + alg = crypto_larval_wait(alg);
> + }
>
> return alg;
> }
Thanks! This does stop the crash I was seeing on the "race with no
valid alg" path, yay!
However, I noticed on the "good" path (even without the above patch),
I sometimes see a double-kfree triggered by the modprobe process. I
can't, however, see how that's happening, since larval_destroy should
only be called when refcnt == 0.
Here's the debugging output I added that noticed this. pid 15797 and
15798 are my "hash" tool that forks before doing identical AF_ALG
calls for "sha512", which has to be modprobed. The trailing "31" and
"32" was an atomic counter I added just to make sure I wasn't going
crazy:
[ 112.495789] crypto_alg_mod_lookup 15797: looking up [sha512]
[ 112.495815] crypto_larval_lookup 15797: looking up [sha512]
[ 112.495839] crypto_larval_lookup 15797: requesting module [sha512]
[ 112.495915] crypto_alg_mod_lookup 15798: looking up [sha512]
[ 112.495937] crypto_larval_lookup 15798: looking up [sha512]
[ 112.495960] crypto_larval_lookup 15798: requesting module [sha512]
[ 112.498691] crypto_larval_destroy 15808(modprobe): larval
kfree(ffff880123e9da00) refcnt:0 (serial:31)
[ 112.498771] crypto_larval_destroy 15808(modprobe): larval
kfree(ffff880123e9da00) refcnt:0 (serial:32)
[ 112.500888] crypto_larval_lookup 15797: after requesting module
[sha512], got alg ffffffffc0299050
[ 112.500904] crypto_larval_lookup 15797: for [sha512], have alg
ffffffffc0299050
[ 112.500934] crypto_larval_lookup 15797: for [sha512], alg
ffffffffc0299050 is NOT larval
[ 112.500953] crypto_alg_mod_lookup 15797: [sha512] is not larval
ffffffffc0299050
[ 112.501384] crypto_larval_lookup 15798: after requesting module
[sha512], got alg ffffffffc0299050
[ 112.501404] crypto_larval_lookup 15798: for [sha512], have alg
ffffffffc0299050
[ 112.501417] crypto_larval_lookup 15798: for [sha512], alg
ffffffffc0299050 is NOT larval
[ 112.501432] crypto_alg_mod_lookup 15798: [sha512] is not larval
ffffffffc0299050
Regardless, this seems to be a separate bug.
-Kees
--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists