[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y1487bh2.ffs@tglx>
Date: Tue, 03 Sep 2024 22:01:45 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Jinjie Ruan <ruanjinjie@...wei.com>, linux-kernel@...r.kernel.org,
"Peter Zijlstra
(Intel)" <peterz@...radead.org>, Christophe Leroy
<christophe.leroy@...roup.eu>, Josh Poimboeuf <jpoimboe@...nel.org>
Subject: Re: [PATCH] static_call: Fix a wild-memory-access bug in
static_call_del_module()
On Mon, Sep 02 2024 at 20:01, Jinjie Ruan wrote:
> On 2023/9/15 16:21, Jinjie Ruan wrote:
>>
>> diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
>> index 639397b5491c..e7aa70d33530 100644
>> --- a/kernel/static_call_inline.c
>> +++ b/kernel/static_call_inline.c
>> @@ -256,8 +256,10 @@ static int __static_call_init(struct module *mod,
>> }
>>
>> site_mod = kzalloc(sizeof(*site_mod), GFP_KERNEL);
>> - if (!site_mod)
>> + if (!site_mod) {
>> + key->mods = NULL;
>> return -ENOMEM;
You fail to explain what setting key->mods to NULL prevents and why the
same problem does not happen when the kzalloc() five lines further down
fails.
Actually if you read the comment below carefully then you might find the
reason why the current code explodes and you might figure out why you
can't set the pointer to NULL.
>> + }
>>
>> /*
>> * When the key has a direct sites pointer, extract
>> @@ -422,7 +424,7 @@ static void static_call_del_module(struct module *mod)
>> ;
>>
>> if (!site_mod)
>> - continue;
>> + break;
Now why does this "fix" things?
The unmodified kernel crashes dereferencing key->mods. That means there
is garbage in key->mods. But that garbage is not garbage:
struct static_call_key {
void *func;
union {
/* bit 0: 0 = mods, 1 = sites */
unsigned long type;
struct static_call_mod *mods;
struct static_call_site *sites;
};
};
key->mods is actually key->sites, which holds the static call sites of
the key. It has bit 0 set.
As that code does not expect anything else than a valid static_call_mod
pointer it dereferences it unconditionally and crashes. See?
So why can't you set the pointer to NULL?
If that key is builtin and has actual builtin usage sites, then you
destroy the key. If the memory allocation fail was temporary then a
subsequent update of that static call will find no call site and the
kernel happily invokes the previous function.
Now that the root cause is properly analyzed, the proper fix is
obvious. See uncomplied below.
It does not need to do anything vs. the other kzalloc() fail because
that is harmless. The key has key->mods set, but key->mods->mod and
key->mods->next are both NULL in that case. So the inner loop breaks out
and continues to the next site which has the uninitialized one. If all
keys were already converted to mods _before_ this fail then the loop
simply iterates further, but will nowhere find a site_mod->mod == mod.
Thanks,
tglx
---
--- a/kernel/static_call_inline.c
+++ b/kernel/static_call_inline.c
@@ -411,6 +411,17 @@ static void static_call_del_module(struc
for (site = start; site < stop; site++) {
key = static_call_key(site);
+
+ /*
+ * If the key was not updated due to a memory allocation
+ * failure in __static_call_init() then treating key::sites
+ * as key::mods in the code below would cause random memory
+ * access and #GP. In that case all subsequent sites have
+ * not been touched either, so stop iterating.
+ */
+ if (static_call_key_sites(key))
+ break;
+
if (key == prev_key)
continue;
Powered by blists - more mailing lists