[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1P193MB07524EFBE97632D575A91EDB99A2A@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM>
Date: Sun, 29 Oct 2023 17:05:54 +0800
From: Juntong Deng <juntong.deng@...look.com>
To: Andrey Konovalov <andreyknvl@...il.com>
Cc: ryabinin.a.a@...il.com, glider@...gle.com, dvyukov@...gle.com,
vincenzo.frascino@....com, akpm@...ux-foundation.org,
kasan-dev@...glegroups.com, linux-mm@...ck.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kernel-mentees@...ts.linuxfoundation.org"
<linux-kernel-mentees@...ts.linuxfoundation.org>
Subject: Re: [RFC] mm/kasan: Add Allocation, Free, Error timestamps to KASAN
report
On 2023/10/26 3:22, Andrey Konovalov wrote:
> On Tue, Oct 17, 2023 at 9:40 PM Juntong Deng <juntong.deng@...look.com> wrote:
>>
>> The idea came from the bug I was fixing recently,
>> 'KASAN: slab-use-after-free Read in tls_encrypt_done'.
>>
>> This bug is caused by subtle race condition, where the data structure
>> is freed early on another CPU, resulting in use-after-free.
>>
>> Like this bug, some of the use-after-free bugs are caused by race
>> condition, but it is not easy to quickly conclude that the cause of the
>> use-after-free is race condition if only looking at the stack trace.
>>
>> I did not think this use-after-free was caused by race condition at the
>> beginning, it took me some time to read the source code carefully and
>> think about it to determine that it was caused by race condition.
>>
>> By adding timestamps for Allocation, Free, and Error to the KASAN
>> report, it will be much easier to determine if use-after-free is
>> caused by race condition.
>
> An alternative would be to add the CPU number to the alloc/free stack
> traces. Something like:
>
> Allocated by task 42 on CPU 2:
> (stack trace)
>
> The bad access stack trace already prints the CPU number.
Yes, that is a great idea and the CPU number would help a lot.
But I think the CPU number cannot completely replace the free timestamp,
because some freeing really should be done at another CPU.
We need the free timestamp to help us distinguish whether it was freed
a long time ago or whether it was caused to be freed during the
current operation.
I think both the CPU number and the timestamp should be displayed, more
information would help us find the real cause of the error faster.
Should I implement these features?
Powered by blists - more mailing lists