[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1P193MB0752E3CA6B2660860BD3923D99BAA@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM>
Date: Thu, 23 Nov 2023 05:36:08 +0800
From: Juntong Deng <juntong.deng@...look.com>
To: Marco Elver <elver@...gle.com>
Cc: glider@...gle.com, dvyukov@...gle.com, akpm@...ux-foundation.org,
kasan-dev@...glegroups.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
linux-kernel-mentees@...ts.linuxfoundation.org
Subject: Re: [PATCH] kfence: Replace local_clock() with
ktime_get_boot_fast_ns()
On 2023/11/23 4:35, Marco Elver wrote:
> On Wed, 22 Nov 2023 at 21:01, Juntong Deng <juntong.deng@...look.com> wrote:
>>
>> The time obtained by local_clock() is the local CPU time, which may
>> drift between CPUs and is not suitable for comparison across CPUs.
>>
>> It is possible for allocation and free to occur on different CPUs,
>> and using local_clock() to record timestamps may cause confusion.
>
> The same problem exists with printk logging.
>
>> ktime_get_boot_fast_ns() is based on clock sources and can be used
>> reliably and accurately for comparison across CPUs.
>
> You may be right here, however, the choice of local_clock() was
> deliberate: it's the same timestamp source that printk uses.
>
> Also, on systems where there is drift, the arch selects
> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK (like on x86) and the drift is
> generally bounded.
>
>> Signed-off-by: Juntong Deng <juntong.deng@...look.com>
>> ---
>> mm/kfence/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
>> index 3872528d0963..041c03394193 100644
>> --- a/mm/kfence/core.c
>> +++ b/mm/kfence/core.c
>> @@ -295,7 +295,7 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex
>> track->num_stack_entries = num_stack_entries;
>> track->pid = task_pid_nr(current);
>> track->cpu = raw_smp_processor_id();
>> - track->ts_nsec = local_clock(); /* Same source as printk timestamps. */
>> + track->ts_nsec = ktime_get_boot_fast_ns();
>
> You have ignored the comment placed here - now it's no longer the same
> source as printk timestamps. I think not being able to correlate
> information from KFENCE reports with timestamps in lines from printk
> is worse.
>
> For now, I have to Nack: Unless you can prove that
> ktime_get_boot_fast_ns() can still be correlated with timestamps from
> printk timestamps, I think this change only trades one problem for
> another.
>
> Thanks,
> -- Marco
Honestly, the possibility of accurately matching a message in the printk
log by the timestamp in the kfence report is very low, since allocation
and free do not directly correspond to a certain event.
Since time drifts across CPUs, timestamps may be different even if
allocation and free can correspond to a certain event.
If we really need to find the relevant printk logs by the timestamps in
the kfence report, all we can do is to look for messages that are within
a certain time range.
If we are looking for messages in a certain time range, there is not
much difference between local_clock() and ktime_get_boot_fast_ns().
Also, this patch is in preparation for my next patch.
My next patch is to show the PID, CPU number, and timestamp when the
error occurred, in this case time drift from different CPUs can
cause confusion.
For example, use-after-free caused by a subtle race condition, in which
the time between the free and the error occur will be very close.
Time drift from different CPUs may cause it to appear in the report that
the error timestamp precedes the free timestamp.
Powered by blists - more mailing lists