[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR2201MB10723DDEE1492EA0BB6AEE8CD0D79@MWHPR2201MB1072.namprd22.prod.outlook.com>
Date: Tue, 24 May 2022 03:10:37 +0000
From: "Liu, Congyu" <liu3101@...due.edu>
To: Dmitry Vyukov <dvyukov@...gle.com>
CC: "andreyknvl@...il.com" <andreyknvl@...il.com>,
"kasan-dev@...glegroups.com" <kasan-dev@...glegroups.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2] kcov: update pos before writing pc in trace function
+Andrew Morton
________________________________________
From: Liu, Congyu <liu3101@...due.edu>
Sent: Monday, May 23, 2022 23:08
To: Dmitry Vyukov
Cc: andreyknvl@...il.com; kasan-dev@...glegroups.com; linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] kcov: update pos before writing pc in trace function
It was actually first found in the kernel trace module I wrote for my research
project. For each call instruction I instrumented one trace function before it
and one trace function after it, then expected traces generated from
them would match since I only instrumented calls that return. But it turns
out that it didn't match from time to time in a non-deterministic manner.
Eventually I figured out it was actually caused by the overwritten issue
from interrupt. I then referred to kcov for a solution but it also suffered from
the same issue...so here's this patch :).
________________________________________
From: Dmitry Vyukov <dvyukov@...gle.com>
Sent: Monday, May 23, 2022 4:38
To: Liu, Congyu
Cc: andreyknvl@...il.com; kasan-dev@...glegroups.com; linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] kcov: update pos before writing pc in trace function
On Mon, 23 May 2022 at 07:35, Congyu Liu <liu3101@...due.edu> wrote:
>
> In __sanitizer_cov_trace_pc(), previously we write pc before updating pos.
> However, some early interrupt code could bypass check_kcov_mode()
> check and invoke __sanitizer_cov_trace_pc(). If such interrupt is raised
> between writing pc and updating pos, the pc could be overitten by the
> recursive __sanitizer_cov_trace_pc().
>
> As suggested by Dmitry, we cold update pos before writing pc to avoid
> such interleaving.
>
> Apply the same change to write_comp_data().
>
> Signed-off-by: Congyu Liu <liu3101@...due.edu>
This version looks good to me.
I wonder how you encountered this? Do you mind sharing a bit about
what you are doing with kcov?
Reviewed-by: Dmitry Vyukov <dvyukov@...gle.com>
Thanks
> ---
> PATCH v2:
> * Update pos before writing pc as suggested by Dmitry.
>
> PATCH v1:
> https://lore.kernel.org/lkml/20220517210532.1506591-1-liu3101@purdue.edu/
> ---
> kernel/kcov.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/kcov.c b/kernel/kcov.c
> index b3732b210593..e19c84b02452 100644
> --- a/kernel/kcov.c
> +++ b/kernel/kcov.c
> @@ -204,8 +204,16 @@ void notrace __sanitizer_cov_trace_pc(void)
> /* The first 64-bit word is the number of subsequent PCs. */
> pos = READ_ONCE(area[0]) + 1;
> if (likely(pos < t->kcov_size)) {
> - area[pos] = ip;
> + /* Previously we write pc before updating pos. However, some
> + * early interrupt code could bypass check_kcov_mode() check
> + * and invoke __sanitizer_cov_trace_pc(). If such interrupt is
> + * raised between writing pc and updating pos, the pc could be
> + * overitten by the recursive __sanitizer_cov_trace_pc().
> + * Update pos before writing pc to avoid such interleaving.
> + */
> WRITE_ONCE(area[0], pos);
> + barrier();
> + area[pos] = ip;
> }
> }
> EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
> @@ -236,11 +244,13 @@ static void notrace write_comp_data(u64 type, u64 arg1, u64 arg2, u64 ip)
> start_index = 1 + count * KCOV_WORDS_PER_CMP;
> end_pos = (start_index + KCOV_WORDS_PER_CMP) * sizeof(u64);
> if (likely(end_pos <= max_pos)) {
> + /* See comment in __sanitizer_cov_trace_pc(). */
> + WRITE_ONCE(area[0], count + 1);
> + barrier();
> area[start_index] = type;
> area[start_index + 1] = arg1;
> area[start_index + 2] = arg2;
> area[start_index + 3] = ip;
> - WRITE_ONCE(area[0], count + 1);
> }
> }
>
> --
> 2.34.1
>
Powered by blists - more mailing lists