linux-kernel - Re: [PATCH 1/5] timer: kasan: record and print timer stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87d03ulqbp.fsf@nanos.tec.linutronix.de>
Date:   Thu, 13 Aug 2020 13:48:10 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Walter Wu <walter-zh.wu@...iatek.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Matthias Brugger <matthias.bgg@...il.com>,
        John Stultz <john.stultz@...aro.org>,
        Stephen Boyd <sboyd@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     kasan-dev@...glegroups.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        wsd_upstream <wsd_upstream@...iatek.com>,
        linux-mediatek@...ts.infradead.org,
        Walter Wu <walter-zh.wu@...iatek.com>
Subject: Re: [PATCH 1/5] timer: kasan: record and print timer stack

Walter,

Walter Wu <walter-zh.wu@...iatek.com> writes:
> This patch records the last two timer queueing stacks and prints

"This patch" is useless information as we already know from the subject
line that this is a patch.

git grep 'This patch' Documentation/process/

> up to 2 timer stacks in KASAN report. It is useful for programmers
> to solve use-after-free or double-free memory timer issues.
>
> When timer_setup() or timer_setup_on_stack() is called, then it
> prepares to use this timer and sets timer callback, we store
> this call stack in order to print it in KASAN report.

we store nothing. Don't impersonate code please.

Also please structure the changelog in a way that it's easy to
understand what this is about instead of telling first what the patch
does and then some half baken information why this is useful followed by
more information about what it does.

Something like this:

  For analysing use after free or double free of objects it is helpful
  to preserve usage history which potentially gives a hint about the
  affected code.

  For timers it has turned out to be useful to record the stack trace
  of the timer init call. <ADD technical explanation why this is useful>

  Record the most recent two timer init calls in KASAN which are printed
  on failure in the KASAN report.

See, this gives a clear context, an explanation why it is useful and a
high level description of what it does. The details are in the patch
ifself and do not have to be epxlained in the changelog.

For the technical explanation which you need to add, you really need to
tell what's the advantage or additional coverage vs. existing debug
facilities like debugobjects. Just claiming that it's useful does not
make an argument.

The UAF problem with timers is nasty because if you free an active timer
then either the softirq which expires the timer will corrupt potentially
reused memory or the reuse will corrupt the linked list which makes the
softirq or some unrelated code which adds/removes a different timer
explode in undebuggable ways. debugobject prevents that because it
tracks per timer state and invokes the fixup function which keeps the
system alive and also tells you exactly where the free of the active
object happens which is the really interesting place to look at. The
init function is pretty uninteresting in that case because you really
want to know where the freeing of the active object happens.

So if KASAN detects UAF in the timer softirq then the init trace is not
giving any information especially not in cases where the timer is part
of a common and frequently allocated/freed other data structure.

>  static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
>  static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
>  static inline void kasan_record_aux_stack(void *ptr) {}
> +static inline void kasan_record_tmr_stack(void *ptr) {}

Duh, so you are adding per object type functions and storage? That's
going to be a huge copy and pasta orgy as every object requires the same
code and extra storage space.

Why not just using kasan_record_aux_stack() for all of this?

The 'call_rcu' 'timer' 'whatever next' printout is not really required
because the stack trace already tells you the function which was
invoked. If TOS is call_rcu() or do_timer_init() then it's entirely
clear which object is affected. If the two aux records are not enough
then making the array larger is not the end of the world.

>  #endif /* CONFIG_KASAN_GENERIC */
>  
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index a5221abb4594..ef2da9ddfac7 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -783,6 +783,8 @@ static void do_init_timer(struct timer_list *timer,
>  	timer->function = func;
>  	timer->flags = flags | raw_smp_processor_id();
>  	lockdep_init_map(&timer->lockdep_map, name, key, 0);
> +
> +	kasan_record_tmr_stack(timer);
>  }

Are you sure this is correct for all timers?

This is also called for timers which are temporarily allocated on stack
and for timers which are statically allocated at compile time. How is
that supposed to work?

These kind of things want to be explained upfront an not left to the
reviewer as an exercise.

Thanks,

        tglx