linux-kernel - Re: [PATCH 1/5] timer: kasan: record and print timer stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1597322937.9999.42.camel@mtksdccf07>
Date:   Thu, 13 Aug 2020 20:48:57 +0800
From:   Walter Wu <walter-zh.wu@...iatek.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Matthias Brugger <matthias.bgg@...il.com>,
        John Stultz <john.stultz@...aro.org>,
        "Stephen Boyd" <sboyd@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        <kasan-dev@...glegroups.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        wsd_upstream <wsd_upstream@...iatek.com>,
        <linux-mediatek@...ts.infradead.org>
Subject: Re: [PATCH 1/5] timer: kasan: record and print timer stack

Hi Thomas,

Please ignore my previous mail. Thanks.


On Thu, 2020-08-13 at 13:48 +0200, Thomas Gleixner wrote:
> Walter,
> 
> Walter Wu <walter-zh.wu@...iatek.com> writes:
> > This patch records the last two timer queueing stacks and prints
> 
> "This patch" is useless information as we already know from the subject
> line that this is a patch.
> 
> git grep 'This patch' Documentation/process/
> 

Thanks for your information.

> > up to 2 timer stacks in KASAN report. It is useful for programmers
> > to solve use-after-free or double-free memory timer issues.
> >
> > When timer_setup() or timer_setup_on_stack() is called, then it
> > prepares to use this timer and sets timer callback, we store
> > this call stack in order to print it in KASAN report.
> 
> we store nothing. Don't impersonate code please.
> 
> Also please structure the changelog in a way that it's easy to
> understand what this is about instead of telling first what the patch
> does and then some half baken information why this is useful followed by
> more information about what it does.
> 
> Something like this:
> 
>   For analysing use after free or double free of objects it is helpful
>   to preserve usage history which potentially gives a hint about the
>   affected code.
> 
>   For timers it has turned out to be useful to record the stack trace
>   of the timer init call. <ADD technical explanation why this is useful>
>  
>   Record the most recent two timer init calls in KASAN which are printed
>   on failure in the KASAN report.
> 
> See, this gives a clear context, an explanation why it is useful and a
> high level description of what it does. The details are in the patch
> ifself and do not have to be epxlained in the changelog.
> 

Thanks for your explanation, Our patch will use this as a template from
now on.

> For the technical explanation which you need to add, you really need to
> tell what's the advantage or additional coverage vs. existing debug
> facilities like debugobjects. Just claiming that it's useful does not
> make an argument.
> 

We originally wanted him to have similar functions. Maybe he can't
completely replace, but KASAN can ave this ability.

> The UAF problem with timers is nasty because if you free an active timer
> then either the softirq which expires the timer will corrupt potentially
> reused memory or the reuse will corrupt the linked list which makes the
> softirq or some unrelated code which adds/removes a different timer
> explode in undebuggable ways. debugobject prevents that because it
> tracks per timer state and invokes the fixup function which keeps the
> system alive and also tells you exactly where the free of the active
> object happens which is the really interesting place to look at. The
> init function is pretty uninteresting in that case because you really
> want to know where the freeing of the active object happens.
> 
> So if KASAN detects UAF in the timer softirq then the init trace is not
> giving any information especially not in cases where the timer is part
> of a common and frequently allocated/freed other data structure.
> 

I don't have experience using this tool, but I will survey it.

> >  static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
> >  static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
> >  static inline void kasan_record_aux_stack(void *ptr) {}
> > +static inline void kasan_record_tmr_stack(void *ptr) {}
> 
> Duh, so you are adding per object type functions and storage? That's
> going to be a huge copy and pasta orgy as every object requires the same
> code and extra storage space.
> 
> Why not just using kasan_record_aux_stack() for all of this?
> 
> The 'call_rcu' 'timer' 'whatever next' printout is not really required
> because the stack trace already tells you the function which was
> invoked. If TOS is call_rcu() or do_timer_init() then it's entirely
> clear which object is affected. If the two aux records are not enough
> then making the array larger is not the end of the world.
> 

My previous mail say that we will re-use kasan_record_aux_stack() and
only have aux_stack.

> >  #endif /* CONFIG_KASAN_GENERIC */
> >  
> > diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> > index a5221abb4594..ef2da9ddfac7 100644
> > --- a/kernel/time/timer.c
> > +++ b/kernel/time/timer.c
> > @@ -783,6 +783,8 @@ static void do_init_timer(struct timer_list *timer,
> >  	timer->function = func;
> >  	timer->flags = flags | raw_smp_processor_id();
> >  	lockdep_init_map(&timer->lockdep_map, name, key, 0);
> > +
> > +	kasan_record_tmr_stack(timer);
> >  }
> 
> Are you sure this is correct for all timers?
> 
> This is also called for timers which are temporarily allocated on stack
> and for timers which are statically allocated at compile time. How is
> that supposed to work?
> 

If I understand correctly, KASAN report have this record only for slub
variable. So what you said shouldn't be a problem.

> These kind of things want to be explained upfront an not left to the
> reviewer as an exercise.
> 

Sorry, My fault. Later we will be more cautious to send patch.

> Thanks,
> 
>         tglx