[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Y5n0u=qLA9A=89B07gMVRiQ+6nQaob2_rk_mOOt57iQw@mail.gmail.com>
Date: Mon, 2 May 2016 13:35:34 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: "Luruo, Kuthonuzo" <kuthonuzo.luruo@....com>
Cc: Andrey Ryabinin <aryabinin@...tuozzo.com>,
Alexander Potapenko <glider@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
kasan-dev <kasan-dev@...glegroups.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] kasan: improve double-free detection
On Mon, May 2, 2016 at 1:30 PM, Luruo, Kuthonuzo
<kuthonuzo.luruo@....com> wrote:
> Hi Dmitry,
>
> Thanks very much for your response/review.
>
>> I agree that it's something we need to fix (user-space ASAN does
>> something along these lines). My only concern is increase of
>> kasan_alloc_meta size. It's unnecessary large already and we have a
>> plan to reduce both alloc and free into to 16 bytes. However, it can
>> make sense to land this and then reduce size of headers in a separate
>> patch using a CAS-loop on state.
>
> ok.
>
>> Alexander, what's the state of your
>> patches that reduce header size?
>>
>> I think there is another race. If we have racing frees, one thread
>> sets state to KASAN_STATE_QUARANTINE but does not fill
>> free_info->track yet, at this point another thread does free and
>> reports double-free, but the track is wrong so we will print a bogus
>> stack. The same is true for all other state transitions (e.g.
>> use-after-free racing with the object being pushed out of the
>> quarantine).
>
> Yes, I've noticed this too. For the double-free, in my local tests, the error
> reports from the bad frees get printed only after the successful "good"
> free set the new track info. But then, I was using kasan_report() on the
> error path to get sane reports from the multiple concurrent frees so that
> may have "slowed" down the error paths enough until the good free was
> done. Agreed, the window still exists for a stale (or missing) deallocation
> stack in error report.
>
>> We could introduce 2 helper functions like:
>>
>> /* Sets status to KASAN_STATE_LOCKED if the current status is equal to
>> old_state, returns current state. Waits while current state equals
>> KASAN_STATE_LOCKED. */
>> u32 kasan_lock_if_state_equals(struct kasan_alloc_meta *meta, u32 old_state);
>>
>> /* Changes state from KASAN_STATE_LOCKED to new_state */
>> void kasan_unlock_and_set_status(struct kasan_alloc_meta *meta, u32
>> new_state);
>>
>> Then free can be expressed as:
>>
>> if (kasan_lock_if_state_equals(meta, KASAN_STATE_ALLOC) ==
>> KASAN_STATE_ALLOC) {
>> free_info = get_free_info(cache, object);
>> quarantine_put(free_info, cache);
>> set_track(&free_info->track, GFP_NOWAIT);
>> kasan_poison_slab_free(cache, object);
>> kasan_unlock_and_set_status(meta, KASAN_STATE_QUARANTINE);
>> return true;
>> }
>>
>> And on the reporting path we would need to lock the header, read all
>> state, unlock the header.
>>
>> Does it make sense?
>
> It does, thanks! I'll try to hack up something, though right now, it's not entirely
> clear to me how to achieve this without resorting to a global lock (which would
> be unacceptable).
We can use per-header lock by setting status to KASAN_STATE_LOCKED. A
thread can CAS any status to KASAN_STATE_LOCKED which means that it
locked the header. If any thread tried to modify/read the status and
the status is KASAN_STATE_LOCKED, then the thread waits.
>> Please add the test as well. We need to start collecting tests for all
>> these tricky corner cases.
>
> Sure, a new test can be added for test_kasan.ko. Unlike the other tests, a
> double-free would likely panic the system due to slab corruption. Would it still
> be "KASANic" for kasan_slab_free() to return true after reporting double-free
> attempt error so thread will not call into __cache_free()? How does ASAN
> handle this?
Yes, sure, it is OK to return true from kasan_slab_free() in such case.
Use-space ASAN terminates the process after the first report. We've
decided that in kernel we better continue in best-effort manner. But
after the first report all bets are mostly off (leaking an object is
definitely OK).
Powered by blists - more mailing lists