lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALYGNiPbtav_ChUFW_kkh+HOHVmKeupTRsnKsaShK=p8ZKguBg@mail.gmail.com>
Date:	Fri, 15 Aug 2014 07:36:16 +0400
From:	Konstantin Khlebnikov <koct9i@...il.com>
To:	Rafael Aquini <aquini@...hat.com>
Cc:	Andrey Ryabinin <ryabinin.a.a@...il.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>, vbabka@...e.cz,
	rientjes@...gle.com, Mel Gorman <mgorman@...e.de>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Dave Jones <davej@...hat.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrey Ryabinin <a.ryabinin@...sung.com>, lwoodman@...hat.com,
	Rik van Riel <riel@...hat.com>, jweiner@...hat.com
Subject: Re: mm: compaction: buffer overflow in isolate_migratepages_range

On Fri, Aug 15, 2014 at 2:07 AM, Rafael Aquini <aquini@...hat.com> wrote:
> On Thu, Aug 14, 2014 at 06:43:50PM -0300, Rafael Aquini wrote:
>> On Thu, Aug 14, 2014 at 10:07:40PM +0400, Andrey Ryabinin wrote:
>> > We discussed this with Konstantin and he suggested a better solution for this.
>> > If I understood him correctly the main idea was to store bit
>> > identifying ballon page
>> > in struct page (special value in _mapcount), so we won't need to check
>> > mapping->flags.
>> >
>>
>> Here goes what I thought doing, following that suggestion of Konstantin and yours. (I didn't tested it yet)
>>
>> Comments are welcomed.
>>
>> Cheers,
>> -- Rafael
>>
>> ---- 8< ----
>> From: Rafael Aquini <aquini@...hat.com>
>> Subject: mm: balloon_compaction: enhance balloon_page_movable() checkpoint against races
>>
>> While testing linux-next for the Kernel Address Sanitizer patchset (KASAN)
>> Sasha Levin reported a buffer overflow warning triggered for
>> isolate_migratepages_range(), which lated was discovered happening due to
>> a condition where balloon_page_movable() raced against move_to_new_page(),
>> while the later was copying the page->mapping of an anon page.
>>
>> Because we can perform balloon_page_movable() in a lockless fashion at
>> isolate_migratepages_range(), the dicovered race has unveiled the scheme
>> actually used to spot ballooned pages among page blocks that checks for
>> page_flags_cleared() and dereference page->mapping to check its mapping flags
>> is weak and potentially prone to stumble across another similar conditions
>> in the future.
>>
>> Following Konstantin Khlebnikov's and Andrey Ryabinin's suggestions,
>> this patch replaces the old page->flags && mapping->flags checking scheme
>> with a more simple and strong page->_mapcount read and compare value test.
>> Similarly to what is done for PageBuddy() checks, BALLOON_PAGE_MAPCOUNT_VALUE
>> is introduced here to mark balloon pages. This allows balloon_page_movable()
>> to skip the proven troublesome dereference of page->mapping for flag checking
>> while it goes on isolate_migratepages_range() lockless rounds.
>> page->mapping dereference and flag-checking will be performed later, when
>> all locks are held properly.
>>
>> ---
>>  include/linux/balloon_compaction.h | 61 +++++++++++---------------------------
>>  mm/balloon_compaction.c            | 53 +++++++++++++++++----------------
>>  2 files changed, 45 insertions(+), 69 deletions(-)
>>
>> diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
>> index 089743a..1409ccc 100644
>> --- a/include/linux/balloon_compaction.h
>> +++ b/include/linux/balloon_compaction.h
>> @@ -108,54 +108,29 @@ static inline void balloon_mapping_free(struct address_space *balloon_mapping)
>>  }
>>
>>  /*
>> - * page_flags_cleared - helper to perform balloon @page ->flags tests.
>> + * balloon_page_movable - identify balloon pages that can be moved by
>> + *                     compaction / migration.
>>   *
>> - * As balloon pages are obtained from buddy and we do not play with page->flags
>> - * at driver level (exception made when we get the page lock for compaction),
>> - * we can safely identify a ballooned page by checking if the
>> - * PAGE_FLAGS_CHECK_AT_PREP page->flags are all cleared.  This approach also
>> - * helps us skip ballooned pages that are locked for compaction or release, thus
>> - * mitigating their racy check at balloon_page_movable()
>> + * BALLOON_PAGE_MAPCOUNT_VALUE must be <= -2 but better not too close to
>> + * -2 so that an underflow of the page_mapcount() won't be mistaken
>> + * for a genuine BALLOON_PAGE_MAPCOUNT_VALUE.
>>   */
>> -static inline bool page_flags_cleared(struct page *page)
>> +#define BALLOON_PAGE_MAPCOUNT_VALUE (-256)
>> +static inline bool balloon_page_movable(struct page *page)
>>  {
>> -     return !(page->flags & PAGE_FLAGS_CHECK_AT_PREP);
>> +     return atomic_read(&page->_mapcount) == BALLOON_PAGE_MAPCOUNT_VALUE;
>>  }
>>
>> -/*
>> - * __is_movable_balloon_page - helper to perform @page mapping->flags tests
>> - */
>> -static inline bool __is_movable_balloon_page(struct page *page)
>> +static inline void __balloon_page_set(struct page *page)
>>  {
>> -     struct address_space *mapping = page->mapping;
>> -     return mapping_balloon(mapping);
>> +     VM_BUG_ON_PAGE(!atomic_read(&page->_mapcount) != -1, page);
>> +     atomic_set(&page->_mapcount, BALLOON_PAGE_MAPCOUNT_VALUE);
>>  }
>>
>> -/*
>> - * balloon_page_movable - test page->mapping->flags to identify balloon pages
>> - *                     that can be moved by compaction/migration.
>> - *
>> - * This function is used at core compaction's page isolation scheme, therefore
>> - * most pages exposed to it are not enlisted as balloon pages and so, to avoid
>> - * undesired side effects like racing against __free_pages(), we cannot afford
>> - * holding the page locked while testing page->mapping->flags here.
>> - *
>> - * As we might return false positives in the case of a balloon page being just
>> - * released under us, the page->mapping->flags need to be re-tested later,
>> - * under the proper page lock, at the functions that will be coping with the
>> - * balloon page case.
>> - */
>> -static inline bool balloon_page_movable(struct page *page)
>> +static inline void __balloon_page_clear(struct page *page)
>>  {
>> -     /*
>> -      * Before dereferencing and testing mapping->flags, let's make sure
>> -      * this is not a page that uses ->mapping in a different way
>> -      */
>> -     if (page_flags_cleared(page) && !page_mapped(page) &&
>> -         page_count(page) == 1)
>> -             return __is_movable_balloon_page(page);
>> -
>> -     return false;
>> +     VM_BUG_ON_PAGE(!balloon_page_movable(page), page)
>> +     atomic_set(&page->_mapcount, -1);
>>  }
>>
>>  /*
>> @@ -170,10 +145,8 @@ static inline bool balloon_page_movable(struct page *page)
>>   */
>>  static inline bool isolated_balloon_page(struct page *page)
>>  {
>> -     /* Already isolated balloon pages, by default, have a raised refcount */
>> -     if (page_flags_cleared(page) && !page_mapped(page) &&
>> -         page_count(page) >= 2)
>> -             return __is_movable_balloon_page(page);
>> +     if (balloon_page_movable && page_count(page) > 1)
>> +             return true;
>>
>>       return false;
>>  }
>> @@ -193,6 +166,7 @@ static inline void balloon_page_insert(struct page *page,
>>                                      struct list_head *head)
>>  {
>>       page->mapping = mapping;
>> +     __balloon_page_set(page);
>>       list_add(&page->lru, head);
>>  }
>>
>> @@ -207,6 +181,7 @@ static inline void balloon_page_insert(struct page *page,
>>  static inline void balloon_page_delete(struct page *page)
>>  {
>>       page->mapping = NULL;
>> +     __balloon_page_clear(page);
>>       list_del(&page->lru);
>>  }
>>
>> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
>> index 6e45a50..1cfb254 100644
>> --- a/mm/balloon_compaction.c
>> +++ b/mm/balloon_compaction.c
>> @@ -220,33 +220,32 @@ bool balloon_page_isolate(struct page *page)
>>        * raise its refcount preventing __free_pages() from doing its job
>>        * the put_page() at the end of this block will take care of
>>        * release this page, thus avoiding a nasty leakage.
>> +      *
>> +      * As balloon pages are not isolated from LRU lists, concurrent
>> +      * compaction threads can race against page migration functions
>> +      * as well as race against the balloon driver releasing a page.
>> +      * The aforementioned operations are done under the safety of
>> +      * page lock, so lets be sure we hold it before proceeding the
>> +      * isolations steps here.
>>        */
>> -     if (likely(get_page_unless_zero(page))) {
>> +     if (get_page_unless_zero(page) && trylock_page(page)) {
> Oops, I need to change this one, as we can left behind pages with raised
> refcounts if get_page_unless_zero(page) but trylock_page(page) fails
> here.

Don't hurry. The code in this state for years.
I'm working on patches for this, if everything goes well I'll show it today.
As usual I couldn't stop myself from cleaning the mess, so it will be
bigger than yours.

>
>
>>               /*
>> -              * As balloon pages are not isolated from LRU lists, concurrent
>> -              * compaction threads can race against page migration functions
>> -              * as well as race against the balloon driver releasing a page.
>> -              *
>> -              * In order to avoid having an already isolated balloon page
>> -              * being (wrongly) re-isolated while it is under migration,
>> -              * or to avoid attempting to isolate pages being released by
>> -              * the balloon driver, lets be sure we have the page lock
>> -              * before proceeding with the balloon page isolation steps.
>> +              * A ballooned page, by default, has just one refcount.
>> +              * Prevent concurrent compaction threads from isolating
>> +              * an already isolated balloon page by refcount check.
>>                */
>> -             if (likely(trylock_page(page))) {
>> -                     /*
>> -                      * A ballooned page, by default, has just one refcount.
>> -                      * Prevent concurrent compaction threads from isolating
>> -                      * an already isolated balloon page by refcount check.
>> -                      */
>> -                     if (__is_movable_balloon_page(page) &&
>> -                         page_count(page) == 2) {
>> +             if (balloon_page_movable(page) && page_count(page) == 2) {
>> +                     struct address_space *mapping = page_mapping(page);
>> +                     if (likely(mapping_balloon(mapping))) {
>>                               __isolate_balloon_page(page);
>>                               unlock_page(page);
>>                               return true;
>> +                     } else {
>> +                             dump_page(page, "not movable balloon page");
>> +                             WARN_ON(1);
>>                       }
>> -                     unlock_page(page);
>>               }
>> +             unlock_page(page);
>>               put_page(page);
>>       }
>>       return false;
>> @@ -255,19 +254,21 @@ bool balloon_page_isolate(struct page *page)
>>  /* putback_lru_page() counterpart for a ballooned page */
>>  void balloon_page_putback(struct page *page)
>>  {
>> +     struct address_space *mapping;
>>       /*
>>        * 'lock_page()' stabilizes the page and prevents races against
>>        * concurrent isolation threads attempting to re-isolate it.
>>        */
>>       lock_page(page);
>>
>> -     if (__is_movable_balloon_page(page)) {
>> +     mapping = page_mapping(page);
>> +     if (balloon_page_movable(page) && mapping_balloon(mapping)) {
>>               __putback_balloon_page(page);
>>               /* drop the extra ref count taken for page isolation */
>>               put_page(page);
>>       } else {
>> -             WARN_ON(1);
>>               dump_page(page, "not movable balloon page");
>> +             WARN_ON(1);
>>       }
>>       unlock_page(page);
>>  }
>> @@ -286,16 +287,16 @@ int balloon_page_migrate(struct page *newpage,
>>        */
>>       BUG_ON(!trylock_page(newpage));
>>
>> -     if (WARN_ON(!__is_movable_balloon_page(page))) {
>> +     mapping = page_mapping(page);
>> +     if (!(balloon_page_movable(page) && mapping_balloon(mapping))) {
>>               dump_page(page, "not movable balloon page");
>> -             unlock_page(newpage);
>> -             return rc;
>> +             WARN_ON(1);
>> +             goto out;
>>       }
>>
>> -     mapping = page->mapping;
>>       if (mapping)
>>               rc = __migrate_balloon_page(mapping, newpage, page, mode);
>> -
>> +out:
>>       unlock_page(newpage);
>>       return rc;
>>  }
>> --
>> 1.9.3
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ