linux-kernel - Re: [PATCH] proc/kpageflags: add KPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Sat, 17 Feb 2018 11:14:10 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        Michal Hocko <mhocko@...e.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Nicholas Piggin <npiggin@...il.com>
Subject: Re: [PATCH] proc/kpageflags: add KPF_WAITERS

On 17.02.2018 02:57, Andrew Morton wrote:
> On Sun, 11 Feb 2018 13:36:41 +0300 Konstantin Khlebnikov <khlebnikov@...dex-team.ru> wrote:
> 
>> KPF_WAITERS indicates tasks are waiting for a page lock or writeback.
>> This might be false-positive, in this case next unlock will clear it.
> 
> Well, kpageflags is full of potential false-positives.  Or do you think
> this flag is especially vulnerable?
> 
> In other words, under what circumstances will we have KPF_WAITERS set
> when PG_locked and PG-writeback are clear?

Looks like lock_page() - unlock_page() shouldn't leave longstanding
false-positive: last unlock_page() must clear PG_waiters.

But I've seen them. Probably that was from  wait_on_page_writeback():
it test PG_writeback, set PG_waiters under queue lock unconditionally
and then test PG_writeback again before sleep - and might exit
without wakeup i.e. without clearing PG_waiters.

This could be fixed with extra check for in wait_on_page_bit_common()
under queue lock.

--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1087,6 +1087,10 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
                 spin_lock_irq(&q->lock);

                 if (likely(list_empty(&wait->entry))) {
+                       if (unlikely(!test_bit(bit_nr, &page->flags))) {
+                               spin_unlock_irq(&q->lock);
+                               goto try;
+                       }
                         __add_wait_queue_entry_tail(q, wait);
                         SetPageWaiters(page);
                 }
@@ -1098,7 +1102,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
                 if (likely(test_bit(bit_nr, &page->flags))) {
                         io_schedule();
                 }
-
+try:
                 if (lock) {
                         if (!test_and_set_bit_lock(bit_nr, &page->flags))
                                 break;

But this seems redundant.

> 
>> This looks like worth information not only for kernel hacking.
> 
> Why?  What are the use-cases, in detail?  How are we to justify this
> modification?

This bit tells which page or оffset in file is actually wanted
by somebody in the system. That's another way to track where major
faults or writeback blocks something. We don't have to record flow
of events - snapshot of page-flags will show where contention is.

> 
>> In tool page-types in non-raw mode treat KPF_WAITERS without
>> KPF_LOCKED and KPF_WRITEBACK as false-positive and hide it.
> 
>>   fs/proc/page.c                         |    1 +
>>   include/uapi/linux/kernel-page-flags.h |    1 +
>>   tools/vm/page-types.c                  |    7 +++++++
> 
> Please update Documentation/vm/pagemap.txt.
> 

ok