lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160510115320.GJ23576@dhcp22.suse.cz>
Date:	Tue, 10 May 2016 13:53:20 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	"David S. Miller" <davem@...emloft.net>,
	Tony Luck <tony.luck@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Chris Zankel <chris@...kel.net>,
	Max Filippov <jcmvbkbc@...il.com>
Subject: Re: [PATCH 03/11] locking, rwsem: introduce basis for
 down_write_killable

On Tue 10-05-16 19:43:20, Tetsuo Handa wrote:
> I hit "allowing the OOM killer to select the same thread again" problem
> ( http://lkml.kernel.org/r/20160408113425.GF29820@dhcp22.suse.cz ), but
> I think that there is a bug in down_write_killable() series (at least
> "locking, rwsem: introduce basis for down_write_killable" patch).
> 
> Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20160510-sem.txt.xz .
[...]
> 2 threads (PID: 1314 and 1443) are sleeping at rwsem_down_read_failed()
> but no thread is sleeping at rwsem_down_write_failed_killable().
> If there is no thread waiting for write lock, threads waiting for read
> lock must be able to run. This suggests that one of threads which was
> waiting for write lock forgot to wake up reader threads.

Or that the write lock holder is still keeping the lock held. I do not
see such a process in your list though. Is it possible that the
debug_show_all_locks would just miss it as it is not sleeping?

> Looking at rwsem_down_read_failed(), reader threads waiting for the
> writer thread to release the lock are waiting on sem->wait_list list.
> Looking at __rwsem_down_write_failed_common(), when the writer thread
> escaped the
> 
>                  /* Block until there are no active lockers. */
>                  do {
>                          if (signal_pending_state(state, current)) {
>                                  raw_spin_lock_irq(&sem->wait_lock);
>                                  ret = ERR_PTR(-EINTR);
>                                  goto out;
>                          }
>                          schedule();
>                          set_current_state(state);
>                  } while ((count = sem->count) & RWSEM_ACTIVE_MASK);
> 
> loop due to SIGKILL, I think that the writer thread needs to check for
> remaining threads on sem->wait_list list and wake up reader threads
> before rwsem_down_write_failed_killable() returns -EINTR.

I am not sure I understand. The rwsem counter is not write locked while
the thread is sleeping and when we fail on the signal pending so readers
should be able to proceed, no?

Or are you suggesting that the failure path should call rwsem_wake? I
do not see __mutex_lock_common for killable wait doing something like
that and rwsem_wake is explicitly documented that it is called after the
lock state has been updated already. Now I might be missing something
subtle here but I guess the code is correct and it is more likely that
the holder of the lock wasn't killed but it is rather holding the lock
and doing something else.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ