linux-kernel - [PATCH v2 00/10] block: fix blktrace debugfs use after free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200419194529.4872-1-mcgrof@kernel.org>
Date:   Sun, 19 Apr 2020 19:45:19 +0000
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     axboe@...nel.dk, viro@...iv.linux.org.uk, bvanassche@....org,
        gregkh@...uxfoundation.org, rostedt@...dmis.org, mingo@...hat.com,
        jack@...e.cz, ming.lei@...hat.com, nstange@...e.de,
        akpm@...ux-foundation.org
Cc:     mhocko@...e.com, yukuai3@...wei.com, linux-block@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Luis Chamberlain <mcgrof@...nel.org>
Subject: [PATCH v2 00/10] block: fix blktrace debugfs use after free

Upstream kernel.org korg#205713 [0] states that there is a UAF in the
core debugfs debugfs_remove() function, and has gone through pushing for
a CVE for this, CVE-2019-19770 [1].

This patch series disputes that CVE, and shows how the issue was just
a complex misuse of debugfs within blktrace and fixes it.

On this v2 I've dropped two patches from my last series which are not
needed to ensure we can move back to a synchronous request_queue
removal. I've also addressed Ming's feedback on ensuring we keep
functionality working for when paritions are used for a blktrace.  That
effort lead me to ensuring we don't try to overwrite the request_queue
debugfs_dir, and add sanity checks in place so that what we give back
only what is expected.

Although my v1 patches also had fixed the kernel splat we get when we
try to reproduce the issue:

debugfs: Directory 'loop0' with parent 'block' already present!

This v2 series now provides a clear explanation for *why* this was
ultimately one of the reasons why we ended up with a crash.

The commit log for the actual fix, patch 3/10, "blktrace: fix debugfs
use after free" has also been extended to provide a better explanation
as to *how* overwriting the debugfs_dir leads to an eventual panic with
blktrace. I hope that helps, as it seems the root cause was still not
well explained in the commit log.

To make review easier, I've also added some helper functions with no
functional changes at first, and only extended them later.

Also changed is blk_queue_debugfs_register() to return an int, we do
this to not make the fact that we don't check for errors on
register_disk() or add_disk() any worse.

After the patch 3/10 "blktrace: fix debugfs use after free", is applied,
with the pr_warns(), and prior to reverting back to synchronous request_queue
removal, if we try to reproduce the issue with break-blktrace [2]'s
./run_0001.sh script, we'd see::

blk_debugfs: loop0 : registering request_queue debugfs directory twice is not allowed
blktrace: loop0: request_queue parent is gone

And sometimes only:

blk_debugfs: loop0 : registering request_queue debugfs directory twice is not allowed

After we revert back to synchronous request_queue removal this should no
longer be possible, and if it is, we want to hear about it. To help with
this two patches are added which change pr_warn() to BUG_ON()s after we flip
back to synchronous request_queue removal.

Note that on patch 6/10 "blk-debugfs: upgrade warns to BUG_ON() if
directory" I explain the syfs layout between a gendisk and the
request_queue. By reverting back to synchronous request_queue removal,
if someone manages to figure out a way to create a clash with
registering block devices, we expect to see a sysfs clash now instead of
a clash with debugfs, as the debugfs directory is removed now always
first, prior to clearing out the sysfs dir. *If* there are races
possible in these areas, we want to hear about them, and the BUG_ON()s
should make it clearer *where* the real issue is coming from.

Having an asynchronous request_queue removal has exposed other bugs
lingering around, however most importantly I think its revealing more
the value of adding error handling for __device_add_disk() and friends.
If its encouraged I could take a stab at finally addressing that for
good.

You can find this code on my git tree, on the 20200417-blktrace-fixes
branch, which is based on linux-next tag next-20200417 [3].

[0] https://bugzilla.kernel.org/show_bug.cgi?id=205713                          
[1] https://nvd.nist.gov/vuln/detail/CVE-2019-19770 
[2] https://github.com/mcgrof/break-blktrace
[3] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200417-blktrace-fixes

Luis Chamberlain (10):
  block: move main block debugfs initialization to its own file
  blktrace: move blktrace debugfs creation to helper function
  blktrace: fix debugfs use after free
  block: revert back to synchronous request_queue removal
  blktrace: upgrade warns to BUG_ON() on unexpected circmunstances
  blk-debugfs: upgrade warns to BUG_ON() if directory is already found
  blktrace: move debugfs file creation to its own function
  blktrace: add checks for created debugfs files on setup
  block: panic if block debugfs dir is not created
  block: put_device() if device_add() fails

 block/Makefile               |   1 +
 block/blk-core.c             |  28 +++++---
 block/blk-debugfs.c          |  39 ++++++++++++
 block/blk-mq-debugfs.c       |   5 --
 block/blk-sysfs.c            |  47 ++++++++------
 block/blk.h                  |  18 ++++++
 block/genhd.c                |   4 +-
 include/linux/blkdev.h       |   7 +-
 include/linux/blktrace_api.h |   1 +
 kernel/trace/blktrace.c      | 120 +++++++++++++++++++++++++++++++----
 10 files changed, 218 insertions(+), 52 deletions(-)
 create mode 100644 block/blk-debugfs.c

-- 
2.25.1