lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8c7b1d37-758a-46b2-b8aa-afd333140bf4@oracle.com>
Date: Wed, 10 Apr 2024 14:24:13 -0500
From: Dave Kleikamp <dave.kleikamp@...cle.com>
To: Divyaank Tiwari <dtiwari@...stonybrook.edu>, shaggy@...nel.org,
        jfs-discussion@...ts.sourceforge.net, linux-kernel@...r.kernel.org,
        brauner@...nel.org, jack@...e.cz, axboe@...nel.dk,
        jinpu.wang@...os.com, dchinner@...hat.com, lizhi.xu@...driver.com,
        johannes.thumshirn@....com
Cc: Yifei Liu <yifeliu@...stonybrook.edu>, Erez Zadok
 <ezk@...stonybrook.edu>,
        Scott Smolka <sas@...stonybrook.edu>,
        Geoff Kuenning <geoff@...hmc.edu>
Subject: Re: [PATCH] fs/jfs: fix order of kfree() and mutex_unlock() within
 lmLogOpen() in jfs_logmgr.c

On 4/10/24 1:48PM, Divyaank Tiwari wrote:
> In the jfs_logmgr.c file, within lmLogOpen() under the “free” label,
> mutex_unlock(&jfs_log_mutex) is called before kfree(log). Through our
> model-checking tool Metis, we found that this is one of the potential
> causes for nondeterministic kernel-hang bugs in JFS. Specifically, this
> indirectly results in the “log” variable being NULL dereferenced in the
> function txEnd() in jfs_txnmgr.c.
> 
> Fix: Swap the order of mutex_unlock(&jfs_log_mutex) and kfree(log).
> 
> We checked the entire JFS codebase, especially the file jfs_logmgr.c where
> log is allocated and kfree’d multiple times, and found that every time,
> except this buggy case, a call to kfree() was followed by a mutex_unlock().
> This ensures that any shared log resources are protected by the
> jfs_log_mutex lock.
> 
> The small patch given below fixes this bug and we are reasonably certain
> that it is the correct fix.  Before this fix, we were able to trigger the
> kernel hang bug fairly quickly through Metis. Through multiple experiments,
> we found that we were able to cause the kernel to hang mostly within a few
> minutes of execution. While we are fairly certain that the patch below
> fixes *a* bug, we believe there’s another race/bug somewhere else that we
> have yet to identify.  With this small fix, when we run Metis, we can still
> trigger a NULL ptr deref of “log” in function txEnd() in jfs_txnmgr.c,
> but now it takes anywhere from 6 to 137 hours (almost 6 days). We are
> hoping that this fix will also help some of the JFS maintainers identify
> other potential races.

I'm really not sure how this fixes anything. When the lock is released, 
nothing should be referencing the data structure. Also, calling kfree() 
doesn't have anything to do with the value of sbi->log, which would be 
the cause of the NULL dereference. It may only affect the timing by 
holding the lock a little longer.

Can you briefly explain the test case? It seems you are using an 
external journal. (I don't think that's very common, which could help 
explain why it hasn't been discovered.) Are more than one file system 
sharing a journal? (I don't know if anyone actually does that, but this 
was designed in from the beginning.)

That said, the patch looks completely safe. I'm not sure if it is 
necessary since nothing should be referencing log any more and a 
use-after-free is just as likely whether or not we hold the lock while 
calling kfree().

Thanks,
Shaggy

> 
> Note: All of our experiments were performed on Linux kernel v6.6.1.
> 
> Signed-off-by: Divyaank Tiwari <dtiwari@...stonybrook.edu>
> Signed-off-by: Yifei Liu <yifeliu@...stonybrook.edu>
> Signed-off-by: Erez Zadok <ezk@...stonybrook.edu>
> Signed-off-by: Scott Smolka <sas@...stonybrook.edu>
> Signed-off-by: Geoff Kuenning <geoff@...hmc.edu>
> ---
>   fs/jfs/jfs_logmgr.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
> index 9609349e92e5..ce9af4ef1e41 100644
> --- a/fs/jfs/jfs_logmgr.c
> +++ b/fs/jfs/jfs_logmgr.c
> @@ -1144,8 +1144,8 @@ int lmLogOpen(struct super_block *sb)
>   	bdev_fput(bdev_file);
>   
>         free:		/* free log descriptor */
> -	mutex_unlock(&jfs_log_mutex);
>   	kfree(log);
> +	mutex_unlock(&jfs_log_mutex);
>   
>   	jfs_warn("lmLogOpen: exit(%d)", rc);
>   	return rc;
> 
> base-commit: 2c71fdf02a95b3dd425b42f28fd47fb2b1d22702

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ