lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJfpegtLjFVRLxeUUGjT1V0iQ8+pwsFn0t2n2yOVfqSjn_7bPg@mail.gmail.com>
Date: Tue, 15 Apr 2025 11:39:31 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: Jeff Layton <jlayton@...nel.org>
Cc: Joanne Koong <joannelkoong@...il.com>, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] fuse: add a new "connections" file to show longest
 waiting reqeust

On Mon, 3 Feb 2025 at 20:57, Jeff Layton <jlayton@...nel.org> wrote:
>
> Add a new file to the "connections" directory that shows how long (in
> seconds) the oldest fuse_req in the processing hash or pending queue has
> been waiting.
>
> Signed-off-by: Jeff Layton <jlayton@...nel.org>
> ---
> This is based on top of Joanne's timeout patches, as it requires the
> "create_time" field in fuse_req.  We have some internal detection of
> hung fuse server processes that relies on seeing elevated values in the
> "waiting" sysfs file. The problem with that method is that it can't
> detect when highly serialized workloads on a FUSE mount are hung. This
> adds another metric that we can use to detect this situation.
> ---
> Changes in v2:
> - use list_first_entry_or_null() when checking hash lists
> - take fiq->lock when checking pending list
> - ensure that if there are no waiting reqs, that the output will be 0
> - use time_before() to compare jiffies values
> - no need to hold fc->lock when walking pending queue
> - Link to v1: https://lore.kernel.org/r/20250203-fuse-sysfs-v1-1-36faa01f2338@kernel.org
> ---
>  fs/fuse/control.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/fuse/fuse_i.h  |  2 +-
>  2 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/fs/fuse/control.c b/fs/fuse/control.c
> index 2a730d88cc3bdb50ea1f8a3185faad5f05fc6e74..b27f2120499826040af77d7662d2dad0e9f37ee6 100644
> --- a/fs/fuse/control.c
> +++ b/fs/fuse/control.c
> @@ -180,6 +180,57 @@ static ssize_t fuse_conn_congestion_threshold_write(struct file *file,
>         return ret;
>  }
>
> +/* Show how long (in s) the oldest request has been waiting */
> +static ssize_t fuse_conn_oldest_read(struct file *file, char __user *buf,
> +                                     size_t len, loff_t *ppos)
> +{
> +       char tmp[32];
> +       size_t size;
> +       unsigned long now = jiffies;
> +       unsigned long oldest = now;
> +
> +       if (!*ppos) {
> +               struct fuse_conn *fc = fuse_ctl_file_conn_get(file);
> +               struct fuse_iqueue *fiq = &fc->iq;
> +               struct fuse_dev *fud;
> +               struct fuse_req *req;
> +
> +               if (!fc)
> +                       return 0;
> +
> +               spin_lock(&fc->lock);
> +               list_for_each_entry(fud, &fc->devices, entry) {
> +                       struct fuse_pqueue *fpq = &fud->pq;
> +                       int i;
> +
> +                       spin_lock(&fpq->lock);
> +                       for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) {
> +                               /*
> +                                * Only check the first request in the queue. The
> +                                * assumption is that the one at the head of the list
> +                                * will always be the oldest.
> +                                */
> +                               req = list_first_entry_or_null(&fpq->processing[i],
> +                                                              struct fuse_req, list);
> +                               if (req && time_before(req->create_time, oldest))
> +                                       oldest = req->create_time;

Couldn't this be merged with the timeout expiry code?  I.e. implement
get_oldest_req_time() helper, the result of which could be compared
against req_timeout.


> +                       }
> +                       spin_unlock(&fpq->lock);
> +               }
> +               spin_unlock(&fc->lock);
> +
> +               spin_lock(&fiq->lock);
> +               req = list_first_entry_or_null(&fiq->pending, struct fuse_req, list);
> +               if (req && time_before(req->create_time, oldest))
> +                       oldest = req->create_time;
> +               spin_unlock(&fiq->lock);
> +
> +               fuse_conn_put(fc);
> +       }
> +       size = sprintf(tmp, "%ld\n", (now - oldest)/HZ);

now - oldest will always be zero if *ppos != 0.  It would be much more
logical to return an error for *ppos != 0, then to return success with
a nonsense value.

use_conn_limit_write() already does so, and existing read callbacks
could be changed to do the same with a very slight risk of a
regression.  But for a new one, I don't think there's any worries.

Thanks,
Miklos


Thanks,
Miklos

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ