[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yt8p+5FrU3XpFlxv@monkey>
Date: Mon, 25 Jul 2022 16:40:43 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Miaohe Lin <linmiaohe@...wei.com>
Cc: akpm@...ux-foundation.org, songmuchun@...edance.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] hugetlbfs: fix confusing hugetlbfs stat
On 07/23/22 10:56, Miaohe Lin wrote:
> On 2022/7/23 6:55, Mike Kravetz wrote:
> > On 07/22/22 14:38, Miaohe Lin wrote:
> >> On 2022/7/22 8:28, Mike Kravetz wrote:
> >>> On 07/21/22 21:16, Miaohe Lin wrote:
> >>>> When size option is not specified, f_blocks, f_bavail and f_bfree will be
> >>>> set to -1 instead of 0. Likewise, when nr_inodes is not specified, f_files
> >>>> and f_ffree will be set to -1 too. Check max_hpages and max_inodes against
> >>>> -1 first to make sure 0 is reported for max/free/used when no limit is set
> >>>> as the comment states.
> >>>
> >>> Just curious, where are you seeing values reported as -1? The check
> >>
> >> From the standard statvfs() function.
> >>
> >>> for sbinfo->spool was supposed to handle these cases. Seems like it
> >>
> >> sbinfo->spool could be created when ctx->max_hpages == -1 while
> >> ctx->min_hpages != -1 in hugetlbfs_fill_super.
> >>
> >>> should handle the max_hpages == -1 case. But, it doesn't look like it
> >>> considers the max_inodes == -1 case.
> >>>
> >>> If I create/mount a hugetlb filesystem without specifying size or nr_inodes,
> >>> df seems to report zero instead of -1.
> >>>
> >>> Just want to understand the reasoning behind the change.
> >
> > Thanks for the additional information (and test program)!
> >
> >>From the hugetlbfs documentation:
> > "If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on
> > command line then no limits are set."
> >
> > So, having those values set to -1 indicates there is no limit set.
> >
> > With this change, 0 is reported for the case where there is no limit set as
> > well as the case where the max value is 0.
>
> IMHO, 0 should not be a valid max value otherwise there will be no hugetlb pages
> to use. It should mean there's no limit. But maybe I'm wrong.
I agree that 0 as a max value makes little sense. However, it is allowed
today and from what I can tell it is file system specific. So, there is no
defined behavior.
>
> >
> > There may be some value in reporting -1 as is done today.
>
> There still be a inconsistency:
>
> If the ``size`` and ``min_size`` isn't specified, then reported max value is 0.
> But if ``min_size`` is specified while ``size`` isn't specified, the reported
> max value is -1.
>
Agree that this is inconsistent and confusing.
In the case where min_size and size is not specified, -1 for size still may
make sense. min_size specifies how many pages are reserved for use by the
filesystem. The only required relation between min_size and size is that if
size is specified, then min_size must be smaller. Otherwise, it makes no
sense to reserve pages (min_size) that can not be used.
> > To be honest, I am not sure what is the correct behavior here. Unless
> > there is a user visible issue/problem, I am hesitant to change. Other
> > opinions are welcome.
>
> Yes, it might be better to keep it as is. Maybe we could change the comment to
> reflect what the current behavior is like below?
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 44da9828e171..f03b1a019cc0 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1080,7 +1080,7 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
> buf->f_bsize = huge_page_size(h);
> if (sbinfo) {
> spin_lock(&sbinfo->stat_lock);
> - /* If no limits set, just report 0 for max/free/used
> + /* If no limits set, just report 0 or -1 for max/free/used
> * blocks, like simple_statfs() */
> if (sbinfo->spool) {
> spin_lock_irq(&sbinfo->spool->lock);
>
> >
>
> No strong opinion to keep this patch or above change. Many thanks for your comment and reply. :)
>
I am fine with the comment change. Thanks for reading through the code and
trying to make sense of it!
--
Mike Kravetz
Powered by blists - more mailing lists