lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200329081923.GD2454444@unreal>
Date:   Sun, 29 Mar 2020 11:19:23 +0300
From:   Leon Romanovsky <leon@...nel.org>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     Jaewon Kim <jaewon31.kim@...sung.com>, vbabka@...e.cz,
        adobriyan@...il.com, akpm@...ux-foundation.org, labbott@...hat.com,
        sumit.semwal@...aro.org, minchan@...nel.org, ngupta@...are.org,
        sergey.senozhatsky.work@...il.com, kasong@...hat.com,
        bhe@...hat.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        jaewon31.kim@...il.com, linux-api@...r.kernel.org,
        kexec@...ts.infradead.org
Subject: Re: [RFC PATCH v2 1/3] meminfo_extra: introduce meminfo extra

On Sun, Mar 29, 2020 at 09:23:04AM +0200, Greg KH wrote:
> On Sun, Mar 29, 2020 at 10:19:07AM +0300, Leon Romanovsky wrote:
> > On Tue, Mar 24, 2020 at 09:53:16PM +0900, Jaewon Kim wrote:
> > >
> > >
> > > On 2020년 03월 24일 20:46, Greg KH wrote:
> > > > On Tue, Mar 24, 2020 at 08:37:38PM +0900, Jaewon Kim wrote:
> > > >>
> > > >> On 2020년 03월 24일 19:11, Greg KH wrote:
> > > >>> On Tue, Mar 24, 2020 at 06:11:17PM +0900, Jaewon Kim wrote:
> > > >>>> On 2020년 03월 23일 18:53, Greg KH wrote:
> > > >>>>>> +int register_meminfo_extra(atomic_long_t *val, int shift, const char *name)
> > > >>>>>> +{
> > > >>>>>> +	struct meminfo_extra *meminfo, *memtemp;
> > > >>>>>> +	int len;
> > > >>>>>> +	int error = 0;
> > > >>>>>> +
> > > >>>>>> +	meminfo = kzalloc(sizeof(*meminfo), GFP_KERNEL);
> > > >>>>>> +	if (!meminfo) {
> > > >>>>>> +		error = -ENOMEM;
> > > >>>>>> +		goto out;
> > > >>>>>> +	}
> > > >>>>>> +
> > > >>>>>> +	meminfo->val = val;
> > > >>>>>> +	meminfo->shift_for_page = shift;
> > > >>>>>> +	strncpy(meminfo->name, name, NAME_SIZE);
> > > >>>>>> +	len = strlen(meminfo->name);
> > > >>>>>> +	meminfo->name[len] = ':';
> > > >>>>>> +	strncpy(meminfo->name_pad, meminfo->name, NAME_BUF_SIZE);
> > > >>>>>> +	while (++len < NAME_BUF_SIZE - 1)
> > > >>>>>> +		meminfo->name_pad[len] = ' ';
> > > >>>>>> +
> > > >>>>>> +	spin_lock(&meminfo_lock);
> > > >>>>>> +	list_for_each_entry_rcu(memtemp, &meminfo_head, list) {
> > > >>>>>> +		if (memtemp->val == val) {
> > > >>>>>> +			error = -EINVAL;
> > > >>>>>> +			break;
> > > >>>>>> +		}
> > > >>>>>> +	}
> > > >>>>>> +	if (!error)
> > > >>>>>> +		list_add_tail_rcu(&meminfo->list, &meminfo_head);
> > > >>>>>> +	spin_unlock(&meminfo_lock);
> > > >>>>> If you have a lock, why are you needing rcu?
> > > >>>> I think _rcu should be removed out of list_for_each_entry_rcu.
> > > >>>> But I'm confused about what you meant.
> > > >>>> I used rcu_read_lock on __meminfo_extra,
> > > >>>> and I think spin_lock is also needed for addition and deletion to handle multiple modifiers.
> > > >>> If that's the case, then that's fine, it just didn't seem like that was
> > > >>> needed.  Or I might have been reading your rcu logic incorrectly...
> > > >>>
> > > >>>>>> +	if (error)
> > > >>>>>> +		kfree(meminfo);
> > > >>>>>> +out:
> > > >>>>>> +
> > > >>>>>> +	return error;
> > > >>>>>> +}
> > > >>>>>> +EXPORT_SYMBOL(register_meminfo_extra);
> > > >>>>> EXPORT_SYMBOL_GPL()?  I have to ask :)
> > > >>>> I can use EXPORT_SYMBOL_GPL.
> > > >>>>> thanks,
> > > >>>>>
> > > >>>>> greg k-h
> > > >>>>>
> > > >>>>>
> > > >>>> Hello
> > > >>>> Thank you for your comment.
> > > >>>>
> > > >>>> By the way there was not resolved discussion on v1 patch as I mentioned on cover page.
> > > >>>> I'd like to hear your opinion on this /proc/meminfo_extra node.
> > > >>> I think it is the propagation of an old and obsolete interface that you
> > > >>> will have to support for the next 20+ years and yet not actually be
> > > >>> useful :)
> > > >>>
> > > >>>> Do you think this is meaningful or cannot co-exist with other future
> > > >>>> sysfs based API.
> > > >>> What sysfs-based API?
> > > >> Please refer to mail thread on v1 patch set - https://protect2.fireeye.com/url?k=16e3accc-4b2f6548-16e22783-0cc47aa8f5ba-935fe828ac2f6656&u=https://lkml.org/lkml/fancy/2020/3/10/2102
> > > >> especially discussion with Leon Romanovsky on https://protect2.fireeye.com/url?k=74208ed9-29ec475d-74210596-0cc47aa8f5ba-0bd4ef48931fec95&u=https://lkml.org/lkml/fancy/2020/3/16/140
> > > > I really do not understand what you are referring to here, sorry.   I do
> > > > not see any sysfs-based code in that thread.
> > > Sorry. I also did not see actual code.
> > > Hello Leon Romanovsky, could you elaborate your plan regarding sysfs stuff?
> >
> > Sorry for being late, I wasn't in "TO:", so missed the whole discussion.
> >
> > Greg,
> >
> > We need the exposed information for the memory optimizations (debug, not
> > production) of our high speed NICs. Our devices (mlx5) allocates a lot of
> > memory, so optimization there can help us to scale in SRIOV mode easier and
> > be less constraint by the memory.
>
> Great, then use debugfs and expose what ever you want in what ever way
> you want, no restrictions there, you do not need any type of kernel-wide
> /proc file for that today.

No argue here, just gave you an example why Jaewon's idea is worth to explore.

>
> > I want to emphasize that I don't like idea of extending /proc/* interface
> > because it is going to be painful to grep on large machines with many
> > devices. And I don't like the idea that every driver will need to register
> > into this interface, because it will be abused almost immediately.
>
> I agree.
>
> > My proposal was to create new sysfs file by driver/core and put all
> > information automatically there, for example, it can be
> > /sys/devices/pci0000:00/0000:00:0c.0/meminfo
> >                                      ^^^^^^^
>
> Nope, again, use debugfs, as sysfs is only one-value-per-file.

Everything that is not /proc and one global file for whole kernel
is fine by me. Debugfs is more than enough for us.

Thanks

>
> thanks,
>
> greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ