lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211014002132.ee7668a4790ea75b0f7a9ceb@kernel.org>
Date:   Thu, 14 Oct 2021 00:21:32 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     Beau Belgrave <beaub@...ux.microsoft.com>
Cc:     rostedt@...dmis.org, linux-trace-devel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] user_events: Enable user processes to create and write
 to trace events

On Mon, 11 Oct 2021 09:25:23 -0700
Beau Belgrave <beaub@...ux.microsoft.com> wrote:

> On Fri, Oct 08, 2021 at 06:22:58PM +0900, Masami Hiramatsu wrote:
> > > > I'm not sure this point, you mean 1 fd == 1 event model?
> > > > 
> > > Yeah, I like the idea of not having an fd per event.
> > 
> > Ah, OK. I misunderstood the idea.
> > per-FD model sounds like having events/user-events/*/marker file.
> > 
> Thanks for the back and forth, I appreciate your time on this.
> 
> Yes, in my mind there are two options to avoid kernel memory usage
> per-event.
> 
> 1.
> We have a an array per file struct that is independently ref-counted.
> This is required to ensure lifetime requirements and to ensure user code
> cannot access other user events that might have been free'd outside of
> the lifetime and cause a kernel crash.
> 
> This approach also requires 2 int's to be returned, 1 for the status
> page the other a local index for the write into the above array per-file
> struct.
> 
> This is likely the most complex method due to it's lifetime and RCU
> synchronization requirements. However, it represents the least memory to
> both kernel and user space.
> 
> 2.
> We have a anon_inode FD that gets installed into the user process and
> returned via the ioctl from user_events tracefs file. The file struct
> backing the FD is shared by all user mode processes for that event. Like
> having an inject/marker file per-event in the user_events subsystem.

Is it safe to share the same file structure among all processes?
(sharing FD via ipc may do same thing?)

> This approach requires an FD returned and either an int for the status
> page or the returend FD could expose the ID via another IOCTL being
> issued.

OK, I would like to suggest you to add events/user-events/*/marker file
(which returns that shared file struct backed FD) so that some simple
user scripts can also send the events (these may not use ioctl, just
write the events.) But this can be done afterwards anyway.

> This is the simplest method since the FD manages the lifetime, when FD
> is released so is the shared file struct. Kernel side memory is reduced
> to only unique events that are actively being used. There is no RCU or
> synchronization beyond the FD lifetime. The user mode processes does
> incur an FD per-event within their file description table. So they
> events charge against their FD per-process limit (not necessarily a bad
> thing).

Yeah, usually FD ulimit will be much bigger than the number of events.

> 
> This also seems to follow the pre-existing patterns of tracefs
> (trace_marker, inject, format, etc all have a shared file available to
> user-processes that have been granted access). For our case, we want
> that, but we want it on a access boundary to who all have access to the
> user_events_* tracefs files. We don't want to open up all of tracefs
> widely.

I think it could be a user choice, and it is possible to add special
access rights for user-events. Anyway, this is an advanced item.

> 
> > > I want to make
> > > sure the complexity is worth it. Is the overhead of an FD per event in
> > > user space too much?
> > 
> > It depends on the use case, how much events you wants to use with
> > the user-events. If there are hundreds of the evets, that will consume
> > kernel resources and /proc/*/fd/ will be filled with the event's fds.
> > But if there is a few events, I think no problem.
> > 
> In our own use case this will be low due to the way we plan to use the
> events. However, I am not sure others will follow that :)

I just concerned if qemu consider to use this interface for their event
log :) 

Thank you,

-- 
Masami Hiramatsu <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ