linux-kernel - Re: [PATCH v2] vmpressure: implement strict mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130627093407.0466ced2@redhat.com>
Date:	Thu, 27 Jun 2013 09:34:07 -0400
From:	Luiz Capitulino <lcapitulino@...hat.com>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	minchan@...nel.org, anton@...msg.org, akpm@...ux-foundation.org,
	kmpark@...radead.org, hyunhee.kim@...sung.com
Subject: Re: [PATCH v2] vmpressure: implement strict mode

On Thu, 27 Jun 2013 11:26:16 +0200
Michal Hocko <mhocko@...e.cz> wrote:

> On Wed 26-06-13 23:17:12, Luiz Capitulino wrote:
> > Currently, an eventfd is notified for the level it's registered for
> > _plus_ higher levels.
> > 
> > This is a problem if an application wants to implement different
> > actions for different levels. For example, an application might want
> > to release 10% of its cache on level low, 50% on medium and 100% on
> > critical. To do this, an application has to register a different
> > eventfd for each pressure level. However, fd low is always going to
> > be notified and and all fds are going to be notified on level critical.
> > 
> > Strict mode solves this problem by strictly notifiying an eventfd
> > for the pressure level it registered for. This new mode is optional,
> > by default we still notify eventfds on higher levels too.
> > 
> > Signed-off-by: Luiz Capitulino <lcapitulino@...hat.com>
> 
> Two nits bellow but it looks good in general to me. You can add my
> Reviewed-by: Michal Hocko <mhocko@...e.cz>

Thanks.

> I still think that edge triggering makes some sense but that one might
> be rebased on top of this patch. We should still figure out whether the
> edge triggering is the right approach for the use case Hyunhee Kim wants
> it for so the strict mode should go first IMO.

I also think that edge triggering makes sense.

> > ---
> > 
> > o v2
> > 
> >  - Improve documentation
> >  - Use a bit to store mode instead of a bool
> >  - Minor changelog changes
> > 
> >  Documentation/cgroups/memory.txt | 26 ++++++++++++++++++++++----
> >  mm/vmpressure.c                  | 26 ++++++++++++++++++++++++--
> >  2 files changed, 46 insertions(+), 6 deletions(-)
> > 
> > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
> > index ddf4f93..412872b 100644
> > --- a/Documentation/cgroups/memory.txt
> > +++ b/Documentation/cgroups/memory.txt
> > @@ -791,6 +791,22 @@ way to trigger. Applications should do whatever they can to help the
> >  system. It might be too late to consult with vmstat or any other
> >  statistics, so it's advisable to take an immediate action.
> >  
> > +Applications can also choose between two notification modes when
> > +registering an eventfd for memory pressure events:
> > +
> > +When in "non-strict" mode, an eventfd is notified for the specific level
> > +it's registered for and higher levels. For example, an eventfd registered
> > +for low level is also going to be notified on medium and critical levels.
> > +This mode makes sense for applications interested on monitoring reclaim
> > +activity or implementing simple load-balacing logic. The non-strict mode
> > +is the default notification mode.
> > +
> > +When in "strict" mode, an eventfd is strictly notified for the pressure
> > +level it's registered for. For example, an eventfd registered for the low
> > +level event is not going to be notified when memory pressure gets into
> > +medium or critical levels. This allows for more complex logic based on
> > +the actual pressure level the system is experiencing.
> 
> It would be also fair to mention that there is no guarantee that lower
> levels are signaled before higher so nobody should rely on seeing LOW
> before MEDIUM or CRITICAL.

I think this is implied. Actually, as an user of this interface I didn't
expect this to happen until I read the code.

> 
> > +
> >  The events are propagated upward until the event is handled, i.e. the
> >  events are not pass-through. Here is what this means: for example you have
> >  three cgroups: A->B->C. Now you set up an event listener on cgroups A, B
> > @@ -807,12 +823,14 @@ register a notification, an application must:
> >  
> >  - create an eventfd using eventfd(2);
> >  - open memory.pressure_level;
> > -- write string like "<event_fd> <fd of memory.pressure_level> <level>"
> > +- write string like "<event_fd> <fd of memory.pressure_level> <level> [strict]"
> >    to cgroup.event_control.
> >  
> > -Application will be notified through eventfd when memory pressure is at
> > -the specific level (or higher). Read/write operations to
> > -memory.pressure_level are no implemented.
> > +Applications will be notified through eventfd when memory pressure is at
> > +the specific level or higher. If strict is passed, then applications
> > +will only be notified when memory pressure reaches the specified level.
> > +
> > +Read/write operations to memory.pressure_level are no implemented.
> >  
> >  Test:
> >  
> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > index 736a601..ba5c17e 100644
> > --- a/mm/vmpressure.c
> > +++ b/mm/vmpressure.c
> > @@ -138,8 +138,16 @@ struct vmpressure_event {
> >  	struct eventfd_ctx *efd;
> >  	enum vmpressure_levels level;
> >  	struct list_head node;
> > +	unsigned int mode;
> 
> You would fill up a hole between level and node if you move it up on
> 64b. Doesn't matter much but why not do it...

You want me to respin?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/