linux-kernel - Re: [RFC mm][PATCH 2/5] percpu cached mm counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20091211095159.6472a009.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Fri, 11 Dec 2009 09:51:59 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Minchan Kim <minchan.kim@...il.com>
Cc:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	cl@...ux-foundation.org,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	mingo@...e.hu
Subject: Re: [RFC mm][PATCH 2/5] percpu cached mm counter

On Fri, 11 Dec 2009 09:40:07 +0900
Minchan Kim <minchan.kim@...il.com> wrote:
> > static inline unsigned long get_mm_counter(struct mm_struct *mm, int member)
> >  {
> > -       return (unsigned long)atomic_long_read(&(mm)->counters[member]);
> > +       long ret;
> > +       /*
> > +        * Because this counter is loosely synchronized with percpu cached
> > +        * information, it's possible that value gets to be minus. For user's
> > +        * convenience/sanity, avoid returning minus.
> > +        */
> > +       ret = atomic_long_read(&(mm)->counters[member]);
> > +       if (unlikely(ret < 0))
> > +               return 0;
> > +       return (unsigned long)ret;
> >  }
> 
> Now, your sync point is only task switching time.
> So we can't show exact number if many counting of mm happens
> in short time.(ie, before context switching).
> It isn't matter?
> 
I think it's not a matter from 2 reasons.

1. Now, considering servers which requires continuous memory usage monitoring
as ps/top, when there are 2000 processes, "ps -elf" takes 0.8sec.
Because system admins know that gathering process information consumes
some amount of cpu resource, they will not do that so frequently.(I hope)

2. When chains of page faults occur continously in a period, the monitor
of memory usage just see a snapshot of current numbers and "snapshot of what
moment" is at random, always. No one can get precise number in that kind of situation. 



> >
> >  static inline void add_mm_counter(struct mm_struct *mm, int member, long value)
<snip>

> > Index: mmotm-2.6.32-Dec8/kernel/sched.c
> > ===================================================================
> > --- mmotm-2.6.32-Dec8.orig/kernel/sched.c
> > +++ mmotm-2.6.32-Dec8/kernel/sched.c
> > @@ -2858,6 +2858,7 @@ context_switch(struct rq *rq, struct tas
> >        trace_sched_switch(rq, prev, next);
> >        mm = next->mm;
> >        oldmm = prev->active_mm;
> > +
> >        /*
> >         * For paravirt, this is coupled with an exit in switch_to to
> >         * combine the page table reload and the switch backend into
> > @@ -5477,6 +5478,11 @@ need_resched_nonpreemptible:
> >
> >        if (sched_feat(HRTICK))
> >                hrtick_clear(rq);
> > +       /*
> > +        * sync/invaldidate per-cpu cached mm related information
> > +        * before taling rq->lock. (see include/linux/mm.h)
> 
> taling => taking
> 
> > +        */
> > +       sync_mm_counters_atomic();
> 
> It's my above concern.
> before the process schedule out, we could get the wrong info.
> It's not realistic problem?
> 
I think not, now.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/