lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 24 Nov 2017 12:02:36 +0900
From:   Byungchul Park <>
To:     Michal Hocko <>
Subject: Re: [PATCH 1/3] lockdep: Apply crossrelease to PG_locked locks

On Thu, Nov 16, 2017 at 02:07:46PM +0100, Michal Hocko wrote:
> On Thu 16-11-17 21:48:05, Byungchul Park wrote:
> > On 11/16/2017 9:02 PM, Michal Hocko wrote:
> > > for each struct page. So you are doubling the size. Who is going to
> > > enable this config option? You are moving this to page_ext in a later
> > > patch which is a good step but it doesn't go far enough because this
> > > still consumes those resources. Is there any problem to make this
> > > kernel command line controllable? Something we do for page_owner for
> > > example?
> > 
> > Sure. I will add it.
> > 
> > > Also it would be really great if you could give us some measures about
> > > the runtime overhead. I do not expect it to be very large but this is
> > 
> > The major overhead would come from the amount of additional memory
> > consumption for 'lockdep_map's.
> yes
> > Do you want me to measure the overhead by the additional memory
> > consumption?
> > 
> > Or do you expect another overhead?
> I would be also interested how much impact this has on performance. I do
> not expect it would be too large but having some numbers for cache cold
> parallel kbuild or other heavy page lock workloads.

Hello Michal,

I measured 'cache cold parallel kbuild' on my qemu machine. The result
varies much so I cannot confirm, but I think there's no meaningful
difference between before and after applying crossrelease to page locks.

Actually, I expect little overhead in lock_page() and unlock_page() even
after applying crossreleas to page locks, but only expect a bit overhead
by additional memory consumption for 'lockdep_map's per page.

I run the following instructions within "QEMU x86_64 4GB memory 4 cpus":

   make clean
   echo 3 > drop_caches
   time make -j4

The results are:

   # w/o page lock tracking

   At the 1st try,
   real     5m28.105s
   user     17m52.716s
   sys      3m8.871s

   At the 2nd try,
   real     5m27.023s
   user     17m50.134s
   sys      3m9.289s

   At the 3rd try,
   real     5m22.837s
   user     17m34.514s
   sys      3m8.097s

   # w/ page lock tracking

   At the 1st try,
   real     5m18.158s
   user     17m18.200s
   sys      3m8.639s

   At the 2nd try,
   real     5m19.329s
   user     17m19.982s
   sys      3m8.345s

   At the 3rd try,
   real     5m19.626s
   user     17m21.363s
   sys      3m9.869s

I think thers's no meaningful difference on my small machine.


Powered by blists - more mailing lists