lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 27 Apr 2018 08:30:21 +0100 From: Mel Gorman <mgorman@...hsingularity.net> To: Ingo Molnar <mingo@...nel.org> Cc: Peter Zijlstra <peterz@...radead.org>, Mike Galbraith <efault@....de>, Matt Fleming <matt@...eblueprint.co.uk>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [PATCH] sched/numa: Stagger NUMA balancing scan periods for new threads On Fri, Apr 27, 2018 at 08:50:57AM +0200, Ingo Molnar wrote: > > 4.17.0-rc1 4.17.0-rc1 > > vanilla stagger-v1r1 > > sys-time-bt.D 48.78 ( 0.00%) 48.22 ( 1.15%) > > sys-time-cg.D 25.31 ( 0.00%) 26.63 ( -5.22%) > > sys-time-ep.D 1.65 ( 0.00%) 0.62 ( 62.42%) > > sys-time-is.D 40.05 ( 0.00%) 24.45 ( 38.95%) > > sys-time-lu.D 37.55 ( 0.00%) 29.02 ( 22.72%) > > sys-time-mg.D 47.52 ( 0.00%) 34.92 ( 26.52%) > > sys-time-sp.D 119.01 ( 0.00%) 109.05 ( 8.37%) > > sys-time-ua.D 51.52 ( 0.00%) 45.13 ( 12.40%) > > > > NUMA scan activity is reduced as well as other balancing activity. > > > > NUMA alloc local 1042828 1342670 > > NUMA base PTE updates 140481138 93577468 > > NUMA huge PMD updates 272171 180766 > > NUMA page range updates 279832690 186129660 > > NUMA hint faults 1395972 1193897 > > NUMA hint local faults 877925 855053 > > NUMA hint local percent 62 71 > > NUMA pages migrated 12057909 9158023 > > Looks like a nice reduction in scanning overhead - which was always the main worry > with the fault based active NUMA balancing technique. > > I have a couple of minor code cleanliness nit: > No problem. > > +void init_numa_balancing(unsigned long clone_flags, struct task_struct *p) > > +{ > > + int mm_users = 0; > > + > > + if (p->mm) { > > + mm_users = atomic_read(&p->mm->mm_users); > > + if (mm_users == 1) { > > + p->mm->numa_next_scan = jiffies + msecs_to_jiffies(sysctl_numa_balancing_scan_delay); > > + p->mm->numa_scan_seq = 0; > > + } > > + } > > + p->node_stamp = 0ULL; > > + p->numa_scan_seq = p->mm ? p->mm->numa_scan_seq : 0; > > + p->numa_scan_period = sysctl_numa_balancing_scan_delay; > > + p->numa_work.next = &p->numa_work; > > + p->numa_faults = NULL; > > + p->last_task_numa_placement = 0; > > + p->last_sum_exec_runtime = 0; > > + p->numa_group = NULL; > > While this is pre-existing code that you moved, could we please use a bit more > organization to make this more readable: > > p->node_stamp = 0ULL; > p->numa_scan_seq = p->mm ? p->mm->numa_scan_seq : 0; > p->numa_scan_period = sysctl_numa_balancing_scan_delay; > p->numa_work.next = &p->numa_work; > p->numa_faults = NULL; > p->last_task_numa_placement = 0; > p->last_sum_exec_runtime = 0; > p->numa_group = NULL; > > ? > > This form made me notice a detail: the 0ULL asymmetry looks weird, this integer > literal type specification is entirely superfluous here, we can just write '0'. > Can do. I'll revise it and should send a new version on Monday. You're right that the type is superfluous, this was simply a code movement so I kept the structure. > > + /* New address space */ > > + if (!(clone_flags & CLONE_VM)) { > > + p->numa_preferred_nid = -1; > > + return; > > + } > > + > > + /* New thread, use existing preferred nid but stagger scans */ > > + if (p->mm) { > > + unsigned int delay; > > + > > + delay = min_t(unsigned int, task_scan_max(current), > > + current->numa_scan_period * mm_users * NSEC_PER_MSEC); > > + delay += 2 * TICK_NSEC; > > + p->numa_preferred_nid = current->numa_preferred_nid; > > + p->node_stamp = delay; > > + } > > So this is a fork time function, shouldn't p->numa_preferred_nid be equal to > current->numa_preferred_nid already? > It should but I see no harm in being explicit. > This is what happens in the !p->mm && CLONE_VM case anyway, right? > !p->mm should never have changed numa_preferred_nid and it's useless information anyway. Kernel threads do not have a user address space to sample and migrate. > So we could leave out the superfluous assignment and make it clear in a comment > that we inherit the parent's ->numa_preferred_nid intentionally. > I can do that. > Also, there's a lot of p->mm use, could we add this helper local variable to > simplify the code some more: > > struct mm_struct *mm = p->mm; > > ? That should be harmless. Thanks! -- Mel Gorman SUSE Labs
Powered by blists - more mailing lists