[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAK-uSPqSOHZ7AptEQjLu4TLXPxdQacZx4uhD5hu-Wvabta8nqg@mail.gmail.com>
Date: Wed, 31 Aug 2016 14:27:21 +0100
From: Andriy Tkachuk <andriy.tkachuk@...gate.com>
To: linux-kernel@...r.kernel.org
Subject: Re: mm: kswapd struggles reclaiming the pages on 64GB server
Alright - after disabling memory cgroup all works perfectly with the
patch. Even with default vm parameters.
Here are some vmstat results to compare. Now:
# vmstat 60
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 0 67606176 375196 38708 1385896 0 74 23 1266751 198073
103648 6 7 86 1 0
3 0 67647904 394872 38612 1371200 0 695 18 1371067 212143
93917 7 8 85 1 0
2 0 67648016 375796 38676 1382812 1 2 13 1356271 215123
115987 6 7 85 1 0
3 0 67657392 378336 38744 1383468 1 157 15 1383591 213694
102457 6 7 86 1 0
6 0 67659088 367856 38796 1388696 1 28 26 1330238 208377
111469 6 7 86 1 0
2 0 67701344 407320 38680 1371004 0 704 34 1255911 203308
126458 8 8 82 3 0
4 0 67711920 402296 38776 1380836 0 176 8 1308525 201451
93053 6 7 86 1 0
8 0 67721264 376676 38872 1394816 0 156 14 1409726 218269
108127 7 8 85 1 0
18 0 67753872 395568 38896 1397144 0 544 16 1288576 201680
105980 6 7 86 1 0
2 0 67755544 362960 38992 1411744 0 28 17 1458453 232544
127088 6 7 85 1 0
4 0 67784056 376684 39088 1410924 0 475 25 1385831 218800
110344 6 7 85 1 0
2 0 67816104 393108 38800 1384108 1 535 17 1336857 208551
105872 6 7 85 1 0
7 0 67816104 399492 38820 1387096 0 0 17 1280630 205478
109499 6 7 86 1 0
1 0 67821648 375284 38908 1397132 1 93 15 1343042 208363
98031 6 7 85 1 0
1 0 67823512 363828 38924 1402388 0 31 15 1366995 212606
101328 6 7 85 1 0
5 0 67864264 416720 38784 1374480 1 680 21 1372581 210256
95369 7 8 83 3 0
Swapping works smoothly, more than enough memory for caching
available, cpu-wait is about 1.
Before:
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 2 13755748 334968 2140 63780 6684 0 7644 21 3122 7704 0
9 83 8 0
2 2 13760380 333628 2140 62468 4572 7764 4764 9129 3326 8678 0
10 83 7 0
2 2 13761072 332888 2140 62608 4576 4256 4616 4470 3377 8906 0
10 82 7 0
2 2 13760812 341532 2148 62644 5388 3532 5996 3996 3451 7521 0
10 83 7 0
3 3 13757648 335116 2148 62944 6176 0 6480 238 3412 8905 0
10 83 7 0
2 2 13752936 331908 2148 62336 7488 0 7628 201 3433 7483 0
10 83 7 0
2 2 13752520 344428 2148 69412 5292 2160 15820 2324 7254 15960
0 11 82 7 0
3 2 13750856 338056 2148 69864 5576 0 5984 28 3384 8060 0
10 84 6 0
2 2 13748836 331516 2156 70116 6076 0 6376 44 3683 6941 2
10 82 6 0
2 2 13750184 335732 2148 70764 3544 2664 4252 2692 3682 8435 3
10 83 4 0
2 4 13747528 338492 2144 70872 9520 3152 9688 3176 4846 7013 1
10 82 7 0
3 2 13756580 341752 2144 71060 9020 14740 9148 14764 4167 8024
1 10 80 9 0
2 2 13749484 336900 2144 71504 6444 0 6916 24 3613 8472 1
10 82 7 0
2 2 13740560 333148 2152 72480 6932 0 7952 44 3891 6819 1
10 82 7 0
2 2 13734456 330896 2148 72920 12228 1736 12488 1764 3454 9321
2 9 82 8 0
The system got into classic thrashing from which it never came out.
Now:
# cat /proc/vmstat | egrep
'nr_.*active_|pg(steal|scan|refill).*_normal|nr_vmscan_write|nr_swap|pgact'
nr_inactive_anon 7546598
nr_active_anon 7547226
nr_inactive_file 175973
nr_active_file 179439
nr_vmscan_write 17862257
pgactivate 213529452
pgrefill_normal 50400148
pgsteal_kswapd_normal 55904846
pgsteal_direct_normal 2417827
pgscan_kswapd_normal 76263257
pgscan_direct_normal 3213568
Before:
# cat /proc/vmstat | egrep
'nr_.*active_|pg(steal|scan|refill).*_normal|nr_vmscan_write|nr_swap|pgact'
nr_inactive_anon 695534
nr_active_anon 14427464
nr_inactive_file 2786
nr_active_file 2698
nr_vmscan_write 1740097
pgactivate 115697891
pgrefill_normal 33345818
pgsteal_kswapd_normal 367908859
pgsteal_direct_normal 681266
pgscan_kswapd_normal 10255454426
Here is the patch again for convenience:
--- linux-3.10.0-229.20.1.el7.x86_64.orig/mm/page_alloc.c
2015-09-24 15:47:25.000000000 +0000
+++ linux-3.10.0-229.20.1.el7.x86_64/mm/page_alloc.c 2016-08-15
09:49:46.922240569 +0000
@@ -5592,16 +5592,7 @@
*/
static void __meminit calculate_zone_inactive_ratio(struct zone *zone)
{
- unsigned int gb, ratio;
-
- /* Zone size in gigabytes */
- gb = zone->managed_pages >> (30 - PAGE_SHIFT);
- if (gb)
- ratio = int_sqrt(10 * gb);
- else
- ratio = 1;
-
- zone->inactive_ratio = ratio;
+ zone->inactive_ratio = 1;
}
Hope it will help someone facing the similar problems.
Regards,
Andriy
On Tue, Aug 23, 2016 at 4:14 PM, Andriy Tkachuk
<andriy.tkachuk@...gate.com> wrote:
> Well, as appeared - the patch did not affect the problem at all since
> the memory cgroup was on (in which case zone's inactive_ratio is not
> used, but the ratio is calculated directly at
> mem_cgroup_inactive_anon_is_low()). So the patch will be retested with
> memory cgroup off.
>
> Andriy
>
> On Mon, Aug 22, 2016 at 11:46 PM, Andriy Tkachuk
> <andriy.tkachuk@...gate.com> wrote:
>> On Mon, Aug 22, 2016 at 7:37 PM, Andriy Tkachuk
>> <andriy.tkachuk@...gate.com> wrote:
>>>
>>> The following patch resolved the problem:
>>> ...
>>
>> Sorry, I was too hurry in sending good news. As appeared - the problem
>> is still there:
>>
Powered by blists - more mailing lists