[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <5DE6F4AE-F3F9-4C52-9DFC-E066D9DD5EDC@apple.com>
Date: Fri, 02 Aug 2019 11:00:55 -0700
From: Masoud Sharbiani <msharbiani@...le.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: gregkh@...uxfoundation.org, hannes@...xchg.org,
vdavydov.dev@...il.com, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1.
> On Aug 2, 2019, at 7:41 AM, Michal Hocko <mhocko@...nel.org> wrote:
>
> On Fri 02-08-19 07:18:17, Masoud Sharbiani wrote:
>>
>>
>>> On Aug 2, 2019, at 12:40 AM, Michal Hocko <mhocko@...nel.org> wrote:
>>>
>>> On Thu 01-08-19 11:04:14, Masoud Sharbiani wrote:
>>>> Hey folks,
>>>> I’ve come across an issue that affects most of 4.19, 4.20 and 5.2 linux-stable kernels that has only been fixed in 5.3-rc1.
>>>> It was introduced by
>>>>
>>>> 29ef680 memcg, oom: move out_of_memory back to the charge path
>>>
>>> This commit shouldn't really change the OOM behavior for your particular
>>> test case. It would have changed MAP_POPULATE behavior but your usage is
>>> triggering the standard page fault path. The only difference with
>>> 29ef680 is that the OOM killer is invoked during the charge path rather
>>> than on the way out of the page fault.
>>>
>>> Anyway, I tried to run your test case in a loop and leaker always ends
>>> up being killed as expected with 5.2. See the below oom report. There
>>> must be something else going on. How much swap do you have on your
>>> system?
>>
>> I do not have swap defined.
>
> OK, I have retested with swap disabled and again everything seems to be
> working as expected. The oom happens earlier because I do not have to
> wait for the swap to get full.
>
In my tests (with the script provided), it only loops 11 iterations before hanging, and uttering the soft lockup message.
> Which fs do you use to write the file that you mmap?
/dev/sda3 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
Part of the soft lockup path actually specifies that it is going through __xfs_filemap_fault():
[ 561.452933] watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [leaker:3261]
[ 561.459904] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt gpio_ich iTCO_vendor_support dcdbas ipmi_ssif intel_powerc
lamp coretemp kvm_intel ses ipmi_si kvm enclosure scsi_transport_sas ipmi_devintf irqbypass pcspkr lpc_ich sg joydev ipmi_msghandler wmi acp
i_power_meter acpi_cpufreq xfs libcrc32c ata_generic sd_mod pata_acpi ata_piix libata megaraid_sas crc32c_intel serio_raw bnx2 bonding
[ 561.495979] CPU: 4 PID: 3261 Comm: leaker Tainted: G I L 5.3.0-rc2+ #10
[ 561.503704] Hardware name: Dell Inc. PowerEdge R710/0YDJK3, BIOS 6.4.0 07/23/2013
[ 561.511168] RIP: 0010:lruvec_lru_size+0x49/0xf0
[ 561.515687] Code: 41 89 ed b8 ff ff ff ff 45 31 f6 49 c1 e5 03 eb 19 48 63 d0 4c 89 e9 48 03 8b 88 00 00 00 48 8b 14 d5 60 a9 92 94 4c 03
34 11 <48> c7 c6 80 7c bf 94 89 c7 e8 89 d3 59 00 3b 05 27 eb ff 00 72 d1
[ 561.534418] RSP: 0018:ffffb5f886a3f640 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 561.541968] RAX: 0000000000000002 RBX: ffff96fca3bba400 RCX: 00003ef5d82059f0
[ 561.549085] RDX: ffff9702a7a40000 RSI: 0000000000000010 RDI: ffffffff94bf7c80
[ 561.556202] RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff94ae1c00
[ 561.563318] R10: ffff96fcc7802520 R11: 0000000000000000 R12: 0000000000000004
[ 561.570435] R13: 0000000000000008 R14: 0000000000000000 R15: 0000000000000000
[ 561.577553] FS: 00007f5522602740(0000) GS:ffff9702a7a80000(0000) knlGS:0000000000000000
[ 561.585623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 561.591352] CR2: 00007fba755f95b0 CR3: 0000000c646dc000 CR4: 00000000000006e0
[ 561.598468] Call Trace:
[ 561.600907] shrink_node_memcg+0xc8/0x790
[ 561.604905] ? shrink_slab+0x245/0x280
[ 561.608644] ? mem_cgroup_iter+0x10a/0x2c0
[ 561.612728] shrink_node+0xcd/0x490
[ 561.616208] do_try_to_free_pages+0xda/0x3a0
[ 561.620466] ? mem_cgroup_select_victim_node+0x43/0x2f0
[ 561.625678] try_to_free_mem_cgroup_pages+0xe7/0x1c0
[ 561.630629] try_charge+0x246/0x7a0
[ 561.634107] mem_cgroup_try_charge+0x6b/0x1e0
[ 561.638453] ? mem_cgroup_commit_charge+0x5a/0x110
[ 561.643231] __add_to_page_cache_locked+0x195/0x330
[ 561.648100] ? scan_shadow_nodes+0x30/0x30
[ 561.652184] add_to_page_cache_lru+0x39/0xa0
[ 561.656442] iomap_readpages_actor+0xf2/0x230
[ 561.660787] iomap_apply+0xa3/0x130
[ 561.664266] iomap_readpages+0x97/0x180
[ 561.668091] ? iomap_migrate_page+0xe0/0xe0
[ 561.672266] read_pages+0x57/0x180
[ 561.675657] __do_page_cache_readahead+0x1ac/0x1c0
[ 561.680436] ondemand_readahead+0x168/0x2a0
[ 561.684606] filemap_fault+0x30d/0x830
[ 561.688343] ? flush_tlb_func_common.isra.8+0x147/0x230
[ 561.693554] ? __mod_lruvec_state+0x40/0xe0
[ 561.697726] ? alloc_set_pte+0x4e6/0x5b0
[ 561.701669] __xfs_filemap_fault+0x61/0x190 [xfs]
[ 561.706361] __do_fault+0x38/0xb0
[ 561.709666] __handle_mm_fault+0xbee/0xe90
[ 561.713750] handle_mm_fault+0xe2/0x200
[ 561.717574] __do_page_fault+0x224/0x490
[ 561.721485] do_page_fault+0x31/0x120
[ 561.725137] page_fault+0x3e/0x50
[ 561.728439] RIP: 0033:0x400c5a
[ 561.731483] Code: 45 c0 48 89 c6 bf 77 0e 40 00 b8 00 00 00 00 e8 3c fb ff ff c7 45 dc 00 00 00 00 eb 36 8b 45 dc 48 63 d0 48 8b 45 c0 48
01 d0 <0f> b6 00 0f be c0 01 45 e8 8b 45 dc 25 ff 0f 00 00 85 c0 75 10 8b
[ 561.750214] RSP: 002b:00007fffba1d9450 EFLAGS: 00010206
[ 561.755426] RAX: 00007f550346b000 RBX: 0000000000000000 RCX: 000000000000001a
[ 561.762542] RDX: 0000000001c4c000 RSI: 000000007fffffe5 RDI: 0000000000000000
[ 561.769659] RBP: 00007fffba1da4a0 R08: 0000000000000000 R09: 00007f552206c20d
[ 561.776775] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000400850
[ 561.783892] R13: 00007fffba1da580 R14: 0000000000000000 R15: 0000000000000000
If I switch the backing file to a ext4 filesystem (separate hard drive), it OOMs.
If I switch the file used to /dev/zero, it OOMs:
…
Todal sum was 0. Loop count is 11
Buffer is @ 0x7f2b66c00000
./test-script-devzero.sh: line 16: 3561 Killed ./leaker -p 10240 -c 100000
> Or could you try to
> simplify your test even further? E.g. does everything work as expected
> when doing anonymous mmap rather than file backed one?
It also OOMs with MAP_ANON.
Hope that helps.
Masoud
> --
> Michal Hocko
> SUSE Labs
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3437 bytes)
Powered by blists - more mailing lists