linux-kernel - Re: Mysterious CFQ crash and RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1306344530.21978.5.camel@t41.thuisdomein>
Date:	Wed, 25 May 2011 19:28:48 +0200
From:	Paul Bolle <pebolle@...cali.nl>
To:	Jens Axboe <jaxboe@...ionio.com>
Cc:	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
	Vivek Goyal <vgoyal@...hat.com>,
	linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: Mysterious CFQ crash and RCU

On Wed, 2011-05-25 at 10:46 +0200, Jens Axboe wrote:
> I don't think we are dealing with bad RCU usage in CFQ. My gut tells me
> that this is related to the merging of cooperating queues. It fits
> roughly with the time frame of when this issue started occuring, and
> some of that reference logic looks fragile/racy.
> 
> So if you _can_ test a patch easily, please try this one. It'll disable
> that logic.

I'm sorry, but with that patch (adapted to out previous discussion, so
simply returning NULL) applied I still hit the same Oops:

[  417.526021] Oops: 0000 [#1] SMP 
[  417.526021] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
[  417.526021] Modules linked in: cfq_iosched cpufreq_ondemand acpi_cpufreq mperf bnep bluetooth nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 ip6t_REJECT nf_defrag_ipv4 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables arc4 ppdev ath5k snd_intel8x0m snd_intel8x0 ath snd_ac97_codec mac80211 microcode ac97_bus snd_seq snd_seq_device snd_pcm cfg80211 joydev pcspkr thinkpad_acpi parport_pc e1000 rfkill parport snd_timer snd iTCO_wdt soundcore snd_page_alloc i2c_i801 iTCO_vendor_support uinput ipv6 yenta_socket video radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[  417.526021] 
[  417.526021] Pid: 30030, comm: mandb Not tainted 2.6.39-0.local5.fc16.i686 #1 IBM        /       
[  417.526021] EIP: 0060:[<f7efe929>] EFLAGS: 00010202 CPU: 0
[  417.526021] EIP is at call_for_each_cic+0x29/0x44 [cfq_iosched]
[  417.526021] EAX: 00000001 EBX: 6b6b6b6b ECX: 00000246 EDX: c0aa4a98
[  417.526021] ESI: f2f53580 EDI: f7efec18 EBP: edda5f18 ESP: edda5f0c
[  417.526021]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  417.526021] Process mandb (pid: 30030, ti=edda4000 task=f6a1d4c0 task.ti=edda4000)
[  417.526021] Stack:
[  417.526021]  f2f53580 f6a1d4c0 f6a1d890 edda5f20 f7efe956 edda5f2c c05e0506 f2f53580
[  417.526021]  edda5f40 c05e0596 f6a1d4c0 00000012 edda5f74 edda5f8c c044149f f646631c
[  417.526021]  f64662c0 00000009 f6a1d4c0 00000007 f6a1d6c4 f6a1d4b8 f6a1d6c4 00000001
[  417.526021] Call Trace:
[  417.526021]  [<f7efe956>] cfq_free_io_context+0x12/0x14 [cfq_iosched]
[  417.526021]  [<c05e0506>] put_io_context+0x34/0x5c
[  417.526021]  [<c05e0596>] exit_io_context+0x68/0x6d
[  417.526021]  [<c044149f>] do_exit+0x63e/0x661
[  417.526021]  [<c04416d9>] do_group_exit+0x63/0x86
[  417.526021]  [<c0441714>] sys_exit_group+0x18/0x18
[  417.526021]  [<c081cc9f>] sysenter_do_call+0x12/0x38
[  417.526021] Code: 5d c3 55 89 e5 57 56 53 3e 8d 74 26 00 89 c6 89 d7 e8 01 db ff ff 8b 5e 4c e8 50 5b 55 c8 85 c0 74 05 e8 b7 ff ff ff 85 db 74 11 <8b> 03 0f 18 00 90 8d 53 d8 89 f0 ff d7 8b 1b eb dd e8 10 db ff 
[  417.526021] EIP: [<f7efe929>] call_for_each_cic+0x29/0x44 [cfq_iosched] SS:ESP 0068:edda5f0c
[  417.526021] CR2: 000000006b6b6b6b
[  417.717510] ---[ end trace 24344cc07101e5e5 ]---

(That last sysfs file apparently was because I now had to switch to from
deadline to cfq manually.)



Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/