linux-kernel - Doubt - facing a cpu softlockup during _raw_spin_unlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CAGvB2rPzAPB2PTsegwyKhQ0omq70r1DZP4NjX7hmdj=NP6ao0w@mail.gmail.com>
Date:   Sat, 28 Jan 2017 20:15:56 +0530
From:   Suraj Choudhari <surajschoudhari@...il.com>
To:     linux-kernel@...r.kernel.org
Subject: Doubt - facing a cpu softlockup during _raw_spin_unlock_irqrestore

Hello,,

I've few queries reg a CPU softlockup issue i am facing with 4.4
kernel on sles12.

Here are details -
1) The thread which causes the softlockup  was releasing a spinlock.
The softlockup happens while running --  '_raw_spin_unlock_irqrestore'
The thread which causes the lockup was getting rescheduled again &
faced the soft lockup continuously.

2) At the time of lockup,  all the IO threads were sleeping since 1
hour in schedule(),
function, specifically in the get_request() condition

Few queries -

1) What may be reason IO threads not getting scheduled since 1 hour,
however thread causing the lockup re-scheduled again readily ?

2) What may be reason few IO threads were waiting in the get_request()
condition for more than 1 hour ??

>From the get_request() implementation, I could figure that
__get_request() may fail with ENODEV or ENOMEM.

I tried to figure out return value of __get_request() using below
systemtap probe, but it did not print the value of the request
pointer.

probe kernel.statement("get_request@...ck/blk-core.c:1246")
{
        printf("localvars1246:%s \n", $$locals);
}

output - localvars1246:is_sync=? wait={...} rl=? rq=?

so I could not figure out exact cause __get_request may be failing for
IO threads ?

3) Any suggestions how to fix such soft lockup issue during
_raw_spin_unlock_irqrestore?  [I was thinking to use 'cond_schedule()'
in the thread facing the soft-lockup]

Thanks & Regards,
Suraj