netdev - Re: Kernel crash after using new Intel NIC (igb)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4DDECA9B.8080206@fb.com>
Date:	Thu, 26 May 2011 14:48:11 -0700
From:	Arun Sharma <asharma@...com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Maximilian Engelhardt <maxi@...monizer.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	StuStaNet Vorstand <vorstand@...sta.mhn.de>
Subject: Re: Kernel crash after using new Intel NIC (igb)

On 5/26/11 12:47 PM, Eric Dumazet wrote:

> You dont get the problem. Problem is : We can do the empty() test only
> if protected by the lock.
>
> If not locked, result can be wrong. [ false positive or negative ]
>


Agreed. Failing to unlink from unused list when we should have sounds wrong.

>> The list modification under unused_peers.lock looks generally safe. But
>> the control flow (based on refcnt) done outside the lock might have races.
>>
>
> "might" is not a good word when dealing with this ;)

Potential race in the current code:

initial refcnt = 1

        T1:                                        T2

atomic_dec_and_lock(refcnt)
// refcnt == 0

                                        atomic_add_unless(refcnt)
                                        unlink_from_unused()

list_add_tail(unused)
                                        // T2 using "unused" entry


> Did you test my fix ?

I could try it on one or two machines - but it won't tell us anything 
for weeks if not months. Unfortunately my next window to try a new 
kernel on a large enough sample is several months away.

>
> Its doing the right thing : Using refcnt as the only marker to say if
> the item must be removed from unused list (and lock the central lock
> protecting this list only when needed)
>
> Since we already must do an atomic operation on refcnt, using
> atomic_inc_return [ or similar full barrier op ] is enough to tell us
> the truth.

Yeah - using the refcnt seems better than list_empty(), but I'm not sure 
that your patch addresses the race above.

  -Arun
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html