linux-kernel - Re: [PATCH v2] wireless: mark expected switch fall-throughs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com>
Date:   Tue, 23 Oct 2018 12:58:19 +0200
From:   "Gustavo A. R. Silva" <gustavo@...eddedor.com>
To:     Johannes Berg <johannes@...solutions.net>,
        "David S. Miller" <davem@...emloft.net>
Cc:     linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH v2] wireless: mark expected switch fall-throughs


On 10/23/18 10:59 AM, Gustavo A. R. Silva wrote:
> 
> On 10/23/18 9:01 AM, Johannes Berg wrote:
>> On Tue, 2018-10-23 at 02:13 +0200, Gustavo A. R. Silva wrote:
>>> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
>>> where we are expecting to fall through.
>>>
>>> Warning level 3 was used: -Wimplicit-fallthrough=3
>>>
>>> This code was not tested and GCC 7.2.0 was used to compile it.
>>
>> Look, I'm not going to make this any clearer: I'm not applying patches
>> like that where you've invested no effort whatsoever on verifying that
>> they're correct.
>>
> 
> How do you suggest me to verify that every part is correct in this type
> of patches?
> 

BTW... I'm under the impression you think that I don't even look at
the code. Is that correct?

I've been working on this for quite a while, and in every case I try to
understand the code in terms of the context in which every warning is
reported.

I look for dependencies between variables in adjacent switch cases, that could
make think it might be OK for some of those cases to fall through to the one
below. I check the names of the case labels to see if there might be any relation
between them, or if they are totally diferent as it might be the case for labels
that indicate the transmision(FOO_TX) or reception of data(FOO_RX). Something
similar to the latter is the case with on/off logic (FOO_ON/FOO_OFF). These are
some of the things I review in the code, so I can have an idea if the warning is
a false positive or an actual bug.

Here is a bug I found yesterday at drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c

5690         case WLAN_CIPHER_SUITE_CCMP:
5691                 key->flags |= IEEE80211_KEY_FLAG_SW_MGMT_TX;
5692                 break;
5693         case WLAN_CIPHER_SUITE_TKIP:
5694                 key->flags |= IEEE80211_KEY_FLAG_GENERATE_MMIC;
5695         default:
5696                 return -EOPNOTSUPP;
5697         }

Notice how the absence of a break statement is very suspicious in case
WLAN_CIPHER_SUITE_TKIP, because the code for case WLAN_CIPHER_SUITE_CCMP
is pretty similar and in that case there is a break at the bottom. Now,
that's not the only thing that looks supicious: in the absence of a break,
the code falls through to the default case, which returns the error value
-EOPNOTSUPP. So, even when key->flags is updated, the code always returns
an error for case WLAN_CIPHER_SUITE_TKIP. This analysis led me to think
that this is an actual bug, so I sent a patch to fix it:

https://lore.kernel.org/patchwork/patch/1002440/

Notice that this bug has been there since 2015:
commit 26f1fad29ad973b0fb26a9ca3dcb2a73dde781aa


Now, let's take a look at the following warning in net/wireless/chan.c:

net/wireless/chan.c: In function ‘cfg80211_chandef_usable’:
net/wireless/chan.c:748:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (!ht_cap->ht_supported)
      ^
net/wireless/chan.c:750:2: note: here
  case NL80211_CHAN_WIDTH_20_NOHT:
  ^~~~

This is the piece of code at net/wireless/chan.c:740:

 740         case NL80211_CHAN_WIDTH_5:
 741                 width = 5;
 742                 break;
 743         case NL80211_CHAN_WIDTH_10:
 744                 prohibited_flags |= IEEE80211_CHAN_NO_10MHZ;
 745                 width = 10;
 746                 break;
 747         case NL80211_CHAN_WIDTH_20:
 748                 if (!ht_cap->ht_supported)
 749                         return false;
 750         case NL80211_CHAN_WIDTH_20_NOHT:
 751                 prohibited_flags |= IEEE80211_CHAN_NO_20MHZ;
 752                 width = 20;
 753                 break;
 754         case NL80211_CHAN_WIDTH_40:
 755                 width = 40;
 756                 if (!ht_cap->ht_supported)
 757                         return false;
 758                 if (!(ht_cap->cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40) ||
 759                     ht_cap->cap & IEEE80211_HT_CAP_40MHZ_INTOLERANT)
 760                         return false;
 761                 if (chandef->center_freq1 < control_freq &&
 762                     chandef->chan->flags & IEEE80211_CHAN_NO_HT40MINUS)
 763                         return false;
 764                 if (chandef->center_freq1 > control_freq &&
 765                     chandef->chan->flags & IEEE80211_CHAN_NO_HT40PLUS)
 766                         return false;
 767                 break;

Notice that the warning was reported at line 748, but I'm including more code
here to make it explicitly clear that I not only focus my atention in a very
narrowed piece of code around the warning.

Now, I don't see anything supicious. The labels NL80211_CHAN_WIDTH_20 and
NL80211_CHAN_WIDTH_20_NOHT seem to be related, and it's even less supicious
when there is an explicit line of code that breaks the switch under certain
conditions, just at the bottom of the "case", as is the case with lines 748
and 749:

 748                 if (!ht_cap->ht_supported)
 749                         return false;

Also, no default case returning an error every time if the code falls through
to the next case. Lastly, I also check the age of the code, but this only after
I have analyzed it as explained above.


I do this analysis for every warning.  Now, when I say I haven't tested the code
is because I don't have any log as evidence for anything. Not that I haven't put
any effort on trying to understand it and its context.  When I started working on
this task there were more than 2000 of these issues, now there are around 600 left.

I have fixed many bugs on the way, so a good amount of work is being invested on
this, and it's paying off. :)


Now, let me ask you this question:

It would be easier for you to review this patch if I turn it into a series?

I can do that without a problem.

Thanks!
--
Gustavo