Recently I upgraded a customer’s wireless controllers to the latest Cisco 5520 WLCs, but because their environment had a mix of brand new access points and somewhat old ones, I had to use an outdated version of code that resulted in some weird client issues on the new APs.
The customer installed many new Cisco 2802 access points which was perfectly fine and easy except that they were still running hundreds of 2602s and dozens of 3500s. This meant that though the controllers were brand new and they were running plenty of new APs, I had to use version 8.3.143 so that both the older access points and the newest ones would join the new controllers.
That was no big deal since 8.3.143 is a stable code version, and the shear number of 2602s meant we really had no alternative anyway. After migrating all the old APs to the new controllers I made sure to test out just a couple test 2802s prior to deploying them. My test 2802i access points joined the controller with no issue, and I connected with both my phone and laptop just fine. I experienced no delays, no issues browsing the internet, and no flakiness. Shortly thereafter I successfully joined all the new APs. This was a super-easy project, and everything seemed to be working just fine.
However, a couple days later I learned that end-users weren’t able to browse the internet when connected to one of the new 2802i or 2802e access points. This didn’t make any sense to me because I tested that particular platform and also saw many clients with correct IP addresses associated to those APs.
An on-site visit helped me understand better that the issue was related to TCP specifically and not simply connectivity. I was able to ping to my heart’s content, and navigating to several websites by IP rather than hostname seemed to work reasonably well. This led me to believe that this was some sort of really weird DNS issue, but after a few minutes of testing I found that even navigating by IP was unreliable and took an extremely long time.
After some googling and reaching out to a colleague I learned that the default TCP MSS value (TCP maximum segment size) on older versions of Cisco WLC code is set to 1363 which doesn’t work well with the latest access points. I’m not exactly sure why, but there is some extra padding added to segments when connected to the newest Cisco APs that causes segment size to exceed this limit and therefore degrade TCP communication.
I decreased the TCP MSS in the global configuration to 1250 which immediately resolved the issue. Simply navigate to the Wireless menu and select Global Configuration from the left menu-pane.
From here change the TCP MSS size to what you want and Apply/Save Configuration.
I’m curious to know exactly what the newer APs do to increase segment size (and why), but I did also learn that the newest versions of WLC code have it set to 1250 by default now anyway. I haven’t used 8.3 in a while, and I certainly haven’t when joining 2800 series access points, so this was an interesting one to figure out.
I hope this helps in your wireless troubleshooting journeys.
Thanks,
Phil
Leave a Reply