Wi-Fi / WLAN Performance Management and Optimization Veli-Pekka Ketonen CTO, 7signal Solutions
Topics 1. The Wi-Fi Performance Challenge 2. Factors Impacting Performance 3. The Wi-Fi Performance Cycle 4. 10 step performance optimization flow 5. Selected example data 6. Summary / Questions 2
Wi-Fi Networks are Everywhere! But they are transitioning from nice to have to must have 3
Wi-Fi Networks are Everywhere! But they are transitioning from nice to have to must have Challenges with Mission Critical Wi-Fi Networks: Connection issues with new devices & machines Bottlenecks from increasing data traffic Dropped or noisy voice calls Challenging physical environments Changes hourly, daily and weekly 4
Dependable Wi-Fi is Costly and Complex $ Cost Needed to Achieve Reliability BYOD Video Apps Virtual Desktop Location Svcs Mobile Computing Guest Networks Voice over Wi-Fi Reactive focus based on complaints Complexity of Network Number of access points, clients, applications 5
2. Factors impacting the performance 6
Antenna gain pattern Antenna gain direction Behind metal grid? Improper Antenna Selection / Placement Near to conductive or dense surface? In common ceiling mounted APs, sideways down tilted patterns is most useful Max gain sideways Down tilted pattern Attenuation upwards 7
RF power isn t always what your datasheet and settings tell you Impact of: AP/device model Rate/MCS HT 20/40/80 Assumed MIMO gain Assumed diversity/stbc gain Antenna gain Channel #, regulation Passing the Type Approval Back annotation reliability Lower output power and use antenna gain to reach further with higher rates RF power level is not that simple MIMO/TX div. gain, +3 db No high MCS/rates, + 3dB HT40 - > HT 20, +2 db Antenna gain, +3 db Radio output (no antenna), HT40, highest MCS 180Mbit/s 300 Mbit/s +20 dbm +17 dbm +14 dbm +11 dbm +8 dbm 300 Mbit/s 8
WLAN Transmit Power Control (TPC) can create issues Common implementation measures neighbor APs levels and keep them below a fixed value High received neighbor AP level may drive AP power down Room Room Room Room Power levels may drift to end of the allowed range Clients commonly use +10 - +15 dbm power, running APs much lower levels causes imbalance to link budget. Both uplink and downlink coverage are needed!..and cause lack of coverage here Room Room Room Room Room Room Room Room 9
Channel & Utilization Issues Channel overlap APs outside channel grid HT conflicts Amount of APs/SSIDs Empty AP vs.. loaded AP 10
Allocate channels properly Use all spectrum you have The most important way to increase capacity -- avoid interference and lower utilization! Some devices do not support all 5 GHz channels, but try really hard to use all available channels Channel automation parameters may help to make it converge towards a better channel plan If not, use manual channel plan 1 1 6 1 1 1 Without a very good reason this should not ever happen 1 11 1 6 1 6 11
Sometimes channel automation is not working well and needs help Continuous channel switching More stable operation 12
Too high rates cause high retries WLAN AP rate control often uses rates that are too high This causes high amount of retries, which have negative impact on performance Optimal rate *Lakshmanan et. al. On link rate adaptation in 802.11n WLANs * Haratcherev et.al. : Automatic IEEE 802.11 Rate Control for Streaming Applications 13
Typical in WLAN Retries = HIGH What can rates and retries tell you? Data rates/mcs = HIGH Unstable, high jitter, packet loss, limited capacity Very slow, at the coverage boundary Good coverage, reliable operation, high speed and capacity Speed limited, working ok Target Retries = LOW Data rates/mcs = LOW 14
Non Wi-Fi Interference Bluetooth Microwave Video cameras Medical devices 15
Legacy mode drives speed down The largest impact from is 802.11b protection When an AP detects an associated 802.11b client, AP turns on protection mode (in beacons and probe responses). AP may turn this on also when it detects another AP using protection mode. When protection mode is on, all clients need to start using either RTS/CTS or CTS-to-Shelf protection to avoid collisions This introduces a significant overhead that usually limits throughputs and capacity remarkably If b support is off, it s useful to try to remove devices completely. Otherwise they keep probing with b rates 16
TCP does not like lost packets or delay TCP uses a mechanism called slow start If a packet loss occurs, TCP assumes that it is due to network congestion and takes steps to rapidly reduce the offered load to the network With slow start, TCP starts increasing rate again when consecutive acknowledgements are received properly Slow-start may perform poorly with wireless networks that are losing packets 17
User Retries at different layers using TCP User data User may lose patience in 4-10s Application (Layer 5-7) varies Desktop virtualization (used sometime to help with layer 1-4 problems) TCP (Layer 4) WLAN (Layer 1-2) Not ACK d? -> Resend, 7-25 times Not ACK d within 2x RTT? -> Resend w/ SLOW START = A data packet, illustration purposes only 18
User Retries at different layers using UDP VoIP call, etc. Application (Layer 5-7) UDP (Layer 4) UDP does not retransmit, permanently lost packet WLAN (Layer 1-2) Not ACK d? -> Resend, 7-25 times Jitter Packet loss = A data packet, illustration purposes only 19
Layer 2 packet fragmentation makes radio more robust #1, 1500 B #2, 1500 B ACK ACK If all goes well, good efficiency #1, 1500 B #1, Retry 1, 1500 B No ACK (lost or any error) If error is detected, content of the whole 1500B packet is lost and needs to be retransmitted #1, 750 B #2, 750 B ACK #3, 750 B #4, 750 B ACK ACK Probability of errors in smaller packet is lower and transmitting it has taken less time in the first place Fragmenting packets increases robustness, but increases overhead Aggregating (e.g. Block ACK), reduces robustness, but increases efficiency Fragmentation threshold default value usually 2346B (>1500B, no fragmenting) 20
Higher QoS helps prioritize data Voice (VO), Video (VI), Best Effort (BE) and Background (BK) classes * Source: IEEE 802.11-08/1214-02-00aa 802.11 QoS Tutorial 21
3. The Wi-Fi Performance Cycle 22
Answering the Wi-Fi Challenge Problem Wait for complaints Solution Proactive measurements Limited view of network Check end-to-end performance Little historical data Analyze historical trends Guess at service levels Use metrics based reporting Remote issues costly to resolve Centralize diagnosis of problems 23
Bending the Cost Curve $ Cost Needed to Achieve Reliability BYOD Video Apps Virtual Desktop Location Svcs Mobile Computing Guest Networks Voice over Wi-Fi Reactive focus based on complaints Proactive focus based on continuous measurements Complexity of Network Number of access points, clients, applications 24
Performance Management with a Systematic Approach Simulate Client Traffic (Active Tests) Sensor Mgmt Station Access Point(s) Listen to AP / Client Traffic (Passive Tests) 25
The Eye s Capabilities Synthetic Tests End-to-end view at the application layer Data and voice quality measurements (throughput, packet loss, latency, jitter) Traffic Analysis Radio frame header analysis for traffic flow between clients and APs. KPIs for each client, SSID, AP, band and antenna beam RF Analysis AP settings, capabilities, signal levels, channels and noise levels KPIs for each AP, channel and antenna beam Spectrum Analysis High resolution (280kHz) for ISM band Interference source analysis with compass directional data on beams Full Packet Capture Capture remotely Easy export to Wireshark or other tool 26
The Wi-Fi Performance Cycle Measure If you can t measure it, you can t manage it! - Peter Drucker Assure Analyze Verify Optimize 27
4. Optimization flow, 10 step process 28
Assess The most important KPIs Connection Success Throughput Packet Loss End user metrics (active tests) Data rates Retry rates Utilization Traffic volume Latency Jitter Voice quality (MOS) Layer 2 / Layer 1 metrics(passive tests) Channels Signal level Spectrum data Optimize 29
Optimization flow at a glance 1. Preparations and baseline 2. Channel plan 3. Minimize utilization 4. Adjust power levels 5. Reduce non-wlan interference 6. Improve radio robustness 7. Prioritize and balance traffic 8. LAN/WAN capabilities 9. Improve client operation 10. Physical network changes Ensure that APs and antennas are positioned correctly Collect baseline data for a few days, check WLAN SW release, upgrade Maximize available spectrum, organize channels for max capacity potential Use manual channel plan in dense areas Minimize utilization due to unnecessary 802.11 traffic # of SSIDs, standards, beaconing, probing, data rates, protection, etc. Adjust AP power levels & TPC settings for improved SNR at both ends Remove non-wlan interference, as much as possible There is always interference, understand whether it has significant impact Make radio more robust towards remaining interference/noise Increased power, dropping max MCS, fragmentation, directional antennas QoS categories, AP power levels, load balancing, SSID strategy, roaming Ensure sufficient LAN/WAN capacity and performance are present Drivers, location, models, settings If performance is not sufficient, consider HW changes Directional antennas, add/move APs, replace equipment, end user devices 30
#1. Understand the baseline Collect and review all radio parameter settings Verify AP type, antenna performance and placement Collect baseline performance data for 3-5 days Understand peaks and valleys in performance Nighttime data is extremely useful - If empty network can t provide good throughput, it won t do that under load either! Analyze and find likely bottlenecks Draft a plan for optimization steps Make small changes and verify each step 31
#2. Plan the channels carefully Understand # of AP/channel in the whole area Use maximum amount of radio spectrum & channels Align all APs to a common channel grid (1, 6, 11, etc) Fix HT bonding side, HT40+ or HT40- Do not overlap bonded with main channel If automation does not provide a balanced plan, assign channels manually Rotate channels evenly within floor Rotate with offset between floors Remove out of grid devices is possible 32
#3. Minimize utilization Reduce number of SSIDs/AP to max. 3-4 Note: Every SSID sends an own beacon, days and nights Its common that networks run high utilization w/o clients! Remove 802.11b rates (1, 2, 5.5, 11) and their support Remove low MCS and SS multiples Increase beacon interval from 100ms to 300ms Note: Some devices do not allow this. E.g. Vocera badges, older VoIP phones and in general older equipment Increase CCA threshold (RX SOP, or similar term) Remove printers and other devices that keep air busy 33
#4. Adjust power levels Define a limited range for TPC algorithms instead of default Observe power level changes also from metrics. Do they correlate with settings? Assign 3-5 db higher power range for 5 vs. 2.4 GHz Use manual power levels if TPC noes not yield good results If possible, do not exceed the power level that still supports all data rates/mcss. Consider compensating with higher gain antennas if needed 34
#5. Reduce non-wi-fi interference Interference is present, always! Understand level of impact How are end user metrics impacted? Correlate spectrum data with metrics Analyze spectrum, where does the noise come from? Bluetooth is the most common non-wlan source Keyboard, mouse, headset, handheld readers Many other potential sources especially at 2.4 GHz band Remove sources when possible Observe impact to throughput and other end user metrics when changes are made If changes are helping, it s visible in active data 35
#6. Improve WLAN robustness Remove highest rates/mcs (most sensitive) Run voice SSIDs only -g/-a mode without n Use radio packet fragmentation Enable interference resistant mode if supported 36
#7. Prioritize and balance traffic Separate SSIDs (but keep quantity to minimum) Assign QoS classes with WMM (Wireless Multimedia Extensions) Adjust relative AP power levels to move clients Consider use of load balancing, band steering/select and admission control features Different features offered depending on vendor 37
#8. Ensure sufficient LAN/WAN capacity Observe utilization at the switch/router interfaces Observe packet loss metrics Internet connection speed may be a bottleneck at remote sites Routing data packets always to controller may impact performance Understand what is sufficient throughput for end user and dimension connections accordingly 38
#9. Improve client operation Review all client devices and understand where are their antennas Ensure that antennas are not hidden within metal enclosures and have space to operate properly Upgrade WLAN drivers Turn roaming aggressiveness to medium or low Adjust client power level CTS-to-Self may be more efficient than RTS/CTS 39
#10. Physical changes to network Move APs Add APs Upgrade APs Use good quality and right type of external antennas Every network can be made perform well! 40
5. Examples 41
Akron Children s Medical Center 42
Uplink throughput Average improved from ~11 to ~14 Mbit/s (27%) The worst APs improved from ~4 to ~13 Mbit/s. (225%) Antenna change ready Channel change Power level change Codec changes Core LAN upgrade 43
Downlink Throughput Average improved from 13 to 17 Mbit/s (30%) The worst APs improved from 7 to 15 Mbit/s. (110%) Antenna change ready Channel change Power level change Codec changes Core LAN upgrade 44
Packet loss From ~2.5% to ~0.5% Antenna change ready Channel change Power level change Codec changes Core LAN upgrade 45
University, Iowa 46
Downlink throughput (daily) Downlink throughput daily averages have improved 50% 1st 2nd 3rd 4th 5th 6th 7th 1 st ) Disabling power saving 2 nd ) Disabling b-data rates, area 1 3 rd ) Disabling b-data rates in other locations 4 th ) New channel plan areas 1 &2 5 th ) New TxPwr settings in XXX and channel plan in YYY 6 th ) Beacon interval change 7 th ( Channel re-plan area 3 2.4GHz 47
Downlink throughput (hour) Minimum values increase up to ~10x 1st 2nd 3rd 4th 5th 6th 7th 1 st ) Disabling power saving 2 nd ) Disabling b-data rates, area 1 3 rd ) Disabling b-data rates in other locations 4 th ) New channel plan areas 1 &2 5 th ) New TxPwr settings in XXX and channel plan in YYY 6 th ) Beacon interval change 7 th ( Channel re-plan area 3 2.4GHz 48
Avans University of Applied Sciences 49
TCP downlink throughput 1 2 3 4 5 900% improvement in 1 st floor 100% improvement in ground floor HT40 More channels Beacon 300ms AP power levels 50
HTTP downlink throughput 1 2 3 4 5 90%/50% improvements 51
Voice Quality (MOS), downlink, hourly 1 2 3 4 5 +0.25MOS in ground +0.25MOS in 1 st floor 52
Network latency (RTT) 1 2 3 4 5 50% improvement in 1 st floor 53
Performance Dashboard Before Analysis and Optimization After Analysis and optimization 54
6. Summary 55
Summary Wi-Fi is very sensitive to the surroundings and network parameters, even though it somehow works almost no matter where you put it Performance can often be improved significantly by adjusting the network parameters Need relevant continuous data to validate changes Need knowledge of WLAN/RF to decide the actions Optimization requires a pragmatic approach 56
Thank You! Email: Presentation: veli-pekka.ketonen@7signal.com http://go.7signal.com/surfwlpc www.7signal.com @7signal 57