WebRTC 102: #5 Understanding Call Quality

WebRTC has become a popular choice for real-time communication applications due to its ease of use and low latency. However, like any technology, it is not without its flaws. There are several common problems that we often encounter with WebRTC calls. These problems can affect the call quality and the overall user experience.

At Dyte, we do our best to give our users an issue-free experience. However, several factors, such as network conditions, OS, audio, and video devices, remain out of our control, causing issues in the call experience of our users. In this blog post, we will discuss the key metrics for measuring call quality in WebRTC audio-video calls and learn how to track and improve these metrics to enhance the user experience.

But first, we will categorize the call experience dampening issues into the following buckets.

1. One-way audio or video: One-way media refers to issues when the receiver can't hear/see you, but you can hear/see them (or vice versa).

Possible causes for this would be:

It can be caused by incorrect network setup/network blocking, which can lead to ICE/STUN binding failure for the client and the failure of the media. connection.
Another possible cause of one-way audio or video is a poor internet connection. When the connection is weak or unstable, it can affect the quality of audio and video transmission. Additionally, participants may accidentally mute their microphones or turn their cameras off, leading to one-way audio or video.

2. Audio and video lag: The time taken from the source media capture to playback on the receiver’s end is called end-to-end latency. If this latency metric is significantly high, the media stops being perceived as real-time, and the user has a degraded experience that we call audio and video lag. Typically anything beyond 300ms of end-to-end latency is considered sub-optimal.

It leads to a disjointed conversation and choppy video, negatively impacting the user experience.

One common cause of this issue is high network Round Trip Time (RTT), which measures the time taken for a packet to travel from the sender to the receiver and back. High RTT values can indicate network congestion.

It is important to improve the network bandwidth to address this problem. This can help reduce RTT and improve the overall quality of the call. There are several ways to improve network bandwidth, such as using a wired connection instead of a wireless one, upgrading the internet plan, and limiting the number of devices connected to the network during the call.

In addition to improving the network bandwidth, other strategies can be implemented to reduce RTT and enhance call quality. These include implementing congestion control strategies, error correction techniques, and bandwidth management strategies.

Congestion control strategies like TCP-friendly rate control, can prevent network congestion and packet loss. This can help ensure that packets are not lost and call quality is maintained. Error correction techniques, such as Forward Error Correction (FEC), can help recover lost packets and reduce the risk of audio and video lagging. Bandwidth management strategies, such as dynamic bitrate adaptation, can help optimize bandwidth usage and ensure a smooth and seamless user experience.

3. Choppy or broken media: When both parties can connect, continuous moments of lossy or bad audio make the conversation difficult to understand.

High jitter is one of the primary reasons for this issue. Jitter refers to the variation in the time it takes for a packet to travel from the sender to the receiver. With high jitter, packets can arrive out of order, resulting in choppy or broken audio.

One strategy for reducing jitter is to implement congestion control techniques. Another technique is to use error correction methods, such as Forward Error Correction (FEC), which can help recover lost packets and reduce the risk of choppy or broken audio.

For video, this will result in an unsteady FPS with missing frames causing video freezes.

4. Low-quality video: When the resolution of the video is below the acceptable threshold, this can result in a blurry or pixelated image that is difficult to see.

If spatial simulcast is active and the receiver network conditions are inadequate to handle a higher-quality stream variant, bandwidth estimation on the SFU side will switch to a low-quality stream.
On the sender side, there could be multiple reasons for low-quality output.
- If the processing power available is insufficient to encode high quality, the encoder will switch the output to low quality.
- Encoder also gets the network feedback and uses it to adjust the quality from BWE, and if the network is not adequate to transmit the stream in higher quality, it will switch to lower quality.
- The encoder can also be passed hints on in case of issues, should it retain resolution or frames per second. The default is “balanced,” so this might not be configured correctly for applications preferring quality preservation.
- Another possible cause might be the camera feed input being low quality and not supporting higher resolutions.

5. Laggy video and low audio/echo: When the frames per second of the video are less than ideal, it results in jarred video playback.

If temporal simulcast is active and the receiver network conditions are inadequate to handle a higher FPS stream variant, bandwidth estimation on the SFU side will switch to a lower temporal layer.

Low audio or echo can be problematic, making it difficult to hear and understand what the other participant is saying. Measuring and optimizing call quality is essential for enhancing the user experience in WebRTC audio-video calls.

Metrics to look out for in a WebRTC call

Packet loss is one of the most important metrics to track in measuring call quality in WebRTC. This is because packet loss can result in audio and video quality degradation, frustrating users. When packets are lost, the receiver may experience choppy audio or video or even complete loss of audio or video. This is why it is important to keep packet loss to a minimum to ensure a smooth and seamless user experience.
- getStats - fractionLost
Round Trip Time (RTT) is another essential metric to track in measuring call quality in WebRTC. RTT measures the time it takes for a packet to travel from the sender to the receiver and back. High RTT values can indicate network congestion, leading to packet loss and poor call quality. When RTT is high, users may experience delays in audio or video or may even experience a complete loss of them. By tracking RTT, developers can identify network areas that are causing congestion and take steps to alleviate it.
- getStats - roundTripTime
Jitter is the variation in the time taken for a packet to travel from sender to receiver. High jitter values can lead to audio and video distortions, which can be frustrating for users. By tracking the jitter, developers can identify areas of the network that are causing the jitter and take steps to alleviate it.
- getStats - jitter
Picture Loss Indication (PLI) is a signal from the receiver to the sender indicating that a video frame has been lost. This metric is important to track because lost video frames can result in stuttering or freezing of the video stream, which can frustrate users. By tracking PLI, developers can identify when and where video frames are being lost and take steps to improve the quality of the video stream.
- getStats - pliCount
Frames dropped occurs when one or more frames are lost during transmission, resulting in a choppy or jerky video during a call.
Frames per second (FPS) is the number of frames displayed per second in a video. In WebRTC, we can measure FPS by counting the number of frames displayed in a given time interval. When the frame rate is too low, users may experience stuttering or freezing of the video stream.
Concealment events occur when the video encoder is unable to keep up with the frame rate, resulting in the loss of visual information in the video stream.
Network resources, such as bandwidth and CPU usage, can also impact call quality. When insufficient bandwidth or CPU resources are unavailable, users may experience poor call quality. By tracking these resources, developers can identify when and where resources are being stretched too thin and take steps to optimize them.
- getStats - qualityLimitationReason, qualityLimitationDuration, qualityLimitationResolutionChanges
Total processing delay is the time it takes for a packet to travel from the sender to the receiver, including processing time at each end. In WebRTC, high processing delay can cause delays and affect the overall quality of the call.
Jitter buffer delay is the time it takes for the receiver to buffer incoming packets before playing them out. In WebRTC, high jitter buffer delay can cause delays and affect the overall quality of the call.
- getStats - jitterBufferDelay

Error correction is another essential strategy for improving call quality in WebRTC. When packets are lost, error correction techniques such as Forward Error Correction (FEC) can help recover them. FEC adds redundancy to the sent data, allowing lost packets to be reconstructed. Using FEC, developers can help ensure that lost packets do not result in poor call quality.

Bandwidth management is also important for ensuring good call quality in WebRTC. When insufficient bandwidth is available, users may experience poor call quality. Bandwidth management techniques such as dynamic bitrate adaptation can help optimize bandwidth usage, allowing for a smooth and seamless user experience.

Monitoring these metrics

To monitor these metrics, WebRTC developers can use getStats() to extract these metrics from the client side RTCPeerConnection.

While the extraction on the client side is simple, having an observability pipeline to ingest these metrics, persist them, and get valuable data out of them is not straightforward. The easiest way to do that is with products like testRTC’s watchRTC, which provides end-to-end monitoring capabilities. However, once you want to get beyond the basics and need custom monitoring/analysis, you might have to build this independently, just as we did at Dyte.

Such observability allows you to look into specific user sessions, understand how the call performed, and find the root cause of potential issues in case the user did not have the best experience. It also allows you to monitor system-wide aggregate data on their calls' performance, allowing you to do a precise analysis.

For Example:

Understand how performance is affected between different browser versions like from Chrome 110 to Chrome 111
Understand how specific regions or internet service providers perform and understand their connectivity to your media servers

However, sometimes getStats() does not give you the complete picture. You have to run an analysis on TCP/UDP layer. Getting this data is not easy as it is not available from inside a browser and involves running tools such as tcpdump to capture low-level dumps (pcaps) which can then be viewed in tools like Wireshark

At the lower layers when there is network congestion, packets can be lost, resulting in poor call quality. Congestion control strategies such as TCP-friendly rate control can help prevent network congestion and packet loss. By limiting the rate at which packets are sent, congestion control can help ensure that packets are not lost and that call quality is maintained.

Conclusion

Measuring and optimizing call quality is essential for enhancing the user experience in WebRTC audio-video calls. Developers can identify and mitigate quality issues by tracking metrics such as packet loss, RTT, PLI, frame rate, jitter, and network resources. By implementing congestion control, error correction, and bandwidth management strategies, developers can improve call quality and provide a smoother, more seamless real-time communication experience for users.

With the right tools and techniques, developers can ensure that their WebRTC applications provide the highest quality audio and video experience possible.

If you have any thoughts or feedback, please get in touch with me on LinkedIn. Stay tuned for more related blog posts in the future!

If you haven't heard about Dyte yet, head over to dyte.io to learn how we are revolutionizing communication through our SDKs and libraries and how you can get started quickly on your 10,000 free minutes, which renew every month. You can reach us at support@dyte.io or ask our developer community if you have any questions.