Wireshark Sniffing
Ghislain Ndeuchi, 2010-01-28
Setting Switzernet VoIP account on X-Lite
Basic introduction to SIP and RTP
Capturing DTMF using Wireshark
The aim of this tutorial is to show how Wireshark can be used for VoIP packets sniffing. Often, some problems occurring in the network can be understood by capturing packets and examining their contents. Wireshark is one of the well known tools for these kinds of tasks.
This document goes through the installation of Wireshark and X-Lite under Windows XP. It also explains how to setup X-Lite using Switzernet VoIP parameters. Then, it presents the two main protocols used in VoIP communication, SIP and RTP, by discussing them on the results of the capture. We show how DTMF's signals are passed in VoIP communication and how to save RTP packets into an audio file. A sample of a file containing a two-way voice communication is provided.
Wireshark is an open source application and may be downloaded freely from www.wireshark.org. Installation is straight forward. To install on Windows using the executable package:
Note: If the security window pops up, click the "Run" button to allow the installation to take place.
Figure 1: Wireshark setup wizard
At this point, the installer asks if you want to install WinPcap.
Figure 2: WinPcap adding on Wireshark installation
The WinPcap installer will launch during Wireshark installation.
Figure 3: WinPcap installer windows
Figure 4: WinPcap installation completion
Figure 5: Wireshark installation completion
Double click the Wireshark shortcut on the start menu. This will open the Wireshark main screen.
Figure 6: Wireshark main window
X-Lite is a free softphone application and may be downloaded freely from www.counterpath.com/x-lite.html. Installation is straight forward. To install on Windows using the executable package:
Note: If the security windows popup, click the "Run" button to allow the installation to take place.
Figure 7: Windows XP warning security windows
Figure 8: X-Lite setup wizard
Figure 9: X-Lite End License Agreement
Figure 10: X-Lite Ready to install window
Then, the installation will start and you have to wait few seconds for the operation to be completed. When finished, select the button "Finish" to close the X-Lite installer windows.
This section of the document described how to setup a VoIP account using X-Lite. For purposes of a quality check of Switzernet voice service, sample of voice communication will be taken with Wireshark and saved in an audio file.
X-Lite interface is quite easy to manage and the account configuration is done as followed:
Figure 11: X-Lite main window
1- User ID: Enter your phone account number starting with "41"
Ex: 41215500315
2- Domain: Enter the domain name provided in your Switzernet welcome letter
Ex: sip16.youroute.net
3- Password: Enter your "VoIP password" provided in your Switzernet welcome letter
Ex: xxxxxxx
4- Display name: Enter your phone account number to identify your VoIP line
Ex: 41215500315
5- Authorization name: Enter your phone account number
Ex: 41215500315
Figure 12: X-Lite account setting window
The window should look like this after.
Figure 13: Well connected X-Lite client
If you reached this step, your VoIP account is ready for calls.
SIP an RTP are in the fundament of today's VoIP communication. RTP is responsible of media streaming between parties. SIP is responsible for signalization, routing, finding of party, establishment and disconnection of calls.
RTP stands for Real-Time Transport Protocol. As a transport layer protocol, it uses UDP ports to pass voice data between a caller and the called. RTP was first designed on the traditional analog telephone communication model; meaning that the caller should know the IP address and port used by the called. But, the problem is that, on the present Internet architecture, the source may not know the destination IP address nor port number. This makes RTP quite inconvenient when used alone, since the parts have no way to find one another. This is why the SIP protocol was introduced. For further reading, please refer to RTP’s RFC. http://www.ietf.org/rfc/rfc1889.txt
SIP stands for Session Initiation Protocol. Its goal is to help RTP by starting the negotiation of the media types and formats between the caller and the called. So, it passes the IP address and the port that will be use during the conversation to RTP. For further reading, please refer to SIP’s RFC. http://www.ietf.org/rfc/rfc3265.txt
When troubleshooting account format problem, NAT issues, authentication, routing, one way audio or no audio issues capturing and analyzing packets is often helpful.
Wireshark can be used for troubleshooting many other problems appearing in an IP based network, but we focus only on VoIP packets only.
For now, this section is restricted on how to use Wireshark to realize SIP packets sniffing without analyzing them. For further reading, follow this link: http://www.linuxjournal.com/article/9398
The picture bellow shows the connection path used in our example.
Figure 14: System architecture
On our example, it is Realtek RTL8169/8110 Family Gigabit Ethernet card
Figure 15: Network card selection
After selection of the Ethernet interface, the screen should look like the bellow screen shot. Wireshark shows the capture of all packets passing by the selected Ethernet card.
Enter "sip" in the "Filter" area to reduce the capture's output to only SIP packets. Then, click "Apply".
Figure 16: Wireshark all packets capture
Dial the destination phone number on X-Lite (00415500328) and click "Call" button
Figure 17: Dial phone number on X-Lite
Wireshark will start showing the captured SIP packets
Figure 18: Wireshark sip packets capture
This example shows an outgoing call attempt which is cancelled after 10 seconds. The exchange shows 3 SIP transactions.
1- The first transaction consists of an unauthenticated INVITE sent from the UA to server, the server's provisional reply "100 trying"; the server's authentication failure reply "401 unauthorized" and the phone's "ACK" which completed the SIP transaction. http://www.faqs.org/rfcs/rfc3261.html
2- The second INVITE transaction is authenticated and accepted. The provisional reply "100" is followed by in progress "180 ringing".
3- The third transaction takes place within the timeslot of the second one. This is a user-initiated cancel, and server's "200" (reply meaning OK)
To realize SIP packets analysis:
Select StatisticèFlow Graph in the menu bar
Figure 19: Wireshark flow diagram selection
Select "OK"
Figure 20: Wireshark Flow diagram options
This is the resulting flow diagram of the communication.
Figure 21: Wireshark flow diagram
This flow diagram gives a good overview of messages exchanged between the caller and the called. In fact, the destination phone does not appear in this graph because, there is a server in the middle and the captures of packets on the server side are missing in our example as we are dealing with an opaque Back to Back User Agent (as a server).
The other end of this communication (called side) cannot be viewed from the client side. All transaction between called and caller can be observed on the server side with tools like ngrep, log files, as well as Wireshark. So, this graph gives the capture of exchanged packets between the originating party and the SIP server.
192.168.1.77 = X-Lite (UA)
91.121.70.119 = SIP server (B2BUA)
The process is divided into 4 parts:
Three way handshake (Authentication process)
1- The client sent an "INVITE" to the server asking to talk with the 0041215500328@sip16.youroute.net (Representing the destination which can be reached through the domain server named sip16.youroute.net. Note that server can further change the domain during lookup process.)
2- The server replies by sending a "STATUS" message with the code "100". This is to acknowledge the INVITE and state there is another action that has to be executed first, before calling the destination number (In our case, this other operation consists of contacting the authentication server via RADIUS protocol)
3- The server sent a "STATUS" message to the caller with the code "401". This means the caller has to send its registration information. In fact the first action the server does here is to reject the first call in order to ask the caller to authenticate first the "401" response contains the "challenge" for the digest authentication. http://tools.ietf.org/html/rfc5090
4- The client replies with the request "ACK" message to the authorization message received from the SIP server and chooses the first transaction.
The example bellow gives a capture of normal SIP transaction call with voice communication establishment.
The part cycled in red represents the difference with the previous call where the user cancels the call.
1- Here you can see the status code "200 OK" sent by the B2BUA to the UA (X-Lite) meaning the called answered the call
3- After, the caller send and "ACK" for the message received from the B2BUA.
3- The B2BUA sent a "BYE" request to the caller. The purpose is to notify the caller that the called hang-up the phone.
4- Then the Called reply with a status code "200 OK" to close the communication properly.
The picture bellow is just the flow diagram of the transaction of a normal SIP transaction
RTP is in charge of passing data from one end to another.
Enter "rtp" in the "Filter" area to reduce the capture's output to only RTP packets. Then click "Apply".
Figure 22: Wireshark rtp packets capture
This screen shot shows the RTP packets capture and the codec used during the transmission. Compare (G711 in this example) to the SIP; there are lots mere packets in capture's output. Of course, it is due for the sampling of voice signal into RTP packets and not SIP (as some may believe). SIP is only used to manage the communication (taking care of session establishment, keep alive call messages and codec changes). G711 codec specification http://tools.ietf.org/html/rfc5391
To realize RTP packets analysis:
Select TelephonyèRTPèStream Analysis in the menu bar
Figure 23: Wireshark rtp packet analysis selection
The following new window pops up
Figure 24: Wireshark stream analysis
In this window, you can see upstream and downstream communication by changing tab ("Forward Direction" # "Reverse Direction" tabs). Forward direction corresponds to audio transmitted by phone and the reverse to audio received by our phone. Moreover, it shows the amount of lost RTP packets and the total number of packets transmitted. Thanks to RTCP counters. http://www.ietf.org/rfc/rfc3550.txt
The bottom area shows a list of buttons that represents different functionalities of Wireshark. The next section will take the particular case of the "player" button.
Wireshark can reassemble voice data containing RTP packets and regenerate the voice communication in one or both ways.
To realize this, on the RTP Stream Analysis windows above (Figure 23: Wireshark stream analysis), click on the "Player" button. Then, a new screen pops up
Click on the "Decode" button within RTP Player window
Figure 25: VoIP-RTP player (Before decode)
Select the voice communication you want to hear by clicking on the checkbox next to it
Click on the "Play" button
Figure 26: VoIP-RTP Player (After decode)
After clicking the "Play" button and having your speaker well connected to the computer, you will listen to the communication which took place.
The payload of RTP packets can be saved into an audio file and can be even played with another sounds player like Windows Media Player or Real Player.
To realize this, on the RTP Stream Analysis windows above, click on the "Save Payload…" button. Then, a new screen popup
Figure 27: Saving RTP payload into .au file
In VoIP, there is two ways to transmit DTMF digitally: in RTP stream and via SIP packets. Our example shows transmission of DTMF in RTP stream.
There is a call that I had with a customer support of Fust for asking whether I can buy a laptop in their Crissier's shop. Click here to listen to the voice sample
Select often, when you call company helpdesk line, there is an IVR System asking to press a number on a dial pad to select a specific service http://en.wikipedia.org/wiki/Interactive_voice_response. On an analogue, the use of voice codecs in VoIP affects seriously these audio signals. After encoding and decoding, the signals are often not recognized as DTMF signals anymore. The solution is to recognize DTMF at the emitting and transmit them digitally and generate them at the receiving end. Those numbers are modulated into double frequency sounds and are transmitted as audio signals according to their DTMF signal.
Select TelephonyèVoIP Calls in the menu bar
Figure 28: VoIP calls selection
Select the exact phone call to trace and click the “Flow” button
Figure 29: VoIP Calls analysis
There we go. As shown in the graphic, X-Lite transmits the DTMF signals digitally within the RTP stream. Most of UA can be configured to transmit DTMF via one or combination of the following methods: RTP, SIP, and audio.
Figure 30: VoIP Graph-Analysis-DTMF signal observation
We can also filter RTP and then open its Graph Analysis. The windows should look like this
Figure 31: RTP Graph-Analysis-DTMF signal observation
On this document, we have shown how to install Wireshark and X-Lite, capture and understand basic SIP exchange, difference between SIP and RTP, capture and saving of voice as well as capture of DTMF signals.
B2BUA:
Back to Back User Agent is a SIP call controlling component logically positioned between the IMS (IP Multimedia Subsystem) and external networks. It handles all SIP (Session Initiation Protocol) signalling, including session attempts, subscriptions, instant messaging, etc, as well as including signalling where the flows may be forward without B2BUA intervention.
Downstream:
The speed at which information is received from the Internet. The speed is sometimes shown as X /Y where X is the downstream speed and Y is the upstream speed.
DTMF:
Dual Tone Multiple Frequency is a signaling method developed by Bell Labs for sending telephone dialing information over the same analog, voice-quality phones lines that carry voice. Each digit is encoded as the sum of two sinewave bursts, of different frequencies. The two-tone method was chosen because it can be reliably distinguished from voice and normal phone conversations are highly unlikely to falsely trigger the DTMF receiver. DTMF was the basis for "TouchTone" (a former trademark of AT&T), the pushbutton system that replaced mechanical rotary dial telephones.
IP:
Internet Protocol defines the way data packets, also called datagrams, should be moved between the destination and the source. More technically, it can be defined as the network layer protocol in the TCP/IP communicationsprotocol suite.
Packet:
A packet is a unit of data transmitted over the network in a packet-switched system. It consists of a header that stores the destination address, a data area which carries the information that is being transmitted, and a trailer which contains information to prevent errors during transmission.
Payload:
Information contained in a packet
Protocol:
It is a convention or standard that defines the procedures to be adopted regarding the transmission of data between two computing end points. These procedures include the way the sending device should sign off a message or how the receiving device should indicate the receipt of a message. Similarly, the protocols also lay down guidelines for error checking, data compression, and other relevant operational details.
RTP:
Real-Time Transport Protocol. Real-Time Transport Protocol. One of the IPv6 protocols. RTP is designed to provide end-to-end network transport functions for applications transmitting real-time data, such as audio, video, or simulation data, over multicast or unicast network services. RTP provides services such as payload type identification, sequence numbering, time-stamping, and delivery monitoring to real-time applications.
SIP:
SIP, which is the acronym of Session Initiation Protocol, is an IP telephony signaling protocol. It is primarily used for voice over IP (VoIP) calls, though with some extensions it can also be used for instant messaging. It is less complex than H.323, the other IP telephonyprotocol.
Softphone:
This is a software application that is installed in the user’s PC. It uses the Voice over IP technology to route voice calls over the net and provides several value added features, such as call forwarding, conference calling, and integration with applications such as Outlook for automatic dialing The audio is provided through a microphone and speakers plugged into the sound card. The only limitation of a Softphone is that the phone call has to made through a PC. Many softphone are free VOIP software downloads.
UAC:
User Agent Client is the cllient application that initiates the SIP request.
UAS:
User Agent Server is the server application that contacts the user when a SIP request is received, and then returns a response on behalf of the user. The response accepts, rejects, or redirects the request.
Upstream:
This refers to sending of data from a client machine across the Internet. With cable modems and ADSL, upstream speeds are slower than downstream speeds.
VoIP:
Voice over IP: VoIP or Voice over IP is the technology that is used to transmit voice over the Internet. The voice is first converted into digital data which is then organized into small packets. These packets are stamped with the destination IP address and routed over the Internet. At the receiving end the digital data is reconverted into voice and fed into the user’s phone.
[1] http://wiki.wireshark.org/VoIP_calls: Play RTP sounds with Wireshark
[2] http://www.markwilson.co.uk/blog/2008/11/recording-voip-calls-using-wireshark.htm : Recording voice sound using wireshark
[3] http://www.voip-info.org/wiki/view/DTMF : DTMF usage in VoIP and VoIP tutorials