technische universit¨at ilmenau fakult¨at f ur ...midas1.e-technik.tu-ilmenau.de/~webkn/... ·...

105
Technische Universit¨ at Ilmenau Fakult¨ at f¨ ur Elektrotechnik und Informationstechnik Diplomarbeit Further development of VoIP softphone based on ’Microsoft RTC Client API’ vorgelegt von: Carla Garc´ ıa S´ anchez eingereicht am: 15. 11. 2006 geboren am: Studiengang: Elektrotechnik und Informationstechnik Anfertigung im Fachgebiet: Kommunikationsnetze Fakult¨ at f¨ ur Elektrotechnik und Informationstechnik Verantwortlicher Professor: Prof. Dr. rer. nat. habil. Jochen Seitz Wissenschaftlicher Betreuer: Dipl.-Ing. Yevgeniy Yeryomin

Upload: others

Post on 01-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Technische Universität Ilmenau

    Fakultät für Elektrotechnik und Informationstechnik

    Diplomarbeit

    Further development of VoIP softphonebased on ’Microsoft RTC Client API’

    vorgelegt von: Carla Garćıa Sánchez

    eingereicht am: 15. 11. 2006

    geboren am:

    Studiengang: Elektrotechnik und Informationstechnik

    Anfertigung im Fachgebiet: Kommunikationsnetze

    Fakultät für Elektrotechnik und Informationstechnik

    Verantwortlicher Professor: Prof. Dr. rer. nat. habil. Jochen Seitz

    Wissenschaftlicher Betreuer: Dipl.-Ing. Yevgeniy Yeryomin

  • Thanksgiving

    Many people have helped me in one way or another during the course of this project.

    Through these lines, I would like to express to them my most sincere gratitude.

    To my professors, thank you for guiding and advising me at any moment. Every

    suggestion has been constantly useful to improve this work. I appreciate all the support

    from the personnel of the department of Communication Networks.

    To my family and friends, thank you for your unconditional support, for encouraging

    me in the hardest and most stressful moments. I appreciate that you have been there

    for me and trusted me. Especially, I want to show my gratefulness to my roommates

    and close friends in Ilmenau, because they have been sharing the everyday life with

    me these last months.

    Finally, I would like to thank TU - Ilmenau for allowing me to develop this project.

    Once again, thank you everyone.

  • Abstract

    In the time being, VoIP has become a widespread technology because enhances real-

    time communication making it easier and more natural, regardless where people are

    located. Voice over Internet Protocol (VoIP), like its name says, is a technology that

    enables voice communication over the network.

    This project intends to achieve the further development of a VoIP softphone based on

    SIP that was implemented as part of a PhD thesis in the department of Communication

    Networks. One of the aims of the project is to study the availability of this technology

    on a mobile environment and the adaptation of this softphone to mobile devices.

    A softphone is a software used to establish telephone calls from one computer to

    other softphones or conventional telephones making use of VoIP technology. Besides, it

    supports additional functionalities that can help and facilitate exclusive services to the

    final user that would not be possible with the current telephone network; for example,

    location of users independently of where they are connected or multiple videoconference

    calls.

    Before beginning with the development of the software application, it is essential to

    understand the operation and the structure of softphones based on Session Initiation

    Protocol (SIP), a protocol responsible of the establishment of the VoIP session between

    users. For that purpose, the first part of this project consists in a survey about VoIP

    technology and the protocols related to the VoIP environment, such as Session Initia-

    tion Protocol, Session Description Protocol (SDP) and Real-time Transport Protocol

    (RTP).

    Nowadays, there are many types of softphones running on diverse operating systems

    and programmed in different languages. Although they must follow the same basic

    structure, they can be totally differentiated because of the extra features they provide

    and the platform on which they are built. In this case, this application uses Microsoft

    RTC Client API, that supplies the libraries and interfaces required to implement the

    functionalities of the VoIP protocols previously mentioned.

    Some of the new features that will be added to this software application are:

    • Management of the contact list: It will allow users to storage information abouttheir contacts and access to it easily. Furthermore, it informs users about the

    presence availability of their buddies.

  • • Videoconference call: In order to improve people communications, multimediacalls with audio and video become more real.

    Although only a few functionalities are going to be developed, the capabilities of the

    softphone could be increased by adding new ones in function of future people needs

    and communication requirements.

  • Contents i

    Contents

    1 VoIP Technology based on SIP 1

    1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 VoIP Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.3 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.4 Types of VoIP calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.5 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.6 VoIP protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.6.1 Session Initiation Protocol (SIP) . . . . . . . . . . . . . . . . . 4

    1.6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 4

    1.6.1.2 Protocol Design . . . . . . . . . . . . . . . . . . . . . . 4

    1.6.1.3 SIP Clients and Servers . . . . . . . . . . . . . . . . . 5

    1.6.1.4 SIP Messages . . . . . . . . . . . . . . . . . . . . . . . 7

    1.6.2 Session Description Protocol (SDP) . . . . . . . . . . . . . . . . 10

    1.6.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 10

    1.6.2.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.6.3 Real-time Transport Protocol (RTP) . . . . . . . . . . . . . . . 12

    1.6.3.1 Real-time Transport Control Protocol (RTCP) . . . . 13

    1.7 VoIP Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    1.7.1 VoIP Clients running on different OS . . . . . . . . . . . . . . . 17

    1.7.2 VoIP Clients for mobile devices . . . . . . . . . . . . . . . . . . 20

    1.7.3 Structure and operation of softphones . . . . . . . . . . . . . . . 21

    1.7.3.1 Registration procedure . . . . . . . . . . . . . . . . . . 23

    1.7.3.2 Multimedia session establishment . . . . . . . . . . . . 26

    1.7.4 Softphones for Windows Mobile OS . . . . . . . . . . . . . . . . 31

    1.7.5 OS for mobile devices . . . . . . . . . . . . . . . . . . . . . . . . 32

    Diplomarbeit Carla Garćıa Sánchez

  • Contents ii

    2 Microsoft RTC Client API 34

    2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    2.2 Object Model Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    2.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    2.4 .NET Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    2.4.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    2.4.3 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    3 Development of VoIP softphone for Windows 2000/XP 39

    3.1 Understanding the code source . . . . . . . . . . . . . . . . . . . . . . . 39

    3.2 New functionalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    3.2.1 Volume bar for microphone and speakers . . . . . . . . . . . . . 40

    3.2.2 Sending DTMF signals . . . . . . . . . . . . . . . . . . . . . . . 40

    3.2.3 Addition of videoconference . . . . . . . . . . . . . . . . . . . . 41

    3.2.4 Contact List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    3.2.5 Encryption of media . . . . . . . . . . . . . . . . . . . . . . . . 42

    3.3 Testing the program and results . . . . . . . . . . . . . . . . . . . . . . 44

    3.3.1 Volume bar for microphone and speakers . . . . . . . . . . . . . 45

    3.3.2 Sending DTMF signals . . . . . . . . . . . . . . . . . . . . . . . 45

    3.3.3 Addition of videoconference . . . . . . . . . . . . . . . . . . . . 47

    3.3.4 Contact List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    3.3.5 Encryption of media . . . . . . . . . . . . . . . . . . . . . . . . 52

    3.4 Software tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    4 Adaptation of the VoIP softphone for mobile devices 55

    5 UML Structure 58

    5.1 Class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    5.2 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    5.3 Sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    5.4 State diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    5.4.1 Buddy state diagram . . . . . . . . . . . . . . . . . . . . . . . . 74

    5.4.2 Watcher state diagram . . . . . . . . . . . . . . . . . . . . . . . 75

    5.4.3 Session state diagram . . . . . . . . . . . . . . . . . . . . . . . . 75

    5.4.4 Client state diagram . . . . . . . . . . . . . . . . . . . . . . . . 76

    Diplomarbeit Carla Garćıa Sánchez

  • Contents iii

    6 Getting Started 79

    6.1 Software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    6.2 Getting an account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    6.3 Description of Graphical User Interface . . . . . . . . . . . . . . . . . . 80

    A UML Diagrams 83

    A.1 Class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.2 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.3 Sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.4 Buddy state diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.5 Watcher state diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.6 Session state diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    A.7 Client state diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    Bibliography 91

    List of Figures 92

    List of Tables 93

    List of Abbreviations and Symbols 94

    Thesis of Diplomarbeit 97

    Erklärung 98

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 1

    1 VoIP Technology based on SIP

    1.1 Introduction

    VoIP (Voice over Internet Protocol) is simply the transmission of voice traffic over

    IP - based networks. It is also called IP Telephony, Internet telephony, Broadband

    telephony or Digital Phone. Companies providing VoIP service are usually known

    as VoIP providers, and protocols used to route voice signals over the IP network are

    identified as VoIP protocols. Although the Internet Protocol (IP) was originally de-

    signed for data networking, the success of IP in becoming a world standard for it has

    contributed to its use to voice networking.

    VoIP uses a broadband internet connection for routing telephone calls, as opposed

    to conventional switching and fibre optic alternatives. This process provides lower cost

    for communication consumers. Maybe the most interesting point of the technology for

    the user is that the current infrastructure is not needed to be reconfigured. The only

    requirements are to adapt the internet functionality and a conventional phone into one

    single service with software and hardware support.

    1.2 VoIP Features

    The biggest advantage of VoIP is that the customers can make and receive calls from

    anywhere in the world where a broadband internet connection is available without

    changing their phone number. This is known as mobility. It is not necessary to have

    multiple numbers (office, home, mobile, and so on) from the same person because the

    calls can be automatically routed to the VoIP phone where the user is registered. The

    customers can take their IP phones with them on national and international trips and

    still can manage to access what is essentially an individual’s domestic phone line.

    On the other hand there are the softphones, which are a software application that

    loads the VoIP services onto the desktop or laptop. Some even simulate an interface

    that looks like a telephone, with which you can place VoIP calls to anybody around

    the world, through a standard broadband connection.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 2

    Most VoIP services come with the caller id, call waiting, call transfer, repeat dialling,

    or multi-conference call features. For additional features such as call filtering, forward-

    ing a call, or sending calls directly to the voice mail, the service provider may assess an

    additional fee. Most VoIP services also allow the user to check his/her voicemail over

    the web or attach messages to an e-mail that is sent to his/her PDA or PC. The facil-

    ities and components provided by VoIP phone system suppliers and service operators

    may vary in significant ways because not all of them support the same functionalities.

    1.3 Advantages

    Since calls can be placed across the Internet, using the Internet connection for both

    data traffic and voice calls allows consumers to save amounts of money. Thereby,

    the major reason to change to VoIP technology for telephone service could be cost

    reduction, for instance, the cost of the call is independently of the destination place,

    so there is no extra charge for long distances.

    VoIP is able to provide some additional features which make this technology even

    more attractive and may be difficult to achieve with conventional telecommunication

    companies, such as:

    • Incoming phone calls can be automatically routed to your VoIP phone, regardlessof where you are connected to the network.

    • Call center agents using VoIP phones can work from anywhere with a sufficientlyfast and stable Internet connection.

    • Other features: multi - conference call, call forwarding, automatic redial, callerID, and so forth.

    1.4 Types of VoIP calls

    There are three techniques of connecting to a VoIP network:

    • Using a VoIP telephone.

    • Using a conventional telephone with a VoIP adapter.

    • Using a computer with speakers and a microphone.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 3

    VoIP telephone calls are routed to other VoIP devices or to normal telephones on

    the PSTN (Public Telephone Switch Network). Depending on the device, there are

    two types of VoIP calls:

    • PC - to - Phone call: from a VoIP device to a conventional telephone.

    • PC - to - PC call: from a VoIP device to another VoIP device.

    • Phone - to - PC call: from a conventional telephone to a VoIP device.

    • Phone - to - Phone call: from a conventional phone to another conventionalphone.

    Note that a VoIP device may not be a PC.

    1.5 Operation

    The most common way VoIP works is that the end user establishes a high speed broad-

    band connection, using a router and a VoIP gateway. Instead of a standard telephone

    line, the router sends the telephone calls over an internet connection. The VoIP gate-

    way, placed somewhere in direct proximity of the connected Internet is responsible of

    connecting the VoIP network with the PSTN network. All the transmission data (SIP

    signalling, audio/video data and so on) are divided into smaller pieces called packets,

    before sending it over the internet. These packets are sent to their final destination

    and instructions for bringing back into an understandable form are embedded in them.

    It then goes through a VoIP gateway where the packets are reconverted into the orig-

    inal format utilizing a PSTN (Public Telephone Switch Network), thereby routing the

    call to the number the caller has dialled blending old technology and high technology

    delivery in a seamless and instantaneous way.

    1.6 VoIP protocols

    In this point, the main protocols required to implement a VoIP softphone based on

    SIP are described.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 4

    1.6.1 Session Initiation Protocol (SIP)

    1.6.1.1 Introduction

    Session Initiation Protocol (SIP) is an application-layer control protocol that can es-

    tablish, modify, and terminate multimedia sessions (conferences) such as Internet tele-

    phony calls. These sessions can include one or more participants, invite new par-

    ticipants, add and remove media streams owing to SIP is a flexible and transparent

    protocol that allows the addition of more features in existing sessions.

    The prime signalling functions of the protocol are detailed below:

    • Location of the end user to guarantee the communication regardless where he isplaced.

    • Determination of the availability of the end user to establish a session.

    • Determination of the media capabilities and allowance the media negotiationbetween the participants involved in the communication.

    • Negotiation of the features supported by the end users.

    • Modification of the parameters or features in an already established session.

    SIP is not a service provider, whereas SIP presents signalling capabilities that can

    perform different services. Consequently, SIP should work in concert with other proto-

    cols in order to supply the requirements of the users. If spite of that, SIP functionality

    and operation is completely independent of the rest of the protocols due to SIP is only

    involved in the signalling portion of a communication session.

    One obvious example is the operation of a VoIP call, where SIP is responsible for

    supporting of the session, Real - time Transport Protocol (RTP) for delivering real -

    time data, and Session Description Protocol (SDP) for describing multimedia sessions.

    1.6.1.2 Protocol Design

    SIP is a peer - to - peer protocol. It means that SIP qualities are defined in the

    communicating endpoints, not in the network.

    As it was explained previously, SIP is an application-layer protocol, following the

    TCP/IP model. The protocol structure can be divided in four different logical levels:

    • Low layer: it is entrusted with the syntax and encoding of the SIP messages.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 5

    Figure 1.1: A typical SIP network with gateways

    • Second layer (Transport layer): it describes the sent requests and received re-sponses in the client and server sides that are transmitted over the network.

    There is a transport layer in every SIP element.

    • Third layer (Transaction process layer): it manages the concordance betweenthe requests and responses that have been transmitted using the transport layer,

    considering also the possible retransmissions and timeouts.

    • Upper layer (Transaction user): all the SIP elements, except the stateless proxy,are defined as a transaction user. It could be said that it is responsible for

    analyzing and completing the tasks of the transaction process layer.

    1.6.1.3 SIP Clients and Servers

    There are five SIP entities whose behaviour is detailed as follows.

    • User Agent Client (UAC): builds SIP request and sends them to the UAS.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 6

    • User Agent Server (UAS): receives and manages SIP request from the UACand prepares SIP responses.

    • Stateless Proxy Server: gets requests from the transport layer and routes itto the next step using the message content, but without storing any information

    related to that request. For that reason, it is unable to distinguish between an

    original message and a retransmission. Stateless proxies do not provide any SIP

    timers and cannot build provisional responses like 100 Trying or 180 Ringing.

    • Stateful Proxy Server: develops a deeper analysis of the requests received thanthe stateless proxy. It verifies the validation of the request and the consignee,

    routes the message and stores state information. Stateful proxies use timers

    to determine if the message must be retransmitted in case of not receiving a

    response. Furthermore, they can demand user agent authentication.

    • Registrar Server: is a server that receives and handles REGISTER requests.The user information contained in these messages are validated (user agent au-

    thentication is required) and used for detecting the user location in the network.

    User agents send this type of requests periodically in order to update their loca-

    tion information.

    Figure 1.2: SIP clients and servers

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 7

    1.6.1.4 SIP Messages

    It is defined two kinds of SIP messages:

    • Request: it is sent from the client to the server

    • Response: it is sent from the server to the client

    Besides, they differ in the syntax and type of fields that form the message.

    There are defined six main requests (also called methods) in the SIP specification:

    • INVITE: Invites a user to take part in a session.

    • ACK: Acknowledges the reception of an INVITE request.

    • BYE: Ends an existing session.

    • CANCEL: Interrupts a current transaction.

    • OPTIONS: Asks for information about a server’s capabilities.

    • REGISTER: Informs about the user’s current location.

    Here it is shown some examples of the exchange of the SIP messages:

    ACK

    user1 ProxyServer

    INVITE

    user2

    INVITE

    180 RINGING180 RINGING

    OKOK

    ACK

    BYEBYE

    OK

    OK

    Figure 1.3: Example of SIP INVITE

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 8

    Figure 1.4: Example of SIP REGISTER

    Supplementary requests have been defined in SIP extensions, like Session Initiation

    Protocol (SIP)-Specific Event Notification (RFC 3265). This document describes how

    UACs can subscribe to specific events, like presence of their contacts, and how they

    receive the notification of these events.

    Figure 1.5: Example of a SIP extension: SUBSCRIBE - NOTIFY

    Each response message has a status code which is used to specify the significance

    of the transaction. According to the first digit of the status code, SIP responses are

    classified in six different groups or families:

    1xx : Provisional - Informs about the status of a received request.

    2xx : Success - Indicates that a request has been successfully processed.

    3xx : Redirection - It is not possible to manage the request. The client must retrans-

    mit or revise the request.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 9

    4xx : Client Error - The request has not succeeded because of a client error. Further

    action needs to be taken according to the response like modifying the original

    request.

    5xx : Server Error - The request has not succeeded because of a server error and the

    server is not able to process it.

    6xx : Global Failure - The request cannot be processed. The client should not retry

    it.

    The most important header fields of a SIP message are:

    • Request-URI: It should contain the value of the SIP URI in the To field (exceptin case of REGISTER request, which refers to the domain where the registrar

    server is located).

    • Via: It indicates the type of transport used for the transmission of the messageand the location where the response must be sent. There can be several Via

    fields to route to packet to the next hop. This field must also contain a branch

    parameter, which is an identifier for the transaction with the same value by both

    UAC and server.

    • To: It contains the SIP address of the request’s recipient. This address is a SIPURI.

    • From: It contains the SIP address of the user who has sent the request (thesevalue is only the same as the To field in case of REGISTER request).

    • Call-ID: It is a unique identifier for each call that allows the server to detectdelayed messages that have arrived out of order.

    • CSeq: It contains a sequence number and the method name. This sequencenumber is incremented by one for each message request that is sent by the same

    user. It allows detecting lost messages and maintaining the order.

    • Contact: It contains a SIP URI of the user’s current location.

    • Max-Forwards: It is an integer identifier used to limit the number of hops of arequest on the way to its destination. Its initial value is usually 70 to guarantee

    the reception and it is decremented by one at each hop. If it reaches 0 before

    arriving to its destination, the request is rejected.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 10

    • Content-Type: This field describes the media type of the message-body sentto the recipient. It is only present if the body is not empty. In this case, it will

    be an application/sdp content-type indicating that the SIP message includes a

    SDP packet with the session description.

    • Content-Length: It contains the size of the message body sent to the recipientin decimal number of octets.

    1.6.2 Session Description Protocol (SDP)

    1.6.2.1 Introduction

    In order to establish videoconferences, VoIP calls or other type of session, it is necessary

    to communicate media capabilities, transport addresses and other session description

    information to the final users.

    SDP presents a standard representation that describes and provides this information

    in such an understanding way to the participants that allows them to make a decision

    about whether to participate in a session.

    SDP does not provide any kind of transport method or negotiation parameters.

    SDP is simply a system for session description. It does not incorporate a transport

    protocol, and it can work in conjunction with different transport protocols as suitable.

    One example could be SIP, which incorporates SDP in its messages.

    An SDP session description must include the following information: IP address,

    port number, media type and media encoding format. Moreover, SDP contains extra

    information like subject of the session, start and stop times or contact information

    about the session.

    These are the most important header fields in a SDP packet:

    • Session description

    – v: It shows the version of the Session Description Protocol.

    – o: It contains the originator of the session (username and user address) and

    a session identifier.

    – s: It is the session name.

    – c: It contains connection information including network type, address type

    and connection address.

    – b: It specifies the proposed bandwidth to be used by the session or media.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 11

    • Time description

    – t: It indicates the start and stop time for a session. If these values are zero,

    the session is considered as permanent.

    • Media description (repeated for each type of media)

    – m: It contains media types, ports, transport protocol, media format...

    – k: If the packet is transported over a secure and trusted channel, this field

    is used to convey encryption keys.

    – a: It defines different media attributes. Normally, there are many lines of

    this kind of field.

    1.6.2.2 Operation

    This point describes the negotiation method between two participants to agree about

    the corresponding parameters to establish a media session using SIP. This negotiation

    method is known as offer/answer model because one participant offers a description of

    his/her available media streams and the other participant answers to the offer. Both,

    offer and answer, have to be a suitable SDP message, following the recommendations

    in RFC 4566.

    The offer must contain all the media streams he/she wants to use, including the IP

    addresses and the ports to receive them. For each media stream, the type of RTP

    payload and the codecs have to been specified. If the offer contains multiple formats

    for one media stream, it means that all of them can be used during the session, but

    they have to be listened in preference order. The other participant should use the type

    of media with the highest position in the list, if it is possible.

    The answer must contain a corresponding media stream for each stream in the offer,

    indicating the IP addresses and the ports to receive them. Besides, it must inform

    about what media streams and codecs are supported. If there are no media formats in

    common for a single media stream, it must be rejected by setting the port to zero. If

    there no media formats in common for any media stream, all the media session must

    be rejected.

    When the participant who sent the offer receives the answer, he/she must identify

    the accepted streams and formats and can start sending and receiving media.

    Since SIP allows modifying the parameters in an established session, both partici-

    pants can generate a new offer at any time in order to update the session with a new

    negotiation.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 12

    1.6.3 Real-time Transport Protocol (RTP)

    RTP is an application-layer protocol which defines a standardized packet format for

    delivering data with real-time characteristics, such as audio and video, over the Inter-

    net. The services provided by RTP incorporate payload type identification, numbering

    sequence, timestamping, and delivery monitoring. Although RTP does not guarantee

    quality - of - service or time delivery by itself, it includes appropriate functionality for

    the detection of some of the problems produced by the transmission in an unreliable

    IP network such as packet loss, variable transport delay, out of sequence packet arrival

    or asymmetric routing.

    In an equivalent manner as it happens in SDP, RTP is not responsible for the packet

    delivering, whereas it usually operates and relies on transport protocols like UDP

    (User Datagram Protocol) or TCP (Transmission Control Protocol) to deal with this

    functionality. Moreover, RTP packets are not able to be transmitted by themselves

    over the network. They are usually encapsulated in UDP packets.

    The Payload field of the RTP packet contains real-time data and the information

    about it, like the source, size, format and so on, is transported in the header fields.

    The complete header structure of a RTP packet is detailed below:

    • Version (V): This field identifies the version of RTP.

    • Padding (P): If the padding bit is set, the packet contains one or more additionalpadding octets at the end which are not part of the payload. Padding may

    be needed by some encryption algorithms. Otherwise, padding should only be

    applied, if it is needed, to the last packet.

    • Extension (X): If the extension bit is set, the fixed header is followed by exactlyone header extension. This extension mechanism allows individual implementa-

    tions to experiment with new payload format independent functions that require

    additional information to be carried in the RTP data packet header. In any other

    case, it may be ignored.

    • CSRC count (CC): It contains the number of contribution count identifiersthat go behind the fixed header.

    • Marker (M): It is used to carry specific profile information in some applications.

    • Payload type (PT): It defines the RTP payload and its understanding by theapplication.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 13

    • Sequence number: It is used to detect lost or out of sequence packets andrestore the original source. This identifier is randomly selected with the trans-

    mission of the first packet of a stream and then it value is incremented by one

    for each RTP sent packet.

    • Timestamp: It reflects the moment of sampling of the first byte in the RTPpayload. Several consecutive packages will have the equal timestamp value if they

    are part of the same stream or data source. The delivering of audio/data packets

    in a media session uses different channel and port transmission. For that rea-

    son, this identifier is very important to allow the receiver to restore audio/video

    data packets and, furthermore, to synchronize a complete videoconference, for

    example.

    • Synchronization Source (SSRC): The synchronization source is a randomnumber used to identify the source of the RTP stream for each RTP session. A

    user can receive RTP packets from the same endpoint at the same time, but two

    different synchronization sources will not have identical SSRC identifier in the

    same session, and so, it will be possible to differentiate the original source of each

    one.

    • CSRC list: It identifies the contributing sources for the payload contained inthis packet. The maximum number of contributing sources that it allows to

    recognize is 15.

    RTP supports, but not provides, encryption of the media flows. Generally, it use

    IPSec or SRTP.

    It is said that RTP consists of two differentiated protocols:

    • Real-time Transport Protocol (RTP): it conveys real-time data.

    • Real-time Transport Control Protocol (RTCP): it contains informationregarding the quality of the RTP session and the participants in the session.

    1.6.3.1 Real-time Transport Control Protocol (RTCP)

    RTP Control Protocol (RTCP) is a communication protocol that provides control in-

    formation and quality services associated with a data flow for a multimedia application.

    It works in concert with transport and packed RTP, but it does not transport any data

    by itself. This protocol gathers connection statistics and information about sent bytes,

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 14

    lost packages, jitter and so forth. It is important to notice that RTCP by itself does

    not offer any kind of authentication or flow coding.

    The information provided by this protocol is used to control the flow and the conges-

    tion in the network. For example, if the statistics show that there are a huge number

    of lost packages, the sender can modify its transmissions limiting flow or changing the

    format of the media stream to another one with low compression codec. RTCP packets

    are also used to realize and determine problems on the network.

    On the other hand, participants in a session use RTCP packet to exchange some

    basic identity data, like the username and the domain that is using.

    The types of RTCP packets are:

    • SR: Sender report, for transmission and reception of statistics from participantsthat are active senders.

    • RR: Receiver report, for reception of statistics from participants that are notactive senders.

    • SDES: Source description items.

    • BYE: Indicates end of participation.

    • APP: Application of specific functions.

    RTCP packet structure depends on the type of packet. The packet structure detailed

    below corresponds to a sender report (SR) packet. The only difference between the

    sender report (SR) and receiver report (RR) forms, besides the packet type code, is

    that the sender report includes a 20-byte sender information section for use by active

    senders. This kind of packet is more complex than the others and has greater number

    of fields.

    • Version: It identifies the version of RTP, which is the same in RTCP packets asin RTP data packets.

    • Padding: This field has the same functionality as in RTP packets, but relatedto RTCP packet.

    • Reception report count (RC): It defines the number of reception reportblocks contained in this packet. Its value can be zero.

    • Packet type (PT): In this case, it contains the constant 200 to identify thatthis as an RTCP SR packet.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 15

    • Length: It is the length of this RTCP packet in 32-bit words minus one, includingthe header and any padding.

    • SSRC: It is the synchronization source identifier for the originator of this SRpacket. It should make reference to the SSRC field of a RTP packet.

    • Network Time Protocol (NTP) timestamp: It is used to indicate the wall-clock time (absolute date and time) when the report was sent in order to used

    it in combination with timestamps returned in reception reports from other re-

    ceivers to measure round-trip propagation to those receivers. It has two subfields:

    most significant word (MSW) and least significant word (LSW).

    • RTP timestamp: It corresponds to the same time as the NTP timestamp, butin the same units and with the same random offset as the RTP timestamps in

    data packets. This timestamp may not be equal to the RTP timestamp in any

    adjacent data packet. Rather, it must be calculated from the corresponding NTP

    timestamp using the relationship between the RTP timestamp counter and real

    time.

    • Sender’s packet count: It contains the total number of RTP data packetstransmitted by the sender since the transmission started until the time this SR

    packet was generated. This count should be reset if the sender changes its SSRC

    identifier.

    • Sender’s octet count: It defines the total number of payload octets transmittedin RTP data packets by the sender since the transmission started until the time

    this SR packet was generated. This count should be reset if the sender changes

    its SSRC identifier.

    • Source identifier (SSRC): This SSRC identifier is the same SSRC field as theRTP packet source which this RTCP packet is related to.

    • Fraction lost: It informs about the fraction of RTP packets from SSRC sourcethat has been lost since the previous SR packet was sent.

    • Cumulative number of packets lost: It refers to the total number of RTPpacket from SSRC source that have been lost since starting transmission. This

    number is calculated using the number of packets expected minus the number of

    packets already received.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 16

    • Extended highest sequence number received: It contains the highest se-quence number received in a RTP data packet from SSRC source, and the most

    significant 16 bits extend that sequence number with the corresponding count of

    sequence number cycles (it is calculated according to an algorithm in Appendix

    A.1 from RFC 3550).

    • Interarrival jitter: It is an estimation of the statistical variance of the RTPdata packet interarrival time, measured in timestamp units and expressed as an

    unsigned integer.

    • Last SR timestamp (LSR): It contains the middle 32 bits out of 64 in theNTP timestamp received as part of the most recent RTCP sender report (SR)

    packet from SSRC source. If no SR has been received yet, the field is set to zero.

    • Delay since last SR (DLSR): It refers to the delay, expressed in units of1/65536 seconds, between the last SR packet received from SSRC source and the

    sending of the new one. If no SR packet has been received yet from SSRC, the

    DLSR field is set to zero.

    1.7 VoIP Clients

    Nowadays, there is a great amount of different VoIP clients. The following tables show

    some examples of free use VoIP softphones for using in computer and some others for

    mobile devices. There is not much information about VoIP clients for mobile devices

    because of the fact that they have proprietary license.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 17

    1.7.1 VoIP Clients running on different OS

    Application Operating

    System

    License Language Other

    ATLSIP Linux, Windows MPL,

    GPL,

    LGPL

    C++ It is written

    using the Ac-

    tive Template

    Library.

    Ekiga Linux, Mac OS

    X, BSD

    GNU/GPL C++

    Eyeball Me-

    ssenger

    Linux and

    uClinux, Win-

    dows 2000/XP,

    Windows Mo-

    bile, Windows

    CE

    Proprietary It is based on

    Eyeball Mes-

    senger SDK.

    It is available

    in PC, PDA

    and embedded

    platforms.

    FreeSWITCH Linux, Win-

    dows, Max OS

    X, BSD, Solaris

    Open

    source

    C++

    KCall Linux GNU/GPL It is a VoIP ap-

    plication for the

    KDE desktop

    environment.

    Kphone Linux GNU/GPL C++ It uses Qt.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 18

    Linphone Linux, Windows

    XP

    Freeware C It uses eXosip

    (SIP user agent

    library based

    on libosip2),

    mediastreamer2

    (powerful li-

    brary to make

    audio/video

    streaming and

    processing)

    and ortp (RTP

    library).

    Minisip Linux, Windows

    XP

    GNU/GPL It will be soon

    available on

    Pocket PC.

    MjUA GNU/GPL Java It is based on

    MjSIP stack.

    OpenWengo Linux, Win-

    dows, Mac OS

    X

    GNU/GPL C++

    OpenZoep Windows GNU/GPL C++

    PhoneGaim Linux, Windows GNU/GPL

    PJSUA Linux, Win-

    dows, Windows

    CE/Mobile,

    Mac OS X,

    Symbian OS

    GNU/GPL C++ It is based on

    PJSIP stack.

    SFLphone Linux GNU/GPL C++ It should be

    portable BSD

    operating sys-

    tems

    Shtoom Linux, Win-

    dows, Mac OS

    X

    GNU/GPL Python

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 19

    SIPCommuni-

    cator

    Linux, Win-

    dows, Mac OS

    X

    GNU/GPL Java

    sipXphone Linux, Windows GNU/GPL Java

    TudoMais Windows GNU/GPL Java

    Twinkle Linux GNU/GPL C++ It uses KDE li-

    braries.

    VMukti Windows Open

    source

    C# It is based on

    .NET 3.0

    WxCommuni-

    cator

    Windows

    XP/2000

    GNU/GPL C++ It is based on

    sipXtapi client

    library and

    wxWidgets 2.8.4

    GUI library.

    XMeeting Mac OS X Open

    source

    YATE Linux , Windows GNU/GPL C++ It supports

    scripting in

    various pro-

    gramming

    languages (such

    as embedded

    PHP, Python

    and Perl).

    YeaPhone Linux GNU/GPL It is based on the

    Linphone stack.

    Zap Linux, Win-

    dows, Mac

    OS

    Open

    source

    JavaScript It is based on

    Mozilla.

    Table 1.1: VoIP Clients running on different OS

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 20

    1.7.2 VoIP Clients for mobile devices

    Application Operating System Others

    AGEphone Windows Mobile 5.0

    for Pocket PC

    It is based on microSIP

    stack, developed in C/C++.

    Articulation Palm OS 5.0 or

    greater

    BeWip Windows Mobile OS

    CiceroPhone Windows Mobile 5.0,

    Windows PPC2003,

    Symbian OS

    ExpressTalk Windows Pocket PC,

    Windows Mobile OS

    eyeP Phone Desktop Windows Pocket PC

    2003

    iFon Windows Mobile,

    Windows CE 4.X

    Microsoft Office Com-

    municator Mobile

    Windows Mobile 2003

    SE for Pocket PC

    smartphone, Mobile

    5.0 for Pocket PC and

    Smartphone

    It is based on the user in-

    terface of Microsoft Office

    Communicator 2005 desk-

    top client.

    Microsoft Portrait Windows Mobile 5.0

    Pocket PC

    It is a research prototype

    for mobile video communi-

    cation.

    MoviVoip Palm OS 5.0 or

    greater

    OnePhone Windows Mobile 5.0,

    Sybian OS, uLinux

    Mobile

    SJPhone Windows Pocket PC

    2003

    Solegy Softphone It is based on their Servi-

    cePDQ platform and using

    opensourcesip and opensip-

    stack.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 21

    speaQ Windows Mobile

    5.0 PDA Edition,

    Linux/Qtopia

    The FirstHand Mobile

    Console

    Windows Mobile 5.0

    PDA Edition

    VOYP Palm OS

    Woize Windows Mobile 5.0,

    Windows Mobile 2003

    for PocketPC

    X-Pro Windows Mobile 2003

    Table 1.2: VoIP Clients for mobile devices

    1.7.3 Structure and operation of softphones

    As a general definition, a softphone (English combination of software and telephone)

    is software used to establish telephone calls from one computer to other softphones

    or conventional telephones over the internet network. Thereby, it is part of a VoIP

    environment and makes use of the protocols previously described, SIP, SDP and RTP.

    Nowadays, there are many available implementations. These softphones can have

    different license of use (closed proprietary software, freeware, open source, GPL/GNU),

    system requirements, operating system or programming language, but their structure

    and operation must follow the same fundamental guidelines.

    In order to develop a VoIP softphone, we can choose between two principal methods:

    • Using some libraries, platform or API, like RTC Client API from Microsoft,where all the necessary protocols are defined and implemented following their

    RFC files.

    • Programming step by step all the features, functionalities, requirements, para-meters, and so forth that are defined in the RFC files of each needed protocol.

    In this case, if we are interested in implementing a VoIP softphone based on SIP, it

    is necessary to take in consideration these documents:

    • RFC 3261: Session Initiation Protocol (SIP)

    • RFC 4566: Session Description Protocol (SDP)

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 22

    • RFC 1889/3550: A Transport Protocol for Real-Time Applications (RTP -RTCP).

    Independently of the method we choose to develop the softphone, at least it must

    contain the following parts:

    • SIP package: it should define all the required classes and methods to provideand manage SIP services.

    • SDP package: it should define all the required classes and methods to provideand manage SDP services.

    • RTP package: it should define all the required classes and methods to provideand manage RTP services. This package should also afford RTCP services.

    • User interface: it should help the final user to interact with the sotfphone andemploy its functionalities, independently of how it is implemented.

    These protocol packages must perform some methods to build, send, receive, analyze

    and process packets. As it was explained, neither of these protocols provides a way

    to be transmitted over the network, otherwise they are usually encapsulated in UDP

    packets. Consequently, the softphone must also define some classes and methods to

    send, receive, and process of IP/UDP packets, including the sockets and the ports

    that are needed for the communication channels. Normally, it is used to have two

    channels for SIP messages (sending and receiving) and other two channels for RTP

    packets (sending and receiving).

    The best way to analyze and understand the operation of the VoIP softphones based

    on SIP is by means of some examples.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 23

    1.7.3.1 Registration procedure

    In this example, the user carla1 wants to register in a registrar server. The following

    picture shows the SIP message exchange between the client and the server.

    Figure 1.6: SIP registration procedure

    Firstly, the user sends a SIP REGISTER request to a registrar server. The SIP

    package must build a SIP request including the header fields that has been already

    explained in 1.6.1.4.

    The SIP server receives the message and uses the information to manage the request.

    Meanwhile, it sends a provisional response (100 TRYING) to the user in order to

    indicate that it is performing some action and does not yet have a definitive response.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 24

    The UAC processes the TRYING response and waits for a new response from the

    server.

    After that, the UAC receives the response of the server, 401 UNAUTHORIZED.

    This response indicates that the request requires the user to perform authentication

    because it does not contain the proper credentials.

    The UAC receives this message and process the information. It should rebuild the

    SIP request adding an Authorization header field in the message. The Authorization

    field value consists of credentials containing the authentication information of the UAC

    for the action requested as well as parameters required in support of authentication

    and replay protection.

    The new message will have the form:

    Request-Line: REGISTER sip:141.24.93.180 SIP/2.0

    Message Header

    Via: SIP/2.0/UDP 141.24.172.62:15966

    Max-Forwards: 70

    From: ;tag=6a09ca1e73c846b681c288ed49dbd071;epid=e04c9989ea

    To:

    Call-ID: [email protected]

    CSeq: 2 REGISTER

    Contact: ;methods=”INVITE, MESSAGE, INFO, SUBSCRIBE,

    OPTIONS, BYE, CANCEL, NOTIFY, ACK, REFER”’

    User-Agent: RTC/1.2.4949

    Authorization: Digest username=”’carla1”’, realm=”’asterisk”’, algorithm=MD5, uri=”’sip:141.24.93.180”’,

    nonce=”’6c839ae6”’, response=”’feea44258515c993945155793cc1c8d6”’

    Event: registration

    Allow-Events: presence

    Content-Length: 0

    In this message, the CSeq field value has been incremented and the new field Authoriza-

    tion has been included. One more time, the server sends a provisional response while it is

    analyzing the request.

    Finally, the request is successful accepted and processed by the registrar server and it

    answers with an OK response:

    Status-Line: SIP/2.0 200 OK

    Message Header

    Via: SIP/2.0/UDP 141.24.172.62:15966;received=141.24.172.62

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 25

    From: ;tag=6a09ca1e73c846b681c288ed49dbd071;epid=e04c9989ea

    To: ;tag=as45ef369c

    Call-ID: [email protected]

    CSeq: 2 REGISTER

    User-Agent: Asterisk PBX

    Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY

    Expires: 120

    Contact: ;expires=120

    Date: Thu, 31 May 2007 07:34:12 GMT

    Content-Length: 0

    The UAC processes the new response and handles it. There is an expires parameter in the

    Contact field. It indicates how long the registration is valid expressed in seconds. Within

    the expiration interval, the UAC should send another REGISTER request in order to inform

    the server about its location.

    Although it is a complete and normal registration procedure, there are many other pos-

    sibilities according to the SIP server responses and the UAC must be able to process and

    manage each of them.

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 26

    1.7.3.2 Multimedia session establishment

    When a UAC wants to initiate a session, it originates an INVITE request. The INVITE

    request asks a server to establish a session. The handshake of the SIP messages is shown in

    the picture.

    INVITE

    407 PROXY AUTHENTICATION REQUIRED

    ACK

    TRYING

    ACK

    carla1 ProxyServer

    INVITE

    carla2

    INVITE

    180 RINGING180 RINGING

    OKOK

    ACK

    BYEBYE

    OK

    OK

    Figure 1.7: Multimedia SIP session establishment

    In this example, the user carla1 wants to initiate a multimedia call with the user carla2.

    The SIP package must build a SIP INVITE request similar to the REGISTER request

    following what it was explained in 1.6.1.4. In the same way, the SDP package has to create

    a SDP packet with the header fields deatiled in 1.6.2.1.

    The complete message is sent. The proxy server receives the message and processes the

    request. The request does not contain any Authorization field and the server requires client

    authentication. For that reason, the server sends a 407 PROXY AUTHENTICATION RE-

    QUIRED to inform the client.

    The UAC receives this message and process the information. This response is very simi-

    lar to the 401 UNAUTHORIZED. The UAC sends an ACK message and rebuilds the SIP

    request adding an Authorization header field with the proper credentials of the client.

    Request-Line: INVITE sip:[email protected] SIP/2.0

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 27

    Message Header

    Via: SIP/2.0/UDP 141.24.172.97:16580

    Max-Forwards: 70

    From: ”’carla1”’ ;tag=f7231a4fdbf547408da053a9f31b0fdb;epid=99274a963c

    To:

    Call-ID: [email protected]

    CSeq: 2 INVITE

    Contact:

    User-Agent: RTC/1.2

    Proxy-Authorization: Digest username=”’carla1”’, realm=”’iptel.org”’, algorithm=md5, uri=”’sip:[email protected]”’,

    nonce=”’465e8da1a386b50d9c5608de3da1961645aa7abd”’, response=”61ecb5b096f0de7146ec001b7f469f85”’

    Content-Type: application/sdp

    Content-Length: 679

    Message body

    Session Description Protocol

    Session Description Protocol Version (v): 0

    Owner/Creator, Session Id (o): - 0 0 IN IP4 141.24.172.97

    Session Name (s): session

    Connection Information (c): IN IP4 141.24.172.97

    Bandwidth Information (b): CT:1000

    Time Description, active time (t): 0 0

    Media Description, name and address (m): audio 62410 RTP/AVP 97 111 112 6 0 8 4 5 3

    101

    Encryption Key (k): base64:I/EwJ93tvnk62iBdgpAUBAtqQDDacCxMqae5MDj1i4A

    Media Attribute (a): rtpmap:97 red/8000

    Media Attribute (a): rtpmap:111 SIREN/16000

    Media Attribute (a): fmtp:111 bitrate=16000

    Media Attribute (a): rtpmap:112 G7221/16000

    Media Attribute (a): fmtp:112 bitrate=24000

    Media Attribute (a): rtpmap:6 DVI4/16000

    Media Attribute (a): rtpmap:0 PCMU/8000

    Media Attribute (a): rtpmap:8 PCMA/8000

    Media Attribute (a): rtpmap:4 G723/8000

    Media Attribute (a): rtpmap:5 DVI4/8000

    Media Attribute (a): rtpmap:3 GSM/8000

    Media Attribute (a): rtpmap:101 telephone-event/8000

    Media Attribute (a): fmtp:101 0-16

    Media Attribute (a): encryption:optional

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 28

    Media Description, name and address (m): video 45406 RTP/AVP 34 31

    Encryption Key (k): base64:YQdcx0AUpMFh2+xW8A1qcZ8NTeN6qEEcjoyfejsVgmo

    Media Attribute (a): rtpmap:34 H263/90000

    Media Attribute (a): rtpmap:31 H261/90000

    Media Attribute (a): encryption:optional

    This message is received by the proxy server. It routes the message to the destination

    user or through another proxy server. Normally, if the message is routed to another proxy

    server, it sends a provisional response to the first proxy server and routes the message to the

    destination user.

    The UAC processes the TRYING response and waits for a new response from the server.

    When the destination user receives the INVITE request, sends a provisional RINGING

    message indicating that the message is being processed. Then, the proxy server forwards the

    response using the information in the Via field.

    The UAC processes the RINGING message and waits for a non provisional response from

    the server. When the destination user finally accepts the call, it is sent an OK response to

    its proxy server. It forwards the message using the Via field and this proxy server forwards

    it again in the same way to the UAC that formulated the original request.

    Status-Line: SIP/2.0 200 OK

    Message Header

    Via: SIP/2.0/UDP 141.24.172.97:16580;rport=1477

    From: ”’carla1”’ ;tag=f7231a4fdbf547408da053a9f31b0fdb;epid=99274a963c

    To: ;tag=ac93304b84c644cc88c52d415d14caac

    Call-ID: [email protected]

    CSeq: 2 INVITE

    Record-Route:

    Record-Route:

    Record-Route:

    Contact:

    User-Agent: RTC/1.2

    Content-Type: application/sdp

    Content-Length: 708

    P-Behind-NAT: Yes

    Message body

    Session Description Protocol

    Session Description Protocol Version (v): 0

    Owner/Creator, Session Id (o): - 0 0 IN IP4 141.24.92.247

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 29

    Session Name (s): session

    Connection Information (c): IN IP4 213.192.59.66

    Bandwidth Information (b): CT:1000

    Time Description, active time (t): 0 0

    Media Description, name and address (m): audio 6260 RTP/AVP 97 111 112 6 0 8 4 5 3 101

    Encryption Key (k): base64:t7AMm5DBS7WjFacOTXt9B+vImF15vDxVPGVgL8fY5GY

    Media Attribute (a): rtpmap:97 red/8000

    Media Attribute (a): rtpmap:111 SIREN/16000

    Media Attribute (a): fmtp:111 bitrate=16000

    Media Attribute (a): rtpmap:112 G7221/16000

    Media Attribute (a): fmtp:112 bitrate=24000

    Media Attribute (a): rtpmap:6 DVI4/16000

    Media Attribute (a): rtpmap:0 PCMU/8000

    Media Attribute (a): rtpmap:8 PCMA/8000

    Media Attribute (a): rtpmap:4 G723/8000

    Media Attribute (a): rtpmap:5 DVI4/8000

    Media Attribute (a): rtpmap:3 GSM/8000

    Media Attribute (a): rtpmap:101 telephone-event/8000

    Media Attribute (a): fmtp:101 0-16

    Media Attribute (a): encryption:optional

    Media Description, name and address (m): video 42648 RTP/AVP 34 31

    Encryption Key (k): base64:rjR7lz3kmGVpahZErkninx1gTFaZMyNr+Y35W1pSHtg

    Media Attribute (a): recvonly

    Media Attribute (a): rtpmap:34 H263/90000

    Media Attribute (a): rtpmap:31 H261/90000

    Media Attribute (a): encryption:optional

    Media Attribute (a): nortpproxy:yes

    This response indicates that the session has been accepted, but maybe not in the origi-

    nal way. The UAC must process the message, verify that it belongs to the original INVITE

    request using the CSeq field and analyze again the content of the SDP packet. The INVITE

    request sent from carla1 to carla2 indicated a multimedia session with audio and video,

    but the OK response indicates that only the received video data and audio are supported

    by the other user. In order to finish the establishment of the session with the corresponding

    parameters and media type, the UAC must send an ACK message.

    The task of the proxy servers is to facilitate the two UAC locating and contacting each

    other. They should not storage any knowledge of the fact that there is a session established

    between the users. Furthermore, once the ACK is received by the destination UAC, they can

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 30

    start exchanging RTP packets. It is important to realize that media is usually transmitted

    end-to-end and not through any proxy server.

    According to the type of media and the rest of parameters present in the SDP session

    description, the RTP package should provide the way to generate RTP packets and send

    them through the network encapsulated in IP/UDP packets. The complete header structure

    of a RTP packet was detailed in 1.6.3.

    After the RTP header there must be the RTP Payload. This is an example of a real RTP

    packet:

    Real-Time Transport Protocol

    10.. .... = Version: RFC 1889 Version (2)

    ..0. .... = Padding: False

    ...0 .... = Extension: False

    .... 0000 = Contributing source identifiers count: 0

    0... .... = Marker: False

    Payload type: SIREN (111)

    Sequence number: 33555

    Timestamp: 3169076983

    Synchronization Source identifier: 2281082149

    Payload: 5994FCBD0BD53F49C59C69B1E9F73449C6D6DC1CC62A6294...

    Since it was mentioned before, there is another protocol which works in concert with RTP,

    RTP Control Protocol (RTCP), that provides control information and quality services asso-

    ciated with a data flow for a multimedia application.

    This is an example of a real RTCP SR packet related to the previous RTP packet which

    contains all the header fields explained in 1.6.3.1.

    Real-time Transport Control Protocol (Sender Report)

    10.. .... = Version: RFC 1889 Version (2)

    ..0. .... = Padding: False

    ...0 0001 = Reception report count: 1

    Packet type: Sender Report (200)

    Length: 12

    Sender SSRC: 2281082149

    Timestamp, MSW: 3389590617 (0xca090c59)

    Timestamp, LSW: 2918510592 (0xadf4f000)

    ’[MSW and LSW as NTP timestamp: May 31, 2007 08:56:57,6795 UTC]’

    RTP timestamp: 3169077087

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 31

    Sender’s packet count: 47

    Sender’s octet count: 2280

    Source 1

    Identifier: 3436236073

    SSRC contents

    Fraction lost: 0 / 256

    Cumulative number of packets lost: 16777213

    Extended highest sequence number received: 52467

    Sequence number cycles count: 0

    Highest sequence number received: 52467

    Interarrival jitter: 3

    Last SR timestamp: 200748212 (0x0bf72cb4)

    Delay since last SR timestamp: 29204

    In a single media session, many RTP and RTCP packets are transmitted. These have been

    only some little examples of its structure and operation.

    The media session finishes when some of the UACs sends a BYE request. In this session,

    user carla1 wants to end the call and its UAC creates and sends a BYE message. OK

    message. The media session is finished with the reception of this message.

    Summarizing, it is possible to say that the basic operation of an UAC consists in preparing

    the communication channel and building, sending, receiving and processing packets of the

    different protocols that are needed in a whole VoIP environment.

    1.7.4 Softphones for Windows Mobile OS

    In the previous point, the main protocols that are needed to develop a VoIP softphone and

    their operation were explained. The structure of a softphone on Windows Mobile OS is the

    same as in other operating systems. It means that it must implement or make use of SIP,

    SDP and RTP packages in the same way as it was described in 1.7.3.

    The specific characteristics for softphones running on Windows Mobile OS reside in the

    system requirements. There is not much information about it because these softphones have

    closed proprietary license, but some common specifications are enumerated below:

    • Minimum Size Requirement: 64MB ROM, 32MB RAM

    • ARM processor

    • Audio codec support: G711

    One example of this softphones is AGEphone, which is based on microSIP stack. These

    are its system requirements:

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 32

    • CPU: ARM-type CPU 200 MHz or above

    • Memory: 64MB or above

    • Free Disk Space: 600kb or above

    • Connection: Up- and Downstream of each 29.2kbps or above

    This protocol stack supports G.711 and GSM6.10 codecs and provides the next protocols:

    • RFC3261: Session Initiation Protocol (SIP)

    • RFC2327: Session Description Protocol (SDP)

    • RFC1889: A Transport Protocol for Real-Time Applications (RTP)

    • RFC2833: RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals

    • RFC3489: STUN - Simple Traversal of User Datagram Protocol (UDP) Through Net-work Address Translators (NATs)

    • RFC3581: An Extension to the SIP for Symmetric Response Routing

    • UPnP: Universal Plug and Play IGD

    1.7.5 OS for mobile devices

    Symbian OS is an operating system produced by Symbian Ltd. It has been designed for

    mobile devices with associated libraries, user interface frameworks and reference implemen-

    tations of common tools. It runs exclusively on ARM (Advanced RISC Machine) processors.

    Windows Mobile is a compact operating system combined with a suite of basic applications

    for mobile devices based on the Microsoft Win32 API. Devices which run Windows Mobile

    include Pocket PCs, Smartphones, and Portable Media Centers (portable media player de-

    vices). It is designed to be somewhat similar to desktop versions of Windows.

    Nokia OS (NOS) is an informal name for the operating system in many Nokia mobile

    phones. There is no such product or trademark. Officially it is referred as ISA (Information

    Source Adapter) platform. It is only available for Nokia’s internal use. It is not licensed to

    anyone else yet. No direct API is provided either, but most ISA phones can be programmed

    with J2ME.

    Operating System Embedded (OSE) is a real-time embedded operating system created by

    the Swedish firm ENEA. OSE uses signaling in the form of messages passed to and from

    processes in the system. Messages are stored in a queue attached to each process.

    BlackBerry OS runs on an Intel 80386 microprocessor and all devices include an embed-

    ded RIM (Research In Motion) wireless modem for wireless data access. This OS supports

    Diplomarbeit Carla Garćıa Sánchez

  • 1 VoIP Technology based on SIP 33

    multitasking and multithreaded applications. Developers familiarized with other operating

    systems such as Windows and the MacOS will be at home in the BlackBerry environment. All

    applications interact with the underlying operating system (and other applications) through

    the exchange of event messages. As most operating systems, a C language API is provided

    for direct access to the system.

    Palm OS is a compact operating system developed and licensed by PalmSource, Inc. for

    personal digital assistants (PDAs) manufactured by various licensees. It is designed to be

    easy-to-use and similar to desktop operating systems such as Microsoft Windows. Palm OS

    is combined with a suite of basic applications including address book, clock, note pad, sync,

    memo viewer and security software. Palm OS licensees decide which applications are included

    on their Palm OS devices. The applications are primarily coded in C/C++ and a Java Run

    time Environment is also available for its platform.

    Linux is a Unix-like computer operating system family that uses the Linux kernel. A Linux

    system which includes system utilities and libraries from the GNU Project is sometimes

    referred to as GNU/Linux. The methodical design of Linux made it possible to adapt it

    to a wide range of computing platforms in spite of being originally developed for Intel 386

    processors. Of particular interest in this context are the ARM based architectures, as many

    embedded systems and mobile devices are powered by ARM processors. Linux is a prominent

    example of free software and of open source development. Its underlying source code is

    available for anyone to use, modify, and redistribute freely, and in some instances the entire

    operating system consists of free/open source software.

    It is said that Mobile Linux and Mobile Java become a power combination. While Linux

    is evolving into a major standard for mobile device operating systems, Java is becoming

    a standard at the software application level. The J2ME/MIDP specifications have been

    adopted by all major mobile phone manufacturers. The MIDP (Mobile Information Device

    Profile) is comprised of a set of Java APIs, that provides a J2ME (Java 2 Micro Edition)

    runtime environment for mobile information devices.

    Mac OSX is a proprietary line, graphical operating systems developed, marketed, and

    sold by Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh

    computers. Mac OSX is a Unix-like operating system. This operating system has been

    developed for the handheld device iPhone and gives access to true desktop-class applications

    and software, including rich HTML email, full-featured web browsing, and applications such

    as calendar, text messaging, notes, and address book. iPhone is fully multi-tasking

    Diplomarbeit Carla Garćıa Sánchez

  • 2 Microsoft RTC Client API 34

    2 Microsoft RTC Client API

    2.1 Introduction

    The Real-time Communications Client Application Programming Interface enables develop-

    ers to build applications for integrated multimodal communications. It provides the necessary

    structure and interfaces to establish PC-PC, PC-phone, or phone-phone calls, Instant Mes-

    saging (IM), sharing application, and whiteboard sessions over the Internet. Furthermore,

    multimedia sessions can be set up on PC-PC calls, and Presence information on a list of

    contacts is also supported.

    RTC Client API can be programmed with C++ or any other programming language

    that can access COM components. This includes .NET languages, such as Microsoft Visual

    Basic .NET and Microsoft Visual C#, which can access the RTC Client API through COM

    interoperability.

    The main functionalities supported by the RTC Client API are:

    • Registration and provisioning

    • Publishing presence

    • Contact management

    • Polling presence

    • Instant Messaging

    • Multimedia calls

    • Call control

    • Session negotiation

    • User search

    • Authentication

    • Signalling privacy

    • Media privacy

    Diplomarbeit Carla Garćıa Sánchez

  • 2 Microsoft RTC Client API 35

    2.2 Object Model Overview

    The basic coding model for RTC is COM (Component Object Model). The main objects

    used for communication in RTC are Client, Session, Profile, Participant, Buddy and Watcher

    objects, and the interfaces used to create and manage them are IRTCClient, IRTCSession,

    IRTCParticipant, IRTCProfile, IRTCBuddy, and IRTCWatcher, respectively.

    Figure 2.1: RTC Client COM Objects

    The client object is the basis of the RTC Client. It establishes the session types and the

    session parameters, the preferred audio and video devices and other media capabilities. This

    object is necessary to construct the rest of the objects.

    The session object is used to manage all the tasks related to the real-time session such

    as: initiating, answering, or terminating sessions, adding or removing participants, adding

    security media or storing information about media types. There are four kinds of sessions:

    PC-to-PC, PC-to-phone, phone-to-phone, and instant messaging.

    The profile object provides a way to get information from a profile user. This profile

    includes information about client account (username, password, sip server), supported session

    types and capabilities, authentication, transport protocol and so forth. After initializing

    RTC, the client application creates and enables a profile.

    The participant object contains all the information and methods associated with users

    who take part in a session. Each of these users is called a ’”participant”’ and is represented

    by a different participant object.

    Diplomarbeit Carla Garćıa Sánchez

  • 2 Microsoft RTC Client API 36

    The buddy object is used to get and put information about the user contacts. It provides

    data like the name or the status of the contact. This object is created when a user adds a

    new contact to his contact list.

    The watcher object is used to get and put information about the state of a watcher. When

    a user adds a new buddy, this buddy creates an object watcher of the user in order to maintain

    information about his presence.

    The buddy and the watcher objects are used to manage the presence information.

    2.3 Architecture

    To provide its functionality, the RTC Client API uses industry standard protocols like:

    • Session Initiation Protocol (SIP)

    • Session Description Protocol (SDP)

    • Real-time Transport Protocol (RTP)

    • Public Switched Telephone Network/Internet (PINT)

    2.4 .NET Platform

    2.4.1 Introduction

    .NET is a software platform that connects information, systems, people and devices. .NET

    Platform connects a great variety of technologies of personal use and businesses, of cellular

    telephones to corporative servants, allowing the access to important information, where and

    when they are needed. Developed with base on the standards of Services Web XML, .NET

    allows the systems and applications (new or existing) to connect their data and transactions

    independently of the version of the operating system, type of computer or mobile device that

    is utilized, or the programming language used to create it.

    Code written on the .NET Framework platform is called managed code. Regardless of

    which .NET language is employed, the output of the language compiler is a representation

    of the same logic in an intermediate language named CIL (Common Intermediate Language)

    or MSIL (Microsoft Intermediate Language). The programming languages that can be used

    in the .NET platform are C#, C++, Visual Basic .NET, J#, JScript .NET, Windows Pow-

    erShell, IronPython, F#.

    Diplomarbeit Carla Garćıa Sánchez

  • 2 Microsoft RTC Client API 37

    2.4.2 Operation

    There are three main points on which .NET platform bases its mode of operation:

    • .NET languages, that have been previously enumerated.

    • Base Class Library (BCL), which is a library of types available to all .NET languagesand provides a lot of classes with a huge number of common functions, including

    file reading and writing, graphic rendering, database interaction and XML document

    manipulation.

    • Common Language Runtime (CLR), which is explained below.

    Figure 2.2: Overview of the Common Language Infrastructure

    Diplomarbeit Carla Garćıa Sánchez

  • 2 Microsoft RTC Client API 38

    The most important component of the .NET Framework is the Common Language In-

    frastructure. The CLI is responsible for providing a language platform for application de-

    velopment and execution, including components for exception handling, security, interoper-

    ability, and so forth. Microsoft’s implementation of the CLI is called the Common Language

    Runtime (CLR). The CLR is composed of four primary parts:

    • Common Type System (CTS)

    • Common Language Specification (CLS)

    • Just-In-Time Compiler (JIT)

    • Virtual Execution System (VES)

    Managed code is compiled down to a combination of MSIL and metadata. These are

    combined into a VES file, which can then be executed on any CLR-capable machine. When

    you run this executable, the JIT starts compiling the CIL down to native code. The result

    is that all .NET Framework components run as native code. Code that requires the CLR at

    run-time in order to execute is referred to as managed code. The purpose of the CLR is to

    control the execution of the code that runs on the .NET Framework.

    2.4.3 Advantages

    For software developers, the .NET Framework is an important change. It offers some capa-

    bilities and responsibilities that had previously been provided individually by programming

    languages and tools from various sources. The incorporation of the features into the operating

    system becomes in a great number of advantages, including:

    • Assuring the availability of framework features to all programs written in any of the.NET languages.

    • Providing to programmers a common mean of accessing framework features, regardlessof programming language.

    • Guarantees of a common behaviour within the framework, regardless of programminglanguage.

    • Allowing the operating system to provide some guarantees of program behaviour that,otherwise, it could not offer.

    • Reducing the complexity and limitations of program-to-program communication, evenwhen those programs are written in different .NET languages.

    Diplomarbeit Carla Garćıa Sánchez

  • 3 Development of VoIP softphone for Windows 2000/XP 39

    3 Development of VoIP softphone for

    Windows 2000/XP

    3.1 Understanding the code source

    Before continue developing the softphone, it is essential to understand and identify how it has

    been built. It means that it is necessary knowing the softphone structure and the different

    functionalities that are already implemented.

    Analyzing the code source of the application and its behaviour in execution it is possible

    to identify the following operations:

    • Initializing RTC Client object: creates the client object.

    • Listening on RTC Events: allows the client to determine which specific events theapplication needs and ignore the rest.

    • Creating and enabling a profile: creates a profile with the configuration parametersof the client object in order to register a user in a server and creates the profile object.

    • Handling events: identifies and controls incoming events.

    • Starting a session and making a call: configures the type of session, adds aparticipant and creates the session object.

    • Answering a call: manages an incoming call.

    • Terminate a call: finishes an existing session.

    • Disabling profile: deregisters a user and disables the profile.

    • Shut down client: stops the operation of the client object and disables the rest ofexisting objects.

    These basic steps compose the softphone framework and permit the correct operation of its

    main purpose: the transmission of voice over internet by means of a SIP session establishment.

    Diplomarbeit Carla Garćıa Sánchez

  • 3 Development of VoIP softphone for Windows 2000/XP 40

    3.2 New functionalities

    Nowadays, a typical VoIP softphone includes some features that are not strictly related with

    its prime operation, but they add new useful capabilities in order to render some facilities

    or services to the user. For that purpose, five new functionalities have been added to the

    softphone. Each of them is explained as follows.

    3.2.1 Volume bar for microphone and speakers

    This functionality allows the user to configure and adjust the audio settings. In order to

    increase or decrease the volume level of the microphone or speakers, it is only needed to

    move the Microphone Volume or Speakers Volume trackbar, respectively.

    Furthermore, the Audio and Video Tuning Wizard help the user to verify that his camera,

    speakers, and microphone are working properly. Before using the Wizard, it is important to

    perform the following:

    • Close all other programs that show video or play or record sound.

    • Make sure that the camera, speakers, and microphone are plugged in and turned on.

    These functionalities are implemented by the client object and the methods used are

    included in the RTCClientClass class.

    • set volume(RTC AUDIO DEVICE enDevice,long lVolume), where the input parame-ters are the audio media type (microphone or speakers) and the volume level.

    • InvokeTuningWizard()

    3.2.2 Sending DTMF signals

    Dual tone multi-frequency is a system of signal tones used in telecommunications. When

    the user presses a dial-pad button corresponding to a digit, two tones of specific frequencies

    are sent. The receiver, normally a switching centre, can decode and detect which digit was

    pressed. The tones are divided into two groups (low and high) into the voice frequency

    band, and each DTMF signal uses one from each group. These signals are used in different

    applications including voice mail, help desks, telephone banking, and so forth, to select some

    configuration options or manage remote control systems, for instance.

    Diplomarbeit Carla Garćıa Sánchez

  • 3 Development of VoIP softphone for Windows 2000/XP 41

    The following table shows the frequencies associated with each decimal digit:

    Button or Digit Low frequency (Hz) High frequency (Hz)1 697 12092 697 13363 697 14774 770 12095 770 13366 770 14777 852 12098 852 13369 852 14770 941 1336

    Table 3.1: DTMF frequencies

    This functionality is provided by the client object. The method used is SendDTMF(RTC DTMF

    enDTMF ), included in the RTCClientClass class. The input parameter is an enumeration

    that specifies which DTMF should be sent.

    This method sends a DTMF to the active session and plays a feedback tone to the RTC

    default audio device.

    3.2.3 Addition of videoconference

    Videoconference calls can be used in a great amount of different situations, which is one of

    the reasons the technology is so popular. Although a lot of people use videoconference in

    a recreational sense, general uses for videoconference include business meetings, educational

    training and collaboration among health officials. In fact, videoconference has been used

    in a huge variety of fields like the followings: telemedicine, telecommunications, education,

    surveillance, security, emergency response, and so on.

    Perhaps the biggest benefit videoconference offers is the ability to meet with people in

    remote locations without problem of time, distance or money. It can be use to keep in touch

    with the entire world without going out home.

    ’A picture says a thousand words’. Videoconference does not replace real life meetings, but

    enhances ’face-to-face’ communication making it easier and more natural, regardless where

    people are located.

    After establishing the audio/video session between two participants, the client object

    processes received and sent video data. This object gets incoming and outgoing video stream

    and shows each of them in a different video window. The method used is get IVideoWindow(RTC VIDEO DEVICE

    enDevice, out IVideoWindow VWindow), included in the RTCClientClass class. The input

    parameter is an enumeration that specifies the video device (receive/preview); the output

    parameter is referred to an interface to control video window properties.

    Diplomarbeit Carla Garćıa Sánchez

  • 3 Development of VoIP softphone for Windows 2000/XP 42

    3.2.4 Contact List

    The contact list, also called address book, is a feature which allows users to storage locally

    friends’ personal information. Besides, it lets users to know if their contacts are online or not.

    Users can call their friends only with some few clicks. It is easy, speedy and comfortable. All

    the information about the user’s buddies is persisted in a file on the user’s computer.

    The service that makes it possible is the presence information service. It is responsible

    for updating contact’s presence status and notifying user’s status. The calls will be done

    through a registrar server that maintains current location information of the contacts.

    The first stage consists in registering the user on the SIP server and enabling presence.

    The presence service can be enabled before registering user’s profile on the server. The main

    steps are: create profile - enable presence - set presence status - enable profile.

    Once the profile is registered and presence is enabled, adding a new contact to the address

    book is simple. The IRTCClientPresence interface provides methods add a buddy, remove a

    buddy, enumerate watchers, set local presence status, and so forth.

    If the buddy object is successfully created, using the IRTCBuddy interface the client object

    will be able to get the buddy’s presentity URI, name of the buddy, buddy’s SIP number, the

    buddy’s status, and some other data associated with the buddy.

    The contact list can be recovered by querying the client object using the IRTCClientPres-

    ence interface. From this interface, the contacts can be enumerated by calling the Enumer-

    ateBuddies method.

    3.2.5 Encryption of media

    It is indispensable to be aware of the risks using VoIP, especially in the case of telephony,

    an application of vital development. People who combine telephony and computing, they

    also promote security holes and dangers. The use of unsecured VoIP communications is a

    great opportunity for undesirable activities of the hackers. Hackers record calls like audio

    file, resend calls, make calls with false identification, generate busy tones or manipulate call

    queues. There are many programs for that purposes available in internet. For that reason,

    VoIP application must assure security. Confidentiality, integrity and authenticity of dates

    must be guaranteed at any time.

    In cryptography, encryption is the process of transforming information to make it unread-

    able to anyone except those possessing special knowledge, usually referred to as a key.

    It could be possible to think that encrypting media flows is sufficient to secure a VoIP

    communication, but this concept is completely wrong. Some media encryption protocols, like

    Secure Real-time Transport Protocol (SRTP), do not provide any method for key exchange

    or key management and they use SIP signalling for this purpose. So, if SIP signalling is

    not encrypted or protected by any mechanism, anyone could get this key. In conclusion, it

    Diplomarbeit Carla Garćıa Sánchez

  • 3 Development of VoIP softphone for Windows 2000/XP 43

    is needed to encrypt any media associated with a session and all SIP traffic to guarantee a

    secure VoIP communication.

    SIP is not an easy protocol to secure. The encryption of the whole message would be the

    best mean to assure security, however, SIP request and responses cannot be entirely encoded

    because some message fields, like Via, need to be able to read and modify by, for example,

    proxy servers. For that reason, it is recommended to use low-layer security mechanisms for

    SIP because they work hop-by-hop. In these kinds of mechanisms, servers are authenticated,

    so, the end users can be sure with whom they are communicating.

    Transport or network layer security encrypts signaling traffic, guaranteeing message confi-

    dentiality, integrity, and, sometimes, authentication. RFC 3261 documentation proposes two

    ways for securing the transport and network layer: Internet Protocol Security (IPSec) and

    Transport Layer Security (TLS).

    IPSec is a set of network-layer protocols for securing Internet Protocol (IP) communica-

    tions. IPSec also includes protocols for cryptographic key establishment. It can be used with

    TC