fix net

23
FIX32/iFIX Networking Architecture

Upload: javierlera

Post on 29-Nov-2014

83 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fix Net

FIX32/iFIX Networking Architecture

Page 2: Fix Net

Contents

Discussion of FIX32 Client/Server Network Architectural Components…………………………………4

Network Component Breakdown………………………………………4

TCPTASK………………………………………………………………………….5NBTASK……………………………………………………………………………6NDK/NOH………………………………………………………………………….6NNTABLE………………………………………………………………………….6CONMGR………………………………………………………………………….7DBASRV……………………………………………………………………………8RDA…………………………………………………………………………………8UDBA……………………………………………………………………………….9LDBA………………………………………………………………………………..9EDA…………………………………………………………………………………9NAM…………………………………………………………………………………9NAC…………………………………………………………………………………9FTK/STK/LTK……………………………………………………………………..11

Tweaking the Communication Process……………………………11

RDAThrottle………………………………………………………………..12

Session Timers……………………………………………………………13

Dynamic Connections……………………………………………………14 IPC Mechanisms Used By FIX32…………………………………………………………………………….15

Windows Sockets Interface……………………………………………15

NetBIOS Interface…………………………………………………………16 Differences Between FIX32 and iFIX Network Communication……………………………………………………………..16

Multiple LAN Support……………………………………………………16

LAN Redundancy…………………………………………………………17

2

Page 3: Fix Net

Performance Improvements……………………………………………17

Tying Everything Together: A Close Examination of FIX32 Client/Server Communication…………………………………………………………..…17

3

Page 4: Fix Net

Discussion of FIX32 Client/ServerNetwork Architectural Components

Probably the best approach to learning FIX32/iFIX networking is to have an understandingof the “big picture”. Being able to visually depict a working block diagram of FIX server andclient networking components in your mind will help make your troubleshooting efforts muchmore efficient.The contents and examples contained in this white paper are geared towards the FIX32 networking model. No troubleshooting is covered; just a detailed discussion of the various networking components, how they interact with one another, and how FIX32 client/server communication is accomplished. It is assumed that we have a working client/server connection regardless of transport protocol.What you will learn can also be applied to iFIX networking as well. There are some minor differences between FIX32 and iFIX networking. These changes are discussed in the section “Differences Between FIX32 and iFIX Network Communication” later on in this paper.

Network Component Breakdown

A block diagram of the FIX32 client/server model is shown in Figure 1. Components colored in gray represent those found on a FIX32 node with SCADA support enabled. All other remaining blocks are related to both the FIX client/server model. From this point forward, the word “client” represents a FIX node without SCADA support enabled in the SCU file. The word “server” means just the opposite.

One essential component not pictured in Figure 1 is the license key. Keys come in two forms, a physical parallel port key or a soft key (keyless option). From a customer support standpoint, the keyless option should never be advertised to customers. This type of key is only provided to customers by our sales division under special circumstances. It is important to be aware that just because a key has been affixed to a FIX32/iFIX node does not necessarily suggest that networking has been enabled for that machine. Among the various options available that can be enabled on keys, networking is one of them. Some FIX packages sold to OEMs by our sales division will not have the networking option enabled; probably because the machine(s) will be standalone or will be using our STK.

4

Page 5: Fix Net

Figure 1: FIX32 Client/Server Model

TCPTASK

TCPTASK is a service that has the NDK/NOH interface on one side and Windows Sockets Interface (WINSOCK) on the other. Its duties include constructing WINSOCK library calls, perform bookkeeping on session timers (Receive) and outstanding messages, and handling/routing NDK/NOH communication for NAC, NAM, CONMGR, and DBASRV.

When messages come in on the server side, TCPTASK looks at the contents of the packets to find out which are database requests or alarm requests. These requests are then placed into the appropriate queues in the global memory buffer space.

Before client connection requests can be received TCPTASK needs to set up a TCP listening port by calling WINSOCK. This results in the creation of a socket with the server’s IP address and TCP port number 2010. When the listening port detects a client connection request, a new socket is generated that contains the client side port number and IP address. All sockets created for client requests flow through the same port number 2010. Each socket created for a client will actually have two logical connections (assuming Alarm Network Service is enabled on the server SCU). The first logical connection is for data and the other is for alarming.

TCPTASK has one table associated with it: the Socket Table (SKT). This table exists in the global memory buffer space. The SKT stores the list of socket records in use, which represent connections to remote nodes. Information on timeouts, IP address, and transactions per second

5

Page 6: Fix Net

is recorded. Each socket record contains various counters that TCPTASK updates with communication data to manage the connection.

NBTASK

NBTASK is a service that has the NDK/NOH interfaces on one side and NetBIOS interface on the other. Its responsibilities include constructing calls to the NetBIOS interface, perform bookkeeping on session timers (Send and Receive) and outstanding messages, and handling/routing NDK/NOH communication for NAC, NAM, CONMGR, and DBASRV.

With existing client connections on the server side, NBTASK looks at the contents of the packets to find out which are database requests or alarm requests. These requests are then placed into the appropriate queues in the global memory buffer space. If the Alarm Network Service on the server SCU has been enabled, there will be two sessions per client connection. One will be for data and the other for alarming. The lack of bi-directional communication on a NetBIOS session is the reason why two sessions are required.

NBTASK has one table associated with it: the NetBIOS Table (NBT). This table exists in the global memory buffer space. The NBT stores the list of NBT records, which represent connections with remote nodes. Information on LANA 0, rate of transactions, Network Control Block (NCB) usage, and maximum number of records in use at one time is also retained. Each record contains data on whether the connection should be cleaned up or not and helpful reply/request logic.

NDK/NOH

Network Driver Toolkit (NDK) and Network Object Handler (NOH) contain dynamic link library files to allow exe-to-exe communication between some other executable and TCPTASK/NBTASK. NDK/NOH calls set up global memory buffer space that allow this interaction between two disparate executables and also indicate to RDA how the packets will be sized for transmission. Global memory can be accessed by any executable. From a Windows NT security perspective, two executables cannot directly communicate with one other, so global memory acts as a communication conduit between the two.

NDK/NOH roots come from a past decision to allow the possibility of other transport protocols (IPX for example) to be used besides TCP/IP and NetBEUI. This was supposed to come in the form of a toolkit to be provided to developers, but the idea for a toolkit was rejected.

NNTABLE

NNTABLE is an executable launched automatically during the FIX32 startup process. It is responsible for creating two tables that CONMGR examines to form remote connections. The two tables reside in the global memory buffer space and are named the Logical Name Table (LNT) and Network Name Table (NNT).

6

Page 7: Fix Net

CONMGR

CONMGR is responsible for establishing/reestablishing communications and periodically checks for connection status. It is responsible for managing the data connection and it accomplishes this through a heartbeat mechanism on the client node using the SCU configurable session timers Keep Alive and Inactivity.

Two tables are used by CONMGR: the Logical Name Table (LNT) and Network Name Table (NNT). The records in these tables represent nodes on the FIX network. The NNT records represent actual nodes on the network while the LNT records represent “virtual nodes”, which are groups of one or more actual nodes that can be treated as a single node for the purpose of data and alarm communications. SCADA redundancy relies on the entries in LNT to work correctly.

The CONMGR on a client is responsible for detecting communication losses and reestablishing them, without having to restart client nodes. Even if data will not be requested, it will connect or reconnect. CONMGR was designed to be very efficient in managing connections because from a TCP/IP standpoint, WINSOCK is not very reliable in doing this.

When the CONMGR on a client detects a connection loss to a server, a immediate connect attempt (one shot) is sent. If there is no response to this connect, then the client will try to connect every 20 seconds. Also, any period of inactivity that exceeds 20 seconds results in CONMGR checking the connection. By default, the Keep Alive timer is set to 20 seconds, and this is the timer CONMGR uses to check for inactivity.

Each record stored in NNT contains connection statistics for a remote node. It is this information that CONMGR uses to manage this connection. Figure 3 displays a flowchart of CONMGR managerial duties.

7

Page 8: Fix Net

Figure 3: Logic that CONMGR uses to manage connections

DBASRV

Database Server (DBASRV) is a service that runs on server nodes only. When the SCADA support has been enabled in the SCU, this service is made available. DBASRV is responsible for handling incoming process database value read and writes from clients. As requests are received, they are placed into the global memory buffers defined by NDK/NOH. This memory acts as a temporary storage location to hold these requests until DBASRV can process them.

RDA

Remote Data Access (RDA) is responsible for breaking up and piecing together FIX information for transmission. RDA is a dynamic link library that is called by EDA applications. Knowledge of how to break up FIX information for transmission (packet size) is provided by NDK/NOH. All EDA

8

Page 9: Fix Net

traffic flows through RDA. This traffic flow is controlled by RDAThrottle; a FIX registry entry that tells RDA how many requests can be made at a single time to a server.

UDBA

Universal Database Access (UDBA) is a dynamic link library that is responsible for deciding where the information that EDA is requesting resides. The information may come from the local process database or from a data system integrated into our product via System Extension Toolkit (STK). If the data is requested locally, UDBA calls the Local Database Access (LDBA) dynamic link library.

LDBA

Local Database Access (LDBA) is a dynamic link library that is responsible for deciding if the information requested will come from a standard FIX block or from one created with the Loadable Block Toolkit (LTK).

EDA

Easy Database Access (EDA) is a dynamic link library that receives requests from VIEW or from any EDA compliant application created with the FIX Integration Toolkit (FTK). If the request is to the local node, EDA calls UDBA, otherwise RDA is called.

NAM

Network Alarm Manager (NAM) is a service that is also available once SCADA support has been enabled. NAM is responsible for establishing alarm sessions, handling local and remote NAC requests, and sending new alarms. There are four alarm queues used by NAM and are contained in the same global memory buffer space that DBASRV uses; NAM Receive Queue, NAM Control Queue, Startup Queue, and Summary Startup Queue. The NAM Receive Queue stores alarm information from the server itself and from other servers. These are not requests, but actual alarm or message data. This data is then available to any alarm client that has an established alarm session request fulfilled and listed in the NAM Control Queue.

When a session is first established between NAM and an alarm client, NAM may also be responsible for sending past alarm data to clients if the server has been configured to do so. This is dependent on whether or not the Alarm Startup Queue service has been enabled in the server SCU. If so, alarm data contained in the Startup Queue and Summary Startup Queue will be sent to the alarm client.

NAC

Network Alarm Client (NAC) is a service that runs on both server and client machines. Its responsibilities include initiating alarm connections, sending operator messages, receiving

9

Page 10: Fix Net

alarm/message data either locally or over the network, and managing alarm connection(s) by sending “heartbeats” periodically to remote NAM(s). Depending on what is needed in managing the alarm connection, NAC will send a heartbeat or connect roughly every 2 minutes. The interval for the heartbeat timeout is hard coded and not configurable. A response to the heartbeat must be received within the Receive session timer setting; otherwise a NAC request to reconnect is performed. In maintaining alarm connections, NAM is responsible for establishing/reestablishing connections, but it will not do so until told to by a remote NAC.

NAC has two alarm queues; NAC Send Queue and NAC Control Queue. Both these queues reside in the global memory buffer space. The NAC Control Queue will contain NAM responses to requests that NAC had made to form alarm connections. The NAC Send Queue only exists on server machines and receives alarms that SAC generates. This alarm data is then made available to alarm clients by placing it into the NAM Receive Queue. Figure 2 displays a flowchart of NAC alarm communication management.

Figure 2: Flowchart of NAC alarm communication management

10

Page 11: Fix Net

FTK/STK/LTK

FIX32 has 3 major toolkits: FIX Integration Toolkit (FTK), System Extension Toolkit (STK), and Loadable Block Toolkit (LTK). These products allow seasoned developers to tailor their FIX32 implementation as required.

The FIX Integration Toolkit is a set of application program interfaces and libraries that allow a customer with development experience in VB or C++ to write programs that directly read, write, and modify their process database. This toolkit also contains a help file that briefly describes each API and how it is used, and a set of functions that allow the customer to access their historical data as well. We recommend the FIX Integration Toolkit to customers who have project specifications that require high-speed database access and modification.

The System Extension Toolkit is a set of application program interface descriptions and libraries that guide customers with extensive development experience on how to adapt a native data engine to our graphical front end. The STK enables customers to integrate their data systems onto our product. It replaces our process database, including SAC. For customers who understand our product and the components that make up the CORE, this toolkit sits at the LDBA level. The toolkit contains libraries and a WORD doc that describes the APIs and how to develop them.

The Loadable Block Toolkit is a set of libraries and instructions on how to write a custom block for our database system. This toolkit is designed for experienced developers who understand pointer usage, structures, and function tables. It allows customers to create blocks for specific data, native systems, and custom designed data manipulation.

Tweaking the Communication Process

Although the default FIX32 communication settings are usually sufficient for proper FIX network communication, there may be situations where these settings need to be adjusted to better suite your implementation. These situations may involve a large FIX32 network that contain a myriad of clients, a slow communication medium between two separate FIX32 subnets or networks, an overloaded server, connection time charges, or occasional interest in data or alarms.

This section discusses ways to tweak the FIX32 networking. These adjustments do not in any way remove or bypass any of the FIX network architectural objects discussed in section “Discussion of FIX32 Client/Server Network Architectural Components”. All they do is affect how some of the components function.

11

Page 12: Fix Net

RDAThrottle

Around the FIX 6.0 era, it was noted that if a good number of clients where connected to a server, any of the clients opening a picture (especially one that was unresolved) would dominate more server time that the other clients. This was due to the flurry of activity View generated to initially populate a picture with values or to map object data sources in pictures to associated tags in a process database. When this situation occurred, other client nodes noticed a performance drop in how their picture values were updated. It was decided that some sort of “throttle” was needed to control how RDA behaved. This throttle should be configurable, so an entry for this throttle was placed into the registry. The outcome resulted in the RDAThrottle registry setting.

RDAThrottle applies only to FIX nodes that make outgoing data requests (clients). FIX 6.0 had a default value of 5 for this setting. Later FIX32 versions now use 30. All FIX HMI products besides iFIX use a packet size of 1,400 bytes, so FIX 6.0 will send (5) 1,400 byte packets, and later versions would send (30) 1,400 byte packets. If we use the term “message” instead of packet, one message is 1,400 bytes.

An example of when RDAThrottle should be used would be a picture opening for the first time on client A. Say that this picture is unresolved and will require 70 messages to build the picture. All other clients are just requesting picture values. If you could see the contents of the DBASRV queue in global memory on the server, you would see a chunk of 30 messages from client A, a few small requests from other clients, another chunk of 30 messages from client A, a few small requests again from other clients, and finally a chunk of ten messages from client A. Figure 4 gives a pictorial representation of the example just described. Notice how an RDAThrottle setting of 5 on client A does not allow it to dominate DBASRV server resources.

RDAThrottle is really designed for pictures that are not resolved. Once a picture is open, requests for picture values are small when compared to requests for building a picture. The only time this setting should be altered on all clients is if there are a tremendous number of client nodes communicating with a server. It is important to apply the RDAThrottle change to ALL clients so that they play nice together. In this situation, the default value (30) would be changed to a lower value and apply this change to all clients, not just to machines experiencing performance drops.

12

Page 13: Fix Net

Figure 4: Pictorial example of RDAThrottle usage

Session Timers

FIX32 uses network session timers to manage connection lifetimes. These timers are configured in the SCU and apply to all outgoing connections.

The four time out values are described below:

Send - defines the amount of time that a client waits for a request to the server to be acknowledged. This acknowledgement is different than the message response. If this timer expires, the session ends.

Receive - defines the amount of time that a client waits for a reply from the server.

Keep Alive - defines the amount of time that, if no activity has occurred over an established connection, a client waits before sending a heartbeat message.

Inactivity - defines the amount of time that, if no data activity has occurred over an established dynamic connection, a client waits before removing the dynamic connection from the list of outgoing connections. If this timer expires, the session ends. The Inactivity timer only applies to outgoing connections and specifically Dynamic Connections. Connections listed in the Remote Node List in the SCU will not be disconnected after a period of inactivity.

13

Page 14: Fix Net

TCPTASK uses only the Send and Receive session timers. When running FIX32 over TCP/IP, the effective session timeout value is either the Send timer or the Receive timer; whichever is greater. If this timer expires, the session ends. The Send/Receive session timers do not alter any TCP/IP protocol registry settings. They are only used in the TCPTASK communication management process. NBTASK provides the NetBIOS interface with the values of the Send and Receive timers when it needs to form a session. CONMGR only uses the Keep Alive and Inactivity session timers for its connection management.

Dynamic Connections

Dynamic Connections is a FIX32 feature that creates server connections “on the fly” if a picture should be opened that contains a server name not listed on the client SCU as a Configured Remote Node. Once a dynamic connection is made to the server, it remains connected even if the picture or View is closed.

You may wish to use Dynamic Connections to manage connection lifetime for either or both of the following reasons:

You are charged for connection time and you want to minimize your connection time: Wide area connections such as ISDN can charge for connect time. You should enable inactivity timer on the client that is connecting to the server via ISDN.

You have only occasional interest in data or alarms: Even if you have only occasional interest in server data, it makes sense to use dynamic connection rather than add that server name to your Remote Node List. Using dynamic connections will ensure that you stop receiving server alarms when you stop requesting data. You should enable inactivity timer on the client to close the connection when data requests stop

The definition of inactivity is that there are 0 data messages for the configured amount of time. Data messages are any messages initiated by any application but Connection Manager or NAC/NAM. The following messages will keep a connection alive:

Real time data requests and writes

Database Builder messages, including database reload messages.

Alarm acknowledgments

Alarms are not data and they will not keep a session alive. Alarms for this purpose includes:

Block alarms

Alarm management messages (NAC management messages)

Operator messages.

Recipe messages.

SAC restart messages

There is also connection management and "heartbeat" messages sent periodically and these will not keep the connection alive either.

14

Page 15: Fix Net

The Inactivity timer does not affect incoming connections or alarm connections. A server should not terminate the connection just because the view node is not requesting data. Server connections should however be cleaned up after the session is dropped by the client. Alarm connections should be cleaned up when their accompanying data session is cleaned up.

A dynamic connection that never gets successfully established is inactive. An example of this is a mistyped SCADA node name or the name of a SCADA that no longer exists. When the inactivity timer expires, its allocated session/socket should be released, and its LNT_REC and NNT_REC should be removed.

IPC Mechanisms Used By FIX32

Interprocess Communications (IPC) is mechanisms allow communications and data sharing between applications. Most current operating systems provide the following IPC mechanisms:

named pipes mailslots NetBIOS Windows Sockets, Remote Procedure Calls (RPC) Network Dynamic Data Exchange (NetDDE)

NBTASK uses the NetBIOS interface while TCPTASK uses the Windows Sockets Interface.

Windows Sockets Interface

The Windows Sockets specification defines a network-programming interface for MicrosoftWindows. Using WINSOCK permits your application to communicate across any network that conforms to the Windows Sockets API.

Sockets are a bi-directional way data can be exchanged between networked computers. When a WINSOCK compliant application (FIX32 for example) creates a socket, it specifies the IP address of the intended host, the transport protocol used (TCP/UDP), and the port that the particular application will use.

When a server first starts up, TCPTASK create a socket with listening port 2010. Each client connection results in the construction of another socket that uses the same port number. The port numbers on the client side are assigned dynamically by the operating system when a request for a socket is made by TCPTASK on that node.

A port functions as a multiplexed message queue (can receive more than one message at a time) and this is what allows two logical connections to be formed on the same socket: the first connection by TCPTASK for data, and the other for alarming by NAC.

NetBIOS Interface

15

Page 16: Fix Net

Network Basic Input/Output System (NetBIOS) defines a software interface and a naming convention, not a protocol. The NetBEUI protocol, introduced by IBM in 1985, provided a protocol for programs designed around the NetBIOS interface. However, NetBEUI is a small protocol with no networking layer and because of this, it is not a routable protocol suitable for medium-to-large intranets.

A NetBIOS “session” is a logical connection between any two processes on the network. For FIX32 to establish a session, NBTASK on the server side issues a NCBLISTEN command to prepare to open a session. The server is now ready for client connection requests. When a client wants to connect to the server, it issues a NCBCALL command. After a session is established, the computers can exchange data using NetBIOS commands. Since NetBIOS sessions are uni-directional, separate sessions are needed for data and alarming.

Differences Between FIX32 and iFIX Network Communication

FIX32 and iFIX network architecture are very similar in design. If you fully understand how the underlying FIX32 network components function and communicate with one another, you understand iFIX networking. There is only one minor exception though, which we will discuss shortly.

Advances in networking technology, Windows operating systems, and computer resources helped shape decisions on iFIX development. These changes allowed iFIX to make use of multiple network interface cards and introduced two major network performance enhancements for iFIX data/alarming transmission.

Multiple LAN Support

FIX32 NBTASK only attempted to communicate over LANA 0 when the SCU was configured to use the NetBEUI protocol. The problem with this was that other protocols and additional network interface cards could be installed on a system, and if this was the case, it introduced additional network paths (LANA numbers) through the NetBIOS interface. Since FIX32 only used LANA 0, there was a possibility that LANA 0 was not using NetBEUI. If a decision was made to use NetBEUI, and this protocol was not tied to LANA 0, reconfiguration was needed. This involved either changes to LANA number configuration, command line switches when loading NBTASK, or changes in the NETWORK.INI file. TCPTASK did not have this issue because WINSOCK was used, not NetBIOS over TCP/IP (NetBT).

A similar issue existed with TCPTASK. If a machine contained more than one network interface card (NIC), this meant that there was more than one IP address. TCPTASK, in this case, would use the first IP address listed in the operating system, and this sometimes resulted in FIX32 trying to communicate with a completely different network instead of the FIX network on the other NIC card.

Fortunately, iFIX supports multiple LAN support. NBTASK in iFIX examines all LANA numbers to see which one is using NetBEUI. No user configuration is required for LANA numbers. With TCPTASK, all IP addresses and NIC cards are noted. Now, when iFIX starts, it examines all available paths for the protocol configured in the SCU and uses whatever one can connect with the server first.

16

Page 17: Fix Net

Multiple LAN support allows path exclusion. More than likely, only one path is associated with an iFIX network, so the ability to disable other network paths is available. What is special about this feature is that it inhibits the possibility of a non-iFIX network connection and especially, unwanted bandwidth usage on this network.

LAN Redundancy

LAN redundancy is another new network feature in iFIX that takes advantage of the ability of iFIX to detect and work with multiple network paths. With LAN redundancy, you can install two NIC cards onto a machine where each will connect to the same iFIX network/subnet. If a client detected a path communication failure, it would simply use the other one.

The difference between multiple LAN support and LAN redundancy is that LAN redundancy monitors the next available non-active route. A heartbeat is sent over the non-active route by CONMGR based on the value of the Keep Alive session timer.

A new table in global memory is required to keep track of redundant LAN connections. This new table is called the Logical Connection Table (LCT). The information in this table indicates LAN failovers and which connection is the active one.

So in the end you have two available paths; one active and the other inactive. What makes LAN redundancy stand apart from multiple LAN support is the managed backup connection.

Performance Improvements

Two major performance improvements introduced into the iFIX product. The first is the elimination of DBASRV. It was noted that DBASRV created a bottleneck because TCPTASK in FIX32 could only communicate to DBASRV through global memory. DBASRV functionality has now been incorporated into TCPTASK in iFIX.

The second performance enhancement was packet size. In the FIX32 era, packet size was limited to 1,400 bytes. Even though packet size was no longer a issue as FIX32 advanced, it was never taken advantage of because backwards compatibility would be needed if earlier FIX32 node versions existed on the network. The overhead of little packet sizes was causing performance degradation. A decision was made to increase the packet size to 16K in iFIX. This is possible due to new logic incorporated in RDA. RDA is smart enough to know how to format or piece together 1,400 byte and 16K packets.

Tying Everything Together: A Close Examinationof FIX32 Client/Server Communication

Now that we have a good understanding of FIX32 network architecture, let us see how everything ties together. Our examination assumes a single client/server model, network alarming, and that the TCP/IP protocol is configured in the SCU on both machines (since TCP/IP is the most commonly used protocol with FIX32). Figure 5 displays a flowchart that can be applied to either a client or server node. It describes the successive communication tasks performed on startup of a FIX32 node.

17

Page 18: Fix Net

Figure 5: Flowchart of client/server startup tasks to form FIX network connection

18