rina detailed components overview and implementation discussion

108
RINA Workshop The Pouzin Society RINA Detailed Overview and Implementation Discussions RINA Workshop. Barcelona, January 22-24 2013

Upload: irati-project

Post on 24-May-2015

968 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: RINA detailed components overview and implementation discussion

RINA Workshop

The Pouzin Society

RINA Detailed Overview and Implementation Discussions

RINA Workshop. Barcelona, January 22-24 2013

Page 2: RINA detailed components overview and implementation discussion

The Pouzin SocietyOverview

• Distributed Applications– Naming, Flows, Application API– Common Application Connection Establishment and CDAP– SDU Protection

• The DIF and the IPC Process– Block Diagram (Reference vs. Implementation Architecture)– RIB and RIB operations– Enrollment– Flow Allocation– Transport protocol: EFCP, DTP, DTCP– Relaying and Multiplexing: RMT– Routing and Forwarding– Resource Allocation

• Shim DIF– Internet/IP/TCP/UDP use with RINA

• IDD• Misc. topics: Network Mgt., Security, …• Demo storyboard

• RINAband2

Page 3: RINA detailed components overview and implementation discussion

The Pouzin Society

DISTRIBUTED APPLICATIONSNaming, Flows and Application API

3

Page 4: RINA detailed components overview and implementation discussion

The Pouzin SocietyDistributed applications

• For A and B to communicate, they need:– A means to identify each other -> Application process naming– A medium that provides a communication service -> Flows– A way to indicate the communication medium that they want resources to be

allocated for a particular communication to take place, with certain quality requirements -> Communication medium API

– A shared domain of discourse -> Objects– Optionally verify who are they talking to (authenticate), negotiate what

protocol is going to be used to carry the data they will exchange, and what concrete encoding is to be used -> Application connection

– A method to carry their discourse (objects)-> Application protocol4

Medium that enables applications to communicate

Appl. Process

B

Local handle to a particular instance of a

communication

Appl. Process

A1 8

Local handle to a particular instance of a

communication

flow

Application connection

Application protocol

Page 5: RINA detailed components overview and implementation discussion

The Pouzin Society

Distributed applications in the Internet

• For A and B to communicate, they need:– Application process naming: No names for applications, IP addresses

and ports is all what we have (URLs are pathnames to the applications)– Flows: Only 2 types; TCP (some variants) or UDP, each of them with

fixed characteristics– Communication medium API: Need to know to what PoA and port an

application is attached to in order to allocate a flow, no means to express desired properties of the flow

– Objects: Vary depending on the application protocol used– Application connection: Applications have to know in advance which

application protocol is going to be used; authentication is done through separate protocols.

– Application protocol: Many protocols and encodings, tailored to different purposes 5

Medium that enables applications to communicate

App name not existing, use IP

address or Domain name

Well known port (the handle is no longer

local)

App name not existing, use IP

address or Domain name

1 8

Well-known port, the handle is no longer local

TCP connection or UDP flow

Application connection (not a generic mechanism, partially provided through

different protocols)

Many: HTTP, SMTP, FTP, Telnet, RTP, SNMP, SSH, XMPP, …

Page 6: RINA detailed components overview and implementation discussion

The Pouzin SocietyDistributed applications in RINA

• For A and B to communicate, they need:– Application process naming: Complete application naming theory, no

communication medium internal addresses are exposed to apps.– Flows: Flows can have a myriad of characteristics, tailored to different

application requirements.– Communication medium API: Request allocation of flows to other

applications by name; request desired properties for each flow– Objects: Each application decides on their contents and encoding– Application connection: Generic application connection establishment

procedure, where different authentication policies can be plugged in – Application protocol: A single application protocol that can have multiple

encodings: CDAP 6

Medium that enables applications to communicate

Application names

Local handle (portId)

Application names

80

8Flows can have different QoS characteristics

Application connection (generic mechanism, different authentication policies)

CDAP

Local handle (portId)

Page 7: RINA detailed components overview and implementation discussion

The Pouzin SocietyRINA API

• The RINA API needs to be different in many ways from conventional Internet operations (usually “sockets”)– Application flows connect named Applications, not addresses/ports– RINA takes responsibility for locating the destination of a flow request,

whereas current practice is to use macro-for-the-address mapping (e.g., DNS) and then use the absolute address returned plus a “known port”• RINA Applications are “registered” in order to be found by their name

– Applications can be reached at multiple points (AE’s, more later)– Applications can reject a request for a flow before it is created and then

authenticate the requestor before establishing a connection– RINA allows a flexible specification of the requested quality/properties of

a flow– RINA transport occurs in application-defined units (“Service Data Units”

or SDUs) vs. a stream of unstructured bytes that force applications to do their own delimiting into meaningful units

• Current API’s don’t provide access to the full set of RINA benefits– Though it is not strictly necessary to have a common cross-system API

for RINA, it would still be a Good Thing to have, as sockets was for IP

7

Page 8: RINA detailed components overview and implementation discussion

The Pouzin SocietyNaming – Points to Remember

• The Internet does not name applications– The Internet doesn’t really name nodes/applications –

everything addressable is accessed using “absolute addresses” (IP addresses, ports) to reach it

– DNS is a name-to-number mapping, but applications contact other applications using the absolute addresses returned• There is no virtual addressing (NAT is arguably a step toward it, but

interacts in complex ways with DNS)

• RINA is different– Applications are named

• There can be multiple simultaneous executing instances of an application, so they must be distinguished. The “application instance” is an integral part of the application name.

• An application may contain multiple “Application Entities” (next slide)

– Applications do not know “addresses” of each other, only names

8

Page 9: RINA detailed components overview and implementation discussion

The Pouzin Society

• The Application Process Model– Application Process Name: the name of the app– Application Process Instance: to differentiate

specific instances of the same app– Application Entity Name: part of the application

concerned with communication. Associated to a subset of all the existing application objects

– Application Entity Instance: refers to a particular instantiation of an application entity (that is a particular instance of a communication associated to a concrete instantiation of an application protocol and a set of objects)

Application naming

Application

Application Entity

Application Entity

9

Public Internet

HTTP AE Instance 1

HTTP AE Instance 2

Gmail Server ApplicationInstance 1HTTP AE

Instance 9

Browser ApplicationInstance 1

Gmail appInstance 1

HTTP AE Instance 4

TCP connection

TCP connection

Private Network DRDA AE Instance 1

DB Server application instance 1

DRDA AE Instance 4

TCP connection

Gmail Server application instance 2

DRDA AE Instance 1

DRDA AE Instance 2

TCP connection

Page 10: RINA detailed components overview and implementation discussion

The Pouzin SocietyFlows

• Instantiation of a communication service between applications– A flow is locally identified by an app through the use of a port-id– Flows transport well defined units of application data (SDUs, Service Data

Units)

• A flow has some externally visible properties:– Bandwidth related

• Average bandwidth• Average SDU bandwidth• Peak BW duration• Peak SDU BW duration

– Undetected Bit Error Rate– Partial Delivery of SDUs allowed?– In order delivery of SDUs required?– Maximum allowable gap between SDUs?– Maximum delay– Maximum jitter

10

Page 11: RINA detailed components overview and implementation discussion

The Pouzin SocietyThe IPC API

11

• Presents the service provided by a DIF: a communication flow between applications, with certain quality attributes.

• 6 operations:• portId _allocateFlow(destAppName, List<QoSParams>)• void _write(portId, sdu)• sdu _read(portId)• void _deallocate(portId)• void _registerApp(appName, List<difName>)• void _unregisterApp(appName, List<difName>)

• QoSParams are defined in a technology-agnostic way• Bandwidth-related, delay, jitter, in-order-delivery, loss rates, …

• Aid to adoption: faux sockets API.• Presents the sockets API to the applications, but internally maps the calls to the IPC

API• Current applications can be deployed in RINA networks untouched, but won’t enjoy

all RINA features

Page 12: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC API Implementationi2CAT/TSSG Prototype: General design

• Design goal: Portability to multiple Operating Systems (take advantage of Java)

• The RINA Library is part of the application, and provides both a Sockets and a Native RINA API (can be part of the same library or create two packages).

• The IPC Manager is the point of entry to the “RINA stack” running on the computer. It hosts the IPC processes, manages its lifecycle (creation, deletion) and acts as a broker between the RINA library and the IPC Processes.

• Local TCP connections are the means of communication between Apps (running the RINA Library) and the IPC Manager.

• Use of blocking I/O: one thread per each TCP connection 12

IPC ManagerLocal TCP connections

Java Application

RINA App Library

Sockets API

Native RINA API

IPC Process

1IPC

Process 2

Page 13: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC API Implementationi2CAT/TSSG Prototype: behavior of a “client” RINA application

13

RINA App Library

IPC Manager

Open a new socket

Send CDAP M_CREATE Message with an FlowService object

Map App Name to DIF, find IPC Process that is member of the DIF, invoke allocateFlowSend CDAP M_CREATE_R Message with the FlowService object

Send delimited SDU (byte[]) to deliver data

Cause the IPC process to transfer the data over the flowSend delimited SDU[] to deliver dataKeep data in buffer

until read by the app, or notify app

Close socket Cause the IPC Process to unallocate the flow

Close socket

Close socket (on response message or timer)

Data transfer

Flow Allocation

Flow Deallocation

Received delete flow request

Page 14: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC API Implementationi2CAT/TSSG Prototype: behavior of a “server” RINA application (I)

14

RINA App Library

IPC Manager

Open a new socket

Send CDAP M_START Message with a AppRegistration object

App registrationStart a new Server Socket at port X.Listen for incoming requests

Send CDAP M_START_R Message (success or not, reason)

Flow allocation(for each new incoming flow request)

Send CDAP M_CREATE Message with a FlowService objectStart a new thread for the flow. Decide if accept connection

Incoming Flow allocation request, if destination app is registered, open a new socket

Open a new socket to port X

Send CDAP M_CREATE_R Message with the FlowService objectCause the IPC process to send the allocate response back

Update IDD table and related flow allocator directorie(s). RegisterApp contains Source App naming info, optional list of DIF names and socket number

Page 15: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC API Implementationi2CAT/TSSG Prototype: behavior of a “server” RINA application (II)

15

RINA App Library

IPC Manager

Flow Deallocation

socket closedIncoming Flow deallocation request,

Socket closeCause the IPC process to send the deallocate response backClose socket, stop

thread

App unregistration

Close socketUpdate IDD table and related IPC Process directorie(s)

Close socketOn timer or directly: close socket, close serversocket On timer, if not already closed, close

socket

Delimited SDU to write data (byte[])

Cause the IPC process to transfer the data over the flowDelimited SDU to read data (byte[])

Keep data in buffer until read by the app, or notify app

Data transfer

Page 16: RINA detailed components overview and implementation discussion

The Pouzin Society

DISTRIBUTED APPLICATIONSCommon Application Establishment Phase and CDAP

16

Page 17: RINA detailed components overview and implementation discussion

The Pouzin Society

Once application processes have a communication flow between them, they have to set up an application connection before being able to exchange any further information.

The application connection allows the two communicating apps to: Exchange naming information with its apposite, optionally authenticating it Agree on an application protocol and/or syntax version for the application data exchange

phase

CACEP Common Application Connection Establishment Phase

Appl. Process

A

DIF

Appl. Process

B

2 2flow

1) 2)

3) 4)

1

M_CONNECT (srcName, destName, credentials, proto, syntax version) Appl.

ProcessA

DIF

Appl. Process

B

2 2flow

2

N

Optional messages exchanging authentication information

Appl. Process

A

DIF

Appl. Process

B

2 2flow

N+1

M_CONNECT_R (result, reason, options)Appl.

ProcessA

DIF

Appl. Process

B

2 2flow

Application data transfer phase, processes exchange data using an application protocol

Page 18: RINA detailed components overview and implementation discussion

The Pouzin SocietyCDAP

• The Common Distributed Application Protocol (CDAP) is the application protocol used by IPC Processes to exchange shared state (IPC Processes are Application Processes)

• It is also recommended for all RINA applications to use for exchanging shared state (when anything but an amorphous flow of bytes is needed), legacy aps can use whatever they want to use

• The CDAP Specification defines the complete set of operations and messages, as well as their fields– Connection establishment (Connect, Disconnect, authentication)– Object operations: create, delete, read, write, start, stop

• The set of objects and meaning of operations is not dictated by CDAP proper – that is an application concern– IPC Processes are applications, and manipulate a set of objects, but none of

them are dictated by CDAP

• Messages can be encoded in any agreed-upon way– As long as the applications agree, e.g., via CACEP exchange– We currently use GPB, have experimented with JSON

18

Page 19: RINA detailed components overview and implementation discussion

The Pouzin Society

19

CDAP operates on objects

• All objects CDAP operates on have the following attributes:– ObjectClass

• Class (data type and representation) of an object

– ObjectName• A identifier of an object, unique within the objects of the same class

– ObjectInstance• An alias of objectClass + objectName, uniquely identifies an object

– ObjectValue• The actual value of the object, encoded as desired by the application

• All CDAP operations can be applied two modifiers: scope/filter; which enables CDAP operations to affect multiple objects that form a hierarchy with a single message:– Scope: An integer indicating how many levels below the selected object

the operation has to be applied.

– Filter: A predicate function that evaluates if the operation should be applied to each individual object within the scope.

Page 20: RINA detailed components overview and implementation discussion

The Pouzin SocietyCDAP, AEs and the OIB/RIB

• All the objects an Application Process knows about are locally “stored” in the Object Information Base/Resource Information Base.– The RIB may be an actual database, or just a logical representation of all

the information known by an application process.

• In RINA there’s only a single application protocol: CDAP. Then why are there different AEs?– Each AE is able to operate on a subset of the RIB

20

AE type 1Instance 1

Application Process 2

AE type 2Instance 1

OIB/RIB

AE type 1Instance 1

Application Process 1 Application Process 3

AE type 2Instance 1

OIB/RIBOIB/RIB

CDAPCDAP

Page 21: RINA detailed components overview and implementation discussion

The Pouzin SocietyCDAP Implementation

• CDAP messages comprise a sequence of 1 or more fields– The one always-present field is the message type

• Each field has an identifying name or numeric tag (depending on encoding), and a value

• The field names or tag values (for GPB encoding), value types, and presence/absence of particular fields in CDAP messages of each type is defined in the CDAP Specification

• One supported type for an object value is an embedded message, not understood by CDAP itself, but transported unchanged to the apposite– The message declarations for IPC Process object values are not

part of CDAP, but part of the IPC Process Object Dictionary definition. Other applications define their own object types

21

Page 22: RINA detailed components overview and implementation discussion

The Pouzin SocietyCDAP Implementation (cont.)

• For Google Protocol Buffers (GPB) syntax, a freely-available compiler can produce code in several languages to construct a valid CDAP message and to access the fields of one: https://developers.google.com/protocol-buffers/– .proto files describe the field values and types– GPB is being used because of its simplicity, compact representation,

support (Google uses it heavily), a freely-available high-quality tool, a simple definition language, and general acceptance by the developer community

– XML, ASN.1, JSON, or other representations would also work as a concrete syntax to encode CDAP messages

• i2CAT’s implementation uses Java code produced by the Google protoc GPB compiler

• TRIA’s implementation uses a table-driven parser/generator that also accepts/generates JSON

22

Page 23: RINA detailed components overview and implementation discussion

The Pouzin SocietyExample Message Definition

message qosCube_t{ //a QoS cube specificationrequired uint32 qosId = 1; //Identifies the QoS cubeoptional string name = 2; // A human-readable name for the QoS

cubeoptional uint64 averageBandwidth = 3; //in bytes/s, a value of 0 indicates 'don't care'optional uint64 averageSDUBandwidth = 4; //in bytes/s, a value of 0 indicates 'don't

care'optional uint32 peakBandwidthDuration = 5; //in ms, a value of 0 indicates 'don't care'optional uint32 peakSDUBandwidthDuration = 6; //in ms, a value of 0 indicates 'don't

care'optional double undetectedBitErrorRate = 7; //a value of 0 indicates 'don`t care'optional bool partialDelivery = 8; //indicates if partial delivery of SDUs is allowed or

notoptional bool order = 9; //indicates if SDUs have to be delivered in orderoptional int32 maxAllowableGapSdu = 10; //indicates the maximum gap allowed in

SDUs, a gap of N SDUs is considered the same as all SDUs delivered. A value of -1 indicates 'Any'optional uint32 delay = 11; //in milliseconds, indicates the maximum delay

allowed in this flow. A value of 0 indicates don't careoptional uint32 jitter = 12; //in milliseconds, indicates indicates the maximum

jitter allowed in this flow. A value of 0 indicates don't care}

23

Page 24: RINA detailed components overview and implementation discussion

The Pouzin Society

DISTRIBUTED APPLICATIONSSDU Protection

24

Page 25: RINA detailed components overview and implementation discussion

The Pouzin SocietySDU Protection

• Applications may have different levels of trust in the communication mediums they use– Need for a way to protect the SDUs they send through the flows

25

Medium that enables applications to communicate

App BApp A

1 2flow

SDUs

App A

SDU Protection module

flow1

Unprotected SDUs

Protected SDUs

• SDU Protection module protects outgoing SDUs and unprotects incoming SDUs

• Can perform the following functions (configurable through policies)• Encryption (Integrity and confidentiality)• Compression• Error detection (CRCs, FECs)• Time To Live

outbound SDU

s

inbound SDU

s

Page 26: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSBlock diagram, architecture reference vs. implementation

26

Page 27: RINA detailed components overview and implementation discussion

The Pouzin Society

Levels of Abstraction (Abstraction is Invariance)

Reference Model

Service Definitions

Protocols

Procedures

Policies

Implementation

Dec

reas

ing

Lev

els

of A

bstr

acti

on

Inva

rian

tV

aria

ble

Page 28: RINA detailed components overview and implementation discussion

The Pouzin SocietyIPC Process

• The IPC Process is an entity that provides IPC services for applications running on the same system– It is an application, and uses RINA application operations to do

everything it does– It may or may not be an “OS Process”– There is no set model for how to implement it, and there can be

very different implementations – based on OS, scale, and many other concerns

– In some implementations, it will become part of the OS, just as IP networking is now

– In some implementations, it will operate as “middleware”, atop the OS and its normal networking layer

• All IPC Processes do similar things – WHAT they do is described in the Reference Architecture, but there are many feasible Implementation Architectures for HOW those functions get done. We’ll examine a few today

28

Page 29: RINA detailed components overview and implementation discussion

The Pouzin SocietyBlock Diagram (Reference Arch.)

• You’ve seen the RINA Reference Architecture (RA) partitioning of the IPC Process– This describes the basic mechanisms and what they communicate

among themselves to perform the total functionality ascribed to an IPC Process

• Implementers use the RA as a guide to create an Implementation Architecture– Driven by their particular requirements, implementation target,

and end-use• Language, use of OS features, flow of control approach, etc., can all be

different – but they all need to implement the RA

– Modules may be different, but ALL RA functions will be present in a complete implementation, and will communicate with the same functions as they do in the RA

– There can also be multiple different implementations of the same Implementation Architecture (e.g., ports to different OS’s)

29

Page 30: RINA detailed components overview and implementation discussion

The Pouzin SocietyBlock Diagram (Reference Arch.)

DIF

System (Host)

IPC Process

IPC Process

MgmtAgemt

System(Router)

IPC Process

IPC Process

IPC Process

MgmtAgemt

System(Host)

IPC Process

IPC Process

MgmtAgemt

Appl. Process

DIF DIF

Appl. Process

IPC API

Data Transfer Data Transfer Control Layer Management

SDU Delimiting

Data Transfer

Relaying and Multiplexing

SDU Protection

Transmission Control

Retransmission Control

Flow Control

RIB Daemon

RIBCDAP

Parser/Generator

CACEP Enrollment

Flow Allocation

Resource Allocation

Forwarding Table Generator (Routing)

Authentication

State VectorState VectorState Vector

Data Transfer Data Transfer

Transmission ControlTransmission Control

Retransmission Control

Retransmission Control

Flow ControlFlow Control

IPC Resource

Mgt.

Inter DIF Directory

SDU Protecti

on

Multiplexing

IPC Mgt. Tasks

Other Mgt. Tasks

Application Specific Tasks

Increasing timescale (functions performed less often) and complexity30

Page 31: RINA detailed components overview and implementation discussion

The Pouzin Society

TRIA IMPLEMENTATION

31

Page 32: RINA detailed components overview and implementation discussion

The Pouzin SocietyOverall Goals and Approach

• Provide a framework to test and debug the new protocols– Use a single-threaded state machine model to simplify locking

and increase repeatability– Operate entirely at user (application) level for easier debugging

• Anticipate the desire to move some portions (which ones were as yet unknown) into the OS kernel eventually– Coded in C– Memory/buffering/time-management operations similar to those

available inside the UNIX/Linux OS

• Anticipate future porting to multiple targets– Use standard POSIX/UNIX capabilities common on all or most

platforms, avoid extensions that impair portability– Test on MacOS (Mach-based UNIX) and Linux– Test on large and small systems (Intel and ARM-based)

32

Page 33: RINA detailed components overview and implementation discussion

The Pouzin SocietyMajor Parts of the Implementation

• Infrastructure– Main program, select (event) loop, state machine framework, file

management, non-blocking I/O, delimiting, pseudo-files (internal IPC, Shim DIF), memory and message pools, timers, startup/shutdown, configuration parsing, logging and debug utilities, GPB and JSON utilities

• CDAP– Table-driven CDAP msg. parse/build, connection state machine

• RIB– Node allocation, lookup, RIB Daemon operations on nodes– Object Manager mechanism for operations on objects

• IPC Process– Per-DIF management (RIB Daemon, enrollment, startup), FA, FAI, DTP/DTCP,

Network Management client interface, Shim DIF, routing, IPC Process-specific Object Managers

• RINA native API Library

• Tests, including RINABAND33

Page 34: RINA detailed components overview and implementation discussion

The Pouzin SocietyHigh-Level Block Diagram

34

User Application

Flow Al. InstancesFlow Allocator

EFCP Instances

RMT

RINA API

Device Driver file (N-1) FAI socket

I/O Device RINA DIF

RINA API

(N-1)DIFFlows

Per-DIF Manager

Routing Computation

NetMgr App.

RINA API

IDD Application

RINA API

RIBDataBase

NetMgr Agent/Directory Server

IPCMGR Process

Logger

AuthenticationDatabase

SHIM DIF

UNIX/LinuxProcess

UNIX/LinuxProcesses

Page 35: RINA detailed components overview and implementation discussion

The Pouzin Society

I2CAT/TSSG IMPLEMENTATION

35

Page 36: RINA detailed components overview and implementation discussion

The Pouzin SocietyOverall Goals and Approach

• Provide an open source initial RINA implementation that can be used for education and quick prototyping – as well as to exercise and improve the RINA specs.– Easy to develop, OS-independent language: Java

– Code structured to be modular and extendable: Use OSGi as a component framework (Eclipse Virgo Implementation)

– Portable to different operating systems: only use Java OS-dependent features available in most OSs (sockets)

• Enable to setup relatively complex scenarios with few hardware resources – Use the TINOS protocol experimentation framework –developed by TSSG- in

order to be able to emulate multiple “hardware” within the same Java process.

• i2CAT/TSSG’s RINA implementation is part of the TINOS project, as one of the “protocol stacks” available. – Reuse of TINOS compile/build infrastructure– Maximize synergies between both projects: single development community

(hosted at github)– Integration with TINOS will be easier (not done yet) 36

Page 37: RINA detailed components overview and implementation discussion

The Pouzin SocietyMajor Parts of the Implementation• Infrastructure– VIRGO OSGi core (handles the lifecycle of the different components –bundles in

OSGi parlance), single thread pool, blocking I/O, configuration parsing (JSON library), sockets, Google Guava library (Java has no unsigned types, thanks!), Google Java GPB implementation, Java timers, delimiting, object encoding/decoding

• IPC Manager– RINA-side of the IPC API, IPC Process Lifecycle Management, will host

management agent and IDD (not implemented yet), console service (local administration)

• “Normal IPC Process”– RIB, RIB Daemon, CDAP Parser/generator, Enrollment task, Flow Allocator,

Resource Allocator, EFCP, RMT, SDU Protection

• Shim IPC Process for IP Layers– Setup and management of TCP and UDP flows as per the shim DIF spec

• RINA Application Library– Native RINA API and faux sockets API

• Test applications– RINABand, Echo server & client, simple chat application

37

Page 38: RINA detailed components overview and implementation discussion

The Pouzin SocietyGeneral design (I)

38

Flows to/from other shim IPC Processes

Listen for local TCP connections at port 32771

Virgo OSGi Kernel

IPC ManagerConsoleService

Listen for local TCP connections at port 32766

Application Service IPC Process Lifecycle

Management (“IRM”)

Client Application 1

RINA Lib

For each flow, local TCP connection to port 32771

Server Application 1

RINA LibFor the registration, local TCP connection to port 32771 Listen for local TCP connections at

port X (dynamically assigned)

For each flow to service application 1, local TCP connection to port X

Normal IPC ProcessComponents

IPC Service

Delimiter

RIB Daemon

RMT

Encoder CDAP Session Manager

Flow Allocator

Enrollment Task

Local administration

IDD

GPB parser Resource Allocator

EFCP

Shim IPC Process for IP

IPC Service

Flow Allocator

OS Process (Java VM

instantiation)NOTE: Could be multiple

“systems” within the same Java VM once fully

integrated with TINOS

OS Process (Java VM

instantiation)

Listen for TCP connections and UDP datagrams at IPa:portb

SDU Protection

Page 39: RINA detailed components overview and implementation discussion

The Pouzin SocietyGeneral design (II)

39

Page 40: RINA detailed components overview and implementation discussion

The Pouzin Society

Why TINOS?Larger experimentation scenarios with less infrastructure

40

Java Virtual Machine

IP (Jnode)

Data Link Data Link Data Link

Shim DIF

Data Link Data Link

IP (Jnode)

Shim DIF

DIF

Java Virtual Machine

IP (Jnode)

Data Link Data Link Data Link

Shim DIF

Data Link Data Link

IP (Jnode)

Shim DIF

DIF

Java Virtual Machine

Data LinkIP (OS stack)

Shim DIF

Java Virtual Machine

Data Link

IP (JNode)

Shim DIF

Public

In

tern

et

Data L

ink

IP (O

S sta

ck)

Shim

DIF

DIF

XMPP network

LAN

• With TINOS multiple nodes can be created within the same Java JVM, with different network connectivity with each other and other JVMs (TINOS uses adapted IP stack from JNode and XMPP for this)

Page 41: RINA detailed components overview and implementation discussion

The Pouzin Society

ALTERNATIVE IMPLEMENTATIONSSome Implementation Architectures with Interesting Properties

41

Page 42: RINA detailed components overview and implementation discussion

The Pouzin SocietyRINA in the OS Kernel

• Make RINA a “native” networking API– New/Extended OS system calls provide full RINA capability– Move (at least) DTP/DTCP into the OS kernel for speed

42

AppApp IPC Process-

RMT

Network Device 1

Forwarding Table

New/Extended OS API Calls

DTP/DTCP Flow State

Application Space

OS Kernel

Network Device 2

“Network Device”Might be a Shim DIF

or a RINA DIF

Page 43: RINA detailed components overview and implementation discussion

The Pouzin SocietyRINA Split Between H/W and S/W

• RINA RMT/DTP performed in hardware– Software still does DTCP and remainder of IPC Process fn’s– Transiting PDUs need not be processed by software

43

AppApp IPC Process-

RMT

Network Interface 1

Forwarding Table

New/Extended OS API Calls

DTP Flow State

Application Space

OS Kernel

Network Interface 2

DTCP

Hardware/Firmware

Page 44: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESS (CONTINUED)RIB and RIB operations

44

Page 45: RINA detailed components overview and implementation discussion

The Pouzin SocietyRIB and RIB Operations

• The Resource Information Base (RIB) is a virtual object database– Each AE projects a view over the underlying objects– The RIB holds the shared state of the communication

instances between applications

• The IPC Processes communicate by exchanging operations on RIB objects– The only operations are:

create, delete, read, write, start, and stop– These operations cause objects to perform appropriate

actions (defined in an object dictionary)– There is a particular tree of RIB objects defined for IPC

Process use (any other application can define its own tree)

45

Page 46: RINA detailed components overview and implementation discussion

The Pouzin SocietyA Few Thoughts on the RIB Daemon• A generalization of Event Management and Routing Update

– Elsewhere (circa 1988) I said Event Management is the hypothalamus of network management and looks like this:

RcvEvents

Logging

SubscriptionService Subscript

DefFile

Add/DeleteSubscriptionFilter control

To Other Management Applications

Page 47: RINA detailed components overview and implementation discussion

The Pouzin SocietyA Few Thoughts on the RIB Daemon

• Generalizing routing update adds a capability for managing periodic and/or event driven a data distribution and replication strategy.

RcvEvents

Logging

SubscriptionService Subscript

DefFile

Add/DeleteSubscription

Filter control

To Other Management Applications

ReplicationOptimizer

Does this imply an opportunity for a journaling RIB for some data?

WriteSubscriptions

Page 48: RINA detailed components overview and implementation discussion

The Pouzin SocietyA Few Thoughts on the RIB Daemon

• So re-arranging and re-labeling for our current problem.

CDAPProcessing

Logging

SubscriptionService

SubscriptDefFile

Add/DeleteSubscriptionsFrom Tasks

To Requesting Tasks

ReplicationOptimizer

An opportunity for a journaling RIB for some data?

WriteSubscriptions

Reads and Writes to an actual store or to other tasks or task data structures, e.g. DT-state vector.

EventSubscriptions

Incoming CDAP PDUSs

Page 49: RINA detailed components overview and implementation discussion

The Pouzin SocietyRIB Implementation

• Our protocol exchanges refer to objects by name and/or object-id (a number)– We haven’t started using object-id’s yet, but the intent was to

make the protocol exchanges more compact– We will standardize the object names/id’s that need to be the

same for consistent RINA implementations through PSOC

• The RIB appears as a tree-structured database with objects at its leaf nodes. Leaves are named with the full absolute pathname from the root to the leaf.

• We operate on an object by sending the operation and the operand object’s name/id (and a value, if appropriate)– The reference model has a “RIB Daemon” that performs the

operation; in practice, this may be subsumed into other entities 49

Page 50: RINA detailed components overview and implementation discussion

The Pouzin Society

Naming conventions for IPC Processes

• Application names:– Can be whatever, probably would be useful to give some

kind of indication of its physical location (to facilitate management, for no other purposes).

• Application instances:– Not used in principle, since in normal operation there

should be no need to connect to a concrete instance of an application process (default to 1).

• Two Application Entities:– Management AE: Flows established to/from here are

used to establish application connections to neighboring IPC Processes and exchange layer management information using CDAP.

– Data Transfer AE: Flows established to/from here are used by the RMT to transport “data transfer SDUs”.

50

Page 51: RINA detailed components overview and implementation discussion

The Pouzin Society

/daf/management/naming/applicationprocessname

Current tree of objects

51

/

/daf

/dif

/daf/management

/daf/management/operationalstatus

/daf/management/naming

/daf/management/naming/address

/daf/management/naming/whatevercastnames

/daf/management/neighbors

/dif/ipc/dif/ipc/datatransfer /dif/ipc/datatransfer/constants

/dif/management

/dif/management/flowallocator /dif/management/flowallocator/qoscubes

/dif/resourceallocation/flowallocator/flows

/dif/management/flowallocator/directoryforwardingtableentries

/dif/resourceallocation /dif/resourceallocation/flowallocator

/dif/resourceallocation/nminus1flowmanager /dif/resourceallocation/nminus1flowmanager/nminus1flows

/dif/resourceallocation/pduforwardingtable

Page 52: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSEnrollment

52

Page 53: RINA detailed components overview and implementation discussion

The Pouzin SocietyEnrollment

• Enrollment is the process by which an IPC Process communicates with another IPC Process to join a DIF– And acquires enough information to start operating as a member

of the DIF– After enrollment, the newly-enrolled IPC Process is able to create

and accept flows between it and other IPC Processes in the DIF

• Enrollment on the Internet– For TCP/IP mostly inexistent or by ad-hoc/manual means (DHCP

provides a bit of the required functionality)– In IEEE 802.11 the procedure for joining a network is almost

identical to what RINA predicts. The BSSID is a DIF-name.– Similarly, there is enrollment in 802.1q (VLANs).– Done independently, confirmation of the theory.

53

Page 54: RINA detailed components overview and implementation discussion

The Pouzin Society

Start at the BeginningJoining a DIF

• A Wants to join DIF beta of which B is a member. First it needs to establish communication with beta. So A’s DIF Management task using DIF A’s IPC Manager (not shown) does an allocate(beta, as good QoS as it can get).The name beta is a whatevercast name for the set containing the addresses of all members of beta that the rule returns the address of an IPC Process with a common (N-1)-DIF. The whatevernme is resolved by the (N-1)-DIF.

• The Allocate creates a flow between A and B. They exchange CDAP connect requests, followed by whatever authentication is required to establish an application connection between A and B. Actually between A and beta. B is acting as an agent or representative for beta.

• Then A and B exchange initialization information. Primarily B is telling A what its DIF internal name (address) is and populating A’s RIB with the current information on the DIF. We will come back to this.

(N-1)-DIF

IPC Process A wants to join DIF of which B is a member.

IPC Process B is a member of a DIF beta

DIF

Man

agem

ent D

IF Managem

ent

Establish connauthenticate

Initialization information

54

Page 55: RINA detailed components overview and implementation discussion

The Pouzin Society

A is now a member of beta

• There is now an application connection between the IPC management components of A and B.– All connections between members of a DAF are managed by their IPC management component.– Any management component can send on the flows managed by IPC.– All incoming PDUs are delivered to the RIB Daemon.– The RIB Daemon is a subscription service, essentially a generalization of both routing update and event

management. When any CDAP PDU arrives, it is logged and distributed to the tasks that have subscribed to be notified.

– The Flow Allocator subscribes to Create/Delete Flow Req. (The Flow Allocator will update the RIB after processing the request.)

(N-1)-DIF

A

IPC

Man

agem

ent

RIB

Dae

mon

B

IPC Managem

ent

RIB Daem

onApplicationConnection

55

Page 56: RINA detailed components overview and implementation discussion

The Pouzin SocietyEnrollment Exchange

• There are several enrollment situations that IPC Processes encounter when connecting, for example:– An IPC Process that is not enrolled connects to an IPC

Process that is not enrolled in a DIF – the two form a DIF– An IPC Process that is not enrolled connects to an IPC

Process that is already enrolled in a DIF – it joins the DIF– An IPC Process that is enrolled makes a connection to a

neighbor that is enrolled – they now have a new route for flows

• An IPC Process can be in either role, as initiator or target

• The information exchanged in some cases can be reduced to minimize enrollment time

56

Page 57: RINA detailed components overview and implementation discussion

The Pouzin SocietyEnrollment Procedure I

• When the New Member receives the M_Connect Response, the New Member copies Current_Address to Saved_Address, it sends – M_Start Enrollment(address, Address_expiration_time, other data about

New Member)

• /* The New Member is telling the Existing Member what it knows. Primarily this is derived from the address (NULL or not), and the expiration life-time of the address if non-NULL. Since addresses are generally assigned for hours or minutes, tight time synchronization is not required. (Even for DIFs with fast turnover, fairly long assignment times are still prudent.)*/

• The Member sends – M_Start_R Enrollment(address (potentially different), Application Process

Name, Current_Address, Address_Expiration).

Page 58: RINA detailed components overview and implementation discussion

The Pouzin SocietyEnrollment Procedure II

• Using the information, provided by the New Member, the Existing Member sends – M_Create (zero or more) to initialize the Static and Near Static information

required. When finished and the New Member has sent all necessary – M_Create_Rs

• The Existing Member sends a

– M_Stop Enrollment (Immediate:Boolean)

• The New Member may Read any additional information not provided by the Existing Member.– M_Read (zero or more)– M_Stop_R Enrollment

• If the Immediate Boolean is True, the New Member is free to transition to the Operational state.

• If the Boolean Immediate is False, then the New Member can not transition to the Operational state until an M_Start Operation is received.

Page 59: RINA detailed components overview and implementation discussion

The Pouzin SocietyEnrollment Procedure III

• The New Member is free to Read any information not provided by the Existing Member. Once these are completed, the Existing Member sends:– M_Start Operation

• The New Member sends– M_Start_R Operation

• Invoke RIB Update of dynamic information which will cause others to send data to the New Member.

Page 60: RINA detailed components overview and implementation discussion

The Pouzin Society

Ignore if started earlier, or start now (consider enrolled now)

Check if I got enough data to start. If more info is required send M_READ requests on specific objects (not the case). I’m Enrolled!Now, if I have a DIF in common with one or more of the neighbors (I’m multihomed) I could enroll with them as well (next slide)

Example Message SequenceSkipping Application connection setup (CACEP)

• One IPC process is a member of a DIF, another one is not

60

Joining IPC Process

MemberIPC Process

1

M_START (Enrollment_Info_object{address=null})

2The joining IPC Process has no address, not a member of the DIF. Assign a valid address and reply

3

M_START_R (ok, Enrollment_Info_object{address=25})

4

Got a positive response and an address. Wait for STOP Enrollment response, RIB Daemon processes the M_CREATE messages

4Send DIF static info (whatevercast names, data transfer constants, qos cubes, supported policy sets) and dynamic info (neighbours, directory forwarding table entries) through a series of M_CREATE messages

5

M_CREATE (DIF_info1)

5

M_CREATE (DIF_infoN)

6Once all the information is sent, send stop enrollment request (informing the enrollee has to wait for START operation request) and wait for response

7

M_STOP (Enrollment{allowed_to_start_early=true})

8

9

M_STOP_R (ok)

10 Got STOP response. He’s enrolled! Send M_START message (no answer required)11

M_START (operationalStatus)12

Page 61: RINA detailed components overview and implementation discussion

The Pouzin Society

Ignore if started earlier, or start now (consider enrolled now)

Check if I got enough data to start. If more info is required send M_READ requests on specific objects (not the case). The member I’ve talked to is now my neighbor!

Example Message SequenceSkipping Application connection setup (CACEP)

• Both IPC Processes are members of the same DIF

61

Joining IPC Processalso a member

MemberIPC Process

1

M_START (Enrollment_Info_object{address=25})

2The joining IPC Process has a valid address, he is a member of the DIF. Reply

3

M_START_R (ok, Enrollment_Info_object{address=25})

4

Got a positive response and my address is still valid. Wait for M_STOP enrollment request, RIB Daemon processes the M_CREATE messages

4Send DIFs dynamic info only (neighbours, directory forwarding table entries) through a series of M_CREATE messages5

M_CREATE (DIF_info1)

5

M_CREATE (DIF_infoN)

6 Once all the information is sent, send stop enrollment request

7

M_STOP (Enrollment{allowed_to_start_early=true})

8

9

M_STOP_R (ok)

10 Got STOP response. He’s my neighbor! Send start message, no response required

11

M_START (operationalStatus)12

Page 62: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSFlow Allocation

62

Page 63: RINA detailed components overview and implementation discussion

The Pouzin Society

63

Flow Allocator

• When Application Process generates an Allocate request, the Flow Allocator creates a flow allocator instance to manage each new flow.

• The Instance is responsible for managing the flow and deallocating the ports– DTP/DTCP instances are deleted automatically after 2MPL with no

traffic,

• When it is given an Allocate Request it does the following:

Allocate(Dest-Appl-Name, QoS parameters)

FlowAllocator

LocalDir Cache

DirForwarding

Table

Page 64: RINA detailed components overview and implementation discussion

The Pouzin Society

Details of the Allocation Data Flow: I

• Upon initialization, the FA subscribes to create/delete flow objects.

• The FAI is handed an allocate request. After determining that it is well formed it must find the destination application.

• It consults the Directory Forwarding Table (dotted arrow). The table maps the dest-appl in the request to a “Next Place” to look for it (IPC Process @)

• That points to either a nearest neighbor management flow (if it is multihomed there will be more than one) or a connection allocated that does not go to a nearest neighbor, but uses the data transfer AE. This connection was created by the management task and is available to all tasks within the IPC Process.

Allocate(dest-appl, desired_flow_properties)

Create Flow(dest-appl, stuff)

IPC/RMT

FAI

EFCP

DirectoryForwarding

Table

Appl-names Next Place

RIBDaemon

SubscribeCreate/deleteFlow objects

64

Page 65: RINA detailed components overview and implementation discussion

The Pouzin Society

Details of the Allocation Data Flow: II

• When a Create Flow Request arrives, the RIB Daemon forwards it to the FAI for inspection.

• If the FA determines that dest-appl is not here, then it consults the Directory forwarding table as before to determine where to send it next.

• If dest-appl is here, then . . .

Create Flow(dest-appl, stuff)

IPC/RMT

FA

EFCP

DirectoryForwarding

Table

Appl-names Next Place

RIBDaemon

Create Flow(dest-appl, stuff)

65

Page 66: RINA detailed components overview and implementation discussion

The Pouzin Society

Details of the Allocation Data Flow: III

• When the Create Flow Req arrives it is passed to the Flow Allocator.• The Flow Allocator looks it up in the table and determines that dest-appl is here. It

determines whether or not the requestor has access to dest-appl.If it does, • then dest-appl is instantiated if necessary, and given an allocate indicate.• It responds with an allocate confirm, if positive then data transfer can commence.• In either case or earlier of access was denied, a Create Flow Resp is sent back with the

appropriate response.

IPC/RMT

FAI

EFCP

DirectoryForwarding

Table

Appl-names Next Place

RIBDaemon

DestAppl

AllocateIndicate Allocate

Confirm

Create Flow Req(dest-appl, stuff)

Create Flow Resp

Read/Write

66

Page 67: RINA detailed components overview and implementation discussion

The Pouzin Society

Implementation of the Flow AllocatorApplication registration and DirectoryForwardingTable• DirectoryForwardingTable maps ApNames to the @ of IPC

processes where they are currently registered.– Updated by local application registration events (through IPC API)– Updated by remote application registration events (through remote CDAP

messages processed by the RIB Daemon)– Updated by timers (to discard stale entries)

• Distributed database, several strategies for implementation (the larger the DIF, the more complex it becomes)– Compromise between load of messages to update the database vs. the

timeliness of the data in each DB– Fully replicated vs. partially replicated

• Current implementation: simple, only for small DIFs.– Fully replicated Database (all the IPC Processes know about all the

registered applications in the DIF)– Each time a local application registers/unregister, the FA sends CDAP

M_CREATE message to all its nearest neighbors – Each time a new mapping is learned (from a remote update), if the value of

that mapping changed, the FA sends CDAP M_CREATE message to all its nearest neighbors – except for the one that notified the update -

67

Page 68: RINA detailed components overview and implementation discussion

The Pouzin Society

Implementation of the Flow Allocatori2CAT: Management of flows and Interaction with EFCP

68

DIF

IPC Process

Flow Allocator

Appl. Process

1Allocate Request (destAPName, QoS Params)

2 Map request into policies, see if is feasible.Search dest app. at the directory.

IPC Process

Flow Allocator

Appl. Process

5

3M_CREATE(Flow object)

allocation_requested(srcApName)

4Check access control and policies to see if flow is feasible

6allocation_response(result)

7 Create DTP/DTCP instanceDTP/DTCP

8

M_CREATE_R(Flow object)2 Create DTP/DTCP instance

DTP/DTCP

2 Create FAI

FAI FAI

4 Create FAI

9

Allocate Response(result)

• When the flow has been established, 1 incoming and 1 outgoing queue are allocated at the layer boundary by the FAI

• Also, a new EFCP StateVector for the connection (1 per flow right now) is instantiated at the DataTransferAE; as well as 2 queues for queuing PDUs to/from the RMT

Page 69: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSTransport protocol: EFCP

69

Page 70: RINA detailed components overview and implementation discussion

The Pouzin Society

70

EFCP: Error and Flow Control Protocol

• Based on delta-t with mechanism and policy separated.– Naturally cleaves into Data Transfer and Data Transfer Control

• Data Transfer consists of tightly bound mechanisms– Roughly similar to IP+UDP

• Data Transfer Control, if present, consists of loosely bound mechanisms.– Flow control and retransmission (ack) control

• One or more instances per flow; policies driven by the QoS parameters.– The Flow Allocator translates the QoS parameters into suitable policies.– In parallel, might be used for things like p2p [sic] do.– Used serially, avoids the need for a separate security connection as in IPsec.

• Comes in several syntactic flavors based on the length of (address, connection-endpoint-id and sequence number)

• Addresses: 8, 16, 32, 64, 128, variable.• CEP-id: 8, 16, 32, 64• Sequence: 4, 8, 16, 32, 64

Data TransferProtocol

Data TransferControl

State Vector

Page 71: RINA detailed components overview and implementation discussion

The Pouzin Society

EFCP: separation of port allocation from synchronization

71

Synchronization (EFCP state machines, data transfer)

ConnectionEndpoint

Port Allocation (FA dialogue, IPC Process management)

Port-id

Connection

• Separating port allocation from synchronization – unlike TCP- has interesting security implications – more on this later.

• Port Allocation state is created/deleted based on explicit requests• Local applications through the IPC API (allocate/deallocate flows)• Remote CREATE/DELETE Flow requests from other IPC Processes

• Synchronization state is refreshed every time a DTP/DTCP packet is sent/received• If no packet is received after a certain amount of time state is discarded

Page 72: RINA detailed components overview and implementation discussion

The Pouzin Society

Intro to Delta-TTimer-based connection management

• All connections exist all the time, the protocol just needs to keep caches of the state for those that have carried traffic recently– When a PDU is received for a certain connectionId, the state of

the connection is refreshed– After a certain amount of time with no traffic, the state is

discarded

• What amount of time with no traffic is necessary to be able to safely discard the send/receive state and ensure that:– No packets from a previous connection are accepted in a new

connection– The receiving side doesn’t close until it has received all the

retransmissions of the sending site and can unambiguously respond to them

– A sending side must not close until it has received an Ack for all its transmitted data or allowed time for an Ack of its final retransmission to return before reporting a giveup failure. 72

Page 73: RINA detailed components overview and implementation discussion

The Pouzin Society

73

Intro to Delta-T (II)Timer-based connection management

• MPL: Max time to traverse a network

• A: Max time the receiver will wait before sending an acknowledge

• R: Max time a sender will keep retransmitting a packet

• deltaT = MPL + A + R

• Watson showed that send state can be safely discarded after a period of 3deltaT with no traffic, and receive state can be discarded after a period of 2deltaT with no traffic

Sender Receiver

MPL

A

R

PDU 1

PDU 1 ACK

PDU 2

PDU 2

PDU 2

• No SYNs are FYNs are necessary (compared to TCP) -> simpler, more robust

• Implication of Watson’s results:• If MPL cannot be bound, then there is no way to have a reliable data

transport, therefore it cannot be IPC

Page 74: RINA detailed components overview and implementation discussion

The Pouzin SocietyData Transfer Protocol (DTP)

• Notice that the flow is a straight shot, very little processing and if there is anything to do, it is moved to the side. The most complex thing DTP does is reassembly and ordering.

• If there is a DTCP instance for this flow: – If the flow control window closes,

PDUs are shunted to the flow controlQ.– If the flow does retransmission, a copy

of the PDU is put on the rexmsnQ.

• These PDUs are now DTCP’s responsibility to send when appropriate.

RMT

CRC

Sequencing/Strip

Delimiting

Reassembly/Separation

Reassmb/SeqQ

InboundQ

CRC

Delimit SDU

Fragment/ Concatenate

Sequence/Address

RexmsnQ

ClsdWinQ

DTCP PDUs

74

Page 75: RINA detailed components overview and implementation discussion

The Pouzin Society

75

Data Transfer PDU Contents• Version: 8 Bit (optionally used, absent in current prototypes)

• Destination-Address: Addr-Length

• Source-Address: Addr-Length

• Flow-id: Struct

– QoS-id: 8 Bit

– Destination-CEP-id: Port-id-Length

– Source-CEP-id: Port-id-length• PDUType: 8 bits

• Flags: 8 bits

• PDU-Length: LengthLength

• SequenceNumber: SequenceNumberlength

• Sequence User-Data{DelimitedSDU* | SDUFrag}

Page 76: RINA detailed components overview and implementation discussion

The Pouzin SocietyDTP PDU Parsing Example (DEMO)

int policy = dtc.EFCPEncodingPolicyType;

switch ( policy ) { case PDUVERSION_DEMOPROFILE:

NEXT16(destAddr);NEXT16(srcAddr);NEXT16(destCEPID);NEXT16(srcCEPID);NEXT8(qosID);NEXT8(pduType);NEXT8(flags);NEXT32(pduSeqNumber);break;

76

Page 77: RINA detailed components overview and implementation discussion

The Pouzin SocietyDTP Policies

• UnknownFlowPolicy – When a PDU arrives for a Data Transfer Flow terminating in this IPC-Process and there is no active DTSV, this policy consults the ResourceAllocator to determine what to do.

• SDUReassemblyTimer Policy – this policy is used when fragments of an SDU are being reassembled and all of the fragments to complete the SDU have not arrived. Typical behavior would be to discard all PDUs associated with the SDU being reassembled.

• SDUGapTimer Policy – this policy is used when the SDUGapTimer expires and PDUs have not been received to a sequence of SDUs with no gaps greater thanMaxGapAllowed. Typically, the action would be to signal an error or abort the flow.

• ClsdWindPolicy - This policy determines what to do if the PDU should not be passed to the RMT.

• MaxPDUSize – The maximum size in bytes of a PDU in this DIF.

• MaxFlowPDUSize – The maximum size in bytes of a PDU on this Flow.

• SeqRollOverThres – The value at which a new flow is created and assigned to this Port-id to support data integrity.

• MaxGapAllowed – The maximum gap in SDUs that can be delivered to the (N)-DIF port without compromising the requested QoS.

77

Page 78: RINA detailed components overview and implementation discussion

The Pouzin SocietyData Transfer Control Protocol

• For flows with retransmission (acks) and/or flow control, a DTP flow requires a DTCP companion.

• DTCP controls flow volume, the RMT controls combined flow rate of (N-1)-flows.– Congestion Control is provided by (N-1)-flows

• Notice no explicit synchronization. This is enforced by the bounds on the 3 timers Watson found are necessary: Retransmission Control bounds two of them: RTT and retries. Max PDU Lifetime is bounded by PDUProtection (TTL) or the propagation time on a (N-1)-DIF that does not relay, e.g. a wire.

DTP

RMT

DT-SV

Re-xmsn Q

Flow Control Q

DTCP

RexmsnCtl

Flow Ctl

Data Flow

Control Flow

78

Page 79: RINA detailed components overview and implementation discussion

The Pouzin Society

DTCP PoliciesGeneral policies & parameters

• TA – Maximum time an ack is delayed before sending

• TG – Maximum time to exhaust retries.

• TimeUnit – for rate based flow control, i.e. # of PDUs sent per TimeUnit

• FlowInitPolicy – Data Transfer Control initialization policy

• SVUpdatePolicy – Updates the State Vector on arrival of a TransferPDU

• LostControlPDUPolicy – What to do if a Control PDU is lost?

79

Page 80: RINA detailed components overview and implementation discussion

The Pouzin Society

DTCP PoliciesRetransmission control

• RTTEstimator Policy – the algorithm for estimating RTT

• RetransmissionTimerExpiryPolicy - what to do when a Retransmission Timer Expires, if the action is not retransmit all PDUs with sequence numbers less than this.

• ReceiverRetransmission Policy - This policy is executed by the receiver to determine when to positively or negatively ack PDUs.

• SenderAck Policy - provides some discretion on when PDUs may be deleted from the ReTransmissionQ. This is useful for multicast and similar situations where one might want to delay discarding PDUs from the retransmission queue.

• SenderAckList Policy - similar to the previous one for selective ack

80

Page 81: RINA detailed components overview and implementation discussion

The Pouzin Society

DTCP PoliciesFlow control

• InitialCredit Policy - sets the initial amount of credit on the flow.

• InitialRate Policy - sets the initial sending rate to be allowed on the flow.

• ReceivingFlowControlPolicy - on receipt of a Transfer PDU can update the flow control allocations.

• UpdateCredit Policy – determines how to update the Credit field, i.e. whether the value is absolute or relative to the sequence number.

• FlowControlOverrun Policy - what action to take if the credit or rate has been exceeded.

• ReconcileFlowConflict Policy - when both Credit and Rate based flow control are in use and they disagree on whether the PM can send or receive data.

81

Page 82: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSRelaying and Multiplexing Task

82

Page 83: RINA detailed components overview and implementation discussion

The Pouzin SocietyRelaying and Multiplexing Task

• Outbound this is the first queuing we must hit and even here it may not be necessary (see below).

• DTP flows are classed by the QoS-id part of the connection-id, RMT policy determines the servicing of the queues, for each PDU consulting the forwarding table and posting it to the proper (N-1)-port.

– Because PDUs are complete formed, RMT need not distinguish locally generated PDUs from those that arrived on an (N-1)-port.

• The natural structure of the 3 kinds of “boxes” is such to limit the number of (N-1)-ports.

(N-1)-DIF A (N-1)-DIF B

Queues

Ports

PDUs fromEFCP & (N-1)-DIF flows

Forwarding Table

83

Page 84: RINA detailed components overview and implementation discussion

The Pouzin SocietyRMT Implementation Issues

• RMT is the main place where QoS policies operate

• Multiplexing requires flow control, buffering, and policies for how to manage queue space and I/O bandwidth– RMT may discard inbound PDUs it has no place for– RMT uses a policy to decide which outbound flow’s data will be sent

when it can next send a PDU to an (N-1)Flow• A primary input to this decision is the QoS cube of the flow

– Various flow control methods to push back to the application may be used to prevent having to discard outbound data• For example, don’t take an outbound PDU from a flow until the destination

(N-1)Flow is known to be able to accept it

• RMT accesses the Forwarding Table to chose the (N-1)Flow to send a PDU over

• The RMT also may receive a PDU from an (N-1)Flow that needs forwarding, refer to the Forwarding Table, and place it on an outbound (N-1)Flow– This also has flow control/resource management implications 84

Page 85: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSRouting and forwarding

85

Page 86: RINA detailed components overview and implementation discussion

The Pouzin SocietyRouting and Forwarding

• When a local application generates an outbound PDU for a remote application, RMT locates the appropriate outbound (N-1)DIF flow by using the last-known address for the destination application– This uses the “forwarding table”, a mapping of

address+QoS to a specific (N-1)DIF flow. This is in general a many-to-many mapping.

• The forwarding table is also used to determine which outbound flow to use to forward a PDU going to a destination other than the current IPC Process

• The forwarding table can be computed in the same way it’s usually done – periodic recomputation, based on neighbor and link performance updates

86

Page 87: RINA detailed components overview and implementation discussion

The Pouzin SocietyRouting and Forwarding

• DTP PDUs with non-local destination transit thru RMT• Route update messages maintain forwarding table

87

(N-1)-DIF A (N-1)-DIF B

Queues

Ports

PDUs fromEFCP & (N-1)-DIF flows

Forwarding Table

Compute Forwarding

Table

Route Update Messages

DTP PDU

Page 88: RINA detailed components overview and implementation discussion

The Pouzin SocietyComplications in Implementing

• Protection on PDU’s must be checked and potentially removed before RMT can examine the PCI– E.g., the PDU could be encrypted or could be coded with

redundant coding to reduce error rates

• RMT uses decoded PCI to determine if the PDU is for a local destination, and if so for which flow

• If the PDU is determined to be transiting the IPC Process, the exiting (N-1)Flow must be identified and appropriate protection re-computed if anything has changed– Protection needs to be recomputed only if the PDU

changes• E.g., hop count, if present, would be decremented by RMT 88

Page 89: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC PROCESSResource Allocator

89

Page 90: RINA detailed components overview and implementation discussion

The Pouzin SocietyResource Allocator

• The resource allocator is the core of management in the IPC Process. The degree of decentralization depends on the policies and how it is used.

• The RA has a set of meters and dials that it can manipulate. The meter fall in 3 categories:– Traffic characteristics from the user of the DIF– Traffic characteristics of incoming and outgoing flows– Information from other members of the DIF

• The Dials– Creation/Deletion of QoS Classes– Data Transfer QoS Sets– Modifying Data Transfer Policy Parameters– Creation/Deletion of RMT Queues– Modify RMT Queue Servicing– Creation/Deletion of (N-1)-flows– Assignment of RMT Queues to (N-1)-flows– Forwarding Table Generator Output

90

Page 91: RINA detailed components overview and implementation discussion

The Pouzin Society

IPC Process

Resource Allocatori2CAT Implementation

• Just implemented a small subset of the RA functionality: management of N-1 flows.– Request flows to one or more N-1 DIFs– Register the IPC Process in unerlying N-1 DIFs. Process flow requests

that have the IPC Process as a target (accept/deny them)

• Current policy.– Before initiating CACEP and enrollment to a neighbor, allocate a management

N-1 flow his Management AE. If enrollment is successful, allocate a data transfer flow to the data Transfer AE of this neighbor.

91

N-1 Flow Manager

N-1 DIF A

N-1 DIF B

Page 92: RINA detailed components overview and implementation discussion

The Pouzin Society

SHIM DIF

92

Page 93: RINA detailed components overview and implementation discussion

The Pouzin SocietyThe Shim DIF

• Sits above a non-RINA transport (e.g., wire, Internet, LAN) and presents enough of the RINA API to allow an application to treat the transport as a RINA DIF

• Some transports present a poor match to RINA

• Luckily, the IPC Process is an undemanding RINA application; it needs only unreliable flows to neighbors to operate

• The non-RINA transport configuration information needed may be configured statically, or by using some non-RINA method (e.g., DNS)

• To date, we have created Shim DIFs over IP– There are many practical issues with this mapping

93

Page 94: RINA detailed components overview and implementation discussion

The Pouzin SocietyThe Shim DIF

94

Public Internet Private IP layer

“Shim IPC Process”

“Shim IPC Process”

IPC Process

“Shim IPC Process”

IPC Process IPC Process

“Shim IPC Process”

Shim DIFShim DIF

DIF

Appl. Process

Appl. Process

UDP flow UDP flow

TCP flow(s) TCP flow(s)

• The “shim IPC Process” for IP layers is not a “real IPC Process”. It just presents an IP layer as if it is a regular DIF– Wraps the IP layer with the DIF interface.– Maps the names of the IPC Processes of the layer above to IP addresses in the IP layer.– Creates TCP and/or UDP flows based on the QoS requested by an “allocate request”.

Page 95: RINA detailed components overview and implementation discussion

The Pouzin SocietyIP/TCP/UCP Practical Issues

• DNS does not provide “application names”– An IP address (or FQDN) plus port is closer, but not exact

• NAT blocks incoming traffic unless a port is opened– Outgoing traffic generally opens a (high) port for incoming– Manually opening ports for incoming flow requests takes

administration/configuration effort, so needs to be minimized

• Sharing a TCP flow among multiple RINA flows creates flow control and starvation potential, so only UDP flows are usable for DTP traffic– This forces one unique port number per application

• Incoming TCP connections from an IP address are not self-identifying w.r.t. the originating application (in general, outgoing ports are not well-known ports)– We introduced a Shim-DIF-specific PDU type to handle this

95

Page 96: RINA detailed components overview and implementation discussion

The Pouzin Society

MISC. TOPICS

96

Page 97: RINA detailed components overview and implementation discussion

The Pouzin Society

INTER-DIF DIRECTORY (IDD)

97

Page 98: RINA detailed components overview and implementation discussion

The Pouzin SocietyIDD

• IDD instances will communicate with IPC Processes– Local communication (same node) may be OS’s IPC or

RINA flows, depending on implementation architecture• The Reference Architecture does not mandate a method

• IDD instances will communicate with one another– Since instances are (generally) on different nodes,

standard RINA application flows will be used– This is an area where standard approaches (protocols,

AE’s, …) can be adopted, but there is no requirement to adopt a single model for all DIFs and sets of DIFs

98

Page 99: RINA detailed components overview and implementation discussion

The Pouzin SocietyNetwork Management

• Network management operations work the same way the IPC Process works in general: operations on objects

• There is additional security (mostly OS provided) on access to the IPC Process by a local Network Management client application

• There will be RIB objects that Network Management has visibility to and rights to operate on that remote IPC Processes do not have

• Network Management can cross DIF boundaries, since multiple DIFs may belong in the same management domain

99

Page 100: RINA detailed components overview and implementation discussion

The Pouzin SocietySecurity

• All applications have the option to identify and accept/reject incoming flow requests from other applications

• An OS may choose to limit what applications have the right to access a particular DIF/IPC Process

• Encryption is per-DIF; if an application wants its own SDUs hidden from the IPC Process it’s using to communicate via a DIF, it can encrypt them – the IPC Process never looks inside of them– Since the IPC Process is an application, this goes for its

PDUs as well.

100

Page 101: RINA detailed components overview and implementation discussion

The Pouzin Society

CONCLUSIONS

101

Page 102: RINA detailed components overview and implementation discussion

The Pouzin SocietyConclusions

• The RINA Architecture is implementable– Three implementations are in various stages of completion, pushing

one another– The size and complexity of implementation is modest (we are

currently using simple policies)

• There are many reasonable implementation approaches– Different requirements and OS’s may lead to different partitioning,

language, and overall approach– What would we do differently in our next implementation? Discussion

to follow!

• With working implementations in place, bringing up a new one is much less difficult than the first ones– Most problems will be with the new implementation– It is also beneficial to the existing ones – new implementations can

cause an existing one to follow new paths and uncover latent defects

• We welcome new partners and new implementations!102

Page 103: RINA detailed components overview and implementation discussion

The Pouzin Society

DISCUSSION

103

Page 104: RINA detailed components overview and implementation discussion

The Pouzin Society

DEMO STORYBOARD

104

Page 105: RINA detailed components overview and implementation discussion

The Pouzin SocietyRINABand Test Application

• Client specifies test parameters– Num_flows, SDU size, SDUs per flow, who sends data, reliable/unreliable

flows

• Client sets up a number of flows and, when setup, the test starts– Client, server or both send the agreed number of SDUs over the flows

• Test ends when all the SDUs have been received at the receiving side(s) or a timer fires (counting time without receiving SDUs)

• Client displays stats of the test– SDUs sent/received (number, Mbps), % of lost SDUs

105

DIF

RINAband serverInstance 1

1 flow for test controlRINABand Client ControlAE

Data AEInst 7N flows for test data

Page 106: RINA detailed components overview and implementation discussion

The Pouzin SocietyDemo scenario

• The public Internet shim DIF provides direct connectivity to all the IPC Processes in the RINA-Demo.DIF– Doesn’t necessarily need to be the case, it depends how the public Internet shim

DIF “directory” is populated

106

“Public Internet layer”

Public Internet shim DIF

RINA-Demo.DIF

T

TI I TFlorida

Florida

Castelldefels

Barcelona.i2CAT

Barcelona

64

84.88.40.71 84.88.40.70

Castefa.i2CAT

65

Castelldefels

?

bigslug.TRIA-Fl

radio.TRIA-Fl

1720

RINABand, 1

RINABand, 4

32769

32770

32769

32770Tria-fl.dyndns.org

Tria-fl.dyndns.org

32792

32793

32769

32770

azathoth.TRIA-Fl

147.83.207.208

32769

32770

RINABand, 6

Page 107: RINA detailed components overview and implementation discussion

The Pouzin SocietyDemo scenario(near future)

• Missing a bit of functionality to reach this– Routing computation– Flow Allocator should do relaying of M_CREATE, M_CREATE_R and M_DELETE Flow

requests

107

“Public Internet layer”

Public Internet shim DIF

RINA-Demo.DIF

LAN shim DIFWiFi LAN

Page 108: RINA detailed components overview and implementation discussion

The Pouzin SocietyDemo storyboard

• IPC Process Creation– As a member of a DIF (show RIB)– Not a member of any DIF (show RIB)

• Enrollment (show RIB after joining)– Unenrolled member contacts enrolled member – Enrolled member contacts enrolled member– Member goes away and joins again

• Application registration– RINABand application(s) registering at DIF (show FA directory update)– RINABand applications unregistering (show FA directory update)

• Flow allocation– Establish flows and send data with the RINABand client. Show

throughtput, stats…

108