ndim

8/12/2019 NDIM

1/64

1

1. Entity relationship model and explain all three levels of E-R Diagram?An Entity Relationship model(ER model) is an abstract way to describe a database.

It is a visual representation of different data using conventions that describe how these

data are related to each other.

There are three basic elements in ER models:

Entitiesare the things about which we seek information. Attributesare the data we collect about the entities. Relationships provide the structure needed to draw information from multiple

entities.

Symbols used in E-R Diagram:

Entity rectangle Attribute-oval Relationship diamond Link- line

Entities and Attributes

Entity Type: It is a set of similar objects or a category of entities that are well defined

A rectangle represents an entity set Ex: students, courses We often just say entity and mean entity type

Attribute: It describes one aspect of an entity type; usually [and best when] single valued

and indivisible (atomic)

Represented by oval on E-R diagram Ex: name, maximum enrollment

8/12/2019 NDIM

2/64

2

Types of Attribute:

Simple and Composite Attribute

Simple attribute that consist of a single atomic value.A simple attribute cannot be

subdivided. For example the attributes age, sex etc are simple attributes.

A composite attribute is an attribute that can be further subdivided. For example the

attribute ADDRESS can be subdivided into street, city, state, and zip code.

Simple Attribute: Attribute that consist of a single atomic value.

Example: Salary, age etc

Composite Attribute : Attribute value not atomic.

Example : Address : House_no:City:State

Name : First Name: Middle Name: Last Name

Single Valued and Multi Valued attributeA single valued attribute can have only a single value. For example a person can have only

one date of birth, age etc. That is a single valued attributes can have only single value.

But it can be simple or composite attribute.That is date of birth is a composite attribute ,

age is a simple attribute. But both are single valued attributes.

Multivalued attributes can have multiple values. For instance a person may have multiple

phone numbers,multiple degrees etc.Multivalued attributes are shown by a double line

connecting to the entity in the ER diagram.

Single Valued Attribute: Attribute that hold a single value

Example1: Age

Exampe 2: City

Example 3: Customer id

Multi Valued Attribute: Attribute that hold multiple values.

Example1: A customer can have multiple phone numbers, email ids etc

Example 2: A person may have several college degrees

Stored and Derived Attributes

The value for the derived attribute is derived from the stored attribute. For example Date

of birth of a person is a stored attribute. The value for the attribute AGE can be derived

by subtracting the Date of Birth(DOB) from the current date. Stored attribute supplies a

value to the related attribute.

Stored Attribute: An attribute that supplies a value to the related attribute.

Example: Date of Birth

8/12/2019 NDIM

3/64

3

Derived Attribute: An attribute thats value is derived from a stored attribute.

Example : age, and its value is derived from the stored attribute Date of Birth.

Keys

Super key: An attribute or set of attributes that uniquely identifies an entitythere can be

many of these

Composite key:A key requiring more than one attribute

Candidate key: a superkey such that no proper subset of its attributes is also a superkey

(minimal superkey has no unnecessary attributes)

Primary key: The candidate key chosen to be used for identifying entities and accessing

records. Unless otherwise noted key means primary key

Alternate key: A candidate key not used for primary key

Secondary key: Attribute or set of attributes commonly used for accessing records, butnot necessarily unique

Foreign key:An attribute that is the primary key of another table and is used to establish a

relationship with that table where it appears as an attribute also.

Graphical Representation in E-R diagram

Rectangle Entity

Ellipses Attribute (underlined attributes are [part of] the primary key)

Double ellipses multi-valued attribute

Dashed ellipses derived attribute, e.g. age is derivable from birthdate and current date.

Relationships

Relationship: connects two or more entities into an association/relationship

John majors in Computer ScienceRelationship Type: set of similar relationships

8/12/2019 NDIM

4/64

4

Student(entity type) is related to Department(entity type) by MajorsIn(relationshiptype).

Relationship Types may also have attributes in the E-R model. When they are mapped to

the relational model, the attributes become part of the relation. Represented by a diamond

on E-R diagram.

Cardinality of Relationships

Cardinality is the number of entity instances to which another entity set can map under the

relationship. This does not reflect a requirement that an entity has to participate in a

relationship. Participation is another concept.

One-to-one: X-Y is 1:1 when each entity in X is associated with at most one entity in Y,

and each entity in Y is associated with at most one entity in X.

One-to-many: X-Y is 1:M when each entity in X can be associated with many entities in

Y, but each entity in Y is associated with at most one entity in X.

Many-to-many: X:Y is M:M if each entity in X can be associated with many entities in Y,and each entity in Y is associated with many entities in X (many =>one or more and

sometimes zero)

8/12/2019 NDIM

5/64

5

8/12/2019 NDIM

6/64

6

Relationship Participation

Constraints

Total participation

Every member ofentity set must

participate in the

relationship

Representedbydouble line from

entity rectangle to relationship diamond

E.g., A Classentity cannot exist unless related to a Facultymember entity in thisexample, not necessarily at Juniata.

You can set this double line in Dia In a relational model we will use the referencesclause.

Key constraint

If every entity participates in exactly one relationship, both a total participation anda key constraint hold

E.g., if a class is taught by only one faculty member.Partial participation

Not every entity instance must participate Represented by single line from entity rectangle to relationship diamond E.g., A Textbookentity can exist without being related to a Classor vice versa.

8/12/2019 NDIM

7/64

7

Strong and Weak Entities

Strong Entity Vs Weak Entity

An entity set that does not have sufficient

attributes to form a primary key is termed as a

weak entity set. An entity set that has a

primary key is termed as strong entity set.

A weak entity is existence dependent. That is

the existence of a weak entity depends on the

existence of a identifying entity set. The

discriminator (or partial key) is used to identify other attributes of a weak entity set.The

primary key of a weak entity set is formed by primary key of identifying entity set and the

discriminator of weak entity set. The existence of a weak entity is indicated by a doublerectangle in the ER diagram. We underline the discriminator of a weak entity set with a

dashed line in the ER diagram.

8/12/2019 NDIM

8/64

8

2. Make a ER Diagram of Library Management System? (all three levels)Library management System (LMS) provides a simple GUI (graphical user interface) for

the Library Staff to manage the functions of the library effectively. Usually when a book is

returned or issued, it is noted down in a register after which data entry is done to update

the status of the books in a moderate scale. This process takes some time and proper

updation cannot be guaranteed. Such anomalies in the updation process can cause loss of

books. So a more user friendly interface which could update the database instantly, has a

great demand in libraries.

E-R Diagram for LMS:

8/12/2019 NDIM

9/64

9

3. Explain decision table and its parts? Make a decision table of a report card

A decision table is an excellent tool to use in both testing and requirements management.

Essentially it is a structured exercise to formulate requirements when dealing with

complex business rules. Decision tables are used to model complicated logic. They can

make it easy to see that all possible combinations of conditions have been considered and

when conditions are missed, it is easy to see this.

A decision table is a good way to deal with combinations of things (e.g. inputs). This

technique is sometimes also referred to as a cause-effect table. The reason for this is that

there is an associated logic diagramming technique called cause-effect graphing which

was sometimes used to help derive the decision table (Myers describes this as a

combinatorial logic network. However, most people find it more useful just to use the

table described in. Decision tables provide a systematic way of stating complex business rules, which

is useful for developers as well as for testers.

Decision tables can be used in test design whether or not they are used inspecifications, as they help testers explore the effects of combinations of different

inputs and other software states that must correctly implement business rules.

It helps the developers to do a better job can also lead to better relationships withthem. Testing combinations can be a challenge, as the number of combinations can

often be huge. Testing all combinations may be impractical if not impossible. We

have to be satisfied with testing just a small subset of combinations but making the

choice of which combinations to test and which to leave out is also important. If

you do not have a systematic way of selecting combinations, an arbitrary subset

will be used and this may well result in an ineffective test effort.

The four quadrants

Conditions Condition alternatives

Actions Action entries

Each decision corresponds to a variable, relation or predicate whose possible values are

listed among the condition alternatives. Each action is a procedure or operation to perform,

and the entries specify whether (or in what order) the action is to be performed for the set

of condition alternatives the entry corresponds to. Many decision tables include in their

condition alternatives the don't care symbol, a hyphen. Using don't cares can simplify

8/12/2019 NDIM

10/64

10

decision tables, especially when a given condition has little influence on the actions to be

performed. In some cases, entire conditions thought to be important initially are found to

be irrelevant when none of the conditions influence which actions are performed.

Aside from the basic four quadrant structure, decision tables vary widely in the way the

condition alternatives and action entries are represented. Some decision tables use simple

true/false values to represent the alternatives to a condition (akin to if-then-else), other

tables may use numbered alternatives (akin to switch-case), and some tables even use

fuzzy logic or probabilistic representations for condition alternatives.In a similar way,

action entries can simply represent whether an action is to be performed (check the actions

to perform), or in more advanced decision tables, the sequencing of actions to perform

(number the actions to perform).

8/12/2019 NDIM

11/64

11

4. Explain various types of cohesion and coupling along with the Diagram?

In software engineering, coupling or dependency is the degree to which each program

module relies on each one of the other modules.

Coupling is usually contrasted with cohesion. Low coupling often correlates with high

cohesion, and vice versa. The software quality metrics of coupling and cohesion were

invented by Larry Constantine, an original developer of Structured Design who was also

an early proponent of these concepts (see also SSADM). Low coupling is often a sign of a

well-structured computer system and a good design, and when combined with high

cohesion, supports the general goals of high readability and maintainability.

In computer programming, cohesion refers to the degree to which the elements of a

module belong together. Thus, it is a measure of how strongly related each piece of

functionality expressed by the source code of a software module is.Cohesion is an ordinal type of measurement and is usually expressed as high cohesion

or low cohesion when being discussed. Modules with high cohesion tend to be

preferable because high cohesion is associated with several desirable traits of software

including robustness, reliability, reusability, and understandability whereas low cohesion

is associated with undesirable traits such as being difficult to maintain, difficult to test,

difficult to reuse, and even difficult to understand.

Cohesion is often contrasted with coupling, a different concept. High cohesion often

correlates with loose coupling, and vice versa. The software quality metrics of coupling

and cohesion were invented by Larry Constantine based on characteristics of good

programming practices that reduced maintenance and modification costs.

Types of coupling

8/12/2019 NDIM

12/64

12

Conceptual model of coupling

Coupling can be "low" (also "loose" and "weak") or "high" (also "tight" and "strong").

Some types of coupling, in order of highest to lowest coupling, are as follows:

Procedural programming

A module here refers to a subroutine of any kind, i.e. a set of one or more statements

having a name and preferably its own set of variable names.

Content coupling (high)

Content coupling (also known as Pathological coupling) occurs when one module

modifies or relies on the internal workings of another module (e.g., accessing local

data of another module).

Therefore changing the way the second module produces data (location, type,

timing) will lead to changing the dependent module.Common coupling

Common coupling (also known as Global coupling) occurs when two modules

share the same global data (e.g., a global variable).

Changing the shared resource implies changing all the modules using it.

External coupling

External coupling occurs when two modules share an externally imposed data

format, communication protocol, or device interface. This is basically related to the

communication to external tools and devices.

Control coupling

Control coupling is one module controlling the flow of another, by passing it

information on what to do (e.g., passing a what-to-do flag).

Stamp coupling (Data-structured coupling)

Stamp coupling occurs when modules share a composite data structure and use

only a part of it, possibly a different part (e.g., passing a whole record to a function

that only needs one field of it).

This may lead to changing the way a module reads a record because a field that the

module does not need has been modified.

Data coupling

Data coupling occurs when modules share data through, for example, parameters.

Each datum is an elementary piece, and these are the only data shared (e.g.,

passing an integer to a function that computes a square root).

8/12/2019 NDIM

13/64

13

Message coupling (low)

This is the loosest type of coupling. It can be achieved by state decentralization (as

in objects) and component communication is done via parameters or message

passing (see Message passing).

No coupling

Modules do not communicate at all with one another.

Object-oriented programming

Subclass Coupling

Describes the relationship between a child and its parent. The child is connected to

its parent, but the parent is not connected to the child.

Temporal coupling

When two actions are bundled together into one module just because they happento occur at the same time.

In recent work various other coupling concepts have been investigated and used as

indicators for different modularization principles used in practice.

Disadvantages

Tightly coupled systems tend to exhibit the following developmental characteristics,

which are often seen as disadvantages:

1. A change in one module usually forces a ripple effect of changes in other modules.2. Assembly of modules might require more effort and/or time due to the increased

inter-module dependency.

3. A particular module might be harder to reuse and/or test because dependentmodules must be included.

Performance issues

Whether loosely or tightly coupled, a system's performance is often reduced by message

and parameter creation, transmission, translation (e.g. marshaling) and message

interpretation (which might be a reference to a string, array or data structure), which

require less overhead than creating a complicated message such as a SOAP message.

Longer messages require more CPU and memory to produce. To optimize runtime

performance, message length must be minimized and message meaning must be

maximized.

8/12/2019 NDIM

14/64

14

Message Transmission Overhead and Performance

Since a message must be transmitted in full to retain its complete meaning,

message transmission must be optimized. Longer messages require more CPU and

memory to transmit and receive. Also, when necessary, receivers must reassemble

a message into its original state to completely receive it. Hence, to optimize

runtime performance, message length must be minimized and message meaning

must be maximized.

Message Translation Overhead and Performance

Message protocols and messages themselves often contain extra information (i.e.,

packet, structure, definition and language information). Hence, the receiver often

needs to translate a message into a more refined form by removing extra characters

and structure information and/or by converting values from one type to another.Any sort of translation increases CPU and/or memory overhead. To optimize

runtime performance, message form and content must be reduced and refined to

maximize its meaning and reduce translation.

Message Interpretation Overhead and Performance

All messages must be interpreted by the receiver. Simple messages such as integers

might not require additional processing to be interpreted. However, complex

messages such as SOAP messages require a parser and a string transformer for

them to exhibit intended meanings. To optimize runtime performance, messages

must be refined and reduced to minimize interpretation overhead.

Solutions

One approach to decreasing coupling is functional design, which seeks to limit the

responsibilities of modules along functionality, coupling increases between two classes A

and Bif:

Ahas an attribute that refers to (is of type) B. Acalls on services of an object B. Ahas a method that references B(via return type or parameter). Ais a subclass of (or implements) class B.

Low coupling refers to a relationship in which one module interacts with another module

through a simple and stable interface and does not need to be concerned with the other

module's internal implementation (see Information Hiding).

8/12/2019 NDIM

15/64

15

Systems such as CORBA or COM allow objects to communicate with each other without

having to know anything about the other object's implementation. Both of these systems

even allow for objects to communicate with objects written in other languages.

Coupling versus Cohesion

Coupling and Cohesion are terms which occur together very frequently. Coupling refers to

the interdependencies between modules, while cohesion describes how related are the

functions within a single module. Low cohesion implies that a given module performs

tasks which are not very related to each other and hence can create problems as the

module becomes large.

Module coupling

Coupling in Software Engineering describes a version of metrics associated with this

concept.For data and control flow coupling:

di: number of input data parameters ci: number of input control parameters do: number of output data parameters co: number of output control parameters

For global coupling:

gd: number of global variables used as data gc: number of global variables used as control

For environmental coupling:

w: number of modules called (fan-out) r: number of modules calling the module under consideration (fan-in)

Coupling(C) makes the value larger the more coupled the module is. This number ranges

from approximately 0.67 (low coupling) to 1.0 (highly coupled)

For example, if a module has only a single input and output data parameter

If a module has 5 input and output data parameters, an equal number of control

parameters, and accesses 10 items of global data, with a fan-in of 3 and a fan-out of 4,

8/12/2019 NDIM

16/64

16

COUPLING

An indication of the strength of interconnections between program units.

Highly coupled have program units dependent on each other. Loosely coupled are made

up of units that are independent or almost independent.

Modules are independent if they can function completely without the presence of the

other. Obviously, can't have modules completely independent of each other. Must interact

so that can produce desired outputs. The more connections between modules, the more

dependent they are in the sense that more info about one modules is required to understand

the other module.

Three factors: number of interfaces, complexity of interfaces, type of info flow alonginterfaces.

Want to minimize number of interfaces between modules, minimize the complexity of

each interface, and control the type of info flow. An interface of a module is used to pass

information to and from other modules.

In general, modules tightly coupled if they use shared variables or if they exchange control

info.

Loose coupling if info held within a unit and interface with other units via parameter lists.

Tight coupling if shared global data.If need only one field of a record, don't pass entire record. Keep interface as simple and

small as possible.

Two types of info flow: data or control.

Passing or receiving back control info means that the action of the module willdepend on this control info, which makes it difficult to understand the module.

Interfaces with only data communication result in lowest degree of coupling,followed by interfaces that only transfer control data. Highest if data is hybrid.

Ranked highest to lowest:

1. Content coupling: if one directly references the contents of the other.When one module modifies local data values or instructions in another module.

(can happen in assembly language)

if one refers to local data in another module.

if one branches into a local label of another.

8/12/2019 NDIM

17/64

17

2. Common coupling: access to global data.modules bound together by global data structures.

3. Control coupling: passing control flags (as parameters or globals) so that onemodule controls the sequence of processing steps in another module.

4. Stamp coupling: similar to common coupling except that global variables areshared selectively among routines that require the data. E.g., packages in Ada.

More desirable than common coupling because fewer modules will have to be

modified if a shared data structure is modified. Pass entire data structure but need

only parts of it.

5. Data coupling: use of parameter lists to pass data items between routines.COHESION

Measure of how well module fits together.A component should implement a single logical function or single logical entity. All the

parts should contribute to the implementation.

Many levels of cohesion:

1. Coincidental cohesion: the parts of a component are not related but simply bundledinto a single component.

harder to understand and not reusable.

2. Logical association: similar functions such as input, error handling, etc. puttogether. Functions fall in same logical class. May pass a flag to determine which

ones executed.

interface difficult to understand. Code for more than one function may be

intertwined, leading to severe maintenance problems. Difficult to reuse

3. Temporal cohesion: all of statements activated at a single time, such as start up orshut down, are brought together. Initialization, clean up.

Functions weakly related to one another, but more strongly related to functions in

other modules so may need to change lots of modules when do maintenance.

4. Procedural cohesion: a single control sequence, e.g., a loop or sequence of decisionstatements. Often cuts across functional lines. May contain only part of a complete

function or parts of several functions.

Functions still weakly connected, and again unlikely to be reusable in another

product.

8/12/2019 NDIM

18/64

18

5. Communicational cohesion: operate on same input data or produce same outputdata. May be performing more than one function. Generally acceptable if alternate

structures with higher cohesion cannot be easily identified.

still problems with reusability.

6. Sequential cohesion: output from one part serves as input for another part. Maycontain several functions or parts of different functions.

7. Informational cohesion: performs a number of functions, each with its own entrypoint, with independent code for each function, all performed on same data

structure. Different than logical cohesion because functions not intertwined.

8. Functional cohesion: each part necessary for execution of a single function. e.g.,compute square root or sort the array.

Usually reusable in other contexts. Maintenance easier.9. Type cohesion: modules that support a data abstraction.

Not strictly a linear scale. Functional much stronger than rest while first two much

weaker than others. Often many levels may be applicable when considering two

elements of a module. Cohesion of module considered as highest level of cohesion

that is applicable to all elements in the module.

8/12/2019 NDIM

19/64

19

5. Explain project selection technique and data dictionary with the help of example?

One of the biggest decisions that any organization would have to make is related to the

projects they would undertake. Once a proposal has been received, there are numerous

factors that need to be considered before an organization decides to take it up.

The most viable option needs to be chosen, keeping in mind the goals and requirements of

the organization. How is it then that you decide whether a project is viable? How do you

decide if the project at hand is worth approving? This is where project selection methods

come in use.

Choosing a project using the right method is therefore of utmost importance. This is what

will ultimately define the way the project is to be carried out.

But the question then arises as to how you would go about finding the right methodology

for your particular organization. At this instance, you would need careful guidance in theproject selection criteria, as a small mistake could be detrimental to your project as a

whole, and in the long run, the organization as well.

Selection Methods

There are various project selection methods practised by the modern business

organizations. These methods have different features and characteristics. Therefore, each

selection method is best for different organizations.

Although there are many differences between these project selection methods, usually the

underlying concepts and principles are the same.

Following is an illustration of two of such methods (Benefit Measurement and

Constrained Optimization methods):

8/12/2019 NDIM

20/64

20

As the value of one project would need to be compared against the other projects, you

could use the benefit measurement methods. This could include various techniques, of

which the following are the most common:

You and your team could come up with certain criteria that you want your idealproject objectives to meet. You could then give each project scores based on how

they rate in each of these criteria and then choose the project with the highest

score.

When it comes to the Discounted Cash flow method, the future value of a project isascertained by considering the present value and the interest earned on the money.

The higher the present value of the project, the better it would be for your

organization.

The rate of return received from the money is what is known as the IRR. Hereagain, you need to be looking for a high rate of return from the project.

The mathematical approach is commonly used for larger projects. The constrained

optimization methods require several calculations in order to decide on whether or not a

project should be rejected.

Cost-benefit analysis is used by several organizations to assist them to make their

selections. Going by this method, you would have to consider all the positive aspects of

the project which are the benefits and then deduct the negative aspects (or the costs) from

the benefits. Based on the results you receive for different projects, you could choose

which option would be the most viable and financially rewarding.

These benefits and costs need to be carefully considered and quantified in order to arrive

at a proper conclusion. Questions that you may want to consider asking in the selection

process are:

Would this decision help me to increase organizational value in the long run? How long will the equipment last for? Would I be able to cut down on costs as I go along?

In addition to these methods, you could also consider choosing based on opportunity cost -

When choosing any project, you would need to keep in mind the profits that you would

make if you decide to go ahead with the project.

Profit optimization is therefore the ultimate goal. You need to consider the difference

between the profits of the project you are primarily interested in and the next best

alternative.

8/12/2019 NDIM

21/64

21

Implementation of the Chosen Method:

The methods mentioned above can be carried out in various combinations. It is best that

you try out different methods, as in this way you would be able to make the best decision

for your organization considering a wide range of factors rather than concentrating on just

a few. Careful consideration would therefore need to be given to each project.

Conclusion:

In conclusion, you would need to remember that these methods are time-consuming, but

are absolutely essential for efficient business planning.

It is always best to have a good plan from the inception, with a list of criteria to be

considered and goals to be achieved. This will guide you through the entire selection

process and will also ensure that you do make the right choice.

A data dictionary is a collection of data about data. It maintains information about thedefintion, structure, and use of each data element that an organization uses.

There are many attributes that may be stored about a data element. Typical attributes used

in CASE tools (Computer Assisted Software Engineering) are:

Name Aliases or synonyms Default label Description Source(s) Date of origin Users Programs in which used Change authorizations Access authorization Data type Length Units(cm., degrees C, etc.) Range of values Frequency of use Input/output/local Conditional values Parent structure

8/12/2019 NDIM

22/64

22

Subsidiary structures Repetitive structures Physical location: record, file, data base

A data dictionary is invaluable for documentation purposes, for keeping control

information on corporate data, for ensuring consistency of elements between

organizational systems, and for use in developing databases.

Data dictionary software packages are commercially available, often as part of a CASE

package or DBMS. DD software allows for consistency checks and code generation. It is

also used in DBMSs to generate reports.

The term data dictionary and data repository are used to indicate a more general

software utility than a catalogue. A catalogueis closely coupled with the DBMS software.

It provides the information stored in it to the user and the DBA, but it is mainly accessedby the various software modules of the DBMS itself, such as DDL and DML compilers,

the query optimiser, the transaction processor, report generators, and the constraint

enforcer. On the other hand, a data dictionaryis a data structure that stores metadata, i.e.,

(structured) data about data. The software package for a stand-alone data dictionary or

data repository may interact with the software modules of the DBMS, but it is mainly used

by the designers, users and administrators of a computer system for information resource

management. These systems are used to maintain information on system hardware and

software configuration, documentation, application and users as well as other information

relevant to system administration.

If a data dictionary system is used only by the designers, users, and administrators and not

by the DBMS Software, it is called a passive data dictionary.Otherwise, it is called an

active data dictionaryor data dictionary.When a passive data dictionary is updated, it

is done so manually and independently from any changes to a DBMS (database) structure.

With an active data dictionary, the dictionary is updated first and changes occur in the

DBMS automatically as a result.

Database users and application developers can benefit from an authoritative data

dictionary document that catalogs the organization, contents, and conventions of one or

more databases. This typically includes the names and descriptions of various tables

(records or Entities) and their contents (fields) plus additional details, like the type and

length of each data element. Another important piece of information that a data dictionary

can provide is the relationship between Tables. This is sometimes referred to in Entity-

8/12/2019 NDIM

23/64

23

Relationship diagrams, or if using Set descriptors, identifying in which Sets database

Tables participate.

In an active data dictionary constraints may be placed upon the underlying data. For

instance, a Range may be imposed on the value of numeric data in a data element (field),

or a Record in a Table may be FORCED to participate in a set relationship with another

Record-Type. Additionally, a distributed DBMS may have certain location specifics

described within its active data dictionary (e.g. where Tables are physically located).

The data dictionary consists of record types (tables) created in the database by systems

generated command files, tailored for each supported back-end DBMS. Command files

contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER

TABLE (for referential integrity), etc., using the specific statement required by that type

of database.There is no universal standard as to the level of detail in such a document.

Middleware

In the construction of database applications, it can be useful to introduce an additional

layer of data dictionary software, i.e. middleware, which communicates with the

underlying DBMS data dictionary. Such a "high-level" data dictionary may offer

additional features and a degree of flexibility that goes beyond the limitations of the native

"low-level" data dictionary, whose primary purpose is to support the basic functions of the

DBMS, not the requirements of a typical application. For example, a high-level data

dictionary can provide alternative entity-relationship models tailored to suit different

applications that share a common database. Extensions to the data dictionary also can

assist in query optimization against distributed databases. Additionally, DBA functions are

often automated using restructuring tools that are tightly coupled to an active data

dictionary.

Software frameworks aimed at rapid application development sometimes include high-

level data dictionary facilities, which can substantially reduce the amount of programming

required to build menus, forms, reports, and other components of a database application,

including the database itself. For example, PHPLens includes a PHP class library to

automate the creation of tables, indexes, and foreign key constraints portably for multiple

databases. Another PHP-based data dictionary, part of the RADICORE toolkit,

automatically generates program objects, scripts, and SQL code for menus and forms with

data validation and complex joins. For the ASP.NET environment, Base One's data

8/12/2019 NDIM

24/64

24

dictionary provides cross-DBMS facilities for automated database creation, data

validation, performance enhancement (caching and index utilization), application security,

and extended data types. Visual DataFlex features provides the ability to use

DataDictionaries as class files to form middle layer between the user interface and the

underlying database. The intent is to create standardized rules to maintain data integrity

and enforce business rules throughout one or more related applications.

Platform-specific examples

Data description specifications(DDS) allow the developer to describe data attributes in

file descriptions that are external to the application program that processes the data, in the

context of an IBM System i.

The table below is an example of a typical data dictionary entry. The IT staff uses this to

develop and maintain the database.

Field Name Data Type Other information

CustomerID Autonumber Primary key field

Title TextLookup: Mr, Mrs, Miss, Ms

Field size 4

Surname TextField size 15

Indexed

FirstName Text Field size 15

DateOfBirth Date/TimeFormat: Medium Date

Range check: >=01/01/1930

HomeTelephone TextField size: 12

Presence check

8/12/2019 NDIM

25/64

25

6. Explain data flow diagram and pseudo codes with the difference between physical

DFD and logical DFD any five points?

To understand the differences between a physical and logical DFD, we need to know what

DFD is. A DFD stands for data flow diagram and it helps in representing graphically the

flow of data in an organization, particularly its information system. A DFD enables a user

to know where information comes in, where it goes inside the organization and how it

finally leaves the organization. DFD does give information about whether the processing

of information takes place sequentially or if it is processed in a parallel fashion. There are

two types of DFDs known as physical and logical DFD. Though both serve the same

purpose of representing data flow, there are some differences between the two that will be

discussed in this article.

Any DFD begins with an overview DFD that describes in a nutshell the system to bedesigned. A logical data flow diagram, as the name indicates concentrates on the business

and tells about the events that take place in a business and the data generated from each

such event. A physical DFD, on the other hand is more concerned with how the flow of

information is to be represented. It is a usual practice to use DFDs for representation of

logical data flow and processing of data. However, it is prudent to evolve a logical DFD

after first developing a physical DFD that reflects all the persons in the organization

performing various operations and how data flows between all these persons.

What is the difference between Physical DFD and Logical DFD?

While there is no restraint on the developer to depict how the system is constructed in the

case of logical DFD, it is necessary to show how the system has been constructed. There

are certain features of logical DFD that make it popular among organizations. A logical

DFD makes it easier to communicate for the employees of an organization, leads to more

stable systems, allows for better understanding of the system by analysts, is flexible and

easy to maintain, and allows the user to remove redundancies easily. On the other hand, a

physical DFD is clear on division between manual and automated processes, gives detailed

description of processes, identifies temporary data stores, and adds more controls to make

the system more efficient and simple.

Data Flow Diagrams (DFDs) are used to show the flow of data through a system in terms

of the inputs, processes,and outputs.

8/12/2019 NDIM

26/64

26

External Entities

Data either comes from or goes to External Entities. They are either the source or

destination (sometimes called a source or sink) of data, which is considered to be external

to the system. It could be people or groups that provide or input data to the system or who

receive data from the system Defined by an oval see below. Identified by a noun.

External Entities are not part of the system but are needed to provide sources of data used

by the system. Fig 1 below shows an example of an External Entity

Fig 1 External Entity

Processes and Data Flows

Data passed to, or from an External Entity mustbe processed in some way. The passing

of data (flow of data) is shown on the DFD as an arrow. The direction of the arrow

defines the direction of the flow of data. All data flows to and from External Entities to

Processes and vice versa need to be named. Fig 2 below shows an example of a data flow:

Fig 2 Data Flow

Process processing data that emanates from external entities or data stores. The process

could be manual, mechanised, or automated/computed. A data process will use or alter the

data in some way. Identified from a scenario by a verb or action. Each process is given a

unique number and is also given a name. An example of a Process is shown in Fig 3

below:

Fig 3 - Process

Customer

Customer details

Add New Customer

1

8/12/2019 NDIM

27/64

27

Data Stores

A Data Store is a point where data is

held and receives or provides data

through data flows. Examples of data

stores are transaction records, data files, reports, and documents. Could be a filing cabinet

or magnetic media. Data stores are named in the singular and numbered. A manual store

such as a filing cabinet is numbered with an M prefix. A D is used as a prefix for an

electronic store such as a relational table. An example of an electronic data store is

shown in Fig 4 below

Fig 4 Data StoreRules

There are certain rules that must be applied when drawing DFDs. These are explained

below:

An external entity cannot be connected to another external entity by a data flow An external entity cannot be connected directly to a data store An external entity must pass data to, or receive data from a process using a data

flow

A data store cannot be directly connected to another data store A data store cannot be directly connected to an external entity A data store can pass data to, or receive data from a process A process can pass data to and receive data from another process Data must flow from external entity to a process and then be passed onto anther

process or a data store

A matrix for the above rules is show in Fig 5 below

Fig 5 ERD Rules

Entity Process Store

Entity No Yes No

Process Yes Yes Yes

Store No Yes No

Customer

8/12/2019 NDIM

28/64

28

There are different levels of DFDs depending on the level of detail shown

Level 0 or context diagram

The context diagram shows the top-level process, the whole system, as a single process

rectangle. It shows all external entities and all data flows to and from the system.

Analysts draw the context diagram first to show the high-level processing in a system. An

example of a Context Diagram is shown in Fig 6 below:

Fig 6 Context Diagram for a Car Sales System

Level 1 DFD

This level of DFD shows all external entities that are on the context diagram, all the high-

level processes and all data stores used in the system. Each high-level process may

contain sub-processes. These are shown on lower level DFDs.

BilbosCar

Sales

Customer

Management

Customer

Management

customerdetails

new car details

monthly reportdetails

invoice details

updated custom-er details

Customer OrderDetails

staff details

8/12/2019 NDIM

29/64

29

A Level 1 DFDfor the Car Sales scenario is shown in Fig 7 below:

Fig 7 Level 1 DFD for a Car Sales System

Management

Customer

Customer

Management

Customer

AddNew

Customer

1

CustomerD1

CarD2

*

CreateMonthlySalesReport

2

SalesD3

AddNewSale

3

*

Add NewCarDetails

4

*

UpdateCustomer

5

*

CreateCustomer

Invoice

6

SalesD3

CustomerD1

Management

*

Add StaffDetails

7

StaffD4

monthly reportdetails

invoice details

customerdetails

new car details

customerdetails

customerdetails

customerdetails

car details

sales details

car details

customerdetails sales details

car details

updated custom-er details customer

details

updated custom-er details

car details

sales details

customerdetails

Customer Order

Details

staff details

staff details

staff details

8/12/2019 NDIM

30/64

30

Level 2 DFDs

Each Level 1 DFD process may contain further internal processes. These are shown on

the Level 2 DFD. The numbering system used in the Level 1 DFD is continued and each

process in the Level 2 DFD is prefixed by the Level 1 DFD number followed by a unique

number for each process i.e. for process 1, sub processes 1.1, 1.2, 1.3 etc see fig 8 below

Fig 8 Level 2 DFD for Level 1 Process Add New Sale

Each of the Level 2 DFDs could also have sub-processes and could be decomposed

further into lower level DFDs i.e. 1.1.1, 1.1.2, 1.1.3 etc

More than 3 levels for a DFD would become unmanageable.

Lowest Level DFDs and Process Specification

Once the DFD has been decomposed into its lowest level, each of the lower level DFDs

can be described using pseudo-code (structured English), flow chart or similar process

specification method that can be used by a programmer to code each process or function.

For example, the Level 2 DFD for the Add New Sale process could be described as being

a process that contains 3 sub-processes, Validate Order, Add Staff to Order and Generate

New Sale. The structured English could be written thus:

Open Customer File

If existing customer

Check Customer Details

Else

Add customer details

Add New Sale3

SalesD3

Customer

CarD2

CustomerD1

*

ValidateOrder

3.1

*

GenerateNewSale

3.2

StaffD4

*

Add staffto

order

3.3

sales details

car details

customer

details

validated orderdetails

car details

Customer OrderDetails

staff detailsvalidatedstaff dets

8/12/2019 NDIM

31/64

31

End If

Open Car File

If car available then

Open Sale File

Add customer to sale

Set car to unavailable

Add car to sale

Add staff details

Calculate price

Generate Invoice

Close Sale File

Close Customer FileClose Car File

Inform User of successful sale exit process

Else

Inform User of problem exit process

Close Customer File

Close Car File

End If

The above example is not carved in stone as the analyst may decide to write separate

functions to validate customer and car details and that the Generate New Sale process

could include other sub-processes.

All that matters is that the underlying processing logic solves the problem.

For example, if you look at Figure 8 there is a process named Validate Order, which has a

duel purpose of checking both the customer details (is customer a current customer, if not

add to customer file) and the car details (is car available, if not stop the sale process). A

separate process called Validate Order could be created, but I have written the structured

English to show a logical sequence that shows that, only if the car is available do we begin

the transaction of creating the sale.

I have also assumed that the staff dealing with the sale will know their own details so there

would not be a need for the process named Add Staff to Order.

Like all analysis and design processes, the process of producing DFDs and writing

structured English is an iterative process

8/12/2019 NDIM

32/64

32

7. Explain coding techniques and types of codes?

It is required that information must be encoded into signals before it can be transported

across communication media. In more precise words we may say that the waveform

pattern of voltage or current used to represent the 1s and 0s of a digital signal on a

transmission link is called digital to digital line encoding. There are different encoding

schemes available:

Digltal-to-Digltal Encoding

It is the representation of digital information by a digital signal.

There are basically following types of digital to-digital encoding available like: Unipolar Polar Bipolar.

Unipolar

Unipolar encoding uses only one level of value 1 as a positive value and 0 remains Idle.

Since unipolar line encoding has one of its states at 0 Volts, its also called Return to Zero

(RTZ) as shown in Figure. A common example of unipolar line encoding is the 11'L logic

levels used in computers and digital logic.

Unipolar encoding represents DC (Direct Current) component and therefore, ca.'1nottravel

through media such as microwaves or transformers. It has low noise margin and needs

extra hardware for synchronization purposes. It is well suited where the signal path is

short. For long distances, it produces stray capacitance in the transmission medium and

therefore, it never returns to zero as shown in Figure.

8/12/2019 NDIM

33/64

33

Polar

Polar encoding uses two levels of voltages say positive and negative. For example, the

RS:232D interface uses Polar line encoding. The signal does not return to zero; it is either

a positive voltage or a negative voltage. Polar encoding may be classified as nonreturn to

zero (NRZ), return to zero (RZ) and biphase. NRZ may be further divided into NRZL and

NRZI. Biphase has also two different categories as Manchester and Differential

Manchester encoding. Polar line encoding is the simplest pattern that eliminates most of

the residua! DC problem. Figure shows the Polar line encoding. It has the same problem of

synchronization as that of unipolar encoding. The added benefit of polar encoding is that it

reduces the power required to transmit the signal by one-half.

Non-Return to Zero (NRZ)

In NRZL, the level of the signal is 1 if the amplitude is positive and 0 in case of negative

amplitude.

In NRZI, whenever a positive amplitude or bit I appears in the signal, the signal gets

inverted,

Figure explains the concepts of NRZ-L and NRZI more precisely.

8/12/2019 NDIM

34/64

34

Return to Zero (RZ)

RZ uses three values to represent the signal. These are positive, negative, and zero. Bit 1is

represented when signal changes from positive to zero. Bit 0 is represented when signal

changes from negative to zero. Figure explains the RZ concept.

Biphase

Biphase is implemented in two different ways as Manchester and Differential Manchester

encoding.

In Manchester encoding, transition happens at the middle of each bit period. A low to high

transition represents a 1 and a high to low transition represents a 0.In case of Differential

Manchester encoding, transition occurs at the beginning of a bit time, which represents a

zero.

These encoding can detect errors during transmission because of the transition during

every bit period. Therefore, the absence of a transition would indicate an error condition.

8/12/2019 NDIM

35/64

35

They have no DC component and there is always a transition available for synchronizing

receives and transmits clocks.

Bipolar

Bipolar uses three voltage levels. These are positive, negative, and zero. Bit 0 occurs at

zero level of amplitude. Bit 1 occurs alternatively when the voltage level is either positive

or negative and therefore, also called as Alternate Mark Inversion (AMI). There is no DC

component because of the alternate polarity of the pulses for Is. Figure describes bipolar

encoding.

Analog to Digital

Analog to digital encoding is the representation of analog information by a digital signal.

These include PAM (Pulse Amplitude Modulation), and PCM (Pulse Code Modulation).

Digital to Analog

These include ASK (Amplitude Shift Keying), FSK (Frequency Shift Keying), PSK

(Phase Shift Keying), QPSK (Quadrature Phase Shift Keying), are QAM (Quadrature

Amplitude Modulation).

Analog to Analog

These are Amplitude modulation, Frequency modulation and Phase modulation

techniques,

Codecs (Coders and Decoders)

Codec stands for coders/decompression in data communication. The reverse conversion of

analog to digital is necessary in situations where it is advantageous to send analog

information across a digital circuit. Certainly, this is often the case in carrier networks,

where huge volumes of analog voice are digitized and sent across high capacity, digital

circuits. The device that accomplishes the analog to digital conversion is known as a

8/12/2019 NDIM

36/64

36

codec. Codecs code an analog input into a digital format on the transmitting side of the

connection, reversing the process, or decoding the information on the receiving side, in

order to reconstitute the analog signal. Codecs are widely used to convert analog voice

and video to digital format, and to reverse the process on the receiving end.

8/12/2019 NDIM

37/64

37

8. Explain algorithm with detect error module (eleven code) and module n code with

the help of algorithm and examples

In information theory and coding theory with applications in computer science and

telecommunication, error detection and correctionor error controlare techniques that

enable reliable delivery of digital data over unreliable communication channels. Many

communication channels are subject to channel noise, and thus errors may be introduced

during transmission from the source to a receiver. Error detection techniques allow

detecting such errors, while error correction enables reconstruction of the original data.

Error correction may generally be realized in two different ways:

Automatic repeat request (ARQ) (sometimes also referred to as backward errorcorrection): This is an error control technique whereby an error detection scheme is

combined with requests for retransmission of erroneous data. Every block of datareceived is checked using the error detection code used, and if the check fails,

retransmission of the data is requested this may be done repeatedly, until the data

can be verified.

Forward error correction (FEC): The sender encodes the data using an error-correcting code (ECC) prior to transmission. The additional information

(redundancy) added by the code is used by the receiver to recover the original data.

In general, the reconstructed data is what is deemed the "most likely" original data.

ARQ and FEC may be combined, such that minor errors are corrected without

retransmission, and major errors are corrected via a request for retransmission: this is

called hybrid automatic repeat-request (HARQ).

Error detection is most commonly realized using a suitable hash function (or checksum

algorithm). A hash function adds a fixed-length tag to a message, which enables receivers

to verify the delivered message by recomputing the tag and comparing it with the one

provided.

There exists a vast variety of different hash function designs. However, some are of

particularly widespread use because of either their simplicity or their suitability for

detecting certain kinds of errors (e.g., the cyclic redundancy check's performance in

detecting burst errors).

Random-error-correcting codes based on minimum distance coding can provide a suitable

alternative to hash functions when a strict guarantee on the minimum number of errors to

be detected is desired. Repetition codes, described below, are special cases of error-

8/12/2019 NDIM

38/64

38

correcting codes: although rather inefficient, they find applications for both error

correction and detection due to their simplicity.

Repetition codes

A repetition code is a coding scheme that repeats the bits across a channel to achieve

error-free communication. Given a stream of data to be transmitted, the data is divided

into blocks of bits. Each block is transmitted some predetermined number of times. For

example, to send the bit pattern "1011", the four-bit block can be repeated three times, thus

producing "1011 1011 1011". However, if this twelve-bit pattern was received as "1010

1011 1011" where the first block is unlike the other two it can be determined that an

error has occurred.

Repetition codes are very inefficient, and can be susceptible to problems if the error occurs

in exactly the same place for each group (e.g., "1010 1010 1010" in the previous examplewould be detected as correct). The advantage of repetition codes is that they are extremely

simple, and are in fact used in some transmissions of numbers stations.

Parity bits

A parity bit is a bit that is added to a group of source bits to ensure that the number of set

bits (i.e., bits with value 1) in the outcome is even or odd. It is a very simple scheme that

can be used to detect single or any other odd number (i.e., three, five, etc.) of errors in the

output. An even number of flipped bits will make the parity bit appear correct even though

the data is erroneous.

Extensions and variations on the parity bit mechanism are horizontal redundancy checks,

vertical redundancy checks, and "double," "dual," or "diagonal" parity (used in RAID-DP).

Checksums

A checksum of a message is a modular arithmetic sum of message code words of a fixed

word length (e.g., byte values). The sum may be negated by means of a ones'-complement

operation prior to transmission to detect errors resulting in all-zero messages.

Checksum schemes include parity bits, check digits, and longitudinal redundancy checks.

Some checksum schemes, such as the Damm algorithm, the Luhn algorithm, and the

Verhoeff algorithm, are specifically designed to detect errors commonly introduced by

humans in writing down or remembering identification numbers.

Cyclic redundancy checks (CRCs)

A cyclic redundancy check (CRC) is a single-burst-error-detecting cyclic code and non-

secure hash function designed to detect accidental changes to digital data in computer

8/12/2019 NDIM

39/64

39

networks. It is not suitable for detecting maliciously introduced errors. It is characterized

by specification of a so-called generator polynomial, which is used as the divisor in a

polynomial long division over a finite field, taking the input data as the dividend, and

where the remainder becomes the result.

Cyclic codes have favorable properties in that they are well suited for detecting burst

errors. CRCs are particularly easy to implement in hardware, and are therefore commonly

used in digital networks and storage devices such as hard disk drives.

Even parity is a special case of a cyclic redundancy check, where the single-bit CRC is

generated by the divisor x + 1.

Cryptographic hash functions

The output of a cryptographic hash function, also known as a message digest, can provide

strong assurances about data integrity, whether changes of the data are accidental (e.g.,due to transmission errors) or maliciously introduced. Any modification to the data will

likely be detected through a mismatching hash value. Furthermore, given some hash value,

it is infeasible to find some input data (other than the one given) that will yield the same

hash value. If an attacker can change not only the message but also the hash value, then a

keyed hash or message authentication code (MAC) can be used for additional security.

Without knowing the key, it is infeasible for the attacker to calculate the correct keyed

hash value for a modified message.

Error-correcting codes

Any error-correcting code can be used for error detection. A code with minimum

Hamming distance, d, can detect up to d 1 errors in a code word. Using minimum-

distance-based error-correcting codes for error detection can be suitable if a strict limit on

the minimum number of errors to be detected is desired.

Codes with minimum Hamming distance d = 2 are degenerate cases of error-correcting

codes, and can be used to detect single errors. The parity bit is an example of a single-

error-detecting code.

In digital data transmission, error occurs due to noise. The probability of error or bit error

rate depends on the signal to noise ratio, the modulation type and the method of

demodulation.

8/12/2019 NDIM

40/64

40

The bit error rate p, may be expressed in terms of

= NNelforbitsN

bitsNinerrorsofno(arg )

For example, if p=0.1 we would expect on average there would be 1 error in every 10 bits.

A p=0.1 actually stating that every bit has a 1/10th

probability of being in error.

Depending on the type of system and many factors, error rates typically range from 10-1

to

10-5

or better.

Information transfer via digital system is usually packaged into a structure (a block of bits)

called a message block or frame. A typical message block contains the following:

Synchronization pattern to mark the start of message block

Destination and sometimes source addressesSystem control/ commands

Information

Error control coding check bits

The total number of the bits in the block may vary widely ( from say 32 bits to several

hundreds bits) depending on the requirement.

Clearly, if the bits are subjected to an error rate p, there is some probability that a message

block will be received with 1 or more bits in error. In order to counteract the effects of

errors, error control coding techniques are used to either:

a) detect errors error detectionb) correct error error detection and correction

Broadly, there are two types of error control codes:

a) Block Codes Parity codes Array codes

Repetition codes

Cyclic codes etc

b) Convolutional Codes

8/12/2019 NDIM

41/64

41

BLOCK CODES

A block code is a coding technique which generates C check bits for M message bits to

give a stand alone block of M+C= N bits.

The sync bits are usually not included in the error control coding because message

synchronization must be achieved before the message and check bits can be processed.

The code rate is given by

Rate = N

M

CM

M=+

Where, M = number of message bits

C = number of check bits

N= M+C= total number of bits.

The code rate is the measure of the proportion of free user assigned bits (M) to the total

bits in the blocks (N).

For example,

i) A single parity bit (C=1 bit) applied to a block of 7 bits give a code rateR =

8

7

17

7=

+

ii) A (7,4) Cyclic code has N=7, M=4

Code rate R =74

8/12/2019 NDIM

42/64

42

iii) A repetition-m code in which each bit or message is transmitted m times

and the receiver carries out a majority vote on each bit has a code rate

Rate =mmM

M 1=

DETECTION AND CORRECTION

Consider message transferred from a Source to a Destination, and assume that the

Destination is able to check the received messages and detect errors.

If no errors are detected, the Destination will accept the messages.

If errors are detected, there are two forms of error corrections.

a) Automatic Retransmission Request (ARQ)In ARQ system, the destination send an acknowledgment ACK message back to the

source if the errors are not detected, and a Not-ACK (NAK) message back to the source if

errors are detected.

If the source receives an ACK to a message it will send the next message. If the source

receives a NAK it repeats the same message. This process repeat until all the messages is

accepted by the destination.

8/12/2019 NDIM

43/64

43

b) Forward Error Correction (FEC)

The error control code may be powerful enough to allow the destination to attempt to

correct the errors by further processing. This is called Forward Error Correction, no

ACKs or NAKs are required.

Many systems are hybrid in that they use both ARQ (ACK/NAK) and FEC strategies for

error correction.

Successful, False & Lost Message Transfer

The process of checking the received messages for errors gives two possible outcomes:

a) Errors not detected messages acceptedb) Errors detected messages re rejected

An error not detected does not mean that errors are not present. Error control codes cannot

detect every possible error or combinations of errors. However, if error are not detected

the destination has not alternative but to accept the message, true or false. That is, we may

conclude if errors are not detected either

8/12/2019 NDIM

44/64

44

a) that there were no errors, i.e. the messages accepted are true or in other wordssuccessful message transfer.

b) that there were undetected errors, i.e. the messages accepted was false or in otherwords a false message transfer.

If errors are detected, the destination does not accept the message and may either request a

re-transmission (ARQ-system) or process the block further in an attempt to correct the

error (FEC).

In processing the block of error correction, again there are two possible outcomes.

a) the processor may get it right, i.e. correct the error and give a successful messagetransfer.

b) the processor may get it wrong, i.e. not correct the errors in which case there is afalse message transfer.

Some codes have a range of ability to detect and correct errors. For example a code may

be able to detect and correct 1 error (single bit error) and detect 2,3 and 4 bits in error, but

not correct them. Thus even with FEC, some messages may still be rejected and we think

of these as lost messages. These ideas are illustrated below:

8/12/2019 NDIM

45/64

45

MESSAGE TRANSFERS

Consider message transfer between two computers e.g. it is required to transfer the

contents of Computer A to Computer B.

COMPUTER A COMPUTER B

As discussed, of the messages transferred to the Computer B, some may be rejected (lost)

and some will be accepted, and will be either true (successful transfer) or false.

Obviously the requirement is for a high probability of successful transfer (ideally = 1), low

probability of false transfer (ideally = 0) and a low probability of lost messages. In

particular the false rate should be kept low, even at the expense of an increased lost

message rate.

Note in some messages there may be in-built redundancy for example in text

message

REPAUT FOR WEDLESDAY (REPORT FOR WEDNESDAY)

However if this is followed by

10 JUNE we would ?? 10

Other example where there is little or no redundancy are Car registration numbers,

Accounts etc, generally numeric or unstructured alpha-numeric information.

There is thus a need for a low false rate appropriate to the function of the system and it is

important for the information in Computer B to be correct even if it takes a long time to

transfer.

Error control coding may be considered further in two main ways.

In terms of System Performance i.e. the probabilities of successful, false and lostmessage transfer. In this case we only need to know what the code for error detection /

correction can do in terms of its ability to detect and correct errors (depends on

hamming distance).

8/12/2019 NDIM

46/64

46

In terms of the Error Control Codeitself i.e. the structure, operation, characteristicsand implementation of various types of codes.

SYSTEM PERFORMANCE

In order to determine system performance in terms of successful, false and lost message

transfers it is necessary to know:

1) the probability of error or b.e.r p.2) the no. of bits in the message block N3) the ability of the code to detect/ correct errors, usually expressed as a minimum

Hamming distance, dmin for the code.

Since the b.e.r, p, and the number of bits in the block, N we can apply the equation below

( )

( ) RNR ppRRN

NR

= 1

!!

!)( 0! =1, 1! =1

This gives the probability of R errors in an N bit block subject to a bit error rate p.

Hence, for an N bit block we can determine the probability of no errors in the block (R=0)

i.e. an error free block

( ) ( ) NN ppp

N

N)1(1

!0!0

!)0(

00=

=

the probability of 1 error in the block (R=1)

( ) ( ) 111 )1(1

!1!1

!)1(

=

=

NN ppNppN

N

the probability of 2 error in the block (R=2)

( ) ( ) 22 1

!2!2

!)2(

=

Npp

N

N

R=3, R=4 etc. P(3), P(4), P(5) ,..P(N).

MINIMUM HAMMING DISTANCE

The minimum hamming distance of an error control code, is a parameter which indicates

the worst case ability of the code to detect/correct errors. In general, codes will perform

better than indicated by the minimum Hamming distance.

8/12/2019 NDIM

47/64

47

Let dmin= minimum Hamming distance

l = number of bits errors detected

t = number of bit errors corrected

It may be shown that

dmin = l + t + 1 with t l

For a given dmin , there are a range of (worst case) options from just error detection to error

detection/ correction.

For example, suppose a code has a dmin= 6.

Since, dmin = l + t + 1

We have as options

1) 6= 5 + 0 + 1 {detect up to 5 errors , no correction}

2) 6= 4 + 1 + 1 {detect up to 4 errors , correct 1 error}3) 6= 3 + 2 + 1 {detect up to 3 errors , correct 2 error}

After this, t>l, i.e. cannot go further, since we cannot correct more errors than can be

detected.

In option 1), up to 5 errors can be detected i.e. 1,2,3,4 or 5 errors detected, but there is no

error correction.

In option 2), up to 4 errors can be detected i.e. 1,2,3,4 errors detected, and 1 error can be

corrected.

In option 3), up to 3 errors can be detected i.e. 1,2,3 errors detected, and 1 and 2 errors can

be corrected.

Hence a given code can give several decoding, error detection/correction options at the

receiver. In an ARQ system with no FEC, we would implement option 1, i.e detect as

many errors as possible.

If FEC were to be used, we might choose option 3 which allows 1 and 2 errors in a block

to be detected and corrected, 3 errors can be detected but not corrected and these messages

could be rejected and recovered by ARQ.

For option 3 for example, if 4 or more errors occurred, these would not be detected and

these messages would be accepted but would be false messages.

Fortunately, the higher the no. of errors, the less the probability they will occur for

reasonable values of p.

8/12/2019 NDIM

48/64

48

From the above, we may conclude that:

Messages transfers are successful if no errors occurs or if t errors occurs which are

corrected.

i.e. Probability of Success = =

+

t

i

ipp1

)()0(

Messages transfers are lost if up to l errors are detected which are not corrected, i.e

Probability of lost = p(t+1) + p(t+2)+ . P(l)

= +=

l

ti

ip1

)(

Message transfers are false of l+1 or more errors occurs

Probability of false = p(l+1) + p(l+2)+ . P(N)

= +=

N

li

ip1

)(

Example

Using dmin= 6, option 3, (t=1, l =4)

Probability of Successful transfer = p(0) + p(1)

Probability of lost messages = p(2) + p(3) + p(4)

Probability of false messages = p(5) + p(6)+ .+ p(N).

8/12/2019 NDIM

49/64

49

9. Explain back-up-plans

In information technology, a backup, or the process of backing up, refers to the copying

and archiving of computer data so it may be used to restore the original after a data loss

event. The verb form is to back upin two words, whereas the noun is backup.

Backups have two distinct purposes. The primary purpose is to recover data after its loss,

be it by data deletion or corruption. Data loss can be a common experience of computer

users. A 2008 survey found that 66% of respondents had lost files on their home PC. The

secondary purpose of backups is to recover data from an earlier time, according to a user-

defined data retention policy, typically configured within a backup application for how

long copies of data are required. Though backups popularly represent a simple form of

disaster recovery, and should be part of a disaster recovery plan, by themselves, backups

should not alone be considered disaster recovery. One reason for this is that not all backupsystems or backup applications are able to reconstitute a computer system or other

complex configurations such as a computer cluster, active directory servers, or a database

server, by restoring only data from a backup.

Since a backup system contains at least one copy of all data worth saving, the data storage

requirements can be significant. Organizing this storage space and managing the backup

process can be a complicated undertaking. A data repository model can be used to provide

structure to the storage. Nowadays, there are many different types of data storage devices

that are useful for making backups. There are also many different ways in which these

devices can be arranged to provide geographic redundancy, data security, and portability.

Before data is sent to its storage location, it is selected, extracted, and manipulated. Many

different techniques have been developed to optimize the backup procedure. These include

optimizations for dealing with open files and live data sources as well as compression,

encryption, and de-duplication, among others. Every backup scheme should include dry

runs that validate the reliability of the data being backed up. It is important to recognize

the limitations and human factors involved in any backup scheme.

Because data is the heart of the enterprise, it's crucial for you to protect it. And to protect

your organization's data, you need to implement a data backup and recovery plan. Backing

up files can protect against accidental loss of user data, database corruption, hardware

failures, and even natural disasters. It's your job as an administrator to make sure that

backups are performed and that backup tapes are stored in a secure location.

8/12/2019 NDIM

50/64

50

Creating a Backup and Recovery Plan

Data backup is an insurance plan. Important files are accidentally deleted all the time.

Mission-critical data can become corrupt. Natural disasters can leave your office in ruin.

With a solid backup and recovery plan, you can recover from any of these. Without one,

you're left with nothing to fall back on.

Figuring Out a Backup Plan

It takes time to create and implement a backup and recovery plan. You'll need to figure out

what data needs to be backed up, how often the data should be backed up, and more. To

help you create a plan, consider the following:

How important is the data on your systems?The importance of data can go along way in helping you determine if you need to back it upas well as when and

how it should be backed up. For critical data, such as a database, you'll want tohave redundant backup sets that extend back for several backup periods. For less

important data, such as daily user files, you won't need such an elaborate backup

plan, but you'll need to back up the data regularly and ensure that the data can be

recovered easily.

What type of information does the data contain? Data that doesn't seemimportant to you may be very important to someone else. Thus, the type of

information the data contains can help you determine if you need to back up the

dataas well as when and how the data should be backed up.

How often does the data change? The frequency of change can affect yourdecision on how often the data should be backed up. For example, data that

changes daily should be backed up daily.

How quickly do you need to recover the data? Time is an important factor increating a backup plan. For critical systems, you may need to get back online

swiftly. To do this, you may need to alter your backup plan.

Do you have the equipment to perform backups? You must have backuphardware to perform backups. To perform timely backups, you may need several

backup devices and several sets of backup media. Backup hardware includes tape

drives, optical drives, and removable disk drives. Generally, tape drives are less

expensive but slower than other types of drives.

Who will be responsible for the backup and recovery plan? Ideally, someoneshould be a primary contact for the organization's backup and recovery plan. This

8/12/2019 NDIM

51/64

51

person may also be responsible for performing the actual backup and recovery of

data.

What is the best time to schedule backups? Scheduling backups when systemuse is as low as possible will speed the backup process. However, you can't always

schedule backups for off-peak hours. So you'll need to carefully plan when key

system data is backed up.

Do you need to store backups off-site?Storing copies of backup tapes off-site isessential to recovering your systems in the case of a natural disaster. In your off-

site storage location, you should also include copies of the software you may need

to install to reestablish operational systems.

The Basic Types of Backup

There are many techniques for backing up files. The techniques you use will depend onthe type of data you're backing up, how convenient you want the recovery process to be,

and more.

If you view the properties of a file or directory in Windows Explorer, you'll note an

attribute called Archive. This attribute often is used to determine whether a file or

directory should be backed up. If the attribute is on, the file or directory may need to be

backed up. The basic types of backups you can perform include

Normal/full backupsAll files that have been selected are backed up, regardless ofthe setting of the archive attribute. When a file is backed up, the archive attribute is

cleared. If the file is later modified, this attribute is set, which indicates that the file

needs to be backed up.

Copy backups All files that have been selected are backed up, regardless of thesetting of the archive attribute. Unlike a normal backup, the archive attribute on

files isn't modified. This allows you to perform other types of backups on the files

at a later date.

Differential backupsDesigned to create backup copies of files that have changedsince the last normal backup. The presence of the archive attribute indicates that

the file has been modified and only files with this attribute are backed up.

However, the archive attribute on files isn't modified. This allows you to perform

other types of backups on the files at a later date.

Incremental backupsDesigned to create backups of files that have changed sincethe most recent normal or incremental backup. The presence of the archive

8/12/2019 NDIM

52/64

52

attribute indicates that the file has been modified and only files with this attribute

are backed up. When a file is backed up, the archive attribute is cleared. If the file

is later modified, this attribute is set, which indicates that the file needs to be

backed up.

Daily backupsDesigned to back up files using the modification date on the fileitself. If a file has been modified on the same day as the backup, the file will be

backed up. This technique doesn't change the archive attributes of files.

In your backup plan you'll probably want to perform full backups on a weekly basis and

supplement this with daily, differential, or incremental backups. You may also want to

create an extended backup set for monthly and quarterly backups that includes additional

files that aren't being backed up regularly.

Tip You'll often find that weeks or months can go by before anyone notices that a file ordata source is missing. This doesn't mean the file isn't important. Although some types of

data aren't used often, they're still needed. So don't forget that you may also want to create

extra sets of backups for monthly or quarterly periods, or both, to ensure that you can

recover historical data over time.

Differential and Incremental Backups

The difference between differential and incremental backups is extremely important. To

understand the distinction between them, examine Table 1. As it shows, with differential

backups you back up all the files that have changed since the last full backup (which

means that the size of the differential backup grows over time). With incremental backups,

you only back up files that have changed since the most recent full or incremental backup

(which means the size of the incremental backup is usually much smaller than a full

backup).

Table -1 Incremental and Differential Backup Techniques

Day of

Week

Weekly Full Backup with Daily

Differential Backup

Weekly Full Backup with Daily

Incremental Backup

Sunday A full backup is performed. A full backup is performed.

MondayA differential backup contains all

changes since Sunday.

An incremental backup contains


TuesdayA differential backup contains all



changes since Monday.

8/12/2019 NDIM

53/64

53

WednesdayA differential backup contains all



changes since Tuesday.

ThursdayA differential backup contains all



changes since Wednesday.

FridayA differential backup contains all



changes since Thursday.

SaturdayA differential backup contains all



changes since Friday.

Once you determine what data you're going to back up and how often, you can select

backup devices and media that support these choices. These are covered in the next

section.

Selecting Backup Devices and Media

Many tools are available for backing up data. Some are fast and expensive. Others are

slow but very reliable. The backup solution that's right for your organization depends on

many factors, including

CapacityThe amount of data that you need to back up on a routine basis. Can thebackup hardware support the required load given your time and resource

constraints?

ReliabilityThe reliability of the backup hardware and media. Can you afford tosacrifice reliability to meet budget or time needs?

ExtensibilityThe extensibility of the backup solution. Will this solution meet yourneeds as the organization grows?

SpeedThe speed with which data can be backed up and recovered. Can you affordto sacrifice speed to reduce costs?

CostThe cost of the backup solution. Does it fit into your budget?Common Backup Solutions

Capacity, reliability, extensibility, speed, and cost are the issues driving your backup plan.

If you understand how these issues affect your organization, you'll be on track to select an

appropriate backup solution. Some of the most commonly used backup solutions include

Tape drives Tape drives are the most common backup devices. Tape drives usemagnetic tape cartridges to store data. Magnetic tapes are relatively inexpensive

bu

ndim

Documents