ndim

Upload: shashi-bhushan-sonbhadra

Post on 03-Jun-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 NDIM

    1/64

    1

    1. Entity relationship model and explain all three levels of E-R Diagram?An Entity Relationship model(ER model) is an abstract way to describe a database.

    It is a visual representation of different data using conventions that describe how these

    data are related to each other.

    There are three basic elements in ER models:

    Entitiesare the things about which we seek information. Attributesare the data we collect about the entities. Relationships provide the structure needed to draw information from multiple

    entities.

    Symbols used in E-R Diagram:

    Entity rectangle Attribute-oval Relationship diamond Link- line

    Entities and Attributes

    Entity Type: It is a set of similar objects or a category of entities that are well defined

    A rectangle represents an entity set Ex: students, courses We often just say entity and mean entity type

    Attribute: It describes one aspect of an entity type; usually [and best when] single valued

    and indivisible (atomic)

    Represented by oval on E-R diagram Ex: name, maximum enrollment

  • 8/12/2019 NDIM

    2/64

    2

    Types of Attribute:

    Simple and Composite Attribute

    Simple attribute that consist of a single atomic value.A simple attribute cannot be

    subdivided. For example the attributes age, sex etc are simple attributes.

    A composite attribute is an attribute that can be further subdivided. For example the

    attribute ADDRESS can be subdivided into street, city, state, and zip code.

    Simple Attribute: Attribute that consist of a single atomic value.

    Example: Salary, age etc

    Composite Attribute : Attribute value not atomic.

    Example : Address : House_no:City:State

    Name : First Name: Middle Name: Last Name

    Single Valued and Multi Valued attributeA single valued attribute can have only a single value. For example a person can have only

    one date of birth, age etc. That is a single valued attributes can have only single value.

    But it can be simple or composite attribute.That is date of birth is a composite attribute ,

    age is a simple attribute. But both are single valued attributes.

    Multivalued attributes can have multiple values. For instance a person may have multiple

    phone numbers,multiple degrees etc.Multivalued attributes are shown by a double line

    connecting to the entity in the ER diagram.

    Single Valued Attribute: Attribute that hold a single value

    Example1: Age

    Exampe 2: City

    Example 3: Customer id

    Multi Valued Attribute: Attribute that hold multiple values.

    Example1: A customer can have multiple phone numbers, email ids etc

    Example 2: A person may have several college degrees

    Stored and Derived Attributes

    The value for the derived attribute is derived from the stored attribute. For example Date

    of birth of a person is a stored attribute. The value for the attribute AGE can be derived

    by subtracting the Date of Birth(DOB) from the current date. Stored attribute supplies a

    value to the related attribute.

    Stored Attribute: An attribute that supplies a value to the related attribute.

    Example: Date of Birth

  • 8/12/2019 NDIM

    3/64

    3

    Derived Attribute: An attribute thats value is derived from a stored attribute.

    Example : age, and its value is derived from the stored attribute Date of Birth.

    Keys

    Super key: An attribute or set of attributes that uniquely identifies an entitythere can be

    many of these

    Composite key:A key requiring more than one attribute

    Candidate key: a superkey such that no proper subset of its attributes is also a superkey

    (minimal superkey has no unnecessary attributes)

    Primary key: The candidate key chosen to be used for identifying entities and accessing

    records. Unless otherwise noted key means primary key

    Alternate key: A candidate key not used for primary key

    Secondary key: Attribute or set of attributes commonly used for accessing records, butnot necessarily unique

    Foreign key:An attribute that is the primary key of another table and is used to establish a

    relationship with that table where it appears as an attribute also.

    Graphical Representation in E-R diagram

    Rectangle Entity

    Ellipses Attribute (underlined attributes are [part of] the primary key)

    Double ellipses multi-valued attribute

    Dashed ellipses derived attribute, e.g. age is derivable from birthdate and current date.

    Relationships

    Relationship: connects two or more entities into an association/relationship

    John majors in Computer ScienceRelationship Type: set of similar relationships

  • 8/12/2019 NDIM

    4/64

    4

    Student(entity type) is related to Department(entity type) by MajorsIn(relationshiptype).

    Relationship Types may also have attributes in the E-R model. When they are mapped to

    the relational model, the attributes become part of the relation. Represented by a diamond

    on E-R diagram.

    Cardinality of Relationships

    Cardinality is the number of entity instances to which another entity set can map under the

    relationship. This does not reflect a requirement that an entity has to participate in a

    relationship. Participation is another concept.

    One-to-one: X-Y is 1:1 when each entity in X is associated with at most one entity in Y,

    and each entity in Y is associated with at most one entity in X.

    One-to-many: X-Y is 1:M when each entity in X can be associated with many entities in

    Y, but each entity in Y is associated with at most one entity in X.

    Many-to-many: X:Y is M:M if each entity in X can be associated with many entities in Y,and each entity in Y is associated with many entities in X (many =>one or more and

    sometimes zero)

  • 8/12/2019 NDIM

    5/64

    5

  • 8/12/2019 NDIM

    6/64

    6

    Relationship Participation

    Constraints

    Total participation

    Every member ofentity set must

    participate in the

    relationship

    Representedbydouble line from

    entity rectangle to relationship diamond

    E.g., A Classentity cannot exist unless related to a Facultymember entity in thisexample, not necessarily at Juniata.

    You can set this double line in Dia In a relational model we will use the referencesclause.

    Key constraint

    If every entity participates in exactly one relationship, both a total participation anda key constraint hold

    E.g., if a class is taught by only one faculty member.Partial participation

    Not every entity instance must participate Represented by single line from entity rectangle to relationship diamond E.g., A Textbookentity can exist without being related to a Classor vice versa.

  • 8/12/2019 NDIM

    7/64

    7

    Strong and Weak Entities

    Strong Entity Vs Weak Entity

    An entity set that does not have sufficient

    attributes to form a primary key is termed as a

    weak entity set. An entity set that has a

    primary key is termed as strong entity set.

    A weak entity is existence dependent. That is

    the existence of a weak entity depends on the

    existence of a identifying entity set. The

    discriminator (or partial key) is used to identify other attributes of a weak entity set.The

    primary key of a weak entity set is formed by primary key of identifying entity set and the

    discriminator of weak entity set. The existence of a weak entity is indicated by a doublerectangle in the ER diagram. We underline the discriminator of a weak entity set with a

    dashed line in the ER diagram.

  • 8/12/2019 NDIM

    8/64

    8

    2. Make a ER Diagram of Library Management System? (all three levels)Library management System (LMS) provides a simple GUI (graphical user interface) for

    the Library Staff to manage the functions of the library effectively. Usually when a book is

    returned or issued, it is noted down in a register after which data entry is done to update

    the status of the books in a moderate scale. This process takes some time and proper

    updation cannot be guaranteed. Such anomalies in the updation process can cause loss of

    books. So a more user friendly interface which could update the database instantly, has a

    great demand in libraries.

    E-R Diagram for LMS:

  • 8/12/2019 NDIM

    9/64

    9

    3. Explain decision table and its parts? Make a decision table of a report card

    A decision table is an excellent tool to use in both testing and requirements management.

    Essentially it is a structured exercise to formulate requirements when dealing with

    complex business rules. Decision tables are used to model complicated logic. They can

    make it easy to see that all possible combinations of conditions have been considered and

    when conditions are missed, it is easy to see this.

    A decision table is a good way to deal with combinations of things (e.g. inputs). This

    technique is sometimes also referred to as a cause-effect table. The reason for this is that

    there is an associated logic diagramming technique called cause-effect graphing which

    was sometimes used to help derive the decision table (Myers describes this as a

    combinatorial logic network. However, most people find it more useful just to use the

    table described in. Decision tables provide a systematic way of stating complex business rules, which

    is useful for developers as well as for testers.

    Decision tables can be used in test design whether or not they are used inspecifications, as they help testers explore the effects of combinations of different

    inputs and other software states that must correctly implement business rules.

    It helps the developers to do a better job can also lead to better relationships withthem. Testing combinations can be a challenge, as the number of combinations can

    often be huge. Testing all combinations may be impractical if not impossible. We

    have to be satisfied with testing just a small subset of combinations but making the

    choice of which combinations to test and which to leave out is also important. If

    you do not have a systematic way of selecting combinations, an arbitrary subset

    will be used and this may well result in an ineffective test effort.

    The four quadrants

    Conditions Condition alternatives

    Actions Action entries

    Each decision corresponds to a variable, relation or predicate whose possible values are

    listed among the condition alternatives. Each action is a procedure or operation to perform,

    and the entries specify whether (or in what order) the action is to be performed for the set

    of condition alternatives the entry corresponds to. Many decision tables include in their

    condition alternatives the don't care symbol, a hyphen. Using don't cares can simplify

  • 8/12/2019 NDIM

    10/64

    10

    decision tables, especially when a given condition has little influence on the actions to be

    performed. In some cases, entire conditions thought to be important initially are found to

    be irrelevant when none of the conditions influence which actions are performed.

    Aside from the basic four quadrant structure, decision tables vary widely in the way the

    condition alternatives and action entries are represented. Some decision tables use simple

    true/false values to represent the alternatives to a condition (akin to if-then-else), other

    tables may use numbered alternatives (akin to switch-case), and some tables even use

    fuzzy logic or probabilistic representations for condition alternatives.In a similar way,

    action entries can simply represent whether an action is to be performed (check the actions

    to perform), or in more advanced decision tables, the sequencing of actions to perform

    (number the actions to perform).

  • 8/12/2019 NDIM

    11/64

    11

    4. Explain various types of cohesion and coupling along with the Diagram?

    In software engineering, coupling or dependency is the degree to which each program

    module relies on each one of the other modules.

    Coupling is usually contrasted with cohesion. Low coupling often correlates with high

    cohesion, and vice versa. The software quality metrics of coupling and cohesion were

    invented by Larry Constantine, an original developer of Structured Design who was also

    an early proponent of these concepts (see also SSADM). Low coupling is often a sign of a

    well-structured computer system and a good design, and when combined with high

    cohesion, supports the general goals of high readability and maintainability.

    In computer programming, cohesion refers to the degree to which the elements of a

    module belong together. Thus, it is a measure of how strongly related each piece of

    functionality expressed by the source code of a software module is.Cohesion is an ordinal type of measurement and is usually expressed as high cohesion

    or low cohesion when being discussed. Modules with high cohesion tend to be

    preferable because high cohesion is associated with several desirable traits of software

    including robustness, reliability, reusability, and understandability whereas low cohesion

    is associated with undesirable traits such as being difficult to maintain, difficult to test,

    difficult to reuse, and even difficult to understand.

    Cohesion is often contrasted with coupling, a different concept. High cohesion often

    correlates with loose coupling, and vice versa. The software quality metrics of coupling

    and cohesion were invented by Larry Constantine based on characteristics of good

    programming practices that reduced maintenance and modification costs.

    Types of coupling

  • 8/12/2019 NDIM

    12/64

    12

    Conceptual model of coupling

    Coupling can be "low" (also "loose" and "weak") or "high" (also "tight" and "strong").

    Some types of coupling, in order of highest to lowest coupling, are as follows:

    Procedural programming

    A module here refers to a subroutine of any kind, i.e. a set of one or more statements

    having a name and preferably its own set of variable names.

    Content coupling (high)

    Content coupling (also known as Pathological coupling) occurs when one module

    modifies or relies on the internal workings of another module (e.g., accessing local

    data of another module).

    Therefore changing the way the second module produces data (location, type,

    timing) will lead to changing the dependent module.Common coupling

    Common coupling (also known as Global coupling) occurs when two modules

    share the same global data (e.g., a global variable).

    Changing the shared resource implies changing all the modules using it.

    External coupling

    External coupling occurs when two modules share an externally imposed data

    format, communication protocol, or device interface. This is basically related to the

    communication to external tools and devices.

    Control coupling

    Control coupling is one module controlling the flow of another, by passing it

    information on what to do (e.g., passing a what-to-do flag).

    Stamp coupling (Data-structured coupling)

    Stamp coupling occurs when modules share a composite data structure and use

    only a part of it, possibly a different part (e.g., passing a whole record to a function

    that only needs one field of it).

    This may lead to changing the way a module reads a record because a field that the

    module does not need has been modified.

    Data coupling

    Data coupling occurs when modules share data through, for example, parameters.

    Each datum is an elementary piece, and these are the only data shared (e.g.,

    passing an integer to a function that computes a square root).

  • 8/12/2019 NDIM

    13/64

    13

    Message coupling (low)

    This is the loosest type of coupling. It can be achieved by state decentralization (as

    in objects) and component communication is done via parameters or message

    passing (see Message passing).

    No coupling

    Modules do not communicate at all with one another.

    Object-oriented programming

    Subclass Coupling

    Describes the relationship between a child and its parent. The child is connected to

    its parent, but the parent is not connected to the child.

    Temporal coupling

    When two actions are bundled together into one module just because they happento occur at the same time.

    In recent work various other coupling concepts have been investigated and used as

    indicators for different modularization principles used in practice.

    Disadvantages

    Tightly coupled systems tend to exhibit the following developmental characteristics,

    which are often seen as disadvantages:

    1. A change in one module usually forces a ripple effect of changes in other modules.2. Assembly of modules might require more effort and/or time due to the increased

    inter-module dependency.

    3. A particular module might be harder to reuse and/or test because dependentmodules must be included.

    Performance issues

    Whether loosely or tightly coupled, a system's performance is often reduced by message

    and parameter creation, transmission, translation (e.g. marshaling) and message

    interpretation (which might be a reference to a string, array or data structure), which

    require less overhead than creating a complicated message such as a SOAP message.

    Longer messages require more CPU and memory to produce. To optimize runtime

    performance, message length must be minimized and message meaning must be

    maximized.

  • 8/12/2019 NDIM

    14/64

    14

    Message Transmission Overhead and Performance

    Since a message must be transmitted in full to retain its complete meaning,

    message transmission must be optimized. Longer messages require more CPU and

    memory to transmit and receive. Also, when necessary, receivers must reassemble

    a message into its original state to completely receive it. Hence, to optimize

    runtime performance, message length must be minimized and message meaning

    must be maximized.

    Message Translation Overhead and Performance

    Message protocols and messages themselves often contain extra information (i.e.,

    packet, structure, definition and language information). Hence, the receiver often

    needs to translate a message into a more refined form by removing extra characters

    and structure information and/or by converting values from one type to another.Any sort of translation increases CPU and/or memory overhead. To optimize

    runtime performance, message form and content must be reduced and refined to

    maximize its meaning and reduce translation.

    Message Interpretation Overhead and Performance

    All messages must be interpreted by the receiver. Simple messages such as integers

    might not require additional processing to be interpreted. However, complex

    messages such as SOAP messages require a parser and a string transformer for

    them to exhibit intended meanings. To optimize runtime performance, messages

    must be refined and reduced to minimize interpretation overhead.

    Solutions

    One approach to decreasing coupling is functional design, which seeks to limit the

    responsibilities of modules along functionality, coupling increases between two classes A

    and Bif:

    Ahas an attribute that refers to (is of type) B. Acalls on services of an object B. Ahas a method that references B(via return type or parameter). Ais a subclass of (or implements) class B.

    Low coupling refers to a relationship in which one module interacts with another module

    through a simple and stable interface and does not need to be concerned with the other

    module's internal implementation (see Information Hiding).

  • 8/12/2019 NDIM

    15/64

    15

    Systems such as CORBA or COM allow objects to communicate with each other without

    having to know anything about the other object's implementation. Both of these systems

    even allow for objects to communicate with objects written in other languages.

    Coupling versus Cohesion

    Coupling and Cohesion are terms which occur together very frequently. Coupling refers to

    the interdependencies between modules, while cohesion describes how related are the

    functions within a single module. Low cohesion implies that a given module performs

    tasks which are not very related to each other and hence can create problems as the

    module becomes large.

    Module coupling

    Coupling in Software Engineering describes a version of metrics associated with this

    concept.For data and control flow coupling:

    di: number of input data parameters ci: number of input control parameters do: number of output data parameters co: number of output control parameters

    For global coupling:

    gd: number of global variables used as data gc: number of global variables used as control

    For environmental coupling:

    w: number of modules called (fan-out) r: number of modules calling the module under consideration (fan-in)

    Coupling(C) makes the value larger the more coupled the module is. This number ranges

    from approximately 0.67 (low coupling) to 1.0 (highly coupled)

    For example, if a module has only a single input and output data parameter

    If a module has 5 input and output data parameters, an equal number of control

    parameters, and accesses 10 items of global data, with a fan-in of 3 and a fan-out of 4,

  • 8/12/2019 NDIM

    16/64

    16

    COUPLING

    An indication of the strength of interconnections between program units.

    Highly coupled have program units dependent on each other. Loosely coupled are made

    up of units that are independent or almost independent.

    Modules are independent if they can function completely without the presence of the

    other. Obviously, can't have modules completely independent of each other. Must interact

    so that can produce desired outputs. The more connections between modules, the more

    dependent they are in the sense that more info about one modules is required to understand

    the other module.

    Three factors: number of interfaces, complexity of interfaces, type of info flow alonginterfaces.

    Want to minimize number of interfaces between modules, minimize the complexity of

    each interface, and control the type of info flow. An interface of a module is used to pass

    information to and from other modules.

    In general, modules tightly coupled if they use shared variables or if they exchange control

    info.

    Loose coupling if info held within a unit and interface with other units via parameter lists.

    Tight coupling if shared global data.If need only one field of a record, don't pass entire record. Keep interface as simple and

    small as possible.

    Two types of info flow: data or control.

    Passing or receiving back control info means that the action of the module willdepend on this control info, which makes it difficult to understand the module.

    Interfaces with only data communication result in lowest degree of coupling,followed by interfaces that only transfer control data. Highest if data is hybrid.

    Ranked highest to lowest:

    1. Content coupling: if one directly references the contents of the other.When one module modifies local data values or instructions in another module.

    (can happen in assembly language)

    if one refers to local data in another module.

    if one branches into a local label of another.

  • 8/12/2019 NDIM

    17/64

    17

    2. Common coupling: access to global data.modules bound together by global data structures.

    3. Control coupling: passing control flags (as parameters or globals) so that onemodule controls the sequence of processing steps in another module.

    4. Stamp coupling: similar to common coupling except that global variables areshared selectively among routines that require the data. E.g., packages in Ada.

    More desirable than common coupling because fewer modules will have to be

    modified if a shared data structure is modified. Pass entire data structure but need

    only parts of it.

    5. Data coupling: use of parameter lists to pass data items between routines.COHESION

    Measure of how well module fits together.A component should implement a single logical function or single logical entity. All the

    parts should contribute to the implementation.

    Many levels of cohesion:

    1. Coincidental cohesion: the parts of a component are not related but simply bundledinto a single component.

    harder to understand and not reusable.

    2. Logical association: similar functions such as input, error handling, etc. puttogether. Functions fall in same logical class. May pass a flag to determine which

    ones executed.

    interface difficult to understand. Code for more than one function may be

    intertwined, leading to severe maintenance problems. Difficult to reuse

    3. Temporal cohesion: all of statements activated at a single time, such as start up orshut down, are brought together. Initialization, clean up.

    Functions weakly related to one another, but more strongly related to functions in

    other modules so may need to change lots of modules when do maintenance.

    4. Procedural cohesion: a single control sequence, e.g., a loop or sequence of decisionstatements. Often cuts across functional lines. May contain only part of a complete

    function or parts of several functions.

    Functions still weakly connected, and again unlikely to be reusable in another

    product.

  • 8/12/2019 NDIM

    18/64

    18

    5. Communicational cohesion: operate on same input data or produce same outputdata. May be performing more than one function. Generally acceptable if alternate

    structures with higher cohesion cannot be easily identified.

    still problems with reusability.

    6. Sequential cohesion: output from one part serves as input for another part. Maycontain several functions or parts of different functions.

    7. Informational cohesion: performs a number of functions, each with its own entrypoint, with independent code for each function, all performed on same data

    structure. Different than logical cohesion because functions not intertwined.

    8. Functional cohesion: each part necessary for execution of a single function. e.g.,compute square root or sort the array.

    Usually reusable in other contexts. Maintenance easier.9. Type cohesion: modules that support a data abstraction.

    Not strictly a linear scale. Functional much stronger than rest while first two much

    weaker than others. Often many levels may be applicable when considering two

    elements of a module. Cohesion of module considered as highest level of cohesion

    that is applicable to all elements in the module.

  • 8/12/2019 NDIM

    19/64

    19

    5. Explain project selection technique and data dictionary with the help of example?

    One of the biggest decisions that any organization would have to make is related to the

    projects they would undertake. Once a proposal has been received, there are numerous

    factors that need to be considered before an organization decides to take it up.

    The most viable option needs to be chosen, keeping in mind the goals and requirements of

    the organization. How is it then that you decide whether a project is viable? How do you

    decide if the project at hand is worth approving? This is where project selection methods

    come in use.

    Choosing a project using the right method is therefore of utmost importance. This is what

    will ultimately define the way the project is to be carried out.

    But the question then arises as to how you would go about finding the right methodology

    for your particular organization. At this instance, you would need careful guidance in theproject selection criteria, as a small mistake could be detrimental to your project as a

    whole, and in the long run, the organization as well.

    Selection Methods

    There are various project selection methods practised by the modern business

    organizations. These methods have different features and characteristics. Therefore, each

    selection method is best for different organizations.

    Although there are many differences between these project selection methods, usually the

    underlying concepts and principles are the same.

    Following is an illustration of two of such methods (Benefit Measurement and

    Constrained Optimization methods):

  • 8/12/2019 NDIM

    20/64

    20

    As the value of one project would need to be compared against the other projects, you

    could use the benefit measurement methods. This could include various techniques, of

    which the following are the most common:

    You and your team could come up with certain criteria that you want your idealproject objectives to meet. You could then give each project scores based on how

    they rate in each of these criteria and then choose the project with the highest

    score.

    When it comes to the Discounted Cash flow method, the future value of a project isascertained by considering the present value and the interest earned on the money.

    The higher the present value of the project, the better it would be for your

    organization.

    The rate of return received from the money is what is known as the IRR. Hereagain, you need to be looking for a high rate of return from the project.

    The mathematical approach is commonly used for larger projects. The constrained

    optimization methods require several calculations in order to decide on whether or not a

    project should be rejected.

    Cost-benefit analysis is used by several organizations to assist them to make their

    selections. Going by this method, you would have to consider all the positive aspects of

    the project which are the benefits and then deduct the negative aspects (or the costs) from

    the benefits. Based on the results you receive for different projects, you could choose

    which option would be the most viable and financially rewarding.

    These benefits and costs need to be carefully considered and quantified in order to arrive

    at a proper conclusion. Questions that you may want to consider asking in the selection

    process are:

    Would this decision help me to increase organizational value in the long run? How long will the equipment last for? Would I be able to cut down on costs as I go along?

    In addition to these methods, you could also consider choosing based on opportunity cost -

    When choosing any project, you would need to keep in mind the profits that you would

    make if you decide to go ahead with the project.

    Profit optimization is therefore the ultimate goal. You need to consider the difference

    between the profits of the project you are primarily interested in and the next best

    alternative.

  • 8/12/2019 NDIM

    21/64

    21

    Implementation of the Chosen Method:

    The methods mentioned above can be carried out in various combinations. It is best that

    you try out different methods, as in this way you would be able to make the best decision

    for your organization considering a wide range of factors rather than concentrating on just

    a few. Careful consideration would therefore need to be given to each project.

    Conclusion:

    In conclusion, you would need to remember that these methods are time-consuming, but

    are absolutely essential for efficient business planning.

    It is always best to have a good plan from the inception, with a list of criteria to be

    considered and goals to be achieved. This will guide you through the entire selection

    process and will also ensure that you do make the right choice.

    A data dictionary is a collection of data about data. It maintains information about thedefintion, structure, and use of each data element that an organization uses.

    There are many attributes that may be stored about a data element. Typical attributes used

    in CASE tools (Computer Assisted Software Engineering) are:

    Name Aliases or synonyms Default label Description Source(s) Date of origin Users Programs in which used Change authorizations Access authorization Data type Length Units(cm., degrees C, etc.) Range of values Frequency of use Input/output/local Conditional values Parent structure

  • 8/12/2019 NDIM

    22/64

    22

    Subsidiary structures Repetitive structures Physical location: record, file, data base

    A data dictionary is invaluable for documentation purposes, for keeping control

    information on corporate data, for ensuring consistency of elements between

    organizational systems, and for use in developing databases.

    Data dictionary software packages are commercially available, often as part of a CASE

    package or DBMS. DD software allows for consistency checks and code generation. It is

    also used in DBMSs to generate reports.

    The term data dictionary and data repository are used to indicate a more general

    software utility than a catalogue. A catalogueis closely coupled with the DBMS software.

    It provides the information stored in it to the user and the DBA, but it is mainly accessedby the various software modules of the DBMS itself, such as DDL and DML compilers,

    the query optimiser, the transaction processor, report generators, and the constraint

    enforcer. On the other hand, a data dictionaryis a data structure that stores metadata, i.e.,

    (structured) data about data. The software package for a stand-alone data dictionary or

    data repository may interact with the software modules of the DBMS, but it is mainly used

    by the designers, users and administrators of a computer system for information resource

    management. These systems are used to maintain information on system hardware and

    software configuration, documentation, application and users as well as other information

    relevant to system administration.

    If a data dictionary system is used only by the designers, users, and administrators and not

    by the DBMS Software, it is called a passive data dictionary.Otherwise, it is called an

    active data dictionaryor data dictionary.When a passive data dictionary is updated, it

    is done so manually and independently from any changes to a DBMS (database) structure.

    With an active data dictionary, the dictionary is updated first and changes occur in the

    DBMS automatically as a result.

    Database users and application developers can benefit from an authoritative data

    dictionary document that catalogs the organization, contents, and conventions of one or

    more databases. This typically includes the names and descriptions of various tables

    (records or Entities) and their contents (fields) plus additional details, like the type and

    length of each data element. Another important piece of information that a data dictionary

    can provide is the relationship between Tables. This is sometimes referred to in Entity-

  • 8/12/2019 NDIM

    23/64

    23

    Relationship diagrams, or if using Set descriptors, identifying in which Sets database

    Tables participate.

    In an active data dictionary constraints may be placed upon the underlying data. For

    instance, a Range may be imposed on the value of numeric data in a data element (field),

    or a Record in a Table may be FORCED to participate in a set relationship with another

    Record-Type. Additionally, a distributed DBMS may have certain location specifics

    described within its active data dictionary (e.g. where Tables are physically located).

    The data dictionary consists of record types (tables) created in the database by systems

    generated command files, tailored for each supported back-end DBMS. Command files

    contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER

    TABLE (for referential integrity), etc., using the specific statement required by that type

    of database.There is no universal standard as to the level of detail in such a document.

    Middleware

    In the construction of database applications, it can be useful to introduce an additional

    layer of data dictionary software, i.e. middleware, which communicates with the

    underlying DBMS data dictionary. Such a "high-level" data dictionary may offer

    additional features and a degree of flexibility that goes beyond the limitations of the native

    "low-level" data dictionary, whose primary purpose is to support the basic functions of the

    DBMS, not the requirements of a typical application. For example, a high-level data

    dictionary can provide alternative entity-relationship models tailored to suit different

    applications that share a common database. Extensions to the data dictionary also can

    assist in query optimization against distributed databases. Additionally, DBA functions are

    often automated using restructuring tools that are tightly coupled to an active data

    dictionary.

    Software frameworks aimed at rapid application development sometimes include high-

    level data dictionary facilities, which can substantially reduce the amount of programming

    required to build menus, forms, reports, and other components of a database application,

    including the database itself. For example, PHPLens includes a PHP class library to

    automate the creation of tables, indexes, and foreign key constraints portably for multiple

    databases. Another PHP-based data dictionary, part of the RADICORE toolkit,

    automatically generates program objects, scripts, and SQL code for menus and forms with

    data validation and complex joins. For the ASP.NET environment, Base One's data

  • 8/12/2019 NDIM

    24/64

    24

    dictionary provides cross-DBMS facilities for automated database creation, data

    validation, performance enhancement (caching and index utilization), application security,

    and extended data types. Visual DataFlex features provides the ability to use

    DataDictionaries as class files to form middle layer between the user interface and the

    underlying database. The intent is to create standardized rules to maintain data integrity

    and enforce business rules throughout one or more related applications.

    Platform-specific examples

    Data description specifications(DDS) allow the developer to describe data attributes in

    file descriptions that are external to the application program that processes the data, in the

    context of an IBM System i.

    The table below is an example of a typical data dictionary entry. The IT staff uses this to

    develop and maintain the database.

    Field Name Data Type Other information

    CustomerID Autonumber Primary key field

    Title TextLookup: Mr, Mrs, Miss, Ms

    Field size 4

    Surname TextField size 15

    Indexed

    FirstName Text Field size 15

    DateOfBirth Date/TimeFormat: Medium Date

    Range check: >=01/01/1930

    HomeTelephone TextField size: 12

    Presence check

  • 8/12/2019 NDIM

    25/64

    25

    6. Explain data flow diagram and pseudo codes with the difference between physical

    DFD and logical DFD any five points?

    To understand the differences between a physical and logical DFD, we need to know what

    DFD is. A DFD stands for data flow diagram and it helps in representing graphically the

    flow of data in an organization, particularly its information system. A DFD enables a user

    to know where information comes in, where it goes inside the organization and how it

    finally leaves the organization. DFD does give information about whether the processing

    of information takes place sequentially or if it is processed in a parallel fashion. There are

    two types of DFDs known as physical and logical DFD. Though both serve the same

    purpose of representing data flow, there are some differences between the two that will be

    discussed in this article.

    Any DFD begins with an overview DFD that describes in a nutshell the system to bedesigned. A logical data flow diagram, as the name indicates concentrates on the business

    and tells about the events that take place in a business and the data generated from each

    such event. A physical DFD, on the other hand is more concerned with how the flow of

    information is to be represented. It is a usual practice to use DFDs for representation of

    logical data flow and processing of data. However, it is prudent to evolve a logical DFD

    after first developing a physical DFD that reflects all the persons in the organization

    performing various operations and how data flows between all these persons.

    What is the difference between Physical DFD and Logical DFD?

    While there is no restraint on the developer to depict how the system is constructed in the

    case of logical DFD, it is necessary to show how the system has been constructed. There

    are certain features of logical DFD that make it popular among organizations. A logical

    DFD makes it easier to communicate for the employees of an organization, leads to more

    stable systems, allows for better understanding of the system by analysts, is flexible and

    easy to maintain, and allows the user to remove redundancies easily. On the other hand, a

    physical DFD is clear on division between manual and automated processes, gives detailed

    description of processes, identifies temporary data stores, and adds more controls to make

    the system more efficient and simple.

    Data Flow Diagrams (DFDs) are used to show the flow of data through a system in terms

    of the inputs, processes,and outputs.

  • 8/12/2019 NDIM

    26/64

    26

    External Entities

    Data either comes from or goes to External Entities. They are either the source or

    destination (sometimes called a source or sink) of data, which is considered to be external

    to the system. It could be people or groups that provide or input data to the system or who

    receive data from the system Defined by an oval see below. Identified by a noun.

    External Entities are not part of the system but are needed to provide sources of data used

    by the system. Fig 1 below shows an example of an External Entity

    Fig 1 External Entity

    Processes and Data Flows

    Data passed to, or from an External Entity mustbe processed in some way. The passing

    of data (flow of data) is shown on the DFD as an arrow. The direction of the arrow

    defines the direction of the flow of data. All data flows to and from External Entities to

    Processes and vice versa need to be named. Fig 2 below shows an example of a data flow:

    Fig 2 Data Flow

    Process processing data that emanates from external entities or data stores. The process

    could be manual, mechanised, or automated/computed. A data process will use or alter the

    data in some way. Identified from a scenario by a verb or action. Each process is given a

    unique number and is also given a name. An example of a Process is shown in Fig 3

    below:

    Fig 3 - Process

    Customer

    Customer details

    Add New Customer

    1

  • 8/12/2019 NDIM

    27/64

    27

    Data Stores

    A Data Store is a point where data is

    held and receives or provides data

    through data flows. Examples of data

    stores are transaction records, data files, reports, and documents. Could be a filing cabinet

    or magnetic media. Data stores are named in the singular and numbered. A manual store

    such as a filing cabinet is numbered with an M prefix. A D is used as a prefix for an

    electronic store such as a relational table. An example of an electronic data store is

    shown in Fig 4 below

    Fig 4 Data StoreRules

    There are certain rules that must be applied when drawing DFDs. These are explained

    below:

    An external entity cannot be connected to another external entity by a data flow An external entity cannot be connected directly to a data store An external entity must pass data to, or receive data from a process using a data

    flow

    A data store cannot be directly connected to another data store A data store cannot be directly connected to an external entity A data store can pass data to, or receive data from a process A process can pass data to and receive data from another process Data must flow from external entity to a process and then be passed onto anther

    process or a data store

    A matrix for the above rules is show in Fig 5 below

    Fig 5 ERD Rules

    Entity Process Store

    Entity No Yes No

    Process Yes Yes Yes

    Store No Yes No

    Customer

  • 8/12/2019 NDIM

    28/64

    28

    There are different levels of DFDs depending on the level of detail shown

    Level 0 or context diagram

    The context diagram shows the top-level process, the whole system, as a single process

    rectangle. It shows all external entities and all data flows to and from the system.

    Analysts draw the context diagram first to show the high-level processing in a system. An

    example of a Context Diagram is shown in Fig 6 below:

    Fig 6 Context Diagram for a Car Sales System

    Level 1 DFD

    This level of DFD shows all external entities that are on the context diagram, all the high-

    level processes and all data stores used in the system. Each high-level process may

    contain sub-processes. These are shown on lower level DFDs.

    BilbosCar

    Sales

    Customer

    Management

    Customer

    Management

    customerdetails

    new car details

    monthly reportdetails

    invoice details

    updated custom-er details

    Customer OrderDetails

    staff details

  • 8/12/2019 NDIM

    29/64

    29

    A Level 1 DFDfor the Car Sales scenario is shown in Fig 7 below:

    Fig 7 Level 1 DFD for a Car Sales System

    Management

    Customer

    Customer

    Management

    Customer

    AddNew

    Customer

    1

    CustomerD1

    CarD2

    *

    CreateMonthlySalesReport

    2

    SalesD3

    AddNewSale

    3

    *

    Add NewCarDetails

    4

    *

    UpdateCustomer

    5

    *

    CreateCustomer

    Invoice

    6

    SalesD3

    CustomerD1

    Management

    *

    Add StaffDetails

    7

    StaffD4

    monthly reportdetails

    invoice details

    customerdetails

    new car details

    customerdetails

    customerdetails

    customerdetails

    car details

    sales details

    car details

    customerdetails sales details

    car details

    updated custom-er details customer

    details

    updated custom-er details

    car details

    sales details

    customerdetails

    Customer Order

    Details

    staff details

    staff details

    staff details

  • 8/12/2019 NDIM

    30/64

    30

    Level 2 DFDs

    Each Level 1 DFD process may contain further internal processes. These are shown on

    the Level 2 DFD. The numbering system used in the Level 1 DFD is continued and each

    process in the Level 2 DFD is prefixed by the Level 1 DFD number followed by a unique

    number for each process i.e. for process 1, sub processes 1.1, 1.2, 1.3 etc see fig 8 below

    Fig 8 Level 2 DFD for Level 1 Process Add New Sale

    Each of the Level 2 DFDs could also have sub-processes and could be decomposed

    further into lower level DFDs i.e. 1.1.1, 1.1.2, 1.1.3 etc

    More than 3 levels for a DFD would become unmanageable.

    Lowest Level DFDs and Process Specification

    Once the DFD has been decomposed into its lowest level, each of the lower level DFDs

    can be described using pseudo-code (structured English), flow chart or similar process

    specification method that can be used by a programmer to code each process or function.

    For example, the Level 2 DFD for the Add New Sale process could be described as being

    a process that contains 3 sub-processes, Validate Order, Add Staff to Order and Generate

    New Sale. The structured English could be written thus:

    Open Customer File

    If existing customer

    Check Customer Details

    Else

    Add customer details

    Add New Sale3

    SalesD3

    Customer

    CarD2

    CustomerD1

    *

    ValidateOrder

    3.1

    *

    GenerateNewSale

    3.2

    StaffD4

    *

    Add staffto

    order

    3.3

    sales details

    car details

    customer

    details

    validated orderdetails

    car details

    Customer OrderDetails

    staff detailsvalidatedstaff dets

  • 8/12/2019 NDIM

    31/64

    31

    End If

    Open Car File

    If car available then

    Open Sale File

    Add customer to sale

    Set car to unavailable

    Add car to sale

    Add staff details

    Calculate price

    Generate Invoice

    Close Sale File

    Close Customer FileClose Car File

    Inform User of successful sale exit process

    Else

    Inform User of problem exit process

    Close Customer File

    Close Car File

    End If

    The above example is not carved in stone as the analyst may decide to write separate

    functions to validate customer and car details and that the Generate New Sale process

    could include other sub-processes.

    All that matters is that the underlying processing logic solves the problem.

    For example, if you look at Figure 8 there is a process named Validate Order, which has a

    duel purpose of checking both the customer details (is customer a current customer, if not

    add to customer file) and the car details (is car available, if not stop the sale process). A

    separate process called Validate Order could be created, but I have written the structured

    English to show a logical sequence that shows that, only if the car is available do we begin

    the transaction of creating the sale.

    I have also assumed that the staff dealing with the sale will know their own details so there

    would not be a need for the process named Add Staff to Order.

    Like all analysis and design processes, the process of producing DFDs and writing

    structured English is an iterative process

  • 8/12/2019 NDIM

    32/64

    32

    7. Explain coding techniques and types of codes?

    It is required that information must be encoded into signals before it can be transported

    across communication media. In more precise words we may say that the waveform

    pattern of voltage or current used to represent the 1s and 0s of a digital signal on a

    transmission link is called digital to digital line encoding. There are different encoding

    schemes available:

    Digltal-to-Digltal Encoding

    It is the representation of digital information by a digital signal.

    There are basically following types of digital to-digital encoding available like: Unipolar Polar Bipolar.

    Unipolar

    Unipolar encoding uses only one level of value 1 as a positive value and 0 remains Idle.

    Since unipolar line encoding has one of its states at 0 Volts, its also called Return to Zero

    (RTZ) as shown in Figure. A common example of unipolar line encoding is the 11'L logic

    levels used in computers and digital logic.

    Unipolar encoding represents DC (Direct Current) component and therefore, ca.'1nottravel

    through media such as microwaves or transformers. It has low noise margin and needs

    extra hardware for synchronization purposes. It is well suited where the signal path is

    short. For long distances, it produces stray capacitance in the transmission medium and

    therefore, it never returns to zero as shown in Figure.

  • 8/12/2019 NDIM

    33/64

    33

    Polar

    Polar encoding uses two levels of voltages say positive and negative. For example, the

    RS:232D interface uses Polar line encoding. The signal does not return to zero; it is either

    a positive voltage or a negative voltage. Polar encoding may be classified as nonreturn to

    zero (NRZ), return to zero (RZ) and biphase. NRZ may be further divided into NRZL and

    NRZI. Biphase has also two different categories as Manchester and Differential

    Manchester encoding. Polar line encoding is the simplest pattern that eliminates most of

    the residua! DC problem. Figure shows the Polar line encoding. It has the same problem of

    synchronization as that of unipolar encoding. The added benefit of polar encoding is that it

    reduces the power required to transmit the signal by one-half.

    Non-Return to Zero (NRZ)

    In NRZL, the level of the signal is 1 if the amplitude is positive and 0 in case of negative

    amplitude.

    In NRZI, whenever a positive amplitude or bit I appears in the signal, the signal gets

    inverted,

    Figure explains the concepts of NRZ-L and NRZI more precisely.

  • 8/12/2019 NDIM

    34/64

    34

    Return to Zero (RZ)

    RZ uses three values to represent the signal. These are positive, negative, and zero. Bit 1is

    represented when signal changes from positive to zero. Bit 0 is represented when signal

    changes from negative to zero. Figure explains the RZ concept.

    Biphase

    Biphase is implemented in two different ways as Manchester and Differential Manchester

    encoding.

    In Manchester encoding, transition happens at the middle of each bit period. A low to high

    transition represents a 1 and a high to low transition represents a 0.In case of Differential

    Manchester encoding, transition occurs at the beginning of a bit time, which represents a

    zero.

    These encoding can detect errors during transmission because of the transition during

    every bit period. Therefore, the absence of a transition would indicate an error condition.

  • 8/12/2019 NDIM

    35/64

    35

    They have no DC component and there is always a transition available for synchronizing

    receives and transmits clocks.

    Bipolar

    Bipolar uses three voltage levels. These are positive, negative, and zero. Bit 0 occurs at

    zero level of amplitude. Bit 1 occurs alternatively when the voltage level is either positive

    or negative and therefore, also called as Alternate Mark Inversion (AMI). There is no DC

    component because of the alternate polarity of the pulses for Is. Figure describes bipolar

    encoding.

    Analog to Digital

    Analog to digital encoding is the representation of analog information by a digital signal.

    These include PAM (Pulse Amplitude Modulation), and PCM (Pulse Code Modulation).

    Digital to Analog

    These include ASK (Amplitude Shift Keying), FSK (Frequency Shift Keying), PSK

    (Phase Shift Keying), QPSK (Quadrature Phase Shift Keying), are QAM (Quadrature

    Amplitude Modulation).

    Analog to Analog

    These are Amplitude modulation, Frequency modulation and Phase modulation

    techniques,

    Codecs (Coders and Decoders)

    Codec stands for coders/decompression in data communication. The reverse conversion of

    analog to digital is necessary in situations where it is advantageous to send analog

    information across a digital circuit. Certainly, this is often the case in carrier networks,

    where huge volumes of analog voice are digitized and sent across high capacity, digital

    circuits. The device that accomplishes the analog to digital conversion is known as a

  • 8/12/2019 NDIM

    36/64

    36

    codec. Codecs code an analog input into a digital format on the transmitting side of the

    connection, reversing the process, or decoding the information on the receiving side, in

    order to reconstitute the analog signal. Codecs are widely used to convert analog voice

    and video to digital format, and to reverse the process on the receiving end.

  • 8/12/2019 NDIM

    37/64

    37

    8. Explain algorithm with detect error module (eleven code) and module n code with

    the help of algorithm and examples

    In information theory and coding theory with applications in computer science and

    telecommunication, error detection and correctionor error controlare techniques that

    enable reliable delivery of digital data over unreliable communication channels. Many

    communication channels are subject to channel noise, and thus errors may be introduced

    during transmission from the source to a receiver. Error detection techniques allow

    detecting such errors, while error correction enables reconstruction of the original data.

    Error correction may generally be realized in two different ways:

    Automatic repeat request (ARQ) (sometimes also referred to as backward errorcorrection): This is an error control technique whereby an error detection scheme is

    combined with requests for retransmission of erroneous data. Every block of datareceived is checked using the error detection code used, and if the check fails,

    retransmission of the data is requested this may be done repeatedly, until the data

    can be verified.

    Forward error correction (FEC): The sender encodes the data using an error-correcting code (ECC) prior to transmission. The additional information

    (redundancy) added by the code is used by the receiver to recover the original data.

    In general, the reconstructed data is what is deemed the "most likely" original data.

    ARQ and FEC may be combined, such that minor errors are corrected without

    retransmission, and major errors are corrected via a request for retransmission: this is

    called hybrid automatic repeat-request (HARQ).

    Error detection is most commonly realized using a suitable hash function (or checksum

    algorithm). A hash function adds a fixed-length tag to a message, which enables receivers

    to verify the delivered message by recomputing the tag and comparing it with the one

    provided.

    There exists a vast variety of different hash function designs. However, some are of

    particularly widespread use because of either their simplicity or their suitability for

    detecting certain kinds of errors (e.g., the cyclic redundancy check's performance in

    detecting burst errors).

    Random-error-correcting codes based on minimum distance coding can provide a suitable

    alternative to hash functions when a strict guarantee on the minimum number of errors to

    be detected is desired. Repetition codes, described below, are special cases of error-

  • 8/12/2019 NDIM

    38/64

    38

    correcting codes: although rather inefficient, they find applications for both error

    correction and detection due to their simplicity.

    Repetition codes

    A repetition code is a coding scheme that repeats the bits across a channel to achieve

    error-free communication. Given a stream of data to be transmitted, the data is divided

    into blocks of bits. Each block is transmitted some predetermined number of times. For

    example, to send the bit pattern "1011", the four-bit block can be repeated three times, thus

    producing "1011 1011 1011". However, if this twelve-bit pattern was received as "1010

    1011 1011" where the first block is unlike the other two it can be determined that an

    error has occurred.

    Repetition codes are very inefficient, and can be susceptible to problems if the error occurs

    in exactly the same place for each group (e.g., "1010 1010 1010" in the previous examplewould be detected as correct). The advantage of repetition codes is that they are extremely

    simple, and are in fact used in some transmissions of numbers stations.

    Parity bits

    A parity bit is a bit that is added to a group of source bits to ensure that the number of set

    bits (i.e., bits with value 1) in the outcome is even or odd. It is a very simple scheme that

    can be used to detect single or any other odd number (i.e., three, five, etc.) of errors in the

    output. An even number of flipped bits will make the parity bit appear correct even though

    the data is erroneous.

    Extensions and variations on the parity bit mechanism are horizontal redundancy checks,

    vertical redundancy checks, and "double," "dual," or "diagonal" parity (used in RAID-DP).

    Checksums

    A checksum of a message is a modular arithmetic sum of message code words of a fixed

    word length (e.g., byte values). The sum may be negated by means of a ones'-complement

    operation prior to transmission to detect errors resulting in all-zero messages.

    Checksum schemes include parity bits, check digits, and longitudinal redundancy checks.

    Some checksum schemes, such as the Damm algorithm, the Luhn algorithm, and the

    Verhoeff algorithm, are specifically designed to detect errors commonly introduced by

    humans in writing down or remembering identification numbers.

    Cyclic redundancy checks (CRCs)

    A cyclic redundancy check (CRC) is a single-burst-error-detecting cyclic code and non-

    secure hash function designed to detect accidental changes to digital data in computer

  • 8/12/2019 NDIM

    39/64

    39

    networks. It is not suitable for detecting maliciously introduced errors. It is characterized

    by specification of a so-called generator polynomial, which is used as the divisor in a

    polynomial long division over a finite field, taking the input data as the dividend, and

    where the remainder becomes the result.

    Cyclic codes have favorable properties in that they are well suited for detecting burst

    errors. CRCs are particularly easy to implement in hardware, and are therefore commonly

    used in digital networks and storage devices such as hard disk drives.

    Even parity is a special case of a cyclic redundancy check, where the single-bit CRC is

    generated by the divisor x + 1.

    Cryptographic hash functions

    The output of a cryptographic hash function, also known as a message digest, can provide

    strong assurances about data integrity, whether changes of the data are accidental (e.g.,due to transmission errors) or maliciously introduced. Any modification to the data will

    likely be detected through a mismatching hash value. Furthermore, given some hash value,

    it is infeasible to find some input data (other than the one given) that will yield the same

    hash value. If an attacker can change not only the message but also the hash value, then a

    keyed hash or message authentication code (MAC) can be used for additional security.

    Without knowing the key, it is infeasible for the attacker to calculate the correct keyed

    hash value for a modified message.

    Error-correcting codes

    Any error-correcting code can be used for error detection. A code with minimum

    Hamming distance, d, can detect up to d 1 errors in a code word. Using minimum-

    distance-based error-correcting codes for error detection can be suitable if a strict limit on

    the minimum number of errors to be detected is desired.

    Codes with minimum Hamming distance d = 2 are degenerate cases of error-correcting

    codes, and can be used to detect single errors. The parity bit is an example of a single-

    error-detecting code.

    In digital data transmission, error occurs due to noise. The probability of error or bit error

    rate depends on the signal to noise ratio, the modulation type and the method of

    demodulation.

  • 8/12/2019 NDIM

    40/64

    40

    The bit error rate p, may be expressed in terms of

    = NNelforbitsN

    bitsNinerrorsofno(arg )

    For example, if p=0.1 we would expect on average there would be 1 error in every 10 bits.

    A p=0.1 actually stating that every bit has a 1/10th

    probability of being in error.

    Depending on the type of system and many factors, error rates typically range from 10-1

    to

    10-5

    or better.

    Information transfer via digital system is usually packaged into a structure (a block of bits)

    called a message block or frame. A typical message block contains the following:

    Synchronization pattern to mark the start of message block

    Destination and sometimes source addressesSystem control/ commands

    Information

    Error control coding check bits

    The total number of the bits in the block may vary widely ( from say 32 bits to several

    hundreds bits) depending on the requirement.

    Clearly, if the bits are subjected to an error rate p, there is some probability that a message

    block will be received with 1 or more bits in error. In order to counteract the effects of

    errors, error control coding techniques are used to either:

    a) detect errors error detectionb) correct error error detection and correction

    Broadly, there are two types of error control codes:

    a) Block Codes Parity codes Array codes

    Repetition codes

    Cyclic codes etc

    b) Convolutional Codes

  • 8/12/2019 NDIM

    41/64

    41

    BLOCK CODES

    A block code is a coding technique which generates C check bits for M message bits to

    give a stand alone block of M+C= N bits.

    The sync bits are usually not included in the error control coding because message

    synchronization must be achieved before the message and check bits can be processed.

    The code rate is given by

    Rate = N

    M

    CM

    M=+

    Where, M = number of message bits

    C = number of check bits

    N= M+C= total number of bits.

    The code rate is the measure of the proportion of free user assigned bits (M) to the total

    bits in the blocks (N).

    For example,

    i) A single parity bit (C=1 bit) applied to a block of 7 bits give a code rateR =

    8

    7

    17

    7=

    +

    ii) A (7,4) Cyclic code has N=7, M=4

    Code rate R =74

  • 8/12/2019 NDIM

    42/64

    42

    iii) A repetition-m code in which each bit or message is transmitted m times

    and the receiver carries out a majority vote on each bit has a code rate

    Rate =mmM

    M 1=

    DETECTION AND CORRECTION

    Consider message transferred from a Source to a Destination, and assume that the

    Destination is able to check the received messages and detect errors.

    If no errors are detected, the Destination will accept the messages.

    If errors are detected, there are two forms of error corrections.

    a) Automatic Retransmission Request (ARQ)In ARQ system, the destination send an acknowledgment ACK message back to the

    source if the errors are not detected, and a Not-ACK (NAK) message back to the source if

    errors are detected.

    If the source receives an ACK to a message it will send the next message. If the source

    receives a NAK it repeats the same message. This process repeat until all the messages is

    accepted by the destination.

  • 8/12/2019 NDIM

    43/64

    43

    b) Forward Error Correction (FEC)

    The error control code may be powerful enough to allow the destination to attempt to

    correct the errors by further processing. This is called Forward Error Correction, no

    ACKs or NAKs are required.

    Many systems are hybrid in that they use both ARQ (ACK/NAK) and FEC strategies for

    error correction.

    Successful, False & Lost Message Transfer

    The process of checking the received messages for errors gives two possible outcomes:

    a) Errors not detected messages acceptedb) Errors detected messages re rejected

    An error not detected does not mean that errors are not present. Error control codes cannot

    detect every possible error or combinations of errors. However, if error are not detected

    the destination has not alternative but to accept the message, true or false. That is, we may

    conclude if errors are not detected either

  • 8/12/2019 NDIM

    44/64

    44

    a) that there were no errors, i.e. the messages accepted are true or in other wordssuccessful message transfer.

    b) that there were undetected errors, i.e. the messages accepted was false or in otherwords a false message transfer.

    If errors are detected, the destination does not accept the message and may either request a

    re-transmission (ARQ-system) or process the block further in an attempt to correct the

    error (FEC).

    In processing the block of error correction, again there are two possible outcomes.

    a) the processor may get it right, i.e. correct the error and give a successful messagetransfer.

    b) the processor may get it wrong, i.e. not correct the errors in which case there is afalse message transfer.

    Some codes have a range of ability to detect and correct errors. For example a code may

    be able to detect and correct 1 error (single bit error) and detect 2,3 and 4 bits in error, but

    not correct them. Thus even with FEC, some messages may still be rejected and we think

    of these as lost messages. These ideas are illustrated below:

  • 8/12/2019 NDIM

    45/64

    45

    MESSAGE TRANSFERS

    Consider message transfer between two computers e.g. it is required to transfer the

    contents of Computer A to Computer B.

    COMPUTER A COMPUTER B

    As discussed, of the messages transferred to the Computer B, some may be rejected (lost)

    and some will be accepted, and will be either true (successful transfer) or false.

    Obviously the requirement is for a high probability of successful transfer (ideally = 1), low

    probability of false transfer (ideally = 0) and a low probability of lost messages. In

    particular the false rate should be kept low, even at the expense of an increased lost

    message rate.

    Note in some messages there may be in-built redundancy for example in text

    message

    REPAUT FOR WEDLESDAY (REPORT FOR WEDNESDAY)

    However if this is followed by

    10 JUNE we would ?? 10

    Other example where there is little or no redundancy are Car registration numbers,

    Accounts etc, generally numeric or unstructured alpha-numeric information.

    There is thus a need for a low false rate appropriate to the function of the system and it is

    important for the information in Computer B to be correct even if it takes a long time to

    transfer.

    Error control coding may be considered further in two main ways.

    In terms of System Performance i.e. the probabilities of successful, false and lostmessage transfer. In this case we only need to know what the code for error detection /

    correction can do in terms of its ability to detect and correct errors (depends on

    hamming distance).

  • 8/12/2019 NDIM

    46/64

    46

    In terms of the Error Control Codeitself i.e. the structure, operation, characteristicsand implementation of various types of codes.

    SYSTEM PERFORMANCE

    In order to determine system performance in terms of successful, false and lost message

    transfers it is necessary to know:

    1) the probability of error or b.e.r p.2) the no. of bits in the message block N3) the ability of the code to detect/ correct errors, usually expressed as a minimum

    Hamming distance, dmin for the code.

    Since the b.e.r, p, and the number of bits in the block, N we can apply the equation below

    ( )

    ( ) RNR ppRRN

    NR

    = 1

    !!

    !)( 0! =1, 1! =1

    This gives the probability of R errors in an N bit block subject to a bit error rate p.

    Hence, for an N bit block we can determine the probability of no errors in the block (R=0)

    i.e. an error free block

    ( ) ( ) NN ppp

    N

    N)1(1

    !0!0

    !)0(

    00=

    =

    the probability of 1 error in the block (R=1)

    ( ) ( ) 111 )1(1

    !1!1

    !)1(

    =

    =

    NN ppNppN

    N

    the probability of 2 error in the block (R=2)

    ( ) ( ) 22 1

    !2!2

    !)2(

    =

    Npp

    N

    N

    R=3, R=4 etc. P(3), P(4), P(5) ,..P(N).

    MINIMUM HAMMING DISTANCE

    The minimum hamming distance of an error control code, is a parameter which indicates

    the worst case ability of the code to detect/correct errors. In general, codes will perform

    better than indicated by the minimum Hamming distance.

  • 8/12/2019 NDIM

    47/64

    47

    Let dmin= minimum Hamming distance

    l = number of bits errors detected

    t = number of bit errors corrected

    It may be shown that

    dmin = l + t + 1 with t l

    For a given dmin , there are a range of (worst case) options from just error detection to error

    detection/ correction.

    For example, suppose a code has a dmin= 6.

    Since, dmin = l + t + 1

    We have as options

    1) 6= 5 + 0 + 1 {detect up to 5 errors , no correction}

    2) 6= 4 + 1 + 1 {detect up to 4 errors , correct 1 error}3) 6= 3 + 2 + 1 {detect up to 3 errors , correct 2 error}

    After this, t>l, i.e. cannot go further, since we cannot correct more errors than can be

    detected.

    In option 1), up to 5 errors can be detected i.e. 1,2,3,4 or 5 errors detected, but there is no

    error correction.

    In option 2), up to 4 errors can be detected i.e. 1,2,3,4 errors detected, and 1 error can be

    corrected.

    In option 3), up to 3 errors can be detected i.e. 1,2,3 errors detected, and 1 and 2 errors can

    be corrected.

    Hence a given code can give several decoding, error detection/correction options at the

    receiver. In an ARQ system with no FEC, we would implement option 1, i.e detect as

    many errors as possible.

    If FEC were to be used, we might choose option 3 which allows 1 and 2 errors in a block

    to be detected and corrected, 3 errors can be detected but not corrected and these messages

    could be rejected and recovered by ARQ.

    For option 3 for example, if 4 or more errors occurred, these would not be detected and

    these messages would be accepted but would be false messages.

    Fortunately, the higher the no. of errors, the less the probability they will occur for

    reasonable values of p.

  • 8/12/2019 NDIM

    48/64

    48

    From the above, we may conclude that:

    Messages transfers are successful if no errors occurs or if t errors occurs which are

    corrected.

    i.e. Probability of Success = =

    +

    t

    i

    ipp1

    )()0(

    Messages transfers are lost if up to l errors are detected which are not corrected, i.e

    Probability of lost = p(t+1) + p(t+2)+ . P(l)

    = +=

    l

    ti

    ip1

    )(

    Message transfers are false of l+1 or more errors occurs

    Probability of false = p(l+1) + p(l+2)+ . P(N)

    = +=

    N

    li

    ip1

    )(

    Example

    Using dmin= 6, option 3, (t=1, l =4)

    Probability of Successful transfer = p(0) + p(1)

    Probability of lost messages = p(2) + p(3) + p(4)

    Probability of false messages = p(5) + p(6)+ .+ p(N).

  • 8/12/2019 NDIM

    49/64

    49

    9. Explain back-up-plans

    In information technology, a backup, or the process of backing up, refers to the copying

    and archiving of computer data so it may be used to restore the original after a data loss

    event. The verb form is to back upin two words, whereas the noun is backup.

    Backups have two distinct purposes. The primary purpose is to recover data after its loss,

    be it by data deletion or corruption. Data loss can be a common experience of computer

    users. A 2008 survey found that 66% of respondents had lost files on their home PC. The

    secondary purpose of backups is to recover data from an earlier time, according to a user-

    defined data retention policy, typically configured within a backup application for how

    long copies of data are required. Though backups popularly represent a simple form of

    disaster recovery, and should be part of a disaster recovery plan, by themselves, backups

    should not alone be considered disaster recovery. One reason for this is that not all backupsystems or backup applications are able to reconstitute a computer system or other

    complex configurations such as a computer cluster, active directory servers, or a database

    server, by restoring only data from a backup.

    Since a backup system contains at least one copy of all data worth saving, the data storage

    requirements can be significant. Organizing this storage space and managing the backup

    process can be a complicated undertaking. A data repository model can be used to provide

    structure to the storage. Nowadays, there are many different types of data storage devices

    that are useful for making backups. There are also many different ways in which these

    devices can be arranged to provide geographic redundancy, data security, and portability.

    Before data is sent to its storage location, it is selected, extracted, and manipulated. Many

    different techniques have been developed to optimize the backup procedure. These include

    optimizations for dealing with open files and live data sources as well as compression,

    encryption, and de-duplication, among others. Every backup scheme should include dry

    runs that validate the reliability of the data being backed up. It is important to recognize

    the limitations and human factors involved in any backup scheme.

    Because data is the heart of the enterprise, it's crucial for you to protect it. And to protect

    your organization's data, you need to implement a data backup and recovery plan. Backing

    up files can protect against accidental loss of user data, database corruption, hardware

    failures, and even natural disasters. It's your job as an administrator to make sure that

    backups are performed and that backup tapes are stored in a secure location.

  • 8/12/2019 NDIM

    50/64

    50

    Creating a Backup and Recovery Plan

    Data backup is an insurance plan. Important files are accidentally deleted all the time.

    Mission-critical data can become corrupt. Natural disasters can leave your office in ruin.

    With a solid backup and recovery plan, you can recover from any of these. Without one,

    you're left with nothing to fall back on.

    Figuring Out a Backup Plan

    It takes time to create and implement a backup and recovery plan. You'll need to figure out

    what data needs to be backed up, how often the data should be backed up, and more. To

    help you create a plan, consider the following:

    How important is the data on your systems?The importance of data can go along way in helping you determine if you need to back it upas well as when and

    how it should be backed up. For critical data, such as a database, you'll want tohave redundant backup sets that extend back for several backup periods. For less

    important data, such as daily user files, you won't need such an elaborate backup

    plan, but you'll need to back up the data regularly and ensure that the data can be

    recovered easily.

    What type of information does the data contain? Data that doesn't seemimportant to you may be very important to someone else. Thus, the type of

    information the data contains can help you determine if you need to back up the

    dataas well as when and how the data should be backed up.

    How often does the data change? The frequency of change can affect yourdecision on how often the data should be backed up. For example, data that

    changes daily should be backed up daily.

    How quickly do you need to recover the data? Time is an important factor increating a backup plan. For critical systems, you may need to get back online

    swiftly. To do this, you may need to alter your backup plan.

    Do you have the equipment to perform backups? You must have backuphardware to perform backups. To perform timely backups, you may need several

    backup devices and several sets of backup media. Backup hardware includes tape

    drives, optical drives, and removable disk drives. Generally, tape drives are less

    expensive but slower than other types of drives.

    Who will be responsible for the backup and recovery plan? Ideally, someoneshould be a primary contact for the organization's backup and recovery plan. This

  • 8/12/2019 NDIM

    51/64

    51

    person may also be responsible for performing the actual backup and recovery of

    data.

    What is the best time to schedule backups? Scheduling backups when systemuse is as low as possible will speed the backup process. However, you can't always

    schedule backups for off-peak hours. So you'll need to carefully plan when key

    system data is backed up.

    Do you need to store backups off-site?Storing copies of backup tapes off-site isessential to recovering your systems in the case of a natural disaster. In your off-

    site storage location, you should also include copies of the software you may need

    to install to reestablish operational systems.

    The Basic Types of Backup

    There are many techniques for backing up files. The techniques you use will depend onthe type of data you're backing up, how convenient you want the recovery process to be,

    and more.

    If you view the properties of a file or directory in Windows Explorer, you'll note an

    attribute called Archive. This attribute often is used to determine whether a file or

    directory should be backed up. If the attribute is on, the file or directory may need to be

    backed up. The basic types of backups you can perform include

    Normal/full backupsAll files that have been selected are backed up, regardless ofthe setting of the archive attribute. When a file is backed up, the archive attribute is

    cleared. If the file is later modified, this attribute is set, which indicates that the file

    needs to be backed up.

    Copy backups All files that have been selected are backed up, regardless of thesetting of the archive attribute. Unlike a normal backup, the archive attribute on

    files isn't modified. This allows you to perform other types of backups on the files

    at a later date.

    Differential backupsDesigned to create backup copies of files that have changedsince the last normal backup. The presence of the archive attribute indicates that

    the file has been modified and only files with this attribute are backed up.

    However, the archive attribute on files isn't modified. This allows you to perform

    other types of backups on the files at a later date.

    Incremental backupsDesigned to create backups of files that have changed sincethe most recent normal or incremental backup. The presence of the archive

  • 8/12/2019 NDIM

    52/64

    52

    attribute indicates that the file has been modified and only files with this attribute

    are backed up. When a file is backed up, the archive attribute is cleared. If the file

    is later modified, this attribute is set, which indicates that the file needs to be

    backed up.

    Daily backupsDesigned to back up files using the modification date on the fileitself. If a file has been modified on the same day as the backup, the file will be

    backed up. This technique doesn't change the archive attributes of files.

    In your backup plan you'll probably want to perform full backups on a weekly basis and

    supplement this with daily, differential, or incremental backups. You may also want to

    create an extended backup set for monthly and quarterly backups that includes additional

    files that aren't being backed up regularly.

    Tip You'll often find that weeks or months can go by before anyone notices that a file ordata source is missing. This doesn't mean the file isn't important. Although some types of

    data aren't used often, they're still needed. So don't forget that you may also want to create

    extra sets of backups for monthly or quarterly periods, or both, to ensure that you can

    recover historical data over time.

    Differential and Incremental Backups

    The difference between differential and incremental backups is extremely important. To

    understand the distinction between them, examine Table 1. As it shows, with differential

    backups you back up all the files that have changed since the last full backup (which

    means that the size of the differential backup grows over time). With incremental backups,

    you only back up files that have changed since the most recent full or incremental backup

    (which means the size of the incremental backup is usually much smaller than a full

    backup).

    Table -1 Incremental and Differential Backup Techniques

    Day of

    Week

    Weekly Full Backup with Daily

    Differential Backup

    Weekly Full Backup with Daily

    Incremental Backup

    Sunday A full backup is performed. A full backup is performed.

    MondayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Sunday.

    TuesdayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Monday.

  • 8/12/2019 NDIM

    53/64

    53

    WednesdayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Tuesday.

    ThursdayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Wednesday.

    FridayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Thursday.

    SaturdayA differential backup contains all

    changes since Sunday.

    An incremental backup contains

    changes since Friday.

    Once you determine what data you're going to back up and how often, you can select

    backup devices and media that support these choices. These are covered in the next

    section.

    Selecting Backup Devices and Media

    Many tools are available for backing up data. Some are fast and expensive. Others are

    slow but very reliable. The backup solution that's right for your organization depends on

    many factors, including

    CapacityThe amount of data that you need to back up on a routine basis. Can thebackup hardware support the required load given your time and resource

    constraints?

    ReliabilityThe reliability of the backup hardware and media. Can you afford tosacrifice reliability to meet budget or time needs?

    ExtensibilityThe extensibility of the backup solution. Will this solution meet yourneeds as the organization grows?

    SpeedThe speed with which data can be backed up and recovered. Can you affordto sacrifice speed to reduce costs?

    CostThe cost of the backup solution. Does it fit into your budget?Common Backup Solutions

    Capacity, reliability, extensibility, speed, and cost are the issues driving your backup plan.

    If you understand how these issues affect your organization, you'll be on track to select an

    appropriate backup solution. Some of the most commonly used backup solutions include

    Tape drives Tape drives are the most common backup devices. Tape drives usemagnetic tape cartridges to store data. Magnetic tapes are relatively inexpensive

    bu