compiler khata

8/10/2019 Compiler Khata

1/35

Compiler DesignProblem

No

Name of the problem

01. Write a C program for developing a lexical analyzer (LA) that will eliminate white

spaces form a source program in c and collect numbers.

02. Write a C program for developing a lexical analyzer (LA) that will eliminate whitespaces form a source program in c and collect numbers as token and then alsodisplay the token value as attribute.

03. Write a C program for developing a lexical analyzer (LA) that will recognize all

basic data type of C.

04. Write a C program for developing a lexical analyzer (LA) that will recognize allKeywords of C.

05. Write a C program for developing a lexical analyzer (LA) that will eliminate white

spaces and comments form a C program.

06. Write a C program for developing a lexical analyzer (LA) that will recognize

Variables of C a source program.

07. Write a C program for developing a lexical analyzer (LA) that will generate tokenfor a given statement of C source program.

08. Design a compiler front-end based on syntax-directed translation technique that willfunction as an infix translator for a language consists of sequence of expressions

terminated by semicolon.


2/35

Problem No.01

Problem Name:

Write a C program for developing a lexical analyzer(LA) that will eliminate

white spaces from a source program in C and collect numbers.

Problem analysis:

Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the

character in the assignment statement

Position :=initial + rate *60

Would be grouped into the following tokens:

1. The identifier position

2. The assignment symbol:=

3. The identifier initial

4. The plus sign.

5. The identifier the rate

6. The multiplication sign

7. The number 60.

The blanks separating the characters of these tokens would normally be eliminated duringlexical analysis.

The lexical analyzer is the first phase of compiler .Its main task is to read the input character

and produce as output a sequence of tokens that the parser uses for syntax analysis. This

interaction, summarized schematically in fig(a), is commonly implemented by making the

lexical analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next

token command from the parser, the lexical analyzer reads input characters until it can

identify the next token.

source programLexical

anlyzer

parser

Symbol table


3/35

Fig.(a): Interaction of lexical analyzer with parser.

Since the lexical analyzer is the part of the compiler that reads the source text, it may also

perform certain secondary tasks at the user interface. One such task is stripping out from the

source program comments and white space in the form of blank ,tab, and newline characters. The

lexical analyzer may keep track of the number of newline characters seen ,so that line numbercan be associated with an error message.

The purpose of the lexical analyzer is to allow white space and numbers to appear within

expressions.

uses getchar() returns token to caller

to read character

pushes back c using

ungetc(c,stdin)

Fig(b):implementing the interaction of source program

Figure (b):suggest how the lexical analyzer ,written as the function lexan in C. The routine

getchar and ungetc from standards include-file take care of input buffering ; lexan

reads and pushes back input characters by calling the routines getchar and ungetc

respectively. With c declared to be a character, the pair of statements

c-getchar(); ungetc(c,stdin);

leaves the input stream undisturbed. The call of getchar assigns the next input character to c

; the call of ungetc pushes back the value c onto the standard stdin.

If the implementation language does not allow data structure to be returned from functions

,then tokens and their attributes have to be passed separately. The function lexan an integer

encoding of a tokens. A token , such as num , can then be encoded by an integer larger than

any integer encoding a character, say 256. We define the statement :

#define NUM 256

The function lexan returns NUM when a sequence of digits is seen in the input. A global

variable tokenval is set to the value of the sequence of digits. Thus , if a 7 is followed

immediately by a 6 in the input , tokenval is assigned the integer value 76.

Lexan()Lexical

analyzer

Tokenval


4/35

Code

#include

#include

#include

void main()

{

char t,f;

int n;

FILE *f1, *f2;

f1=fopen("c:\\compile\\input.txt","r");

f2=fopen("c:\\compile\\output.txt","w");

while( (t=getc(f1)) !=EOF)

{

if(t==' ') ;

else if(isdigit(t) && f!='_')

if(65


5/35

{

n=0;

while(isdigit(t))

{

putc(t,f2);

n=n*10+(t-48);

t=getc(f1);

}

printf("%d\n",n);

}

else putc(t,f2);

}

fclose(f1);

fclose(f2);

return(0);

}

INPUT:

void main(){

FILE *f1,*f2;long int a;

char c[100];f1=fopen ("testinput.cpp","r");f2=fopen("testoutput.cpp","w");

while(fscanf(f1,"%s",c)!=EOF) /* reading value from file */{int line=1;

if(c[0]=='\n'){fprintf(f2,"\n",);line++;}

else if(!isdigit(c[0]))


6/35

fprintf(f2,"%s",c);else if(isdigit(c[0])

{a=c[0]-'10';int i=1;

j=120;

while(isdigit(c[i])){a=a*10+c[i]-'0';i++;}

printf("Number %ld in line no. %d\n",a,line);

}}}

OUTPUT:

voidmain()

{

FILE*f1,*f2;longinta;charc[100];

f1=fopen("testinput.cpp","r");f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)/*readingvaluefromfile*/{

intline=Num(1);if(c[0]=='\n'){fprintf(f2,"\n",);line++;}

elseif(!isdigit(c[0]))fprintf(f2,"%s",c);

elseif(isdigit(c[0]){a=c[0]-'Num(10)';

inti=Num(1);j=Num(120);

while(isdigit(c[i])){a=a*Num(10)+c[i]-'Num(0)';

i++;}

printf("Number%ldinlineno.%d\n",a,line);}}}

Result and Discussion:

This program has been written in C/C++ language and successfully eliminate white space from

a source program and collect number as Num.


7/35

Problem No.02

Problem Name:

Write a C program for developing a lexical analyzer(LA) that will eliminate white

spaces from a source program in C and collect numbers as token and then also

display the token and token value attribute.

Problem analysis:

Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis the

character in the assignment statement



1. The identifier position

2. The assignment symbol:=

3. The identifier initial

4. The plus sign.

5. The identifier the rate

6. The multiplication sign

7. The number 60.

The blanks separating the characters of these tokens would normally be eliminated during

lexical analysis.







source program


Lexical

anlyzer

parser

Symbol table


8/35




lexical analyzer may keep track of the number of newline characters seen ,so that line number

can be associated with an error message.

Tokens: The smallest individual unit in a source program are known as token.

CODE

#include

#include

#include

void main()

{

clrscr();

char t,f;

int n;

FILE *f1, *f2;



printf(" Token Token value as attributes\n-----------------------------------------");


{

if(t==' ') ;

else if(isdigit(t) && f!='_')

if(65


9/35

putc(t,f2);

}

else

{

n=0;

while(isdigit(t))

{

putc(t,f2);

n=n*10+(t-48);

t=getc(f1);

}

printf("\n num %d",n);

if(t!=' ') putc(t,f2);

}

else putc(t,f2);

f=t;

}

fclose(f1);

fclose(f2);

getch();

}


10/35

INPUT:

void main(){

FILE *f1,*f2;long int a;char c[100];f1=fopen ("testinput.cpp","r");

f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)

{int line=1;if(c[0]=='\n')

{fprintf(f2,"\n",);line++;}else if(!isdigit(c[0]))fprintf(f2,"%s",c);

else if(isdigit(c[0]){a=c[0]-'10';int i=1;

j=120;

}}}

OUTPUT:

voidmain(){

FILE*f1,*f2;

longinta;charc[100];f1=fopen("testinput.cpp","r");f2=fopen("testoutput.cpp","w");while(fscanf(f1,"%s",c)!=EOF)

{intline=1;if(c[0]=='\n'){fprintf(f2,"\n",);line++;}elseif(!isdigit(c[0]))

fprintf(f2,"%s",c);elseif(isdigit(c[0])

{a=c[0]-'10';

inti=1;j=120;}

}}


11/35

NUM 1NUM 10

NUM 1NUM 120NUM 10NUM 0


This program has been written in C/C++ language and successfully eliminate white space from

a source program and collect numbers as token and then also display the token and token value

as attributes.


12/35

Problem Name.03

Write a C program for developing a lexical analyzer(LA) that will recognize all

basic data types of C.

Problem analysis:







source program






can be associated with an error message. The basic data types in a c program are int, float, char

,double, longint .

CODE

#include

#include

#include

void main()

{

clrscr();

Lexical

anlyzer

parser

Symbol table


13/35

char *ch;

FILE *f1;


while((fscanf(f1,"%s",ch)) !=EOF)

{

if(strcmp("int",ch)==0 || strcmp("char",ch)==0 || strcmp("float",ch)==0 ||

strcmp("double",ch)==0)

printf("%s\n",ch);

}

fclose(f1);

getch();

}

INPUT:

int main(){

int a,b,c;float s;chart s;

}

OUTPUT:

int

float

char


This program has been written in C/C++ language and that will successfully recognize all basic

data types of C.


14/35

Problem No.04

Problem Name:

Write a C program for developing a lexical analyzer(LA) that will recognize all

Keywords of C.

Problem analysis:





token command from the parser, the lexical analyzer reads input characters until it canidentify the next token.

source program






can be associated with an error message.The keyword of C language are

For ,auto ,if,else,break,case,char ,const,continue,default,do,double,enum,float,

goto,int,long,register,return,short,signed,sizeof,static,stuct,switch,typedef,union,unsigned,void,

volatile,while.

Lexical

anlyzer

parser

Symbol table


15/35

CODE

#include

#include

#include

void main()

{

clrscr();

char *t;

char *k[]={"auto","break","case","void","char","int","const","continue","default",

"do","double","else","enum","extren","float","if","while","for"};

int n,i;

FILE *f1, *f2;


while( (fscanf(f1,"%s",t)) !=EOF)

{

for(i=0;i


16/35

INPUT:

#include #include #include

#include

int main(void){

int i,j;

for(j=0;j


17/35

Problem No.05

Problem Name:

Write a C program for developing a lexical analyzer(LA) that will eliminate white

spaces and comments from a source program in C .

Problem analysis:

Linear analysis is called lexical analysis or scanning. For example ,in lexical analysis thecharacter in the assignment statement



The identifier positionThe assignment symbol:=

The identifier initial

The plus sign.The identifier the rate

The multiplication sign

The number 60.The blanks separating the characters of these tokens would normally be eliminated during lexical

analysis.


and produce as output a sequence of tokens that the parser uses for syntax analysis. Thisinteraction, summarized schematically in fig(a), is commonly implemented by making the lexical

analyzer be a subroutine or a coroutine of the parser. Upon receiving a get next token

command from the parser, the lexical analyzer reads input characters until it can identify the next

token.

source program




Lexical

anlyzer

parser

Symbol table


18/35



can be associated with an error message.

CODE

#include

#include

#include

void main()

{

clrscr();

char t,t1;

int n,s;

FILE *f1, *f2;




{

if(t==' ')

;

else if(t=='/')

{

t=getc(f1);

if(t=='*')

{ s=5;


19/35

while(s)

{

t=getc(f1);

if(t=='*')

{t=getc(f1); if(t=='/') s=0;}

}

}

}

else putc(t,f2);

}

fclose(f1);

fclose(f2);

getch();

}

INPUT:

#include

#includevoid main(){

clrscr();int p,q,m,n;

printf("How many line ");scanf("%d",&n);/* n is the number of input*/

printf("\n\n");

for(p=1;p


20/35

printf("%2d",(m--%10));printf("\n");

}getch();}

OUTPUT:

#include

#includevoidmain(){

clrscr();intp,q,m,n;

printf("Howmanyline");scanf("%d",&n);

printf("\n\n");

for(p=1;p


21/35

Problem No.06

Problem Name:

Write a C program for developing a lexical analyzer(LA) that will generate token

for a given statement of C source program.

Problem analysis:







source program






can be associated with an error message. The smallest individual unit in a source program are

known as token.

CODING:

#include

#include

#includeint keyword(char buf[]);char

*key[]={"auto","break","case","char","const","continue","default","do","double","else","enum",

"extern","float","for","goto","if","int","long","register","return","short","signed","sizeof","static","struct","switch","typedef","union","unsigned","void","volatile","while","\0"};

void main()

{

Lexical

anlyzer

parser

Symbol table


22/35

char c,buf[100];

FILE *f;

f=fopen("c6input.cpp","r");c=getc(f);

printf("Token Attribute value:\n");

while(c!=EOF){int i=0;

if(isalpha(c))

{buf[i]=c;i++;

c=getc(f);

while(isalpha(c)||isdigit(c)||c=='_')

{buf[i]=c;

c=getc(f);

i++;}

buf[i]='\0';

if(keyword(buf)==0)

printf("ID %s\n",buf);else

printf("%s %s\n",buf,buf);

}else if(isdigit(c))

{

int a=c-'0';

c=getc(f);while(isdigit(c))

{

a=a* 10 +c-'0';c=getc(f);

}

if(c=='.'){

c=getc(f);

char b[10];int i=0;

while(isdigit(c)){

b[i]=c;i++;

c=getc(f);

}b[i]='\0';

printf("Num %d.%s\n",a,b);

}else


23/35

printf("Num %d\n",a);

}

else if(c==''||c=='='){

char k=c;

c=getc(f);if(c=='='){

printf("RE %c%c\n",k,c);

c=getc(f);}

else

printf("RE %c\n",k);

}else

{

if(c!='\n'&&c!=' ')printf("Punchuation %c\n",c);

c=getc(f);

}

//c=getc(f);}

fclose(f);

}

int keyword(char buf[])

{

int i=0;while(*(key+i)!='\0')

{

if(strcmp(*(key+i),buf)==0)return 1;

i++;

}return 0;

}

INPUT:

(i


24/35

OUTPUT:

Token Attribute value:Punchuation (ID i

RE

compiler khata

Documents