compiler design | Gate Vidyalay

Operator Precedence Parsing

Compiler Design

Operator Precedence Grammar-

A grammar that satisfies the following 2 conditions is called as Operator Precedence Grammar–

There exists no production rule which contains ε on its RHS.
There exists no production rule which contains two non-terminals adjacent to each other on its RHS.

It represents a small class of grammar.
But it is an important class because of its widespread applications.

Examples-

Operator Precedence Parser-

A parser that reads and understand an operator precedence grammar

is called as Operator Precedence Parser.

Designing Operator Precedence Parser-

In operator precedence parsing,

Firstly, we define precedence relations between every pair of terminal symbols.
Secondly, we construct an operator precedence table.

Defining Precedence Relations-

The precedence relations are defined using the following rules-

Rule-01:

If precedence of b is higher than precedence of a, then we define a < b
If precedence of b is same as precedence of a, then we define a = b
If precedence of b is lower than precedence of a, then we define a > b

Rule-02:

An identifier is always given the higher precedence than any other symbol.
$ symbol is always given the lowest precedence.

Rule-03:

If two operators have the same precedence, then we go by checking their associativity.

Parsing A Given String-

The given input string is parsed using the following steps-

Step-01:

Insert the following-

$ symbol at the beginning and ending of the input string.
Precedence operator between every two symbols of the string by referring the operator precedence table.

Step-02:

Start scanning the string from LHS in the forward direction until > symbol is encountered.
Keep a pointer on that location.

Step-03:

Start scanning the string from RHS in the backward direction until < symbol is encountered.
Keep a pointer on that location.

Step-04:

Everything that lies in the middle of < and > forms the handle.
Replace the handle with the head of the respective production.

Step-05:

Keep repeating the cycle from Step-02 to Step-04 until the start symbol is reached.

Advantages-

The advantages of operator precedence parsing are-

The implementation is very easy and simple.
The parser is quite powerful for expressions in programming languages.

Disadvantages-

The disadvantages of operator precedence parsing are-

The handling of tokens known to have two different precedence becomes difficult.
Only small class of grammars can be parsed using this parser.

Important Note-

In practice, operator precedence table is not stored by the operator precedence parsers.
This is because it occupies the large space.
Instead, operator precedence parsers are implemented in a very unique style.
They are implemented using operator precedence functions.

Operator Precedence Functions-

Precedence functions perform the mapping of terminal symbols to the integers.

To decide the precedence relation between symbols, a numerical comparison is performed.
It reduces the space complexity to a large extent.

Also Read- Shift Reduce Parsing

PRACTICE PROBLEMS BASED ON OPERATOR PRECEDENCE PARSING-

Problem-01:

Consider the following grammar-

E → EAE | id

A → + | x

Construct the operator precedence parser and parse the string id + id x id.

Solution-

Step-01:

We convert the given grammar into operator precedence grammar.

The equivalent operator precedence grammar is-

E → E + E | E x E | id

Step-02:

The terminal symbols in the grammar are { id, + , x , $ }

We construct the operator precedence table as-

	id	+	x	$
id		>	>	>
+	<	>	<	>
x	<	>	>	>
$	<	<	<

Operator Precedence Table

Parsing Given String-

Given string to be parsed is id + id x id.

We follow the following steps to parse the given string-

Step-01:

We insert $ symbol at both ends of the string as-

$ id + id x id $

We insert precedence operators between the string symbols as-

$ < id > + < id > x < id > $

Step-02:

We scan and parse the string as-

$ < id > + < id > x < id > $

$ E + < id > x < id > $

$ E + E x < id > $

$ E + E x E $

$ + x $

$ < + < x > $

$ < + > $

$ $

Problem-02:

Consider the following grammar-

S → ( L ) | a

L → L , S | S

Construct the operator precedence parser and parse the string ( a , ( a , a ) ).

Solution-

The terminal symbols in the grammar are { ( , ) , a , , }

We construct the operator precedence table as-

	a	(	)	,	$
a		>	>	>	>
(	<	>	>	>	>
)	<	>	>	>	>
,	<	<	>	>	>
$	<	<	<	<

Operator Precedence Table

Parsing Given String-

Given string to be parsed is ( a , ( a , a ) ).

We follow the following steps to parse the given string-

Step-01:

We insert $ symbol at both ends of the string as-

$ ( a , ( a , a ) ) $

We insert precedence operators between the string symbols as-

$ < ( < a > , < ( < a > , < a > ) > ) > $

Step-02:

We scan and parse the string as-

$ < ( < a > , < ( < a > , < a > ) > ) > $

$ < ( S , < ( < a > , < a > ) > ) > $

$ < ( S , < ( S , < a > ) > ) > $

$ < ( S , < ( S , S ) > ) > $

$ < ( S , < ( L , S ) > ) > $

$ < ( S , < ( L ) > ) > $

$ < ( S , S ) > $

$ < ( L , S ) > $

$ < ( L ) > $

$ < S > $

$ $

Problem-03:

Consider the following grammar-

E → E + E | E x E | id

Construct Operator Precedence Parser.
Find the Operator Precedence Functions.

Solution-

The terminal symbols in the grammar are { + , x , id , $ }

We construct the operator precedence table as-

g →
f ↓		id	+	x	$
	id		>	>	>
	+	<	>	<	>
	x	<	>	>	>
	$	<	<	<

Operator Precedence Table

The graph representing the precedence functions is-

Here, the longest paths are-

f_id→ g_x → f₊ → g₊ → f_$
g_id→ f_x → g_x → f₊ → g₊ → f_$

The resulting precedence functions are-

	+	x	id	$
f	2	4	4	0
g	1	3	5	0

To gain better understanding about Operator Precedence Parsing,

Watch this Video Lecture

Download Handwritten Notes Here-

Next Article- Three Address Code

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Quadruples, Triples and Indirect Triples

Compiler Design

Three Address Code-

Before you go through this article, make sure that you have gone through the previous article on Three Address Code.

We have discussed-

Three Address Code is a form of an intermediate code.
It is generated by the compiler for implementing Code Optimization.
It uses maximum three addresses to represent any statement.

In this article, we will discuss Implementation of Three Address Code.

Implementation-

Three Address Code is implemented as a record with the address fields.

The commonly used representations for implementing Three Address Code are-

Quadruples
Triples
Indirect Triples

1. Quadruples-

In quadruples representation, each instruction is splitted into the following 4 different fields-

op, arg1, arg2, result

Here-

The op field is used for storing the internal code of the operator.
The arg1 and arg2 fields are used for storing the two operands used.
The result field is used for storing the result of the expression.

Exceptions

There are following exceptions-

Exception-01:

To represent the statement x = op y, we place-

op in the operator field
y in the arg1 field
x in the result field
arg2 field remains unused

Exception-02:

To represent the statement like param t1, we place-

param in the operator field
t1 in the arg1 field
Neither arg2 field nor result field is used

Exception-03:

To represent the unconditional and conditional jump statements, we place label of the target in the result field.

2. Triples-

In triples representation,

References to the instructions are made.
Temporary variables are not used.

3. Indirect Triples-

This representation is an enhancement over triples representation.
It uses an additional instruction array to list the pointers to the triples in the desired order.
Thus, instead of position, pointers are used to store the results.
It allows the optimizers to easily re-position the sub-expression for producing the optimized code.

PRACTICE PROBLEMS BASED ON QUADRUPLES, TRIPLES & INDIRECT TRIPLES-

Problem-01:

Translate the following expression to quadruple, triple and indirect triple-

a + b x c / e ↑ f + b x c

Solution-

Three Address Code for the given expression is-

T1 = e ↑ f

T2 = b x c

T3 = T2 / T1

T4 = b x a

T5 = a + T3

T6 = T5 + T4

Now, we write the required representations-

Quadruple Representation-

Location	Op	Arg1	Arg2	Result
(0)	↑	e	f	T1
(1)	x	b	c	T2
(2)	/	T2	T1	T3
(3)	x	b	a	T4
(4)	+	a	T3	T5
(5)	+	T5	T4	T6

Triple Representation-

Location	Op	Arg1	Arg2
(0)	↑	e	f
(1)	x	b	c
(2)	/	(1)	(0)
(3)	x	b	a
(4)	+	a	(2)
(5)	+	(4)	(3)

Indirect Triple Representation-

	Statement
35	(0)
36	(1)
37	(2)
38	(3)
39	(4)
40	(5)

Location	Op	Arg1	Arg2
(0)	↑	e	f
(1)	x	b	e
(2)	/	(1)	(0)
(3)	x	b	a
(4)	+	a	(2)
(5)	+	(4)	(3)

Problem-02:

Translate the following expression to quadruple, triple and indirect triple-

a = b x – c + b x – c

Solution-

Three Address Code for the given expression is-

T1 = uminus c

T2 = b x T1

T3 = uminus c

T4 = b x T3

T5 = T2 + T4

a = T5

Now, we write the required representations-

Quadruple Representation-

Location	Op	Arg1	Arg2	Result
(1)	uminus	c		T1
(2)	x	b	T1	T2
(3)	uminus	c		T3
(4)	x	b	T3	T4
(5)	+	T2	T4	T5
(6)	=	T5		a

Triple Representation-

Location	Op	Arg1	Arg2
(1)	uminus	c
(2)	x	b	(1)
(3)	uminus	c
(4)	x	b	(3)
(5)	+	(2)	(4)
(6)	=	a	(5)

Indirect Triple Representation-

	Statement
35	(1)
36	(2)
37	(3)
38	(4)
39	(5)
40	(6)

Location	Op	Arg1	Arg2
(1)	uminus	c
(2)	x	b	(1)
(3)	uminus	c
(4)	x	b	(3)
(5)	+	(2)	(4)
(6)	=	a	(5)

To gain better understanding about Quadruples, Triples and Indirect Triples,

Watch this Video Lecture

Download Handwritten Notes Here-

Next Article- Basic Blocks and Flow Graphs

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Code Optimization | Code Optimization Techniques

Compiler Design

Code Optimization-

Code Optimization is an approach to enhance the performance of the code.

The process of code optimization involves-

Eliminating the unwanted code lines
Rearranging the statements of the code

Advantages-

The optimized code has the following advantages-

Optimized code has faster execution speed.
Optimized code utilizes the memory efficiently.
Optimized code gives better performance.

Code Optimization Techniques-

Important code optimization techniques are-

Compile Time Evaluation
Common sub-expression elimination
Dead Code Elimination
Code Movement
Strength Reduction

1. Compile Time Evaluation-

Two techniques that falls under compile time evaluation are-

A) Constant Folding-

In this technique,

As the name suggests, it involves folding the constants.
The expressions that contain the operands having constant values at compile time are evaluated.
Those expressions are then replaced with their respective results.

Example-

Circumference of Circle = (22/7) x Diameter

Here,

This technique evaluates the expression 22/7 at compile time.
The expression is then replaced with its result 3.14.
This saves the time at run time.

B) Constant Propagation-

In this technique,

If some variable has been assigned some constant value, then it replaces that variable with its constant value in the further program during compilation.
The condition is that the value of variable must not get alter in between.

Example-

pi = 3.14

radius = 10

Area of circle = pi x radius x radius

Here,

This technique substitutes the value of variables ‘pi’ and ‘radius’ at compile time.
It then evaluates the expression 3.14 x 10 x 10.
The expression is then replaced with its result 314.
This saves the time at run time.

2. Common Sub-Expression Elimination-

The expression that has been already computed before and appears again in the code for computation

is called as Common Sub-Expression.

In this technique,

As the name suggests, it involves eliminating the common sub expressions.
The redundant expressions are eliminated to avoid their re-computation.
The already computed result is used in the further program when required.

Example-

Code Before Optimization

Code After Optimization

S1 = 4 x i

S2 = a[S1]

S3 = 4 x j

S4 = 4 x i // Redundant Expression

S5 = n

S6 = b[S4] + S5

S1 = 4 x i

S2 = a[S1]

S3 = 4 x j

S5 = n

S6 = b[S1] + S5

3. Code Movement-

In this technique,

As the name suggests, it involves movement of the code.
The code present inside the loop is moved out if it does not matter whether it is present inside or outside.
Such a code unnecessarily gets execute again and again with each iteration of the loop.
This leads to the wastage of time at run time.

Example-

Code Before Optimization

Code After Optimization

for ( int j = 0 ; j < n ; j ++)

{

x = y + z ;

a[j] = 6 x j;

}

x = y + z ;

for ( int j = 0 ; j < n ; j ++)

{

a[j] = 6 x j;

}

4. Dead Code Elimination-

In this technique,

As the name suggests, it involves eliminating the dead code.
The statements of the code which either never executes or are unreachable or their output is never used are eliminated.

Example-

Code Before Optimization

Code After Optimization

i = 0 ;

if (i == 1)

{

a = x + 5 ;

}

i = 0 ;

5. Strength Reduction-

In this technique,

As the name suggests, it involves reducing the strength of expressions.
This technique replaces the expensive and costly operators with the simple and cheaper ones.

Example-

Code Before Optimization	Code After Optimization
B = A x 2	B = A + A

Here,

The expression “A x 2” is replaced with the expression “A + A”.
This is because the cost of multiplication operator is higher than that of addition operator.

To gain better understanding about Code Optimization Techniques,

Watch this Video Lecture

Download Handwritten Notes Here-

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Three Address Code | Examples

Compiler Design

Three Address Code-

Three Address Code is a form of an intermediate code.

The characteristics of Three Address instructions are-

They are generated by the compiler for implementing Code Optimization.
They use maximum three addresses to represent any statement.
They are implemented as a record with the address fields.

General Form-

In general, Three Address instructions are represented as-

a = b op c

Here,

a, b and c are the operands.
Operands may be constants, names, or compiler generated temporaries.
op represents the operator.

Examples-

Examples of Three Address instructions are-

a = b + c
c = a x b

Common Three Address Instruction Forms-

The common forms of Three Address instructions are-

1. Assignment Statement-

x = y op z and x = op y

Here,

x, y and z are the operands.
op represents the operator.

It assigns the result obtained after solving the right side expression of the assignment operator to the left side operand.

2. Copy Statement-

x = y

Here,

x and y are the operands.
= is an assignment operator.

It copies and assigns the value of operand y to operand x.

3. Conditional Jump-

If x relop y goto X

Here,

x & y are the operands.
X is the tag or label of the target statement.
relop is a relational operator.

If the condition “x relop y” gets satisfied, then-

The control is sent directly to the location specified by label X.
All the statements in between are skipped.

If the condition “x relop y” fails, then-

The control is not sent to the location specified by label X.
The next statement appearing in the usual sequence is executed.

4. Unconditional Jump-

goto X

Here, X is the tag or label of the target statement.

On executing the statement,

The control is sent directly to the location specified by label X.
All the statements in between are skipped.

5. Procedure Call-

param x call p return y

Here, p is a function which takes x as a parameter and returns y.

PRACTICE PROBLEMS BASED ON THREE ADDRESS CODE-

To solve the problems, Learn about the Precedence Relations and Associativity of Operators.

Problem-01:

Write Three Address Code for the following expression-

a = b + c + d

Solution-

The given expression will be solved as-

Three Address Code for the given expression is-

(1) T1 = b + c

(2) T2 = T1 + d

(3) a = T2

Problem-02:

Write Three Address Code for the following expression-

-(a x b) + (c + d) – (a + b + c + d)

Solution-

Three Address Code for the given expression is-

(1) T1 = a x b

(2) T2 = uminus T1

(3) T3 = c + d

(4) T4 = T2 + T3

(5) T5 = a + b

(6) T6 = T3 + T5

(7) T7 = T4 – T6

Problem-03:

Write Three Address Code for the following expression-

If A < B then 1 else 0

Solution-

Three Address Code for the given expression is-

(1) If (A < B) goto (4)

(2) T1 = 0

(3) goto (5)

(4) T1 = 1

(5)

Problem-04:

Write Three Address Code for the following expression-

If A < B and C < D then t = 1 else t = 0

Solution-

Three Address Code for the given expression is-

(1) If (A < B) goto (3)

(2) goto (4)

(3) If (C < D) goto (6)

(4) t = 0

(5) goto (7)

(6) t = 1

(7)

Also Read- More Problems On Three Address Code

To gain better understanding about Three Address Code,

Watch this Video Lecture

Download Handwritten Notes Here-

Next Article- Quadruples, Triples & Indirect Triples

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Directed Acyclic Graphs | DAGs | Examples

Compiler Design

Directed Acyclic Graph-

Directed Acyclic Graph (DAG) is a special kind of Abstract Syntax Tree.

Each node of it contains a unique value.
It does not contain any cycles in it, hence called Acyclic.

Optimization Of Basic Blocks-

DAG is a very useful data structure for implementing transformations on Basic Blocks.

A DAG is constructed for optimizing the basic block.
A DAG is usually constructed using Three Address Code.
Transformations such as dead code elimination and common sub expression elimination are then applied.

Properties-

Reachability relation forms a partial order in DAGs.
Both transitive closure & transitive reduction are uniquely defined for DAGs.
Topological Orderings are defined for DAGs.

Applications-

DAGs are used for the following purposes-

To determine the expressions which have been computed more than once (called common sub-expressions).
To determine the names whose computation has been done outside the block but used inside the block.
To determine the statements of the block whose computed value can be made available outside the block.
To simplify the list of Quadruples by not executing the assignment instructions x:=y unless they are necessary and eliminating the common sub-expressions.

Construction of DAGs-

Following rules are used for the construction of DAGs-

Rule-01:

In a DAG,

Interior nodes always represent the operators.
Exterior nodes (leaf nodes) always represent the names, identifiers or constants.

Rule-02:

While constructing a DAG,

A check is made to find if there exists any node with the same value.
A new node is created only when there does not exist any node with the same value.
This action helps in detecting the common sub-expressions and avoiding the re-computation of the same.

Rule-03:

The assignment instructions of the form x:=y are not performed unless they are necessary.

Also Read- Code Optimization

PRACTICE PROBLEMS BASED ON DIRECTED ACYCLIC GRAPHS-

Problem-01:

Consider the following expression and construct a DAG for it-

( a + b ) x ( a + b + c )

Solution-

Three Address Code for the given expression is-

T1 = a + b

T2 = T1 + c

T3 = T1 x T2

Now, Directed Acyclic Graph is-

NOTE

From the constructed DAG, we observe-

The common sub-expression (a+b) has been expressed into a single node in the DAG.
The computation is carried out only once and stored in the identifier T1 and reused later.

This illustrates how the construction scheme of a DAG identifies the common sub-expression and helps in eliminating its re-computation later.

Problem-02:

Consider the following expression and construct a DAG for it-

( ( ( a + a ) + ( a + a ) ) + ( ( a + a ) + ( a + a ) ) )

Solution-

Directed Acyclic Graph for the given expression is-

Problem-03:

Consider the following block and construct a DAG for it-

(1) a = b x c

(2) d = b

(3) e = d x c

(4) b = e

(5) f = b + c

(6) g = f + d

Solution-

Directed Acyclic Graph for the given block is-

Problem-04:

Optimize the block in the Problem-03.

Solution-

Step-01:

Firstly, construct a DAG for the given block (already done above).

Step-02:

Now, the optimized block can be generated by traversing the DAG.

The common sub-expression e = d x c which is actually b x c (since d = b) is eliminated.
The dead code b = e is eliminated.

The optimized block is-

(1) a = b x c

(2) d = b

(3) f = a + c

(4) g = f + d

Problem-05:

Consider the following basic block-

B10:

S1 = 4 x I

S2 = addr(A) – 4

S3 = S2[S1]

S4 = 4 x I

S5 = addr(B) – 4

S6 = S5[S4]

S7 = S3 x S6

S8 = PROD + S7

PROD = S8

S9 = I + 1

I = S9

If I <= 20 goto L10

Draw a directed acyclic graph and identify local common sub-expressions.
After eliminating the common sub-expressions, re-write the basic block.

Solution-

Directed Acyclic Graph for the given basic block is-

In this code fragment,

4 x I is a common sub-expression. Hence, we can eliminate because S1 = S4.
We can optimize S8 = PROD + S7 and PROD = S8 as PROD = PROD + S7.
We can optimize S9 = I + 1 and I = S9 as I = I + 1.

After eliminating S4, S8 and S9, we get the following basic block-

B10:

S1 = 4 x I

S2 = addr(A) – 4

S3 = S2[S1]

S5 = addr(B) – 4

S6 = S5[S1]

S7 = S3 x S6

PROD = PROD + S7

I = I + 1

If I <= 20 goto L10

To gain better understanding about Directed Acyclic Graphs,

Watch this Video Lecture

Download Handwritten Notes Here-

Next Article- Misc Problems On Directed Acyclic Graphs

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Tag: compiler design

Operator Precedence Grammar-

Examples-

Operator Precedence Parser-

Designing Operator Precedence Parser-

Defining Precedence Relations-

Rule-01:

Rule-02:

Rule-03:

Parsing A Given String-

Step-01:

Step-02:

Step-03:

Step-04:

Step-05:

Advantages-

Disadvantages-

Important Note-

Operator Precedence Functions-

PRACTICE PROBLEMS BASED ON OPERATOR PRECEDENCE PARSING-

Problem-01:

Solution-

Step-01:

Step-02:

Operator Precedence Table

Parsing Given String-

Step-01:

Step-02:

Problem-02:

Solution-

Operator Precedence Table

Parsing Given String-

Step-01:

Step-02:

Problem-03:

Solution-

Operator Precedence Table

Download Handwritten Notes Here-

Three Address Code-

Implementation-

1. Quadruples-

Exceptions

Exception-01:

Exception-02:

Exception-03:

2. Triples-

3. Indirect Triples-

PRACTICE PROBLEMS BASED ON QUADRUPLES, TRIPLES & INDIRECT TRIPLES-

Problem-01:

Solution-

Quadruple Representation-

Triple Representation-

Indirect Triple Representation-

Problem-02:

Solution-

Quadruple Representation-

Triple Representation-

Indirect Triple Representation-

Download Handwritten Notes Here-

Code Optimization-

Advantages-

Code Optimization Techniques-

1. Compile Time Evaluation-

A) Constant Folding-

Example-

B) Constant Propagation-

Example-

2. Common Sub-Expression Elimination-

Example-

3. Code Movement-

Example-

4. Dead Code Elimination-

Example-

5. Strength Reduction-

Example-

Download Handwritten Notes Here-

Three Address Code-

General Form-

Examples-

Common Three Address Instruction Forms-