Tag: compiler design

Operator Precedence Parsing

Operator Precedence Grammar-

 

A grammar that satisfies the following 2 conditions is called as Operator Precedence Grammar
  • There exists no production rule which contains ε on its RHS.
  • There exists no production rule which contains two non-terminals adjacent to each other on its RHS.

 

  • It represents a small class of grammar.
  • But it is an important class because of its widespread applications.

 

Examples-

 

 

Operator Precedence Parser-

 

A parser that reads and understand an operator precedence grammar

is called as Operator Precedence Parser.

 

Designing Operator Precedence Parser-

 

In operator precedence parsing,

  • Firstly, we define precedence relations between every pair of terminal symbols.
  • Secondly, we construct an operator precedence table.

 

Defining Precedence Relations-

 

The precedence relations are defined using the following rules-

 

Rule-01:

 

  • If precedence of b is higher than precedence of a, then we define a < b
  • If precedence of b is same as precedence of a, then we define a = b
  • If precedence of b is lower than precedence of a, then we define a > b

 

Rule-02:

 

  • An identifier is always given the higher precedence than any other symbol.
  • $ symbol is always given the lowest precedence.

 

Rule-03:

 

  • If two operators have the same precedence, then we go by checking their associativity.

 

Parsing A Given String-

 

The given input string is parsed using the following steps-

 

Step-01:

 

Insert the following-

  • $ symbol at the beginning and ending of the input string.
  • Precedence operator between every two symbols of the string by referring the operator precedence table.

 

Step-02:

 

  • Start scanning the string from LHS in the forward direction until > symbol is encountered.
  • Keep a pointer on that location.

 

Step-03:

 

  • Start scanning the string from RHS in the backward direction until < symbol is encountered.
  • Keep a pointer on that location.

 

Step-04:

 

  • Everything that lies in the middle of < and > forms the handle.
  • Replace the handle with the head of the respective production.

 

Step-05:

 

Keep repeating the cycle from Step-02 to Step-04 until the start symbol is reached.

 

Advantages-

 

The advantages of operator precedence parsing are-

  • The implementation is very easy and simple.
  • The parser is quite powerful for expressions in programming languages.

 

Disadvantages-

 

The disadvantages of operator precedence parsing are-

  • The handling of tokens known to have two different precedence becomes difficult.
  • Only small class of grammars can be parsed using this parser.

 

Important Note-

 

  • In practice, operator precedence table is not stored by the operator precedence parsers.
  • This is because it occupies the large space.
  • Instead, operator precedence parsers are implemented in a very unique style.
  • They are implemented using operator precedence functions.

 

Operator Precedence Functions-

 

Precedence functions perform the mapping of terminal symbols to the integers.

 

  • To decide the precedence relation between symbols, a numerical comparison is performed.
  • It reduces the space complexity to a large extent.

 

Also Read- Shift Reduce Parsing

 

PRACTICE PROBLEMS BASED ON OPERATOR PRECEDENCE PARSING-

 

Problem-01:

 

Consider the following grammar-

E → EAE | id

A → + | x

Construct the operator precedence parser and parse the string id + id x id.

 

Solution-

 

Step-01:

 

We convert the given grammar into operator precedence grammar.

The equivalent operator precedence grammar is-

E → E + E | E x E | id

 

Step-02:

 

The terminal symbols in the grammar are { id, + , x , $ }

We construct the operator precedence table as-

 

id + x $
id > > >
+ < > < >
x < > > >
$ < < <

Operator Precedence Table

 

Parsing Given String-

 

Given string to be parsed is id + id x id.

We follow the following steps to parse the given string-

 

Step-01:

 

We insert $ symbol at both ends of the string as-

$ id + id x id $

 

We insert precedence operators between the string symbols as-

$ < id > + < id > x < id > $

 

Step-02:

 

We scan and parse the string as-

 

$ < id > + < id > x < id > $

$ E + < id > x < id > $

$ E + E x < id > $

$ E + E x E $

$ + x $

$ < + < x > $

$ < + > $

$ $

 

Problem-02:

 

Consider the following grammar-

S → ( L ) | a

L → L , S | S

Construct the operator precedence parser and parse the string ( a , ( a , a ) ).

 

Solution-

 

The terminal symbols in the grammar are { ( , ) , a , , }

We construct the operator precedence table as-

 

a ( ) , $
a > > > >
( < > > > >
) < > > > >
, < < > > >
$ < < < <

Operator Precedence Table

 

Parsing Given String-

 

Given string to be parsed is ( a , ( a , a ) ).

We follow the following steps to parse the given string-

 

Step-01:

 

We insert $ symbol at both ends of the string as-

$ ( a , ( a , a ) ) $

 

We insert precedence operators between the string symbols as-

$ < ( < a > , < ( < a > , < a > ) > ) > $

 

Step-02:

 

We scan and parse the string as-

 

$ < ( < a > , < ( < a > , < a > ) > ) > $

$ < ( S , < ( < a > , < a > ) > ) > $

$ < ( S , < ( S , < a > ) > ) > $

$ < ( S , < ( S , S ) > ) > $

$ < ( S , < ( L , S ) > ) > $

$ < ( S , < ( L ) > ) > $

$ < ( S , S ) > $

$ < ( L , S ) > $

$ < ( L ) > $

$ < S > $

$ $

 

Problem-03:

 

Consider the following grammar-

E → E + E | E x E | id

  1. Construct Operator Precedence Parser.
  2. Find the Operator Precedence Functions.

 

Solution-

 

The terminal symbols in the grammar are { + , x , id , $ }

We construct the operator precedence table as-

 

g →
f ↓ id + x $
id > > >
+ < > < >
x < > > >
$ < < <

Operator Precedence Table

 

The graph representing the precedence functions is-

 

 

Here, the longest paths are-

  • fid  → gx → f+ → g+ → f$
  • gid  → fx → gx → f+ → g+ → f$

 

The resulting precedence functions are-

 

+ x id $
f 2 4 4 0
g 1 3 5 0

 

To gain better understanding about Operator Precedence Parsing,

Watch this Video Lecture

 

Download Handwritten Notes Here-

 

Next Article- Three Address Code

 

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Quadruples, Triples and Indirect Triples

Three Address Code-

 

Before you go through this article, make sure that you have gone through the previous article on Three Address Code.

 

We have discussed-

  • Three Address Code is a form of an intermediate code.
  • It is generated by the compiler for implementing Code Optimization.
  • It uses maximum three addresses to represent any statement.

 

In this article, we will discuss Implementation of Three Address Code.

 

Implementation-

 

Three Address Code is implemented as a record with the address fields.

 

The commonly used representations for implementing Three Address Code are-

 

 

  1. Quadruples
  2. Triples
  3. Indirect Triples

 

1. Quadruples-

 

In quadruples representation, each instruction is splitted into the following 4 different fields-

op, arg1, arg2, result

Here-

  • The op field is used for storing the internal code of the operator.
  • The arg1 and arg2 fields are used for storing the two operands used.
  • The result field is used for storing the result of the expression.

 

Exceptions

 

There are following exceptions-

 

Exception-01:

 

To represent the statement x = op y, we place-

  • op in the operator field
  • y in the arg1 field
  • x in the result field
  • arg2 field remains unused

 

Exception-02:

 

To represent the statement like param t1, we place-

  • param in the operator field
  • t1 in the arg1 field
  • Neither arg2 field nor result field is used

 

Exception-03:

 

To represent the unconditional and conditional jump statements, we place label of the target in the result field.

 

2. Triples-

 

In triples representation,

  • References to the instructions are made.
  • Temporary variables are not used.

 

3. Indirect Triples-

 

  • This representation is an enhancement over triples representation.
  • It uses an additional instruction array to list the pointers to the triples in the desired order.
  • Thus, instead of position, pointers are used to store the results.
  • It allows the optimizers to easily re-position the sub-expression for producing the optimized code.

 

PRACTICE PROBLEMS BASED ON QUADRUPLES, TRIPLES & INDIRECT TRIPLES-

 

Problem-01:

 

Translate the following expression to quadruple, triple and indirect triple-

a + b x c / e ↑ f + b x c

 

Solution-

 

Three Address Code for the given expression is-

 

T1 = e ↑ f

T2 = b x c

T3 = T2 / T1

T4 = b x a

T5 = a + T3

T6 = T5 + T4

 

Now, we write the required representations-

 

Quadruple Representation-

 

Location Op Arg1 Arg2 Result
(0) e f T1
(1) x b c T2
(2) / T2 T1 T3
(3) x b a T4
(4) + a T3 T5
(5) + T5 T4 T6

 

Triple Representation-

 

Location Op Arg1 Arg2
(0) e f
(1) x b c
(2) / (1) (0)
(3) x b a
(4) + a (2)
(5) + (4) (3)

 

Indirect Triple Representation-

 

Statement
35 (0)
36 (1)
37 (2)
38 (3)
39 (4)
40 (5)

 

Location Op Arg1 Arg2
(0) e f
(1) x b e
(2) / (1) (0)
(3) x b a
(4) + a (2)
(5) + (4) (3)

 

Problem-02:

 

Translate the following expression to quadruple, triple and indirect triple-

a = b x – c + b x – c

 

Solution-

 

Three Address Code for the given expression is-

 

T1 = uminus c

T2 = b x T1

T3 = uminus c

T4 = b x T3

T5 = T2 + T4

a = T5

 

Now, we write the required representations-

 

Quadruple Representation-

 

Location Op Arg1 Arg2 Result
(1) uminus c T1
(2) x b T1 T2
(3) uminus c T3
(4) x b T3 T4
(5) + T2 T4 T5
(6) = T5 a

 

Triple Representation-

 

Location Op Arg1 Arg2
(1) uminus c
(2) x b (1)
(3) uminus c
(4) x b (3)
(5) + (2) (4)
(6) = a (5)

 

Indirect Triple Representation-

 

Statement
35 (1)
36 (2)
37 (3)
38 (4)
39 (5)
40 (6)

 

Location Op Arg1 Arg2
(1) uminus c
(2) x b (1)
(3) uminus c
(4) x b (3)
(5) + (2) (4)
(6) = a (5)

 

To gain better understanding about Quadruples, Triples and Indirect Triples,

Watch this Video Lecture

 

Download Handwritten Notes Here-

 

 

Next Article- Basic Blocks and Flow Graphs

 

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Code Optimization | Code Optimization Techniques

Code Optimization-

 

Code Optimization is an approach to enhance the performance of the code.

 

The process of code optimization involves-

  • Eliminating the unwanted code lines
  • Rearranging the statements of the code

 

Advantages-

 

The optimized code has the following advantages-

  • Optimized code has faster execution speed.
  • Optimized code utilizes the memory efficiently.
  • Optimized code gives better performance.

 

Code Optimization Techniques-

 

Important code optimization techniques are-

 

 

  1. Compile Time Evaluation
  2. Common sub-expression elimination
  3. Dead Code Elimination
  4. Code Movement
  5. Strength Reduction

 

1. Compile Time Evaluation-

 

Two techniques that falls under compile time evaluation are-

 

A) Constant Folding-

 

In this technique,

  • As the name suggests, it involves folding the constants.
  • The expressions that contain the operands having constant values at compile time are evaluated.
  • Those expressions are then replaced with their respective results.

 

Example-

 

Circumference of Circle  = (22/7) x Diameter

 

Here,

  • This technique evaluates the expression 22/7 at compile time.
  • The expression is then replaced with its result 3.14.
  • This saves the time at run time.

 

B) Constant Propagation-

 

In this technique,

  • If some variable has been assigned some constant value, then it replaces that variable with its constant value in the further program during compilation.
  • The condition is that the value of variable must not get alter in between.

 

Example-

 

pi = 3.14

radius = 10

Area of circle = pi x radius x radius

 

Here,

  • This technique substitutes the value of variables ‘pi’ and ‘radius’ at compile time.
  • It then evaluates the expression 3.14 x 10 x 10.
  • The expression is then replaced with its result 314.
  • This saves the time at run time.

 

2. Common Sub-Expression Elimination-

 

The expression that has been already computed before and appears again in the code for computation

is called as Common Sub-Expression.

 

In this technique,

  • As the name suggests, it involves eliminating the common sub expressions.
  • The redundant expressions are eliminated to avoid their re-computation.
  • The already computed result is used in the further program when required.

 

Example-

 

Code Before Optimization Code After Optimization
S1 = 4 x i

S2 = a[S1]

S3 = 4 x j

S4 = 4 x i  // Redundant  Expression

S5 = n

S6 = b[S4] + S5

S1 = 4 x i

S2 = a[S1]

S3 = 4 x j

S5 = n

S6 = b[S1] + S5

 

3. Code Movement-

 

In this technique,

  • As the name suggests, it involves movement of the code.
  • The code present inside the loop is moved out if it does not matter whether it is present inside or outside.
  • Such a code unnecessarily gets execute again and again with each iteration of the loop.
  • This leads to the wastage of time at run time.

 

Example-

 

Code Before Optimization Code After Optimization
for ( int j = 0 ; j < n ; j ++)

{

x = y + z ;

a[j] = 6 x j;

}

x = y + z ;

for ( int j = 0 ; j < n ; j ++)

{

a[j] = 6 x j;

}

 

4. Dead Code Elimination-

 

In this technique,

  • As the name suggests, it involves eliminating the dead code.
  • The statements of the code which either never executes or are unreachable or their output is never used are eliminated.

 

Example-

 

Code Before Optimization Code After Optimization
i = 0 ;

if (i == 1)

{

a = x + 5 ;

}

i = 0 ;

 

5. Strength Reduction-

 

In this technique,

  • As the name suggests, it involves reducing the strength of expressions.
  • This technique replaces the expensive and costly operators with the simple and cheaper ones.

 

Example-

 

Code Before Optimization Code After Optimization
B = A x 2 B = A + A

 

Here,

  • The expression “A x 2” is replaced with the expression “A + A”.
  • This is because the cost of multiplication operator is higher than that of addition operator.

 

To gain better understanding about Code Optimization Techniques,

Watch this Video Lecture

 

Download Handwritten Notes Here-

 

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Three Address Code | Examples

Three Address Code-

 

Three Address Code is a form of an intermediate code.

 

The characteristics of Three Address instructions are-

  • They are generated by the compiler for implementing Code Optimization.
  • They use maximum three addresses to represent any statement.
  • They are implemented as a record with the address fields.

 

General Form-

 

In general, Three Address instructions are represented as-

 

a = b op c

 

Here,

  • a, b and c are the operands.
  • Operands may be constants, names, or compiler generated temporaries.
  • op represents the operator.

 

Examples-

 

Examples of Three Address instructions are-

  • a = b + c
  • c = a x b

 

Common Three Address Instruction Forms-

 

The common forms of Three Address instructions are-

 

1. Assignment Statement-

 

x = y op z and x = op y

 

Here,

  • x, y and z are the operands.
  • op represents the operator.

 

It assigns the result obtained after solving the right side expression of the assignment operator to the left side operand.

 

2. Copy Statement-

 

x = y

 

Here,

  • x and y are the operands.
  • = is an assignment operator.

 

It copies and assigns the value of operand y to operand x.

 

3. Conditional Jump-

 

If x relop y goto X

 

Here,

  • x & y are the operands.
  • X is the tag or label of the target statement.
  • relop is a relational operator.

 

If the condition “x relop y” gets satisfied, then-

  • The control is sent directly to the location specified by label X.
  • All the statements in between are skipped.

 

If the condition “x relop y” fails, then-

  • The control is not sent to the location specified by label X.
  • The next statement appearing in the usual sequence is executed.

 

4. Unconditional Jump-

 

goto X

 

Here, X is the tag or label of the target statement.

 

On executing the statement,

  • The control is sent directly to the location specified by label X.
  • All the statements in between are skipped.

 

5. Procedure Call-

 

param x call p return y

 

Here, p is a function which takes x as a parameter and returns y.

 

PRACTICE PROBLEMS BASED ON THREE ADDRESS CODE-

 

To solve the problems, Learn about the Precedence Relations and Associativity of Operators.

 

Problem-01:

 

Write Three Address Code for the following expression-

a = b + c + d

 

Solution-

 

The given expression will be solved as-

 

 

Three Address Code for the given expression is-

(1) T1 = b + c

(2) T2 = T1 + d

(3) a = T2

 

Problem-02:

 

Write Three Address Code for the following expression-

-(a x b) + (c + d) – (a + b + c + d)

 

Solution-

 

Three Address Code for the given expression is-

(1) T1 = a x b

(2) T2 = uminus T1

(3) T3 = c + d

(4) T4 = T2 + T3

(5) T5 = a + b

(6) T6 = T3 + T5

(7) T7 = T4 – T6

 

Problem-03:

 

Write Three Address Code for the following expression-

If A < B then 1 else 0

 

Solution-

 

Three Address Code for the given expression is-

(1) If (A < B) goto (4)

(2) T1 = 0

(3) goto (5)

(4) T1 = 1

(5)

 

Problem-04:

 

Write Three Address Code for the following expression-

If A < B and C < D then t = 1 else t = 0

 

Solution-

 

Three Address Code for the given expression is-

(1) If (A < B) goto (3)

(2) goto (4)

(3) If (C < D) goto (6)

(4) t = 0

(5) goto (7)

(6) t = 1

(7)

 

Also Read- More Problems On Three Address Code

 

To gain better understanding about Three Address Code,

Watch this Video Lecture

 

Download Handwritten Notes Here-

 

 

Next Article- Quadruples, Triples & Indirect Triples

 

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.

Directed Acyclic Graphs | DAGs | Examples

Directed Acyclic Graph-

 

Directed Acyclic Graph (DAG) is a special kind of Abstract Syntax Tree.

 

  • Each node of it contains a unique value.
  • It does not contain any cycles in it, hence called Acyclic.

 

Optimization Of Basic Blocks-

 

DAG is a very useful data structure for implementing transformations on Basic Blocks.

 

  • A DAG is constructed for optimizing the basic block.
  • A DAG is usually constructed using Three Address Code.
  • Transformations such as dead code elimination and common sub expression elimination are then applied.

 

Properties-

 

  • Reachability relation forms a partial order in DAGs.
  • Both transitive closure & transitive reduction are uniquely defined for DAGs.
  • Topological Orderings are defined for DAGs.

 

Applications-

 

DAGs are used for the following purposes-

  • To determine the expressions which have been computed more than once (called common sub-expressions).
  • To determine the names whose computation has been done outside the block but used inside the block.
  • To determine the statements of the block whose computed value can be made available outside the block.
  • To simplify the list of Quadruples by not executing the assignment instructions x:=y unless they are necessary and eliminating the common sub-expressions.

 

Construction of DAGs-

 

Following rules are used for the construction of DAGs-

 

Rule-01:

 

In a DAG,

  • Interior nodes always represent the operators.
  • Exterior nodes (leaf nodes) always represent the names, identifiers or constants.

 

Rule-02:

 

While constructing a DAG,

  • A check is made to find if there exists any node with the same value.
  • A new node is created only when there does not exist any node with the same value.
  • This action helps in detecting the common sub-expressions and avoiding the re-computation of the same.

 

Rule-03:

 

The assignment instructions of the form x:=y are not performed unless they are necessary.

 

Also Read- Code Optimization

 

PRACTICE PROBLEMS BASED ON DIRECTED ACYCLIC GRAPHS-

 

Problem-01:

 

Consider the following expression and construct a DAG for it-

         ( a + b ) x ( a + b + c )

 

Solution-

 

Three Address Code for the given expression is-

 

T1 = a + b

T2 = T1 + c

T3 = T1 x T2

 

Now, Directed Acyclic Graph is-

 

 

NOTE

 

From the constructed DAG, we observe-

  • The common sub-expression (a+b) has been expressed into a single node in the DAG.
  • The computation is carried out only once and stored in the identifier T1 and reused later.

 

This illustrates how the construction scheme of a DAG identifies the common sub-expression and helps in eliminating its re-computation later.

 

Problem-02:

 

Consider the following expression and construct a DAG for it-

( ( ( a + a ) + ( a + a ) ) + ( ( a + a ) + ( a + a ) ) )

 

Solution-

 

Directed Acyclic Graph for the given expression is-

 

 

Problem-03:

 

Consider the following block and construct a DAG for it-

 

(1) a = b x c

(2) d = b

(3) e = d x c

(4) b = e

(5) f = b + c

(6) g = f + d

 

Solution-

 

Directed Acyclic Graph for the given block is-

 

 

Problem-04:

 

Optimize the block in the Problem-03.

 

Solution-

 

Step-01:

 

Firstly, construct a DAG for the given block (already done above).

 

Step-02:

 

Now, the optimized block can be generated by traversing the DAG.

  • The common sub-expression e = d x c which is actually b x c (since d = b) is eliminated.
  • The dead code b = e is eliminated.

 

The optimized block is-

 

(1) a = b x c

(2) d = b

(3) f = a + c

(4) g = f + d

 

Problem-05:

 

Consider the following basic block-

 

B10:

S1 = 4 x I

S2 = addr(A) – 4

S3 = S2[S1]

S4 = 4 x I

S5 = addr(B) – 4

S6 = S5[S4]

S7 = S3 x S6

S8 = PROD + S7

PROD = S8

S9 = I + 1

I = S9

If I <= 20 goto L10

 

  1. Draw a directed acyclic graph and identify local common sub-expressions.
  2. After eliminating the common sub-expressions, re-write the basic block.

 

Solution-

 

Directed Acyclic Graph for the given basic block is-

 

 

In this code fragment,

  • 4 x I is a common sub-expression. Hence, we can eliminate because S1 = S4.
  • We can optimize S8 = PROD + S7 and PROD = S8 as PROD = PROD + S7.
  • We can optimize S9 = I + 1 and I = S9 as I = I + 1.

 

After eliminating S4, S8 and S9, we get the following basic block-

 

B10:

S1 = 4 x I

S2 = addr(A) – 4

S3 = S2[S1]

S5 = addr(B) – 4

S6 = S5[S1]

S7 = S3 x S6

PROD = PROD + S7

I = I + 1

If I <= 20 goto L10

 

To gain  better understanding about Directed Acyclic Graphs,

Watch this Video Lecture

 

Download Handwritten Notes Here-

 

 

Next Article- Misc Problems On Directed Acyclic Graphs

 

Get more notes and other study material of Compiler Design.

Watch video lectures by visiting our YouTube channel LearnVidFun.