How .y file generated by yacc to restore a .c file?
Categories:
From .y to .c: Understanding Yacc's Role in Compiler Construction

Explore how Yacc (Yet Another Compiler Compiler) transforms grammar rules defined in a .y file into a C-language parser, a fundamental step in building compilers and interpreters.
Yacc, which stands for "Yet Another Compiler Compiler," is a powerful tool used in compiler construction to generate a parser. A parser's primary role is to take a sequence of tokens (produced by a lexer) and determine if the sequence conforms to the grammatical rules of a language. This article delves into the process by which Yacc takes a grammar specification in a .y
file and generates a C source file (.c
) that implements the parser.
The Role of Yacc in Parsing
At its core, Yacc helps automate the creation of the syntax analysis phase of a compiler. Instead of manually writing complex parsing logic, developers define the language's grammar using a Backus-Naur Form (BNF) like notation within a .y
file. Yacc then reads this grammar and produces a C program that can parse input according to those rules. This generated C file typically contains a function named yyparse()
which is the entry point for the parser.
flowchart TD A[Grammar Definition (.y file)] --> B{Yacc Tool} B --> C[Generated Parser Source (.tab.c or .y.tab.c)] C --> D{C Compiler (e.g., GCC)} D --> E[Executable Parser] F[Input Tokens (from Lexer)] --> E E --> G[Parse Tree / Actions] style A fill:#f9f,stroke:#333,stroke-width:2px style B fill:#bbf,stroke:#333,stroke-width:2px style C fill:#f9f,stroke:#333,stroke-width:2px style D fill:#bbf,stroke:#333,stroke-width:2px style E fill:#f9f,stroke:#333,stroke-width:2px style F fill:#ccf,stroke:#333,stroke-width:2px style G fill:#cfc,stroke:#333,stroke-width:2px
The Yacc Parser Generation Workflow
Structure of a .y File
A Yacc grammar file (.y
) is typically divided into three main sections, separated by %%
delimiters. These sections define declarations, grammar rules, and auxiliary C code, respectively. Understanding this structure is crucial for writing effective Yacc specifications.
/* Declarations section */
%{
#include <stdio.h>
#include <stdlib.h>
extern int yylex();
extern int yyerror(const char *s);
%}
%token NUMBER ID
%left '+' '-'
%left '*' '/'
%%
/* Rules section */
program: /* empty */
| program statement
;
statement: expression ';'
| ID '=' expression ';'
;
expression: NUMBER
| ID
| expression '+' expression
| expression '-' expression
| expression '*' expression
| expression '/' expression
| '(' expression ')'
;
%%
/* Auxiliary C code section */
int main() {
printf("Enter an expression:\n");
return yyparse();
}
int yyerror(const char *s) {
fprintf(stderr, "Error: %s\n", s);
return 0;
}
Example of a simple Yacc (.y) grammar file
%{ %}
block in the declarations section allows you to embed C code directly into the generated .c
file. This is where you typically include header files, define global variables, or declare functions like yylex()
and yyerror()
.Generating and Compiling the Parser
Once you have your .y
file, the process of turning it into an executable parser involves two main steps: running Yacc to generate the C source code, and then compiling that C code with a standard C compiler (like GCC). You'll also need a lexer (often generated by Flex) to provide tokens to the parser.
1. Run Yacc
Execute Yacc on your grammar file. This will produce a C source file, typically named y.tab.c
or parser.tab.c
(depending on your Yacc version and options). It might also generate a header file (y.tab.h
) containing token definitions.
2. Generate Lexer (Flex)
If you're using Flex for lexical analysis, run Flex on your .l
file to generate lex.yy.c
. This file contains the yylex()
function that the Yacc-generated parser will call to get tokens.
3. Compile the C files
Compile both the Yacc-generated C file (y.tab.c
) and the Flex-generated C file (lex.yy.c
) together using a C compiler. Link them to create your final executable parser.
# Assuming you have 'grammar.y' and 'lexer.l'
# 1. Run Yacc
yacc -d grammar.y
# This generates: y.tab.c (parser source) and y.tab.h (token definitions)
# 2. Run Flex
flex lexer.l
# This generates: lex.yy.c (lexer source)
# 3. Compile and link
gcc -o myparser y.tab.c lex.yy.c -lfl
# -lfl links against the Flex library (if needed)
# 4. Run the parser
./myparser
Command-line steps to generate and compile a Yacc/Flex parser
-d
option with Yacc generates a header file (y.tab.h
) that defines the token numbers. This header file is crucial for the lexer (e.g., lex.yy.c
) to correctly identify and return token types that the parser expects.