The General Problem of Describing Syntax: Terminology
•A
sentence
is a
string of characters over some alphabet
•A
language is a set of sentences
•A lexeme is the lowest level syntactic unit of a
language (e.g., *,
sum,
begin)
•A
token
is a
category of lexemes (e.g., identifier)
Formal Definition of Languages
•Recognizers
–A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language
•Generators
–A device that generates sentences of a language
–One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator.
BNF and Context-Free Grammars
•Context-Free Grammars
–Developed
by Noam Chomsky in the mid-1950s
–Language
generators, meant to describe the syntax of natural languages
–Define
a class of languages called context-free languages
•Backus-Naur Form (1959)
–Invented
by John Backus to describe the syntax of Algol 58
–BNF
is equivalent to context-free grammars
BNF Rules
•An abstraction (or nonterminal symbol)
can have more than one RHS
<stmt>
®
<single_stmt>
| begin <stmt_list>
end
Describing Lists
Syntactic lists are described using recursion
<ident_list> ® ident
| ident, <ident_list>
A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
An Example Grammar
<program> ® <stmts>
<stmts> ® <stmt> | <stmt> ; <stmts>
<stmt> ® <var> = <expr>
<var> ® a | b | c | d
<expr> ® <term> + <term> | <term> - <term>
<term> ® <var> | const
Static Semantics
•Nothing to do with meaning
•Context-free grammars (CFGs) cannot
describe all of the syntax of programming languages
•Categories of constructs that are
trouble:
- Context-free, but cumbersome (e.g.,
types of operands in expressions)
- Non-context-free (e.g., variables must
be declared before they are used)
Operational Semantics
•Operational Semantics
–Describe
the meaning of a program by executing its statements on a machine, either
simulated or actual. The change in the
state of the machine (memory, registers, etc.) defines the meaning of the
statement
•To use operational semantics for
a high-level language, a virtual machine
is needed
Denotational Semantics
•Based on recursive function theory
•The most abstract semantics description
method
•Originally developed by Scott and
Strachey (1970)
•The process of building a denotational
specification for a language:
- Define a mathematical object for each language
entity
–Define
a function that maps instances of the language entities onto instances of the
corresponding mathematical objects
•The meaning of language constructs are
defined by only the values of the program's variables
Axiomatic Semantics
•Based on formal logic (predicate
calculus)
•Original purpose: formal program
verification
•Axioms or inference rules are defined for
each statement type in the language (to allow transformations of logic
expressions into more formal logic expressions)
•The logic expressions are called assertions.
•Pre-, post form: {P} statement {Q}
•An example
–a = b
+ 1 {a > 1}
–One
possible precondition: {b > 10}
–Weakest
precondition: {b
> 0}
Summary
•BNF and context-free grammars are
equivalent meta-languages
–Well-suited
for describing the syntax of programming languages
•An attribute grammar is a descriptive
formalism that can describe both the syntax and the semantics of a language
•Three primary methods of semantics
description
–Operation,
axiomatic, denotational
Reference
Robert W. Sebesta - Concept of Programming Languages (Tenth Edition), Chapter
3
Tidak ada komentar:
Posting Komentar