Kamis, 28 Desember 2017

Describing Syntax and Semantics

The General Problem of Describing Syntax: Terminology


A sentence is a string of characters over some alphabet
A language is a set of sentences
A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin)
A token is a category of lexemes (e.g., identifier)

Formal Definition of Languages



Recognizers
A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language
Generators
A device that generates sentences of a language
One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator.


BNF and Context-Free Grammars


Context-Free Grammars
Developed by Noam Chomsky in the mid-1950s
Language generators, meant to describe the syntax of natural languages
Define a class of languages called context-free languages
Backus-Naur Form (1959)
Invented by John Backus to describe the syntax of Algol 58
BNF is equivalent to context-free grammars


BNF Rules

An abstraction (or nonterminal symbol) can have more than one RHS
     <stmt> ® <single_stmt>
             | begin <stmt_list> end

Describing Lists

Syntactic lists are described using recursion

    <ident_list® ident
                | ident, <ident_list>
A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)

     An Example Grammar

 <program> ® <stmts>
   <stmts® <stmt> | <stmt> ; <stmts>
   <stmt® <var> = <expr>
   <var® a | b | c | d
   <expr> ® <term> + <term> | <term> - <term>
   <term> ® <var> | const

Static Semantics


Nothing to do with meaning
Context-free grammars (CFGs) cannot describe all of the syntax of programming languages
Categories of constructs that are trouble:
    - Context-free, but cumbersome (e.g.,
        types of operands in expressions)
    - Non-context-free (e.g., variables must
        be declared before they are used)


Operational Semantics



Operational Semantics
Describe the meaning of a program by executing its statements on a machine, either simulated or actual.  The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement
To use operational semantics for a high-level language,  a virtual machine is needed



Denotational Semantics


Based on recursive function theory
The most abstract semantics description method
Originally developed by Scott and Strachey (1970)
The process of building a denotational specification for a language:
    - Define a mathematical object for each language
        entity
Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects
The meaning of language constructs are defined by only the values of the program's variables


Axiomatic Semantics



Based on formal logic (predicate calculus)
Original purpose: formal program verification
Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions)
The logic expressions are called assertions.
Pre-, post form:  {P} statement {Q}
An example
a = b + 1  {a > 1}
One possible precondition: {b > 10}
Weakest precondition:        {b > 0}


Summary



BNF and context-free grammars are equivalent meta-languages
Well-suited for describing the syntax of programming languages
An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language
Three primary methods of semantics description
Operation, axiomatic, denotational



Reference
Robert W. Sebesta - Concept of Programming Languages (Tenth Edition), Chapter 3







Tidak ada komentar:

Posting Komentar