Programming Language concept: Describing Syntax and Semantics

The General Problem of Describing Syntax: Terminology

•A sentence is a string of characters over some alphabet

•A language is a set of sentences

•A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin)

•A token is a category of lexemes (e.g., identifier)

Formal Definition of Languages

•Recognizers

–A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language

•Generators

–A device that generates sentences of a language

–One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator.

BNF and Context-Free Grammars

•Context-Free Grammars

–Developed by Noam Chomsky in the mid-1950s

–Language generators, meant to describe the syntax of natural languages

–Define a class of languages called context-free languages

•Backus-Naur Form (1959)

–Invented by John Backus to describe the syntax of Algol 58

–BNF is equivalent to context-free grammars

BNF Rules

•An abstraction (or nonterminal symbol) can have more than one RHS

<stmt> ® <single_stmt>

| begin <stmt_list> end

Describing Lists

Syntactic lists are described using recursion

<ident_list> ® ident

| ident, <ident_list>

A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)

An Example Grammar

<program> ® <stmts>

<stmts> ® <stmt> | <stmt> ; <stmts>

<stmt> ® <var> = <expr>

<var> ® a | b | c | d

<expr> ® <term> + <term> | <term> - <term>

<term> ® <var> | const

Static Semantics

•Nothing to do with meaning

•Context-free grammars (CFGs) cannot describe all of the syntax of programming languages

•Categories of constructs that are trouble:

- Context-free, but cumbersome (e.g.,

types of operands in expressions)

- Non-context-free (e.g., variables must

be declared before they are used)

Operational Semantics

•Operational Semantics

–Describe the meaning of a program by executing its statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement

•To use operational semantics for a high-level language, a virtual machine is needed

Denotational Semantics

•Based on recursive function theory

•The most abstract semantics description method

•Originally developed by Scott and Strachey (1970)

•The process of building a denotational specification for a language:

- Define a mathematical object for each language

entity

–Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects

•The meaning of language constructs are defined by only the values of the program's variables

Axiomatic Semantics

•Based on formal logic (predicate calculus)

•Original purpose: formal program verification

•Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions)

•The logic expressions are called assertions.

•Pre-, post form: {P} statement {Q}

•An example

–a = b + 1 {a > 1}

–One possible precondition: {b > 10}

–Weakest precondition: {b > 0}

Summary

•BNF and context-free grammars are equivalent meta-languages

–Well-suited for describing the syntax of programming languages

•An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language

•Three primary methods of semantics description

–Operation, axiomatic, denotational

Reference

Robert W. Sebesta - Concept of Programming Languages (Tenth Edition), Chapter 3

Programming Language concept

Kamis, 28 Desember 2017

Describing Syntax and Semantics