Update page '[DRAFT] AerScript Specification Aer#19 standard'

Rafal Kupiec 2019-06-14 21:02:58 +02:00
parent 36d2d4ee72
commit 0523df475f
1 changed files with 78 additions and 23 deletions

@ -1,39 +1,91 @@
## 1. Introduction
The __AerScript__ language specification is the definitive source for __Aer__ syntax and usage. This specification contains detailed information about all aspects of the language.
# 1. Introduction
The __AerScript__ language specification is the definitive source for the language syntax and usage. This specification contains detailed information about all aspects of the language. AerScript has its roots in the C family of languages and will be immediately familiar to C, C++, C#, Java and PHP programmers. AerScript is standardized by CodingWorkshop as the Aer#19 standard. The __Aer Interpreter__ is a conforming implementation of this standard.
As the definition of Aer evolved, the goals used in its design are as follows:
* Aer is intended to be a simple, modern, general-purpose, object-oriented scripting language.
* The language, and implementations thereof, should provide support for software engineering
principles such as strong type checking, detection of attempts to use uninitialized variables, and automatic garbage collection.
* The language is intended for use in developing web sites and and scripts< suitable for deployment in any, including embedded environments.
* Aer is intended to be suitable also for writing scripts for embedded systems.
* Source code portability is very important, especially for those programmers already familiar with PHP and/or C++.
AerScript is a simple, modern, general-purpose, object-oriented scripting language. Several AerScript features aid in the construction of web sites and robust scripts in any, inluding embedded environments: Garbage collection automatically reclaims memory occupied by unused objects; exception handling provides a structured and extensible approach to error detection and recovery; and strong type checking presents errors and enhances reliability.
## 2. Scope
This specification describes the form and establishes the interpretation of programs written in the Aer scripting language. It describes:
* The representation of Aer programs;
* The syntax and constraints of the Aer language;
* The semantic rules for interpreting Aer programs;
## 3. Lexical structure
A Aer script consists of one or more source files. A source file is an ordered sequence of UTF-8 characters, what means that any UTF-8 encoded natural language (i.e: Japanese, Chinese, Arabic, etc.) can be used for writing scripts. Source files typically have a one-to-one correspondence with files in a file system.
# 2. Scope
This specification describes the form and establishes the interpretation of programs written in the AerScript language. It describes:
* The representation of AerScript programs;
* The syntax and constraints of the AerScript language;
* The semantic rules for interpreting AerScript programs;
* The restrictions and limits imposed by a conforming implementation of Aer Interpreter;
### 3.1. Comments
Two forms of comments are supported: delimited comments and single-line comments. A delimited comment begins with the characters /* and ends with the characters */. Delimited comments can occupy a portion of a line, a single line, or multiple lines. A single-line comment begins with the character # or characters // and extends to the end of the line.
Comments do not nest. The character sequences /* and */ have no special meaning within a single-line comment, and the character sequences #, // and /* have no special meaning within a delimited comment.
Example:
# 3. Terms and Definitions
* __Argument:__ an expression in the comma-separated list bounded by the parentheses in a method or instance constructor call expression. It is also known as an actual argument.
* __Behavior:__ external appearance or action.
* __Constraint:__ restriction, either syntactic or semantic, on how language elements can be used.
* __Deprecated:__ an informational message reported, that is intended to identify a use of deprecated language construct or feature.
* __Error:__ a condition in which the engine cannot continue executing the script and must terminate.
* __Exception:__ an error that is outside the ordinary expected behavior and can be caught by a user-defined handler.
* __Notice:__ an informational message informing user of the code that may not work as intended.
* __Parameter:__ a variable declared as part of a method, instance constructor, or indexer definition, which6acquires a value on entry to that method. It is also known as formal parameter.
* __PH7 Engine:__ the software including Virtual Machine, that executes an AerScript program.
* __Script:__ refers to one or more source files that are presented to the interpreter. Essentially, a script is the input to the interpreter.
* __Warning:__ an informational message reported, that is intended to identify a potentially questionable usage of a program element.
* __Value:__ a primitive unit of data operated by the Engine having a type and potentially other content depending on the type.
# 4. Notational conventions
Lexical and syntactic grammars for AerScript are interspersed throughout this specification. The lexical grammar defines how characters can be combined to form tokens, that is the minimal lexical elements of the language. The syntactic grammar defines how tokens can be combined to make valid AerScript programs.
## 4.1. Scripts
A script is an ordered sequence of UTF-8 characters. Typically, a script has a one-to-one correspondence with a file in a file system, but this correspondence is not required. AerScript scripts are parsed as a series of 8-bit bytes, rather than code points from Unicode or any other character repertoire. Within this specification, bytes are represented by their ASCII interpretations where these are printable characters.
Conceptually speaking, a script is translated using the following steps:
* Lexical analysis, which translates a stream of input characters into a stream of tokens.
* Syntactic analysis, which translates the stream of tokens into executable code.
Conforming implementations must accept scripts encoded with the UTF-8 encoding form (as defined by the Unicode standard), and transform them into a sequence of characters. This means that any UTF-8 encoded natural language (i.e: Japanese, Chinese, Arabic, etc.) can be used for writing scripts. Implementations can choose to accept and transform additional character encoding schemes.
## 4.2. Lexical analysis
The input production defines the lexical structure of an AerScript source file. Each source file in an AerScript application must conform to this lexical grammar production.
Four basic elements make up the lexical structure of an AerScript source file: Line terminators, white space, comments, and tokens. Of these basic elements, only tokens are significant in the syntactic grammar of an AerScript program.
### 4.2.1. Comments
Two forms of comments are supported:
* delimited comments,
* single-line comments.
Single-line comments start with the characters __//__ or __#__ and extend to the end of the source line. The trailing new line can be ommitted. Delimited comments start with the characters __/*__ and end with the characters __*/__. Delimited comments may span multiple lines. Example:
# This is a single line comment
// This is another single line comment
/* This is a
multi-line comment block */
### 3.2. White space
White space is defined as any character with Unicode class Zs (which includes the space character) as well as the horizontal tab character, the vertical tab character, and the form feed character. White space in Aer is used to separate tokens in Aer source file. It is used to improve the readability of the source code. The amount of space put between tokens is irrelevant for the Aer interpreter. It is based on the preferences and the style of a programmer. Two statements can be put into one line, as well as one statement into several lines. Source code should be readable for humans and makes no difference to the Interpreter.
A delimited comment can occur in any place in a script in which white space can occur. During tokenizing, an implementation can treat a delimited comment as though it was a white space.
### 3.3. Tokens
### 4.2.2. White space
White space is defined as any character with Unicode class Zs (which includes the space character) as well as the horizontal tab character, the vertical tab character, and the form feed character. White space in AerScript is used to separate tokens in the source file. It is used to improve the readability of the source code. The amount of space put between tokens is irrelevant for the Aer Interpreter. It is based on the preferences and the style of a programmer. Two statements can be put into one line, as well as one statement into several lines. Source code should be readable for humans and makes no difference to the Interpreter.
### 4.2.3. Tokens
There are several kinds of tokens: identifiers, keywords, literals, operators, and punctuators. White space and comments are not tokens, though they act as separators for tokens.
token
: identifier
| keyword
| NULL_literal
| boolean_literal
| integer_literal
| real_literal
| character_literal
| string_literal
| interpolated_string_literal
| operator_or_punctuator
;
#### 3.3.1 Literals
A literal is any notation for representing a value within the Aer source code. Technically, a literal is assigned a value at compile time, while a variable is assigned at runtime. Aer supports:
* boolean literal (can be true or false),
@ -54,6 +106,9 @@ There are several kinds of operators and punctuators. Operators are used in expr
) [ ] { } -> :
? =>
#### X.Y.Z Identifiers
#### 3.3.3 Keywords
A keyword is a reserved word in the Aer language. Keywords are used to perform a specific task in a computer program; for example, print a value, do repetitive tasks, or perform logical operations. A programmer cannot use a keyword as an ordinary variable.