Other roles of the lexical analyser include the removal of whitespace and comments and handling compiler directives (i.e., as a preprocessor). We need to also store the various attributes in the symbol or literal tables for later use, e.g., if we have an variable, the tokeniser would generate the token var and then associate the name of the variable with it in the symbol table - in this case, the variable name is the lexeme. If we consider a statement in a programming language, we need to be able to recognise the small syntactic units (tokens) and pass this information to the parser. Lexical analysis is the extraction of individual words or lexemes from an input stream of symbols and passing corresponding tokens back to the parser. We can consider the front-end as a two stage process, lexical analysis and syntactic analysis. ![]() ![]() The target code could be assembly and then passed to an assembler, or it could be direct to machine code. Optimisation is then applied to the target code. The syntax tree forms an intermediate representation of the code structure, and has links to the symbol table.įrom the annotated tree, intermediate code generation produces intermediate code (e.g., that suitable for a virtual machine, or pseudo-assembler), and then the final code generation stage produces target code whilst also referring to the literal and symbol table and having another error handler. An error handler also exists to catch any errors generated by any stages of the program (e.g., a syntax error by a poorly formed line). In addition to this, a literal table, which contains information on the strings and constants used in the program, and a symbol table, which stores information on the identifiers occuring in the program (e.g., variable names, constant names, procedure names, etc), are produced by the various stages of the process. Semantic analysis is then performed on the syntax tree to produce an annotated tree. The front-end of a compiler only analyses the program, it does not produce code.įrom source code, lexical analysis produces tokens, the words in a language, which are then parsed to produce a syntax tree, which checks that tokens conform with the rules of a language. LSA only deals with the front-end of the compiler, next year's module CGO deals with the back-end. Throughout the years, progress has been made in the field of programming to bring higher and higher levels of abstraction, moving from machine code, to assembler, to high level langauges and now to object orrientation, reusable languages and virtual machines. We can think of the process of description transformation, where we take some source description, apply a transformation technique and end up with a target description - this is inference mapping between two equivalent languages, where the destination is a machine executable.Ĭompilers are an important part of computer science, even if you never write a compiler, as the concepts are used regularly, in interpreters, intelligent editors, source code debugging, natural language processing (e.g., for AI) and for things such as XML. ![]() ![]() Lexical and syntactical analysis can be simplified to a machine that takes in some program code, and then returns syntax errors, parse trees and data structures. Grammars assign structure to a sentence.Īn automaton is an algorithm that can recognise (accept) all sentences of a language and reject those which do not belong to it. The tokens of each sentence are ordered according to some structure.Ī grammar is a set of rules that describe a language. Languages are a potentially infinite set of strings (sometimes called sentences, which are a sequence of symbols from a given alphabet).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |