Translator

Lecture



A translator is a program or hardware that performs a program broadcast . [1] [2]

Program translation is the transformation of a program presented in one of the programming languages ​​into a program in another language and, in a certain sense, equivalent to the first. [one]

The translator usually performs diagnostics of errors, forms identifier dictionaries, issues program text for printing, etc. [1]

The language in which the input program is presented is called the source language , and the program itself is called the source code . The output language is called the target language or object code .

In general, the concept of translation refers not only to programming languages, but also to other languages ​​- both formal computer (like HTML markup languages) and natural (Russian, English, etc.). [3] [4]

Content

  • 1 Types of translators
  • 2 Implementations
  • 3 Mixing the concepts of translation and interpretation
  • 4 Notes
  • 5 Literature

Types of translators

Translators are divided [2] :

  • Interactive . Provides the use of a time-sharing programming language.
  • Syntactically-oriented (syntactically-driven) . Receives a description of the syntax and semantics of the language and the text in the described language, which is translated according to the specified description.
  • Single pass . Generates an object module for one consecutive view of the source program.
  • Multi-pass . Generates an object module for several views of the source program.
  • Optimizing . Performs code optimization in the generated object module.
  • Test A set of assembly language macros that allow you to set various debugging procedures in programs written in assembly language.
  • Reverse . For a program in machine code, it produces an equivalent program in any programming language (see: disassembler, decompiler).

Implementations

The purpose of the broadcast is to convert text from one language to another, which is understandable to the recipient of the text. In the case of translators, the recipient is a technical device (processor) or interpreter program.

The processor language (machine code) is usually low level. There are platforms that use a high-level language as a machine (for example, iAPX-432 [5] ), but they are an exception to the rule because of the complexity and cost of living. A translator that converts programs to machine language, which is accepted and executed directly by the processor, is called a compiler . [6]

The compilation process, as a rule, consists of several stages: lexical, syntactic and semantic analyzes, generation of intermediate code based on the results of analyzes, optimization of intermediate code and generation of the resultant object code, in this case machine. In addition, the program usually has external infrastructures: services provided by the operating system and third-party libraries (for example, file I / O or a graphical interface), for which the program's computer code must be associated with these services and library functions. Linking to static libraries is performed by the linker or linker (which can be a separate program or be part of the compiler), and with the operating system and dynamic libraries, the linking is performed when the program starts execution by the loader .

Advantage of the compiler: the program is compiled once and at each execution no additional conversions are required. Accordingly, no compiler is required on the target machine for which the program is compiled. Disadvantage: a separate compilation stage slows down writing and debugging and makes it difficult to execute small, simple or one-time programs.

If the source language is an assembler language (low-level language close to machine language), the compiler of such a language is called an assembler .

Another implementation method is when a program is executed using an interpreter without any translation at all. The interpreter programmatically models the machine, the sample-execution cycle of which works with commands in high-level languages, and not with machine commands. Such software modeling creates a virtual machine that implements the language. This approach is called pure interpretation . [6] Pure interpretation applies, as a rule, to languages ​​with a simple structure (for example, NPS or Lisp). Command line interpreters process commands in scripts on UNIX or in batch files ( .bat ) on MS-DOS, as a rule, also in pure interpretation mode.

The virtue of a pure interpreter: the absence of intermediate actions for translation simplifies the implementation of the interpreter and makes it easier to use, including in the interactive mode. The disadvantage is that the interpreter must be available on the target machine where the program is to be executed. Also, as a rule, there is a more or less significant loss in speed. And the property of a pure interpreter that errors in an interpreted program are detected only when an attempt is made to execute a command (or a string) with an error can be considered both a disadvantage and an advantage.

There are trade-offs between compilation and pure interpretation, variants of the implementation of programming languages, when the interpreter translates it into an intermediate language (for example, into a bytecode or p-code) that is more convenient for interpretation (that is, an interpreter with a built-in translator) . This method is called a mixed implementation . [6] An example of a mixed language implementation is Perl. This approach combines both the advantages of a compiler and an interpreter (higher execution speed and ease of use) as well as disadvantages (additional resources are required for translating and storing a program in an intermediate language; an interpreter must be presented to execute a program on a target machine). As in the case of the compiler, the mixed implementation requires that the source code be free of errors (lexical, syntactic and semantic) before execution.

As computer resources increase and heterogeneous networks (including the Internet) expand, connecting computers of different types and architectures, a new type of interpretation has emerged, in which the source (or intermediate) code is compiled into machine code directly at runtime, on the fly. Already compiled parts of the code are cached so that when they are accessed again, they immediately get control, without recompiling. This approach is called dynamic compilation .

The advantage of dynamic compilation is that the speed of interpretation of programs becomes comparable to the speed of execution of programs in ordinary compiled languages, while the program itself is stored and distributed in a single form, independent of the target platforms. The disadvantage is greater implementation complexity and greater resource requirements than in the case of simple compilers or pure interpreters.

This method is well suited for web applications. Accordingly, dynamic compilation has appeared and is supported to some extent in Java, .NET Framework, Perl, Python implementations.

Mixing the concepts of translation and interpretation

Translation and interpretation are different processes: translation translates programs from one language into another, and interpretation is responsible for the execution of programs. However, since the purpose of the broadcast, as a rule, is to prepare the program for interpretation, these processes are usually considered together. For example, programming languages ​​are often characterized as “compiled” or “interpretable”, depending on what prevails when using a language: compilation or interpretation. Moreover, almost all low-level and third-generation programming languages, such as assembler, C, or Modula-2, are compiled, and higher-level languages, like Python or SQL, are interpretable.

On the other hand, there is an interpenetration of translation and interpretation processes: interpreters may be compiling (including dynamic compilation), and translators may require interpretation for metaprogramming constructs (for example, for macros in assembly language, conditional compilation in C, or templates in C ++ ).

Moreover, the same programming language can be both translated and interpreted, and in both cases the general stages of the analysis and recognition of the structures and directives of the source language should be present. This applies to both software and hardware implementations — for example, processors of the x86 family execute their decoding before executing machine language instructions, allocating operands (registers, memory addresses, immediate values), bitness, etc., in opcodes, and Pentium with NetBurst architecture the same machine code before being stored in the internal cache is additionally translated into a sequence of micro-operations.

created: 2014-10-13
updated: 2021-12-12
132504



Rating 9 of 10. count vote: 2
Are you satisfied?:



Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Programming Languages and Methods / Translation Theory

Terms: Programming Languages and Methods / Translation Theory