2025-07-23

Map of territory

  • scanning or lexing, or lexing analysis is to convert program string into tokens
  • parsing is to add programming grammer to the tokens. it creates abstract syntax trees (ASTs), or just trees.
  • front end which creates the intermediate representation of code. These are called frontend because they compile the source code, the front part.
  • backend which then takes the intermediate representation and converts to native code targeting the architecture.
  • IR (intermediate representation).
  • optimization - replacing some of the user code with efficient code.
    • constant folding where evaluated result of constant is stored during the compile time. for example,
pennyArea = 3.14159 * (0.75 / 2) * (0.75 / 2);
// can be
pennyArea = 0.4417860938;
  • after optimization, we need to convert it to the form that can run on machine. There are two ways for that,
    • either convert it to machine native code, which is faster but is complex.
    • or create virtual machine (hypothetical cpu) where the code runs (similar to java language)
  • runtime which provide runtime supports running the program. for example, garbage collector etc. For fully compiled language, runtime is inserted directly to the executable. And for language runs in interpreter or vm, then runtime lives there.
  • shortcuts to designing compilers
    • single pass compilers. does not have pipeline of stages in compilation. It produces the output code at the parsing stage only.
    • tree walk interpreters. going through the AST and executing the node. good for designing simple languages.
    • transpiler or source to source compiler. it compiles the source language to another language which already has compiler tools. For example, writing a frontend (transpiler) for a custom language that converts that to C language which already has compiler toolchain.
    • JIT

See also

  1. https://llvm.org/
  2. https://emscripten.org/
  3. https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js

2025-07-28

Lox language

  • managing memory using techinques, reference counting and trace garbage collection (or just garbage collection).

See also