less_retarded_wiki/lisp.md
2024-09-05 21:54:17 +02:00

14 KiB
Raw Blame History

Lisp

{ I'm not a Lisper, let me know if something I claim here is wrong. ~drummyfish }

WIP

Lisp (for "list processing") constitutes a family of minimalist programming languages, such as Common Lisp, Scheme or Clojure, which all descend from an old (1960, by John McCarthy) language called LISP. Given the big number of different Lisp languages (called Lisp dialects) that have spawned from the original LISP along the way (in a fashion similar to Linux distros, just among programming languages), it's hard to make general statements about all of them, but as a high altitude overview let's state this much: Lisp languages are based on functional paradigm and lambda calculus (though not strictly enforcing functional programming, side effects of functions are still allowed as well as imperative and other paradigms), they're high level (but simple), very dynamic and flexible, dynamically typed, usually interpreted (or compiled to bytecode, though some also compile to native code), garbage collected, use a prefix notation with brackets (e.g. (+ 1 2) instead of 1 + 2) and are quite elegant and minimalist -- many hackers consider (some form of) Lisp to be the pinnacle of beauty and minimalism; in this Lisp competes mainly with Forth, another extremely minimalist language. While Forth wins on some fronts, mainly that of performance and being a better fit for typical hardware, Lisp is the more mathematically elegant language, it IS the mathematician's wet dream -- Alan Kay went as far as calling it the "Maxwells Equations of software" once he saw it could be described on a single page. I.e., unlike Forth, Lisp is normally not among the fastest languages (for example Emacs is in great part written in Lisp but has to rely on C core for performance critical parts) -- likely due to the overhead of dynamic typing and high abstraction (e.g. arbitrarily precise numbers, garbage collection etc.) -- but it has to be said Lisp can run exceptionally fast on special hardware that is made to run Lisp -- in fact there used to be such machines, called Lisp Machines, however they fell out of popularity. Lisp is infamous for employing an excessive number of brackets -- for this Lisp programmers are usually sensitive about proper code formatting. Lisp is often used as a scripting language for extensions, e.g. in Emacs or GIMP.

Lisps typically employ strict evaluation (i.e. they evaluate all function parameters whether they're needed or not, just like the "mainstream imperative languages" rather than purely functional ones) -- this is probably because purely functional paradigm isn't enforced, imperative constructs are supported (though often discouraged). This takes away a bit of elegance because now there have to exist so called special forms, i.e. constructs that look like functions (they use the same syntax) but are exception from how normal functions behave -- for example while normal functions always have all their parameters evaluated, if branch is a special form that only evaluates a certain branch, depending on condition (consider e.g. (if #t (display "yes") (display "no")); evaluating all arguments would lead to executing both branches). Strictly functional languages like Haskell don't suffer from this, however it seems like Lisp kind of masterfully break this rule to allow a bit more practicality, it just found a nice balance of imposing rules but removing limitations.

Part of the Lisp philosophy/paradigm is that program and data are represented in the same way, i.e. that there is little distinction between code and data: both are just lists or, in Lisp terms, so called S-expressions (symbolic expressions, basically the bracketed expressions in which the first symbol is understood to be the operator/function and the rest are its arguments). This allows for example dynamically constructing a program and executing it at runtime.

Lisps also have various systems of macros that allow to extend the language on-the-go (similarly to Forth), i.e. it is possible to create even new control structures etc.

Lisp is high level but minimalist, a rare beast to witness, an unusual combination to see at least nowadays, but not a bad combination -- high level languages are not necessarily evil, it's just that bloat is evil and high level usually implies bloat. However Lisp manages to stay nice -- the language is defined in very abstract ways, treating values in memory as abstract entities rather than sequences of bits, not bothering the programmer with memory management etc. -- this is subsequently usually optimized under the hood by compilers/interpreters to a more machine friendly code, however the programmer doesn't have to care, it's enough he knows the abstract behavior. Of course there are penalties for this, ignoring what's going on behind the curtains always comes for a price, but Lisp makes this choice willingly and implements it so as to gain the advantages of high level while minimizing the penalties -- for example the language is designed so that it can be implemented in a minimalist way. So basically we could call Lisp a high level language done right.

How do the Lisp dialects differ? Potentially in many different ways: some focus on pragmatism, some on minimalism, some add their own extensions and features, some modify the language itself while some just provide extra libraries, some just put more emphasis on different things (e.g. PicoLisp is more datacentric), some try to support more paradigms such as OOP, some just try to do things differently than others. For example let's examine Common Lisp versus Scheme: Common Lisp is more "bloated", has a big standard library, Scheme is more minimalist, enforces tail recursion to be optimized (and so more encourages functional programming) and mostly doesn't specify order of argument evaluation, Common Lisp evaluates arguments from left to right and supports dynamic scope while Scheme only supports lexical scope, and so on and so forth. From someone coming from the world of C -- where C is simply C -- this world may invoke a cultural shock and information overload: there are different Lisp dialects, each with different standards, each with different implementation. But ultimately there is richness, evolution and a great deal of choice in it. Some of the most notable List dialects include:

  • Common Lisp (CL): The "big", bloated (for Lisp standards), "feature rich" kind of Lisp, around since 1984; it is widely supported and well standardized by ANSI.
  • Scheme (and descendants such as Racket): Well established minimalist branch of Lisp, around since 1975.
  • Clojure: Some shitty Lisp in Java or something. It distances itself from other Lisps by dropping some essential structures such as cons cells, for which some (like Richard Stallman) argue it's not even Lisp anymore.
  • PicoLisp: Another minimalist kind of Lisp, looks cool but probably not as famous as Scheme.
  • Emacs Lisp: Used in Emacs text editor.
  • ...

Details (Scheme Lisp)

From now on we'll be talking about Scheme Lisp. This is because it's the most minimalist of the big Lisp languages, so it's of the biggest interest to us. Furthermore there are several standards of Scheme, most notable are probably the ones called Revised Report^N on the Algorithmic Language Scheme where N is integer, which is shortened to rNrs. It seems like the most classic one is r5rs from 1998, which is the one we will adopt here.

As said, basically every construct in Lisp has the form (f a1 a2 a3 ...) where f is a function (e.g. +, sin or myFunction) and a1, a2 etc. are arguments, which themselves are either of the same format or represent some atomic value, such as a number (e.g. 123) or string (e.g. "hello"). So what in C we would write as myVariable = myFunction(1,2,strLen("abc")) in Lisp we write like (set! myVariable (myFunction 1 2 (strLen "abc"))). This format applies also to control structures, macros and so on.

How to: there are many Scheme implementations, for example scm, mit-scheme or Guile. We'll choose scm here. To run a Scheme source code from a file just do: scm -f source.scm.

As for data types: Scheme (and Lisp in general) is dynamically typed, i.e. values have data types but variables don't -- it is possible to assign any value to any variable and type checking happens at run time. All variables can be seen as pointers to a general object type (which carries information about its type). Data types in scheme include:

  • boolean: Data type for binary logic, can hold one of two values: #t (true) and #f (false). In Scheme (but not necessarily elsewhere) #f is considered the only false value, everything else is true.
  • number: In Lisps numbers are more abstract that in traditional languages, they are normally unbounded and arbitrarily precise; complex numbers and rational numbers are supported (i.e. we can e.g. keep value 1/3 exactly). Furthermore every number is either exact or inexact, i.e. computing for example square root of two will return a special kind of number marked as a mere approximation.
  • character: Element of strings.
  • string: Sequence of characters.
  • pair: Very important type capable of holding exactly two ordered values, the first one denoted car, the other one cdr (the names are weird for historical reasons). This is used to make lists.
  • list: Linked list made from pairs; in a pair the first item is a value and the other one points to another pair -- next chain in the link. The last item points to a special value called empty list that terminates the list. Lists are extremely important in Lisp (they literally gave it the name).
  • vector: Collection of items, similar to a list but with better performance in random access.
  • symbol: Not to be confused with strings, they're more of tokens, for example +. This helps us be able to treat code as data and vice versa.
  • procedure: Procedures are first class citizens in Lisp, i.e. we may assign them to variables etc.

Here is a small cheatsheet:

  • comments: Start with ; and go until new line.
  • basic operators/predicates: +, -, *, /, =, <, >, <=, >=, abs, max, min, zero?, positive?, 1+ (increment), -1+, quotient, remainder, modulo, gcd, and, or, not ...
  • numbers: floor, ceiling, round, exact?, inexact?, number?, integer?, rational?, real?, complex?, numerator, rationalize, real-part, exact->inexact, ...
  • branching: (if cond thenCommand) or (if cond thenCommand elseCommand) (with else branch), is a SPECIAL form, i.e. lazily evaluates only the correct branch, returns the value of the branch. Switch is made with (cond (test1 actions1 ...) (test2 actions2 ...) ...) which evaluates tests one by one until one is found to be true, then actions are performed and the value of the last one is returned.
  • do loop (tail recursion is preferred to loops): (do ((var init step) ...) (test expr1 expr2 ...) command1 command2 ...) makes a loop that initializes local variables (var with initial value init, each iteration step is performed), then performs test; if false, then commands are executed left to right and loop repeats, otherwise expressions expr1, expr2 etc. are evaluated and the value of last one is returned.
  • sequence: (begin command1 command2 command3 ...) is used to create a block of commands that will evaluate from left to right, the value of last command (expression) will be returned.
  • variables: (define varName value) creates variable varName with initial value set. Similarly (let ((v1 a) (v2 b) ...) commands) creates a block (similar to C's {/} block) with local variables v1, v2 etc. (with initial values a, b etc.). Then (set! var value) is used to set the variable value (! indicates a side effect).
  • functions: (define (fName x y z) ...) creates function fName with parameters x, y and z; (fName a b c) calls the function. Expressions in function are evaluated from left to right, the value of last expression is returned by the function.
  • input/output: (read) reads a value from input, (display x) prints out value x.
  • other: sin, cos, log, exp, sqrt, number->string, string->number, string-append, string?, list?, character?, cons (creates pair from two values), list (creates list from all arguments), quote (special form, creates a list without evaluating it), (lambda (x y ...) commands ...) (creates anonymous function), ...

Example: here is our standardized divisor tree program written in Scheme:

;;; returns the divisor tree string
(define (divisorTreeStr x)
  (define divs (cons 0 1)) ; divisors (a, b)

  (do                      ; find two closest divisors
    ((i 2 (+ i 1)))
    ((or (> i (/ x 2)) (>= (car divs) (cdr divs))))
    (if (zero? (remainder x i))
      (set! divs (cons i (/ x i)))))

  (string-append
    "("
    (if (zero? (car divs))
      ""
      (string-append (divisorTreeStr (car divs)) " "))
    (number->string x)
    (if (zero? (car divs))
      ""
      (string-append " " (divisorTreeStr (cdr divs))))
    ")"))

(define inVal 0)

(define (mainLoop) ; main loop, read values from the user
  (define inVal 0)
  (display "Enter a number: ")
  (set! inVal (read))

  (if (not (and (integer? inVal) (>= inVal 0)))
    #f
    (begin
      (display (string-append (divisorTreeStr inVal) "\n"))
      (mainLoop))))

(mainLoop)

And here is fizzbuzz:

(define (divisible a b)
  (zero? (remainder a b)))

(define (fizzBuzz from to)
  (if (divisible from 3)
    (display "Fizz"))

  (if (divisible from 5)
    (display "Buzz"))

  (if (not (or (divisible from 3) (divisible from 5)))
    (display from))

  (if (< from to)
    (begin
      (display ", ")
      (fizzBuzz (1+ from) to))))

(fizzBuzz 1 100)
(display "\n")

TODO: the basic two-function definition of Lisp