This FAQ was written by Daniel Ehrenberg and Doug Coleman, with suggestions for questions from many in the community.
- What is Factor?
Factor is a functional, dynamically-typed, object-oriented, stack-based programming language designed by Slava Pestov. It's sort of like a combination of Forth and Lisp. - What's a stack-based programming language?
A stack-based programming language is one which uses the stack, rather than named variables, to manage data flow. This concept is closely related to that of concatenative programming languages, most of which are stack-based.
Why Factor?
- What is Factor's specialty? Where is it best used?
Factor can be used for anything. It is not a scripting language but it is suitable for rapid development. Factor has been used for everything from web applications to game development to XML parser implementation. Factor isn't meant for extremely low-level things, like boot loaders or microcontroller programming, though. - What is Factor's purpose?
Factor is an experiment to build a modern, useful concatenative language with strong abstraction capabilities. Though there are some pitfalls, many problems can be more cleanly expressed in a concatenative language. The Factor website has more about specific goals. - Is Factor suitable for implementing my next program?
Probably. There are only a couple places where it might not work. For example, you'll never be able to program your TI-83 calculator in Factor, because it has a footprint at least as big as the image, which means at least 500 Kb for most programs, even with a minimal image. Also, Factor won't (yet) work for situations where things need to run extremely fast (though it is already faster than most scripting languages for most tasks, and it's possible to write some parts of a Factor program in C). It would be difficult to write, say, a bootloader with it. One disadvantage of Factor is that there is currently no way to use a C++ library. But for everything else, it should be usable. - Is Factor cross-platform?
Though Factor images can only run on one specific platform, the same Factor source code can easily be run on any platform that Factor is ported to. Portability issues only occur when interfacing with native libraries, or if there is a bug. - How is Factor different from Forth?
Forth is untyped, not garbage collected, and close to the machine. For flow control, Forth code tends to use immediate words. Variables and arrays tend to be global, and programs aren't usually written in a functional manner. Factor is very different from this. It is dynamically typed, offering a high degree of reflection. Unreferenced objects are garbage collected with a generational garbage collector. Factor code is a little bit more distant from the machine, though the C FFI allows using words like malloc
and mmap
. For flow control, Factor generally uses quotations, allowing flexible higher order functions; parsing words are used mostly for definitions and data literals. Variables are dynamically or statically scoped (see below), and arrays are just objects which don't need to be treated specially. There is no need to think about pointers, although in Forth, this can usually be factored out to just a few words. Factor is generally a functional programming language. - If Factor programs are just compositions of existing words, how is Factor as powerful as other programming languages?
Factor is a Turing-complete programming language whose programs are capable of doing whatever any other programming language can do. In fact, it isn't all that complicated to translate between a stack-based language and an applicative one, as long as it's known how many arguments the words in the stack-based language take. - Why do we need a new stack-based language?
Because the other ones aren't suitable for high-level development. Forth is great for low-level things, but its lack of type system and garbage collection make it difficult to debug programs, and it doesn't mesh well with functional programming. Joy made a very important theoretical contribution, but it is difficult to compile efficiently, its syntax is inextensible and it has an insufficient module system. Additionally, it is almost purely functional, making many things difficult. Factor combines the best aspects of these two systems, together many other borrowings from various places.
What do these Factor-related words mean?
- What is a word?
A word is our (from Forth) name for a named function. - What does concatenative mean?
There are many ways to categorize programming languages, and one way is contrasting concatenative and applicative. In an applicative language, things are evaluated by applying functions to arguments. This includes almost all programming languages in wide use, such as C, Python, ML, Haskell, and Java. In a concatenative programming language, things are evaluated by composing several functions which all operate on a single piece of data, passed from function to function. This piece of data is usually in the form of a stack. Additionally, in concatenative languages, this function composition is indicated by concatenating programs. Examples of concatenative languages include Forth, Joy, Postscript, Cat, and Factor. - What is a quotation?
A quotation is our (from Joy) name for an anonymous function. In syntax, it is a piece of code enclosed in square brackets. A quotation is an ordinary piece of data, and it can be treated in a similar manner to an array. - What is a combinator?
A combinator is our word for a higher-order function, or a word that takes a quotation as an argument. Examples of combinators include if
and map
. - What is a vocabulary?
A vocabulary is our name for a module. At one time, there was a distinct concept of modules and vocabularies, but it is now merged. - What is a parsing word?
A 'parsing word' in Factor is akin to an 'immediate word' in Forth. With a comparison to Lisp, it is somewhat like a reader macro but often used like a regular macro. Defining a parsing word extends the parser, and it is used to introduce new definition syntax or datatype literals. - What is a generic word?
A generic word, taken from the Lisp terminology of a generic function, is a word that dispatches on the class of its arguments. This means that methods can be defined on it. In Factor, words, rather than objects, handle the dispatch. - What is a word property?
In Factor, each word (but not quotation) is associated with a hashtable of word properties (abbreviated "word props"). These word props store metadata about the word, like where it was defined and its documentation, but not core properties like its definition or name. The entire hashtable of word props is accessible with the word word-props
, and a single word property is accessed with word-prop
. Word properties must be used carefully, as they are more 'global' than variables. - What license is Factor under?
Factor is free software under a BSD-style license with no advertising clause. This means that you can do whatever you want with Factor, as long as you distribute the license and the copyright notice with Factor source or binaries. You are completely free to fork Factor or any code included with its distribution for a closed-source or GNU GPL-licensed project (unless otherwise noted). To signify this, at the beginning of each code file distributed with Factor, there should be lines of the form
! Copyright (C) 2007 Manuel Lopez Garcia.
! See http://factorcode.org/license.txt for BSD license.
A BSD license was chosen rather than the GPL or LGPL because it allows for the most flexible reuse. However, as a result, we can accept no GPL or LGPL code for the Factor distribution itself.
What kind of a language is Factor?
- Why is there a distinction between words and quotations?
Words and quotations are used in different places. Factor programs are built out of words, which are just compositions of other words. Words are invoked simply by naming them. Words contain metadata about their module, source location, and other things in addition to the code. If a word is on the stack, it is called with execute
. Quotations, on the other hand, are the code-data used in most combinators and are not associated with the same metadata that words are. They are invoked with call
. - Is Factor purely functional, like Haskell?
No, it allows arbitrary side effects and the standard library includes several mutable data structures and imperative I/O. - Why not?
It makes things much simpler. Effects don't need to be sequenced explicitly, as in Haskell. (No one has yet implemented anything like monads or uniqueness types in Factor, but we would welcome that development.) A broader range of algorithms can be used when mutability is included. It is certainly possible to write Factor code in a purely functional way, and there are a lot of interesting possibilities to explore here. It may be possible, but no one has yet designed a metaprogramming system in a purely functional language with the degree of flexibility that Factor allows. - Why is Factor dynamically, rather than statically, typed?
For basically the same reason: it makes things more simple and flexible, allowing a high degree of metaprogramming. Additionally, a flexible enough type system for concatenative languages has not yet been designed. However, Factor 2.0 may include optional static typing, if a suitable type system can be found. - Does Factor have variables?
Yes. Most commonly, Factor uses dynamically scoped variables in the namespaces
vocabulary and statically scoped ones using the locals
vocabulary. The latter are local to a word, while the former are used for communication between words. - But aren't dynamically scoped variables bad?
They are, if they're used for everything by default. But usually, data is passed around using the stack. Dynamically scoped variables are used for a number of wider things where it would be awkward to pass data through the stack, such as prettyprinter settings, the state of a the XML parser, and a partially assembled array. - If I do
1 foo set
at the toplevel of a file, then why does foo get
in the listener give me f
after loading that file?
When parsing a file and executing toplevel code, a new dynamic scope is created. If you want to have that value of foo
be accessable elsewhere, use 1 foo set-global
or : foo 1 ;
depending on whether foo
's value needs to change. You may, instead, want to make a word like : init-foo 1 foo set ;
to initialize that value. - Why are the stack shufflers given names like
dup
, swap
and drop
? Why not just x-xx
, xy-yx
and x-
or something like that?
It is actually possible to use shufflers of this form using a vocabulary called shufflers
. However, it is very rarely used. The use of mnemonics is much more clear in almost all cases, as they can mentally represent the abstract data flow going on. For extreme cases of complicated stack shuffling, statically scoped variables in the locals
vocabulary are available, but stack shuffling mnemonics like dup
, drop
and swap
are far more clean in most cases to those familiar with the language. For complicated shufflings similar to existing mnemonics, the shuffle
vocabulary is also occasionally useful, mostly in dealing with foreign functions with many arguments. - How do literals in Factor work?
Literals are a source of confusion for some beginners. Literals are pushed to the stack in the place they are used. A literal in Factor refers to one specific object in memory, and is not automatically cloned. If you modify a literal without cloning it, that modification will be global. For more information, see the help document about literals.
How is Factor implemented?
- Is Factor compiled or interpreted?
Both. Factor has an optimizing compiler (written in Factor), a minimal, non-optimizing JIT compiler (written in C), and a metacircular interpreter (written in Factor). For most code, the optimizing compiler is used. However, some code won't work in the optimizing compiler (see next section) and the non-optimizing compiler must be used. When single-stepping through code in the walker (Ctrl-4 in the UI), neither of these is suitable, so a metacircular interpreter is used for more flexibility. If, by this, you mean "Does Factor produce standalone executables," the answer is no, but it can produce standalone packages; see the later question about deployment. - Why isn't my code compiling?
For Factor code to compile, it has to have a consistent stack effect that the compiler can discern, meaning it always takes a consistent number of things off the top of the stack and puts a consistent number back on. The Factor compiler can't infer the stack effect of calling quotations that it can't inline at compiletime. Another possibility is that you left out an inline
. inline
is needed after every combinator definition, as a hint to the compiler. A third possibility is that there is a bug; in that case, please tell us about it! - Why isn't Factor fully self-hosted?
Making Factor self-hosted would essentially mean rewriting the virtual machine in a new, low-level DSL within Factor, avoiding all high-level features. This would not offer any real advantages over using C, as it would not be interactively debuggable or replaceable. C is a suitable language for implementing low-level components. - Why aren't the C components of Factor implemented in my favorite other programming language?
C has many advantages over other programming languages. For one, GCC is heavily optimizing and readily available on almost all platforms. C is also suitable because it is very low-level and close to the machine. We need a programming language which gives us full control over the heap, since the image is saved by copying the heap directly. - Why not Factor for the Java Virtual Machine or .NET?
Originally, Factor was written in Java, and there was a compiler to JVM bytecode. But, to improve speed and the foreign function interface, Factor was rewritten in C and Factor. JFactor was abandoned in favor of CFactor. At this point, basicaly no Factor code would work in the old JFactor unmodified, as JFactor is missing several key features like generic words and syntax extension. Factor is currently natively compiled, not bytecode compiled. Removing the dependency on the JVM makes portability and distribution easier. - Why doesn't Factor use LLVM?
Until recently, Factor's assembler supported some platforms that LLVM did not. This is not true any more, but as the current architecture works, a change isn't being pursued right now. We would welcome any contributed LLVM backend, however.
What features does Factor support?
- Does Factor have a concurrency model?
Yes, Factor supports explicit, cooperative coroutines. A new thread can be spawned with the word in-thread
, and control is passed between threads with the yield
word. The core thread
vocabulary contains the most basic thread operations, and derived coroutine operations are in the coroutines
vocabulary. - Does Factor support multiple OS threads?
Currently, no. Factor's threading model works somewhat similarly to Erlang's, with the important difference that there is only one heap, and the runtime (virtual machine) always runs in a single OS thread. The VM isn't currently thread-safe, though it will be made so in the future. Certain language features, such as word properties, currently pose challenges for making Factor thread-safe. Because everything is run in a single OS thread and there is no direct efficiency gain, Factor threads are most useful for things like executing parallel I/O operations that involve waiting. - What's some cool feature of Factor that other languages don't support?
One small feature that comes in handy is the make
word, which assists in building sequences. Another cool feature is Factor's unique object system, which deserves a separate blog post to explain. A third feature is the sequence and assoc protocols, allowing numbered sequences and associative mappings to be treated generically. This isn't something that's uniquely possible in Factor, but Factor's library just happens to be very well-designed here. A very interesting library is the units
library, which, due to postfix notation, looks very natural. (The calendar
library also works well with postfix.) It works very well in conjunction with a library called inverse
, which takes advantage of the properties of concatenativity to invert some of computation. Slava described some of these cool properties in a reddit comment. - What kind of foreign function interface does Factor have?
Factor's FFI library is called alien. It works by linking do a dynamically linked library (.dll, .so or .dylib) at runtime, allowing the user to be free of writing, generating or otherwise messing with C code. Currently, alien only supports interfacing with C. Elie Chaftari wrote a good introduction to Factor's FFI. - Why isn't my code using alien working?
First, you have to makes sure the appropriate dynamically linked library is being loaded using the word add-library
. Once that is loaded, run the word recompile-all
to compile all words that haven't been compiled. This will link words using the FFI up with the DLL. - What kinds of GUI libraries does Factor support?
Currently, Factor uses a cross-platform UI library written in Factor itself, using OpenGL and a small amount of native code on each platform. The listener uses this library. There is a Cocoa binding, which is used for the window frame and menu for Mac applications, though it could be used for other things. Similarly, for Unix, there is an X binding, and it has been used in a Factor window manager, Factory. On Windows, there is a binding to some parts of the Windows API through C, but not parts that create widgets. There aren't any bindings to wxWidgets or Gtk yet. Gtk bindings would be doable but somewhat challenging due to their heavy use of macros and complicated structs, and a SWIG binding could be helpful in implementing them. wxWidgets bindings would be impossible right now, as alien does not support C++'s name mangling. - Why isn't Factor in the Computer Language Shootout?
We want to make Factor faster before compiling a submission. Most things are already far faster than scripting languages, but certain things, such as I/O, still need some work. The shootout benchmarks are heavy in I/O. But don't let any of this hold you back from making your own Factor submission! - How do you put a Factor program into a package so it can be run easily?
A tool for this is in development which currently makes Mac OS X .app
packages and Windows executables, bundled with an image and some .dll
s. There will soon be a Unix version. To deploy a vocabulary (a package which will run the vocabulary's main word), use the code
USE: ui.tools.deploy
"vocab-name" vocab deploy-tool
- Does Factor support Unicode?
There is no one meaning to the phrase "Unicode support", but there are a few things that a modern programming language is expected to support in its library: UTF-8/UTF-16 input and output, Unicode collation, Unicode-appropriate casing operations, normalization, and UI support for Unicode input and output. Of these, Factor supports all but collation and UI support, both of which are in the works and should be finished before 1.0 comes out. All Factor strings can hold any Unicode code point, and it's planned that strings will be consistently held in Normalization Form D. - Can a vocabulary have sub-vocabularies?
Yes, but the the module structure tends to be fairly flat in practice. - Can I have two words with the same name in different vocabs?
You probably don't want to design a set of vocabularies to have overlapping word names, but sometimes it comes up that you want to use two libraries that use the same word names. In this case, ther eare two resolution strategies. The simplest, if you only need one of the words which overlaps, is to put the one you want second in the order of the USING:
declaration. If you need both, or in more complicated overlap situations, you can use a qualified library import. In this case, you should include the qualified
vocabulary to enable qualified naming, and load a vocab called foo
as
QUALIFIED: foo
To access the word bar
in vocab foo
, use the name foo:bar
. When the same name is used in two vocabs, one vocab is usually put in the USING:
declaration and the other loaded with QUALIFIED:
.
How can I download, install or contribute to Factor?
- What's an image? Why does Factor use one?
The image is the file that Factor uses to store all code and data when Factor isn't running. The Factor executable and dynamically linked library only have a small amount of knowledge about Factor--just the virtual machine, the primitives and the structure of the image. The entire library is contained in the image, and was loaded there during the bootstrap process. The image is a map of the memory after the code was loaded. Unlike in Smalltalk, Factor code is almost always distributed in files rather than in the image. - What is bootstrapping, and why do I need a boot image for it?
In general, bootstrapping is the process of compiling a self-hosted compiler, that is, a compiler written in the programming language it compiles. Though Factor isn't entirely self-hosted, we use a bootstrapping process as many important pieces, like the compiler and parser (but not the virtual machine or primitives) are written in Factor. Years ago, there was a Factor interpreter and compiler written in Java, and that was initially used to run the Factor code we use now, creating an image. Now, we use a boot image--a kind of mini-image which has just enough knowledge to start the process to create a full image. - How can I make a boot image?
Use the word make-image
, as in "x86.32" make-image
. This creates a file boot.x86.32.image
in the current directory which is a full boot image. For a listing of the strings needed to specify architecture, see the help file by running \ make-image help
at the listener. You can also get boot images from the Factor website, if you can't make one yourself. - Once I have a boot image, how do I compile and bootstrap Factor?
Here is the series of commands used to compile Factor on a 32-bit x86 computer:
make
./factor -i=boot.x86.32.image #replace 'x86.32' with the appropriate architecture's string
- Should I use the last stable version of Factor, or track the current development with git?
The only advantage of git is that you can do git pull
, that is, you can update your code from any git server (not just the main one) and have the contents be in the same directory as they were before. Developers who want to make contributions should use git and maintain a git server. - How can I track development with git?
First, download git. Then, move to the folder where you want Factor to be downloaded and enter the command git clone git://factorcode.org/git/factor.git
. This will create a new folder called factor
with the current development version of Factor in it. From the Factor website download a current boot image, and go through the bootstrap process described above. - How can I join Factor development?
The best way is to make a git repository of your own. Chris Double described how to do this in a blog post. Once you have a git repository, make whatever changes you feel like to the code base, and tell someone involved in development about it. If they like your changes, they can be pulled into the main Factor repository. Before pushing any of your patches to your public repository, make sure that they'll be signed with your name by including your information in the .git/config
file with the following format:
[user]
name = "Manuel Lopez Garcia"
email = "manuel@lopez.mx"
- I'm poor. Anyone want to give me a server to run git on?
Daniel Ehrenberg would. Just send him an email. - When trying to push to my repository using Cygwin, why do I see
fatal: exec failed
fatal: The remote end hung up unexpectedly
error: failed to push to 'foo@bar.com:factor.git'
Install OpenSSH with the Cygwin installer, and the problem should be fixed. - Where should I store my working code that I don't feel like contributing to the Factor project?
If you feel like it, you can put that code in extra/
, where included non-core libraries go. But if you want to be a little more organized, you can put vocabs in a different directory named work/
which is part of the vocab search path (for example, extra/foobar/foobar.factor
or work/foobar/foobar.factor
), or in the current working directory. You can also add new directories to the vocab search path by modifying the vocab-roots
variable in the vodabs.loader vocabulary
Troubleshooting installation
- When I try to bootstrap I get the following output:
Loading P" factor.image"
*** Data heap resized to 196104192 bytes
*** Data GC (2 minor, 10 cards)
*** Data heap resized to 630124544 bytes
*** Data GC (0 minor, 0 cards)
P" factor.image":1
^
Word not found in current vocabulary search path
no-word-name "\u000c"
You are passing the boot image name to the Factor executable
incorrectly. The correct syntax is to pass the image name as an -i=
parameter, e.g. ./factor -i=boot.x86.32.image. - Which libraries do I need to get the UI working with X11 on Linux?
You need to install recent development libraries for libc, Freetype, X11, OpenGL and GLUT. On a Debian-derived Linux distribution (like Ubuntu), you can use the line
sudo apt-get install libc6-dev libfreetype6-dev libx11-dev glutg3-dev
How can I best edit and run Factor?
- How does Factor integrate with text editors?
Factor integrates with text editors in two ways: syntax highlighting and edit hooks. Syntax highlighting is available for the popular text editors Vi, Emacs, TextMate and jEdit. Edit hooks are a special feature of Factor, available for many more editors: to jump to the definition of a word named foo
, simply type at the listener \ foo edit
after loading the edit hooks. In the event of a syntax error, you can jump to the error site using the word :edit
. - How do I load the edit hooks for my editor?
For most editors, it's sufficient to run the line USE: editors.name
, where you substitute name
for what your editor is called. It's a good idea to put this in your ~/.factor-rc
or save it into your image, if like editor hooks. - But what about Emacs?
In Emacs, you need to use the command M-x server-start
before invoking the edit hook from the Factor end. - Where do I find the syntax highlighting files?
For Emacs, Vi and TextMate, the relevant files are in the misc/
directory of the Factor distribution. For jEdit, Factor syntax highlighting and other editor shortcuts are included. - How can I integrate my unsupported editor into Factor?
For syntax highlighting, there's nothing special in Factor; you're on your own. But for the editor hooks, there's a very simple pattern. Just make a new vocabulary under editors
which implements the necessary words such that, when it is run, the edit-hook
variable is set to a quotation which, given a file and line number, opens the editor at the right spot. If you write either syntax highlighting or edit hook support for an unsupported editor, we'd be happy to include it in the Factor distribution. - How do I run Factor with the graphical interface?
On Windows and Unix, just run the main Factor executable. On Mac OS X, run Factor.app. - How do I run Factor without the graphical interface?
To run a command-line based listener, use the command ./factor -run=listener
. To just run a file written in Factor, use a command of the form ./factor -run=none filename.factor
. To execute a short expression in Factor without opening the listener, use a command of the form ./factor -run=none -e='"Hello world" print'
- When using Factor in the terminal,
./factor -run=listener
, is there a way to get a command history?
rlwrap is a readline wrapper that adds readline support to terminal applications. On a Debian-derived distro, you can install it with
sudo apt-get install rlwrap
Otherwise, you can download the sources from its website. One rlwrap is installed, you can run Factor with rlwrap ./factor -run=listener
.
How can I learn Factor?
- How can I start learning Factor?
The best way to go about it is to figure out something you want to program and start trying to do it. Once you have a goal in mind, you can look at Factor's included documentation (available online at Factor's website), and ask questions on the mailing list or #concatenative
at irc.freenode.net. - Are there any good books I can read about Factor?
Factor is a very young language, and so far, there are no books which use it yet. A good introduction to Forth, much of which applies in Factor, is Thinking Forth (PDF) by Leo Brodie. The best place to start to learn about the principles of modern concatenative languages is the Joy papers, by Manfred von Thun. Another good internet resource is Planet Factor, a blog aggregator for all things Factor-related. There won't be a Factor book written until after Factor 1.0 is released. - How can I keep track of the stack in my head?
At first, it may be useful to make diagrams on paper. But eventually stack shufflers should fade away in your mind and become part of the data flow. If your stack is hard to trace, it is likely that you are thinking about too many things on the stack at once. It is highly unusual for a Factor word to accept or return more than three arguments on the stack. If you ever need to keep track of the location of more than three or four items, you should probably reorganize the function by factoring it into smaller pieces. - How can I improve my Factor coding style?
- Most word definitions should fit in three or fewer 64-column lines.
- Any copy/pasted code should be factored out into new words.
- Use combinators to abstract control flow patterns.
- Use library words where possible.
- More general words should go at the top of a file; more specific
at the bottom.
- Try to use collections instead of working with individual objects on the stack.
- Don't use the datastack as a data structure.
- Use meaningful word names. Avoid too many words named (foo) or foo*.
- A word named (foo) should only exist to help implement the word foo.
- Come to the IRC channel and we'll review your code. It's fun!