Em is a limited hypertext markup language that is designed to be maximally readable. It is similar to Markdown, but it has a few key advantages:

  1. It is more readable.
  2. It is simpler to parse.
  3. There is not more than one way to do it: for any given HTML, there is never more than a single possible em representation.

Em values readability over expressiveness. This means that it is rather limited in terms of what HTML it can produce. Most noticeably, only a very limited form of inline links are supported (see Hyperlinks [1]).

Em also values consistency and predictability. As such, the syntax is rather strict. This makes it a bit harder to learn, but much more predictable.

Em's complete and exact syntax is defined by its implementation [2], but a general description follows below. For longer examples, see the source code for this text [3] or the test file [4].

Em is implemented in portable awk, with an rc script to bind it together. It is written on and for Plan 9 primarily, but the rc code can (more or less) trivially be translated to POSIX shell; the work just hasn't been done yet.

Em includes two additional rc scripts:

htwrap
Creates a standalone HTML document from em output.
htindex
Adds appropriate ids to HTML headings and prints an index of them on standard output. Supports Latin-1.

Inline formatting


Font style

Italic, bold and teletype text is marked with the asterisk, the underscore and the backtick, respectively:

Example of *italic text*, _bold text_ and `teletype text`.

The marks are only valid in certain positions:

  1. At word borders
  2. At the beginning of a word after an opening parenthesis
  3. At the end of a word before any of .,:;?!)
  4. At the end of a word before a closing parenthesis followed by any of .,:;?!

Inline references

Inline references are created with square brackets:

Example of an inline reference [1].

 [1] The quick brown fox ...

In the final output, the inline reference becomes a link to the reference item later in the document:

<p>Example of an inline reference [<a href="#ref1">1</a>].
</p>
<ol class="reflist">
<li value="1" id="ref1">The quick brown fox ...
</li>
</ol>

For more information about references, see Reference lists [5].


Em provides two types of hyperlinks: literal hyperlinks and hyperlink references.

Literal hyperlinks are wrapped in less-than and greater-than signs:

<http://example.com>

Most links are recognized, as long as they are free of spaces and contain a slash or start with a hash:

</page>
<./page>
<#section>

The less-than and greater-than signs are valid in the same places as the font style formatting marks.

Hyperlink references are a special case of inline references. When an inline reference refers to a reference containing a literal hyperlink and nothing else, the inline reference points directly to that link, rather than at the reference.

It is available for download [1].

 [1] <./v1.tgz>

The above example translates to the following HTML:

<p>It is available for download [<a href="./v1.tgz">1</a>].
</p>
<ol class="reflist">
<li value="1" id="ref1"><a href="./v1.tgz">./v1.tgz</a>
</li>
</ol>

Block-level formatting


Headings

Headings begin and end with the same number of equal signs:

= First-level heading =

== Second-level heading ==

Lists

All lists start with a single space, followed by some marker.

Unordered lists are created with - :

 - This is an unordered list
 - With two items

Ordered lists are created with n. :

 1. This is an ordered list
 2. With an item that spans
two lines

Definition lists are created with term: :

 dinosaur: an animal

Reference lists are created with [n] :

 [1] This is a reference list
 [2] With two items

Nesting

Unordered and ordered lists can be nested. An additional space at the beginning of the line increases the item level by one:

 1. First level
  - Second level
 2. First level

Reference lists

A reference list is a special type of list. It is a type of footnote list, to which you can make inline references:

See footnote [1].

 [1] The quick brown fox ...

Note: There is a special type of reference list item called a hyperlink reference. It contains only a single link:

 [1] <http://example.com>

Hyperlink references behave just like normal references, except inline references to them link directly to the link rather than the reference item.


Blockquotes

Blockquotes are, in terms of syntax and behavior, actually another type of list, started with an initial space, followed by > :

 > This is a quoted paragraph.
The paragraph continues on the next line.
 > Here begins a new quoted paragraph.

Preformatted blocks

Preformatted blocks start with a single tab:

	#include <stdio.h>
	main() { puts("Hello world!\n"); }

Paragraphs

Paragraphs start with no space:

This is a paragraph
with two lines.
This is another paragraph.

References

  1. #hyperlinks
  2. ../tree/emparse
  3. ../tree/README
  4. ../tree/test.em
  5. #reference-lists