This post is part of a series of articles about PGF/TikZ and its new graph drawing features on which I am working as part of my graduate thesis.

In this post you will learn about the motivation for using computer-aided tools to draw graphs as well as the graph drawing features and problems of PGF/TikZ in the version currently available. At the very bottom of the article I will introduce to you the graph drawing features that we have worked on over the past months. They will be covered in more detail in my next post.

Motivation based on my life as a student

When I started as a computer science student, I first had to learn about all the tools that exist out there in order to help with writing down one’s homework. In particular, everyone was looking for an easy way to combine text, math formulas and code for our programming assignments. The answer to all this will be obvious for most of you: LaTeX. Or, as I found out later, its main alternative ConTeXt which I prefer to use nowadays.

LaTeX and ConTeXt make embedding math formulas and code into text documents a bliss, at least compared to other solutions on the market such as the math editor of LibreOffice. The output format is PDF (although other formats are possible) and the resulting documents are rendered and typeset in exceptionally high quality. Sure, you can write homework down on paper yourself. Not everyone fancies professionally-looking, printed documents as they often take more time to create if you are not familiar with the tools. But in our case, having invested a lot of time into preparing our homework was followed by the desire to produce visually appealing documents.

A special problem my fellow students and I were often confronted with was drawing graphs. Typically, we were dealing with binary or arbitrary trees, directed or undirected graphs, flow networks, state machines or automata in general. There are many image editors and vector graphic applications out there, but which one to pick for these kinds of tasks? In the beginning, I remember, we simply left empty spots in the PDFs and filled them with hand-drawn graphs later. This was ok for small graphs but manually arranging the various elements (nodes, edges, labels attached to them etc.) soon became tiring and painful, in particular with larger graphs. Just think about drawing a Huffman code tree for a small alphabet, which was one of our earlier assignments. (Although trees are harmless as it’s relatively easy to predict how they grow.)

Graphviz and the problem with specialized graphics tools

Anyway, you can imagine how happy we were when, finally, we found out about Graphviz. It provides two things mainly: a relatively simple and succinct syntax for specifying the structure of graphs (again: nodes, edges, labels and other information) as well as a set of command line tools to convert this syntax into image files (I mostly used PNG). Graphviz supports directed graphs as well as undirected graphs and provides a handful of different tools/algorithms for drawing them. Here is an example of a simple directed graph drawing with the dot command:

Now, this is quite cool. Even better, the code needed to achieve this layout is just a few lines:

digraph mygraph {
  1 -> 2;  2 -> 3;  3 -> 4;  4 -> 2;
  5 -> 6;  6 -> 3;  6 -> 7;  6 -> 8;
  1 -> 8;  1 -> 5;

The major disadvantages of Graphviz (at least from my perspective) are:

  • Rather than being a general purpose graphics toolkit, Graphviz only supports drawing graphs. There are many kinds of graphics people need to create while working on homework, theses, research papers or books. As you can imagine it’s hardly desirable to learn a different syntax or tool for each of these types of graphics.
  • Graphviz only ships external tools and is not integrated into the typesetting systems many people are used to (like TeX). Thus, whenever you change your document’s font, you also need to update and regenerate all the graph drawings created by Graphviz. This might be solvable by generating PostScript files to be included in TeX documents though, I am not sure.

Which leads us back to the topic.

PGF/TikZ — A graphics package integrated into the TeX typesetting system

For TeX-based typesetting systems like LaTeX or ConTeXt there are a number of full-fledged graphics packages that come bundled with their own syntax for creating complex graphics of every kind. Their main advantage is the tight integration with the typesetting system, allowing graphics to be embedded in TeX documents in order to adopt the font and metric properties of the document itself. 

One of the most versatile of these graphics packages is PGF (for the lack of a proper official website I am linking to here) which comes bundled with the TikZ graphics description language. With a bit of exercise and patience, PGF and TikZ can be used for almost anything. PGF has a manual comprising 800 pages of dense information about all its features and options as well as the internals (the latter being useful for developers mostly). In PGF/TikZ, there is a solution to (almost) everything.

Graph drawing in PGF/TikZ and the problems with the current features

Many graphic objects in PGF/TikZ are represented as nodes and the visual connections between them are remarkably similar to edges as we know them from graphs. It thus comes as no surprise that TikZ can be used to describe and draw graphs. Here is a simple example:

The ConTeXt code (based on the syntax available in the current PGF stable release) for this is more verbose:

\starttikzpicture[every path/.style={>=latex},every node/.style={draw,circle}]
  \node            (a) at (0,0)  { A };
  \node            (b) at (2,0)  { B };
  \node            (c) at (2,-2) { C };
  \node[rectangle] (d) at (0,-2) { D };
  \draw[->] (a) edge (b);
  \draw[->] (a) edge (c);
  \draw[->] (b) edge (c);
  \draw[<-] (c) edge (d);

The syntax presented here is a lot more verbose than the compact syntax offered by Graphviz. In particular, nodes have to be defined with the \node macro and edges are defined using the \draw instruction. Obviously, with a syntax like this graph drawings are harder to create and understand at the same time. Again, for small graphs the syntactic overhead might be acceptable. On the other hand creating drawings of large graphs this way becomes a tiring process.

More important and critical than the syntax issues, however, is the fact that, in version available publically, PGF/TikZ does not arrange nodes automatically. As you can see in the above example, all nodes are positioned manually (e.g. node (b) at coordinate (2,0)). People are unlikely to have the perfect layout in mind before seeing a visual representation of the graph the first time, so typically, manual positioning requires step by step improvements or paper sketches to find a visually appealing graph layout. This clearly defeats the purpose of using computer tools.

Note that for trees there is a different syntax. The following example shows a tree drawn with TikZ:

One way to write this down would be with the verbose syntax and manual positioning as described above. A more comfortable notation supported by the current stable version of PGF/TikZ is this:

\starttikzpicture[every node/.style={draw,circle}]
  \node { a }
   child {
     node [fill=black!20] { b }
     child [->,>=latex] { node [red]  { c } }
     child [<-,label=x] { node [blue] { d } }
   } child {
     node[rectangle] { e }

There is quite a lot you can do with this syntax without blowing the code up too much. The main problem of this node/child approach though is that the built-in tree growth function is somewhat inflexible. And of course it does not solve the problems with general graphs. 

Luckily, there is a solution coming up.

Major graph drawing improvements in the upcoming release of PGF/TikZ

Over the past twelve months, there has been a lot of activity around graph drawing in the PGF repository. Let me briefly list some of the major new features available in the development tree, all of which will be part of the next release of PGF/TikZ:

  • TikZ now supports a new and drastically simplified syntax for defining graphs. This syntax is inspired by the Graphviz format but is more flexible as it allows nodes and edges of a graph to be combined with elements of other features of the graphics system.
  • A couple of simple node placement strategies have been added such as circular placement (placing all nodes of a graph in a circle) or grid placement (placing nodes in a N*M grid).
  • As part of a student project and my graduate thesis, PGF has been extended by a graph drawing subsystem that allows automatic graph layout algorithms to be implemented in the Lua programming language. This is as simple as dropping a Lua script in the PGF tree that implements a single function. PGF provides a number of very useful object-oriented data structures for working with graphs.
  • A number of standard algorithms have been implemented (some are work in progress) to cover the most common scenarios. This was the main subject of my graduate thesis. After initial research I decided to implement three force-based algorithms (for distributing nodes evenly on the plane and highlighting graph features such as symmetry) as well as a modular algorithm for layered drawings of directed graphs. These algorithms are very similar to those available in Graphviz and Mathematica. The results so far look great!
These awesome new features will be covered in detail in the next post. They greatly improve the usefulness of PGF/TikZ for drawing all kinds of graphs and hopefully will make many people’s life easier. Also, the Lua-based graph drawing subsystem provides a great framework for graph drawing researchers to experiment with new ideas. So keep your eyes open for the next post!