Saxon assumes the document is represented as a graph, with nodes representing document sections (tokens, sentences, etc.) and edges representing relationships between sections (this token occurs in this sentence, this token follows this token, etc.). Saxon rules are then defined as regular expressions detailing how to move between sections of the document. A rule has three main parts a starting point, a regular expression describing how to move between sections of the document, and a section detailing how the document should be updated if the rule matches. For example a simple rule to match the names of people can be defined as:

Rule:Person
((next_$_token{$_token_has_string{~Mr|Dr|Mrs|Miss}})(next_$_token{$_token_has_part_of_speech{=NNP}})+)

This defines a rule called person which will start matching at any token in a document which represents one of the four titles Mr, Dr, Mrs, or Miss. The last line of the rule specifies how matching should progress from this initial starting point. Assuming a token representing a title has been found then the rule matches one or more following tokens with a part-of-speech tag which identifies the token as a proper noun. This rule does not specify any action to take when the rule matches and so a simple annotation is added to the document specifying that a Person (i.e. the rule name) consisting of the matched tokens has been found.

If annotations should be created when a rule match that do not have the same type as the rule name then a simple RHS can be supplied. For example let us assume the rule should add an entity with type 'PersonWithTitle' then the rule can be re-written as:

Rule:Person
((next_$_token{$_token_has_string{~Mr|Dr|Mrs|Miss}})(next_$_token{$_token_has_part_of_speech{=NNP}})+)
=>
[PersonWithTitle]

The full flexibility of Saxon lies however in the ability to specify unrestricted Java code as the RHS of a rule. For example, lets assume we simply wanted to display the names of the people matched by the rule then it could be re-written with an expanded RHS as:

Rule:Person
((next_$_token{$_token_has_string{~Mr|Dr|Mrs|Miss}})(next_$_token{$_token_has_part_of_speech{=NNP}})+)
=>
{
	final StructureAndContent strings = stone.getStructureAndContent("$_string");
	final Structure tokenHasString = stone.getStructure("$_token_has_string");

	for (Path p : paths)
	{

		System.out.print("path of " + p.getNodes().length +" nodes: ");

		for (int code : p.getNodes())
		{
			System.out.print(" "+strings.retrieve(tokenHasString.follow(code)));
		}

		System.out.println();

		addSaxonNode(stone,getName(),p);
	}
}

The RHS in this instance loops through the graph nodes captured by the Saxon rule and for each node extracts and displays the token string without altering the data in the graph.