How to Develop a Customized Transformer |
by Thomas Ruess
Introduction
At the heart of Cocoon there are the different sitemap components. When you start writing your own components the component you will probably end up writing the most is a transformer. Transformers receive SAX events from a previous component in the pipeline and send SAX events to the following component. Transformers therefore allow you to directly influence and change an XML stream. Based on an example I want to show you how you can write a customized transformer.
Prerequisites
What are the prerequisites to start writing your own transformer? Depending on the Cocoon version you run you should have the appropriate JDK. Whatever editor you feel comfortable with is fine. For this example I used Cocoon 2.1 with JDK 1.4.1. As my favourite editor currently is Netbeans I mounted the source file distribution cocoon2.1/src/java (and created my own package structure within this folder (de/tomdat). I assume that you already have Cocoon up and running and therefore only indicate the changes you have to implement in your sitemap to integrate the transformer.
Getting started
In the following example we are going to take a piece of xhtml formatted text enclosed by a root element (here paragraph) and strip off the xhtml tags to get the plain text.
<paragraph>
<h1>He</h1><br/>
<b>has the deed half done who has made a beginning</b>
<font color="green">Horace</font>
</paragraph>
Save this file to your Cocoon file system and add the following code snippet to your sitemap. "tagtransformer" is the name under which we are going to make the transformer component known to Cocoon.
<map:match pattern="citation">
<map:generate src="citation.xml" />
<map:transform type="tagtransformer" />
<map:serialize type="xml" />
</map:match>
The transformer
As with all other components, Cocoon provides an abstract class - the AbstractTransformer which inherits from AbstractXMLPipe class. To keep things easy, we will extend our TagTransformer from this class. If you ever had to deal with parsers you will already feel familiar with the methods of the class. In the startElement event method we test for the root element and pass it on by calling the super implementation of the startElement event method. In the endElement method we do the same. As we only pass on the root element all other tags are omitted and virtually stripped off. The text information contained between the opening and closing tags is processed by the characters event method and appended to a string, which will give us our "plain text". Type or copy and paste the following code into your editor and compile it.
package de.tomdat.transformation;
import org.apache.cocoon.transformation.AbstractTransformer;
import org.apache.cocoon.environment.SourceResolver;
import org.apache.cocoon.ProcessingException;
import org.apache.avalon.framework.parameters.Parameters;
import org.xml.sax.SAXException;
import org.xml.sax.Attributes;
import org.xml.sax.helpers.AttributesImpl;
import java.util.Map;
import java.io.IOException;
import java.io.*;
import org.apache.cocoon.environment.Context;
import org.apache.cocoon.environment.ObjectModelHelper;
import org.apache.cocoon.environment.Request;
/*
* this class strips off all tags from an incoming XML stream and
* passes on the plain text contained within a root
* element.
*
*/
public class TagTransformer extends AbstractTransformer{
//contains the concatenated text
String content;
public void setup(SourceResolver resolver, Map objectModel,
String src, Parameters par)
throws ProcessingException, SAXException, IOException {
}
public void startElement(String namespaceURI, String localName,
String qName, Attributes attributes) throws SAXException {
//start a root element paragraph
if (localName.equals("paragraph")) {
super.startElement(namespaceURI, "paragraph", "paragraph", attributes);
}
}
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
if(localName.equals("paragraph")) {
super.endElement(namespaceURI, "paragraph", "paragraph");
}
}
public void characters(char[] buffer, int start, int length)
throws SAXException {
//concatenate the content
StringBuffer contentBuffer = new StringBuffer();
contentBuffer.append(buffer, start, length);
content += contentBuffer.toString();
super.characters(buffer, start, length);
}
}
Shutdown Tomcat (or whatever servlet container you use), if running. What we have to do now, is to make this class file available to Cocoon. Therefore you copy it into the folder classes within the Cocoon WEB-INF. The structure has to reflect the package name, so you will have to create a a folder structure like: de - tomdat - transformation. In your sitemap you make the new class known to Cocoon in the components section within transformers.
<map:transformer name="tagtransformer"
src="de.tomdat.transformation.TagTransformer" />
And that is all it takes to write a transformer of your own
What next?
You might extend this class to a veritable format changer, replacing all sorts of tags. For example, if you have an element named paragraph and would like it to be named absatz (the German translation thereof), just change the local names
if (localName.equals("paragraph")) {
super.startElement(namespaceURI, "absatz", "absatz", attributes);
}




