Project: Shakespeare XML and XSL

Playing with XML and XSL and need some data to work with? I came across a resource of all of Shakespeare's plays in an XML format. They were apparently first marked up by Jon Bosak before the XML recommendation had even been finalized. It's a great source of test data.

I began playing with them because i've appeared in a few Shakespeare plays and i had a few questions about my parts: Exactly how many lines did i have? Which scens do i appear in? I thought i could answer most of my queries with a simple XSL transformation on the XML data which was conveniently available.

Here are some of the XSL files i've put together. I've uploaded three of the play XML files for you to try them out on. If you need another play, download it yourself from the link above.

See how many lines each character has...

See who is in each scene...

These samples both rely on finding a way to group XML data which isn't always the easiest thing to do. XSL doesn't have a nice group element. One technique is to loop through all the elements of the type you want, making sure its value wasn't already included in any of the preceding elements of the same type. I'll call that the preceding-sibling method. There is also the Muenchian method. The method uses the XSL key element to build lists of nodes that share a common value or attribute. We then find all the nodes which happen to be the first in the list of node in the key for the value or attribute they have. I know my eplaination isn't the greatest but there are plenty of better resources on the topic out there. While lightly harder to wrap your head around (at least the first time you use it), the Muenchian seems to be the perfered (usually faster) method. I've inluded an example of each method for the line-counting transformation in the downloads section.

Downloads