This page is designed to give you some hints on how to select and mark up a document. This page covers the following topics.
Probably the most important thing is to select some kind of document you are interested in! Whereas it is not necessary to read the whole document or book it is certainly an advantage if you are interested in the subject matter, or have an empathy and sympathy for the views of the author.
Check our Document Index to see what documents have been marked up already, then check the Gutenberg lists to find something that you are interested in. It is often a good idea to cultivate an interest for a period, or an author, or a certain kind of document.
Once you have decided on a document that you would like to mark-up, you need to sign it out. This way you will be given "ownership" of that document for a certain period of time, and this will prevent two people inadvertently working on the same document.
You sign out a document simply by e-mailing Frank Boumphrey ( firstname.lastname@example.org), and copying Donna Smillie (email@example.com) and the HWG Gutenberg list (firstname.lastname@example.org). Be sure to put "Gutenberg signout" in the subject line, and let us know the author, title and Gutenberg number of the text you are going to work on. This process will be automated at some time in the future, but for now we will just manually track the sign outs.
There are numerous DTD's that can be used to markup a document. There is a separate mailing list (HWG Gutenberg DTDs) devoted to developing these DTD's. Basically we are encouraging everyone to use either XHTML, one of our modular DTD's or TEI. You can in fact use any DTD you want, however it must either be widely available to the public, or we will have to store it on this site.
Athough XHTML is familiar to most of the readers of this page, it has several draw backs for use as a markup language for historical documents. The chief problem is that XHTML is a structural language that describes the structure of the document rather than the content type. It does not describe very well the content of the document. However a document marked up in XHTML can be easily converted to a richer form, and it can also be displayed quite nicely, so if you don't want to use a 'book' dtd by all means go ahead and mark up the document in XHTML.
It is however possible to give semantic meaning to an XHTML document by using a
span tag and the
'class' attribute. We show you how to do this in the XHTML tutorials.
The 'Gutbook' DTD that we have posted on this site is a modular DTD. This allows a marker to add any extensions they wish to it. The current DTD is far from perfect, and it is hoped that it will evolve of the years, and that we will add other modules to it. However for stability's sake we will not be bringing out a new version more often than once a year. There is a mailing list devoted to discussing the evolution of the 'Gutbook' DTD.
There are numerous other DTD's that can be used to mark up documents. However in keeping with the Gutenberg philosophy, they must be freely available for use to the public. For example the ISO book DTD's are copyrighted, and one must pay a fee to use them, so they are not suitable for use in the public arena. If you use your own DTD it must be accompanied by a notice, as a comment in the body of the DTD, that it may be freely copied without cost.
Once you have signed out a document you then mark it up. Take your time, and do this as carefully as you can. Remember that we expect your document to be around for hundreds of years! We have provided a series of tutorials to help you to do this.
Before signing the document back in make sure that it validates. Then email it back to Frank Boumphrey (email@example.com), with a copy to Donna Smillie (firstname.lastname@example.org). The document will be checked by an independent person and then posted in our Document Index. We will be looking for volunteers to check, verify, and validate documents. This work is just as important as marking up a document. You will be credited in the meta information of the document.