About Standards for Transcribing Text

In our last meeting we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.

If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like Markdown (http://daringfireball.net/projects/markdown/syntax) or Textile (http://txstyle.org).

Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text. TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm Transcribe Bentham Guidelines (seems to be based on TEI) - http://www.transcribe-bentham.da.ulcc.ac.uk/td/Help:Transcription_Guidelines New York Public Library Menu transcription guidelines - http://menus.nypl.org/help National Archives Transcription tips - http://transcribe.archives.gov/tips

Projects that might have additional approaches to transcription http://scripto.org http://www.uscript.org http://transcriptorium.eu http://t-pen.org