Text Transcription Issues: Difference between revisions

Revision as of 17:21, 2 January 2013

About Standards for Transcribing Text

Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012.

In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.

If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like:
- Markdown (http://daringfireball.net/projects/markdown/syntax) or
- Textile (http://txstyle.org).

Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text.
- TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH
- FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm
- Transcribe Bentham Guidelines (seems to be based on TEI) - http://www.transcribe-bentham.da.ulcc.ac.uk/td/Help:Transcription_Guidelines
- New York Public Library Menu transcription guidelines - http://menus.nypl.org/help
- National Archives Transcription tips - http://transcribe.archives.gov/tips

Projects that might have additional approaches to transcription
- http://scripto.org http://www.uscript.org
- http://transcriptorium.eu http://t-pen.org

@@ Line 3: / Line 3: @@
 *Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012.
-*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.
+*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.
 *If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like:

Text Transcription Issues: Difference between revisions

Revision as of 17:21, 2 January 2013

About Standards for Transcribing Text

Navigation menu

Search