Problems of Encoding Digital Text Documents in Arabic Using the International Standard TEI

Rachid Zghibi

doi:10.12816/0025944

Authors

Rachid Zghibi Assistant professor, Higher Institute of Documentation, University of Manouba, Tunisia

DOI:

https://doi.org/10.12816/0025944

Abstract

This research aims to study the most important technical problems that we can be exposed to when encoding text documents in Arabic using the TEI technique - one of the most important international standards and the most commonly used by many libraries, information centers and archives around the world in the digitization process of its records.

In the first part of the research, we will discuss the origins and evolution of this international standard and its most important characteristics which distinguishes it from other coding systems. We will also discuss the structure and the components of the TEI file. The second part will discuss these problems that particularly relate to the inclusion of some Arab graphemes and the relation between letters in Arabic, as well as issues related to printing and the display of bidirectional texts. In this part, we also suggest some solutions to overcome these problems such as using some of additional characters and codes that can be integrated directly with the tag standards or can be written in external files. It should be noted that these solutions are related to letters in Arabic. They are intended to show Arabic letters in their correct form or as in the original texts.

Problems of Encoding Digital Text Documents in Arabic Using the International Standard TEI

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Language

Indexing Services