E-text

e-text (from ” electronic text “, sometimes written as etext ) is a general document, and especially a document that is mainly text. For example, a computer based book of art with minimal text, or a set of photographs or scans of pages, would not usually be called an “e-text”. The term is usually synonymous with e-book .

An e-text may be a binary or a plain text file, viewed with any open source or proprietary software . An e-text may have markup or other formatting information, or not.

An e-text may be an electronic edition of a work originally composed or published in other media, or may be created in electronic form originally.

E-text origins

E-texts, or electronic documents, have been around since the Internet, the Web, and specialized E-book reading hardware. Roberto Busa, first edition of an electronic edition of Aquinas in the 1940s, while large-scale electronic text editing, hypertext and Augment and FRESS appeared in the 1960s. These early systems made extensive use of formatting, markup , automatic tables of contents, hyperlinks , and other information in their texts, as well as graphics. [1]

“Just plain text”

In some communities, “e-text” is used much more narrowly, to refer to electronic documents that are, so to speak, “plain vanilla ASCII “. It is meant that the document is a plain text file, but that it has no information beyond “the text itself” -no representation of bold or italics, paragraph, page, chapter, or footnote boundaries, and so on. Michael S. Hart, [2] for example, arguing that this is the only text mode that is easy on both the eyes and the computer. Hart made the correct according to whom? ]point that proprietary word-processor formats made grossly inaccessible; but that is irrelevant to standard, open data formats. The narrow sense of “e-text” is now uncommon, because the notion of “just vanilla ASCII” (attractive at first glance), has turned out to have serious difficulties:

First, this narrow type of “e-text” is limited to the English letters. Not even Spanish used the the the used used used used used………………………. Asian, Slavic, Greek, and other writing systems are impossible.

Second, diagrams and pictures can not be accommodated, and many books at least some such material; often it is essential to the book.

Third, “e-texts” in this narrow sense of the word, “the text” from other things that occur in a work. For example, page numbers, page headers, and footnotes might be omitted, or perhaps simply appear as additional lines of text, perhaps with blank lines before and after (or not). An ornate separator line might be represented instead of asterisks (or not). And they are able to be detected by capitalization if they were all caps in the original (or not). Even to discover what conventions have been used, makes each book a new research or reverse-engineering project.

In consequence of this, such texts can not be reliably re-formatted. A program can not reliably tell where footnotes, headers or footers are, or perhaps even paragraphs, so it can not re-arrange the text, for example to fit a screen, or read it for the visually impaired. Programs might apply heuristics to the structure, but this can easily fail.

Fourth, and a surprisingly surprisingly according to whom? ] important issue, a “plain-text” e-text affords no way to represent information about the work. For example, is it the first or the tenth edition? Who prepared it, and what rights do they reserve or grant to others? Is this the raw version of a straight off scanner, or has it been proofread and corrected? Where it is preset to be used. At best, the text of the title might be included (or not), perhaps with centering imitated by indentation.

Fifth, texts with more complicated information can not really be handled at all. A bilingual edition, or a critical edition with footnotes, commentary, critical apparatus, cross-references, or even the simplest tables. This leads to endless practical problems: for example, if the computer can not be detected, it can not find a sentence that a footnote interrupts.

Even raw scanner OCR is usually a product of this type. If this information is not kept, it is expensive and time-consuming to reconstruct it; more sophisticated information such as you, can not be recoverable at all.

If actuality, even “plain text” uses some kind of “markup” -usually control characters , spaces, tabs, and the like: Spaces between words; two returns and 5 spaces for paragraph. The use of plain text is implicit, usually undocumented conventions, which are therefore inconsistent and difficult to recognize. [3]

The narrow sense of e-text as “plain vanilla ASCII” has fallen out of favor. according to whom? ] Nevertheless, many such texts are freely available on the Web, possibly because they are much easier. For Many Years Project Gutenberg Strongly Favored this model of text, purpose with time, Has Begun to Develop and distribute more able forms Such As HTML .

See also

  • Text file
  • e-book
  • Electronic paper
  • Digital library
  • Online Books Page
  • Project Gutenberg
  • Distributed Proofreaders
  • The Association of Universal Bibliophiles
  • Higher intellect project

References

  1. Jump up^ Reading and Writing the Electronic Book. Nicole Yankelovich, Norman Meyrowitz, and Andries van Dam. IEEE Computer 18 (10), October 1985.http://dl.acm.org/citation.cfm?id=4407
  2. Jump up^ Michael S. Hart
  3. Jump up^ Coombs, James H .; Renear, Allen H .; DeRose, Steven J. (November 1987). “Markup systems and the future of scholarly text processing” . Communications of the ACM . ACM . 30 (11): 933-947. doi : 10.1145 / 32206.32209 .