Proofreading is the foundation of Wikisource, providing the best quality texts in our library. The process involves two "namespaces" (sections of Wikisource; included at the start of the page title) and a special piece of software. Both together, these two namespaces (Index
and Page
) are sometimes called the "workspace". This is where the proofreading, editing and other "back room" processes are done.
The process is based on page scans of a physical book, usually in the form of a DjVu file. This is used to make an Index page, which is a page in the "Index" namespace with the same name as the DjVu file. Each individual page in the book is a separate page in the "Page" namespace. The Index page will link to the pages and each page needs to be proofread.
The following guide will explain how to proofread a page, with pointers to other pages with more detailed information. For a guide to the Index page portion of proofreading, see Help:Beginner's guide to Index: files.
Proofreading is based around the Index page and all of the connected Page-namespace pages. The first step is to find one page to proofread. You will probably be starting from an "index" page. The index page shows a photo of the cover or first page. Below the picture is a list of all the page numbers.
When you view a page in the Page namespace, the screen will be split into two sections (fig 1). This is the default side-by-side layout that allows users to proofread the text on Wikisource (left section) against the scanned text (right section). When you click Show Preview on a page in the Page namespace, the screen will then have three sections. The text edit window and the scanned text section remain as they are, with the previewed text showing in an area above the other two sections.
To proofread a page, you should edit the text in the left section so that it matches the scan in the right section as much as possible.
You do not have to make an identical, photographic copy of the scan. Wikisource is a website, not a book, and the text is more important than the typography. You should just try to get as close as possible. Some things work in books but do not work on Wikisource. For example, if the text was originally in columns (like a newspaper), then preserving that formatting is not necessary and does not work well on Wikisource, because several pages will be added together in the main namespace when proofreading is finished. Instead, use normal paragraphs without columns, placed in the order that you would naturally read the page.
When you save the page, you should also set the page status. You should see a row of color-coded radio buttons just above the save button (fig 2). If you have just started a page with no (or not many) changes, then select the red button (for "Not proofread"). If you have completely proofread the page and corrected every error you can find, then select the yellow button (for "Proofread").
Some pages will have been proofread already by other people. You can check these and upgrade the page status. Look through the page for any remaining errors or things that need to be changed. If there are no errors, or you have fixed everything that needs to be fixed, increase the page status by one level. "Not proofread" (red) pages become "Proofread" (yellow), which become "Validated" (green). Validated pages are finished and should not need any more editing. Blank pages (gray) and Problematic pages (blue) are special cases; see below for more information.
Headers and footers will be automatically added if they are set on the Index page, with the only changes needed to the page number. If they are not added, then use {{rh}}. Headers and footers are not transcluded into the mainspace.
Blank pages can be left blank and set to the "No text" (gray) page status. These pages will be ignored when pages are added to the main namespace.
This includes book covers, unless illustrated. This does not include pages with an illustration, which should be proofread as normal. If the illustration is unavailable at present, see Problematic pages.
If you have a problem while proofreading a page and cannot finish it, you can set the page status to "Problematic" (blue). This will alert other people that a problem exists, which they may be able to solve.
Common problems include pages with illustrations (if no image file is available), pages with equations, pages with foreign text (especially text that does not use the Latin alphabet) and pages with special formatting. In some of these cases, special templates exist to identify the problem (see Problem templates, below). These are useful to anyone else looking at the page and they can attract the attention of people able to fix the problem.
Optical Character Recognition (OCR) is the function used by computers to read text. This is often saved within DjVu files and is extracted by the computer when a new page is started in proofreading. However, computers are not very good at reading printed text and errors (sometimes called "scanos") can be quite frequent. This table shows some common errors made by computers that will need to be found and corrected during proofreading.
For example OCR error Correction tlie the a11, aH, aU all au an \vas was mc me Other common things to correct[edit]There are some templates that can be necessary when proofreading a page.
Place this template on a line by itself at the end of a page, where a paragraph ends the page. If the paragraph continues to the next page, do not use this template.
Proofreading templates[edit]These should be used if there is a problem that you cannot fix yourself. When using one of these, also set the progress to "problematic" (blue).
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4