LATEX Typesetting - Beginners Guide - Peter Flynn - 2003
· Fachbuch · Monographie ·
Informatik · Textsatz · TEX · LaTEX · Textformatierung
A beginners introduction to typesetting with LATEX
- Peter Flynn
- Silmaril Consultants
- Textual Therapy Division
- This document is Copyright © 1999–2005 by Silmaril Consultants under the terms of what is now the GNU Free Documentation License (copyleft), attached at the end of this document.
A beginners introduction to typesetting with LATEX
- Peter Flynn
- Silmaril Consultants
- Textual Therapy Division
- v. 3.6 (March 2005)
- This document is Copyright © 1999–2005 by Silmaril Consultants under the terms of what is now the GNU Free Documentation License (copyleft), attached at the end of this document.
Thanks to all the people who sent me corrections and suggestions for improvement or additions to earlier versions. As usual, the problem has been what to leave out, not what to include. Some of the suggestions were well-intentioned but would have turned the book into a higher-level mathematics treatise. One of my objectives was to omit all maths except for a short example, as all the other books on T E X and LAT E X already cover mathematical typesetting in finer and better detail than I am capable of. Some of the suggestions would have taken me down pathways I prefer not to tread. Large software corporations are full of well-meaning, hard-working individuals who genuinely believe that their products make life easier for users. Unfortunately, experience shows that this is often only true in the first hot flush of using a new program: in the long run the winners are those whose data is secure, accessible, and reusable; whose documents can be reformatted at any time, on any platform, without penalty, financial or otherwise. I make no apology for recommending Unix-like systems running LATEX as the platform of choice for document-processing applications — if you have a choice — and I’m happy to welcome the Apple Macintosh to that family. Unfortunately, there are those whose circumstances at home or work require them to use something else, and I am pleased that LAT E X can help them by being available on their platform as well. I have incorporated all the remaining suggestions except where it would materially distort the objective of being a beginner’s booklet. It is very difficult for documenters to remember how they struggled to learn what is now a familiar system. So much of what we do is second nature, and much of that actually has nothing to do with the software, but more to do with the way in which you view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge of computing, please let me know so that I can correct it.
Preface Version 3.5
This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.
Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ.
This book originally accompanied a 2-day course on using the LATEX typesetting system. It has been extensively revised and updated and can now be used for self-study or in the classroom. It is aimed at users of Linux, Macintosh, or Microsoft Windows but it can be used with LATEX systems on any platform, including other Unix workstations, mainframes, and even your Personal Digital Assistant (PDA).
Who needs this book?
The audience for the original training course was assumed to be computer-literate and composed of professional, business, academic, technical, or administrative computer users. The readers of the book (you) are mostly assumed to be in a similar position, but may also come from many other backgrounds, including hobbyists, students, and just people interested in quality typesetting. You are expected to have one or more of the following or similar objectives:
- producing typesetter-quality formatting;
- formatting long, complex, highly-structured, repetitive, or automatically-generated documents;1
- saving time and effort by automating common tasks;
- achieving or maintaining your independence from specific makes or models of proprietary hardware, software, or file formats (portability);
- using Open Source software (free of restrictions, sometimes also free of charge).
- LATEX can easily be used for once-off or short and simple documents as well, but its real strengths lie in consistency and automation.
LATEX is a very easy system to learn, and requires no specialist knowledge, although literacy and some familiarity with the publishing process is useful. It is, however, assumed that you are completely fluent and familiar with using your computer before you start. Specifically, effective use of this document requires that you already know and understand the following very thoroughly:
- how to use a good plain-text editor (not a wordprocessor like OpenOffice, WordPerfect, or Microsoft Word, and not a toy like Microsoft Notepad);
- where to find all 95 of the printable ASCII characters on your keyboard and what they mean, and how to type accents and symbols, if you use them;
- how to create, open, save, close, rename, move, and delete files and folders (directories);
- how to use a Web browser and/or File Transfer Protocol (FTP) program to download and save files from the Internet;
- how to uncompress and unwrap (unzip or detar) downloaded files.
If you don't know how to do these things yet, it's important to go and learn them first. Trying to become familiar with the fundamentals of using a computer at the same time as learning LATEX is not likely to be as effective as doing them in order.
These are not specialist skills — they are all included in the European Computer Driving Licence (ECDL) and the relevant sections of the ECDL syllabus are noted in the margin above, so they are well within the capability of anyone who uses a computer.
Objectives of this book
By the end of this book, you should be able to undertake the following tasks:
- use a plain-text editor to create and maintain your documents;
- add LATEX markup to identify your document structure and formatting requirements;
- typeset LATEX documents, correct simple formatting errors, and display or print the results;
- identify, install, and use additional packages (using CTAN for downloading where necessary);
- recognise the limitations of procedural markup systems and choose appropriate generic markup methods where appropriate.
The original course covered the following topics as separate sessions, which are represented in the book as chapters:
- Where to get and how to install LATEX (teTEX, fpTEX, or proTEXt from the TEX Collection disks);
- How to type LATEX documents: using an editor to create files (half a dozen editors for LATEX);
- Basic structures (the Document Class Declaration and its layout options; the document environment with sections and paragraphs);
- Typesetting, viewing, and printing;
- The use of packages and CTAN to adapt formatting using standard tools;
- Other document structures (lists, tables, figures, images, and verbatim text);
- Textual tools (footnotes, marginal notes, cross-references, indexes and glossaries, and bibliographic citations);
- Typographic considerations (white-space and typefaces; inline markup and font changes; extra font installation and automation);
- Programmability and automation (macros and modifying LATEX's behaviour);
- Conversion and compatibility with other systems (XML, Word, etc.).
A few changes have been made in the transition to printed and online form, but the basic structure is the same, and the document functions as a workbook for the course as well as a standalone self-teaching guide.
Where's the math?
It is important to note that the document does not cover mathematical typesetting, complex tabular material, the design of large-scale macros and document classes, or the finer points of typography or typographic design, although it does refer to these topics in passing on a few occasions.
There are several other guides, introductions, and ‘get-started’ documents on the Web and on CTAN which cover these topics and more. Among the more popular are:
- Getting Started with TEX, LATEX, and friends, where all beginners should start;
- The (Not So) Short Guide to LATEX?: LATEX? in 131 Minutes is a good beginner's tutorial;
- A Gentle Introduction to TEX: A Manual for Self-Study is a classic tutorial on Plain TEX;
- Using imported graphics in LATEX? shows you how to do (almost) anything with graphics: side-by-side, rotated, etc.;
- Short Math Guide for LATEX gets you started with the American Math Society's powerful packages;
- A comprehensive list of symbols in TEX shows over 2,500 symbols available.
This list was taken from the CTAN search page. There are also lots of books published about TEX and LATEX: the most important of these for users of this document are listed in the last paragraph in the Foreword.
Availability of LATEX
Because the TEX program (the ‘engine’ which actually does the typesetting) is separate from whichever editor you choose, TEX-based systems are available in a variety of different modes using different interfaces, depending on how you want to use them.
The normal way to run LATEX is to use a toolbar button (icon), a menu item, or a keystroke in your editor. Click on it and your document gets saved and typeset. All the other features of LATEX systems (the typeset display, spellchecker, related programs like makeindex and BIBTEX) are run the same way. This works both in a normal Graphical User Interface (GUI) as well as in text-only interfaces.
In the popular LATEX editors like Emacs, TEXshell, TEXnicCenter, WinShell, or WinEdt, a record of the typesetting process is shown in an adjoining window so that you can see the progress of pages being typeset, and any errors or warnings that may occur.2
However, the graphical interface is useless if you want to run LATEX unattended, as part of an automated system, perhaps in a web server or e-commerce environment, where there is no direct connection between user and program. The underlying TEX engine is in fact a Command-Line Interface (CLI) program, that is, it is used as a ‘console’ program which you run from a standard Unix or Mac terminal or shell window (or from an MS-DOS command window in Microsoft Windows systems). You type the command
latex followed by the name of your document file (see in section 4.1.2 for an example).
Commands like these let you run LATEX in an automated environment like a Common Gateway Interface (CGI) script on a web server or a batch file on a document system. All the popular distributions for Unix and Windows, both free and commercial, include this interface as standard (teTEX, fpTEX, MiKTEX, proTEXt, PC-TEX, TrueTEX, etc.).
LATEX usually displays your typeset results in a separate window, redisplayed automatically every time the document is reprocessed, because the typesetting is done separately from the editing. Some systems, however, can format the typesetting while you type, at the expense of some flexibility.
- Asynchronous typographic displays
- This method is called an asynchronous typographic display because the typeset window only updates after you have typed something and reprocessed it, not while you are still typing, as it would with a wordprocessor.3
- Synchronous typographic displays
- Some distributions of LATEX offer a synchronous typographic interface. In these, you type directly into the typographic display, as with a wordprocessor. Three popular examples are Textures, Scientific Word, and VTEX (see table below). At least one free version (LYX, see Figure 2.1 in section 2.3) offers a similar interface.With a synchronous display you get Instant Textual Gratification™, but your level of control is restricted to that of the GUI you use, which cannot provide access to everything that LATEX can do. For complete control of the formatting you may still need access to your normal source (input) file in the same way as for asynchronous implementations.
- Near-synchronous displays
- There are several other methods available free for Unix and some other systems for close-to-synchronous updates of the typeset display (including Jonathan Fine's Instant Preview and the TEX daemon), and for embedding typographic fragments from the typeset display back into the editor window (David Kastrup's preview-latex package).
Whatever method you choose, the TEX Collection CD and CTAN are not the only source of software. The vendors listed in Table offer excellent commercial implementations of TEX and LATEX, and if you are in a position where their enhanced support and additional features are of benefit, I urge you to support them. In most cases their companies, founders, and staff have been good friends of the TEX and LATEX communities for many years.
|PCTEX||MS-Windows||Personal TEX, Inc|
|Textures||Apple Mac||Blue Sky Research|
|Scientific Word||MS-Windows||Mackichan Software|
|VTEX||MS-Windows, Linux, OS/2||MicroPress, Inc|
- Recent versions of some editors hide this display by default unless errors occur in the typesetting.
- Among other reasons, TEX typesets whole paragraphs at a time, not line-by-line as lesser systems do, in order to get the hyphenation and justification (H&J) right (see section 2.8).
The complete source, with all ancillary files, is available online at http://www.ctan.org/tex-archive/info/beginlatex/src/ but if you want to try processing it yourself you must install Java (from Sun, IBM, or a number of others) and Saxon (from http://saxon.sourceforge.net/), in addition to LATEX.
Symbols and conventions
The following typographic notations are used:
||Control sequences which perform an action, e.g. |
|\length||Control sequences which store a dimension (measurement in units), e.g. \parskip|
|counter||Values used for counting (whole numbers, as opposed to measuring in units), e.g. secnumdepth|
|term||Defining instance of a new term|
|environment||A LATEX formatting environment|
A LATEX package (available from CTAN)
|product||Program or product name|
||Examples of source code (stuff you type)|
|mybook or value||Mnemonic examples of things you have to supply real-life values for|
|x||A key on your keyboard|
|Ctrl–x||Two keys pressed together|
|Esc q||Two keys pressed one after another|
|Submit||On-screen button to click|
|?||Drop-down menu with items|
Examples of longer fragments of input are shown with a border round them. Where necessary, the formatted output is shown immediately beneath. Warnings are shown with a shaded background. Exercises are shown with a double border.
As noted in this Introduction, this document accompanies a two-day introductory training course. It became obvious from repeated questions in class and afterwards, as well as from general queries on comp.text.tex that many people do not read the FAQs, do not use the TUG web site, do not buy the books and manuals, do not use the newsgroups and mailing lists, and do not download the free documentation. Instead, they try to get by using the training technique known as ‘sitting by Nelly’, which involves looking over a colleague's shoulder in the office, lab, library, pub, or classroom, and absorbing all his or her bad habits.
In the summer of 2001 I presented a short proposal on the marketing of LATEX to the annual conference of the TEX Users Group held at the University of Delaware, and showed an example of a draft brochure designed to persuade newcomers to try LATEX for their typesetting requirements. As a result of questions and suggestions, it was obvious that it needed to include a pointer to some documentation, and I agreed to make available a revised form of this document, expanded to be used outside the classroom, and to include those topics on which I have had most questions from users over the years.
It turned out to mean a significant reworking of a lot of the material. Some of it appears in almost every other manual and book on LATEX but it is essential to the beginner and therefore bears repetition. Some of it appears other forms elsewhere, and is included here because it needs explaining better. And some of it appears nowhere else but this document. I took the opportunity to revise the structure of the training course in parallel with the book (expanding it from its original one day to two days), and to include a more comprehensive index. It is by no means perfect (in both senses), and I would be grateful for comments and corrections to be sent to me at the address given under the credits.
I had originally hoped that the LATEX version of the document would be processable by any freshly-installed default LATEX system, but the need to include font samples which go well beyond the default installation, and to use some packages which the new user is unlikely to have installed, means that this document itself is not really a simple piece of LATEX, however simply it may describe the process itself.
However, as the careful reader will have already noticed, the master source of the document is not maintained in LATEX but in XML. A future task is therefore to compare the packages required with those installed by default, and flag portions of the document requiring additional features so that an abbreviated version can be generated which can be guaranteed to process even with a basic LATEX installation.
If you are just starting with LATEX, at an early opportunity you should buy or borrow a copy of LATEX: A Document Preparation System which is the original author's manual. More advanced users should get the The LATEX Companion or one of its successors. In the same series there are also the The LATEX Graphics Companion and the The LATEX Web Companion. Mathematical users might want to read Short Math Guide for LATEX.
Many people discover LATEX after years of struggling with wordprocessors and desktop publishing systems, and are amazed to find that TEX has been around for over 25 years and they hadn't heard of it. It's not a conspiracy, just ‘a well-kept secret known only to a few million people’, as one anonymous user has put it.
Perhaps a key to why it has remained so popular is that it removes the need to fiddle with the formatting while you write. Although playing around with fonts and formatting is attractive to the newcomer, it is completely counter-productive for the serious author or editor who wants to concentrate on writing — ask any journalist or professional writer.
A few years ago a new LATEX user expressed concern on the comp.text.tex newsgroup about ‘learning to write in LATEX’. Some excellent advice was posted in response to this query, which I reproduce with permission below [the bold text is my emphasis]:
No, the harder part might be writing, period. TEX/LATEX is actually easy, once you relax and stop worrying about appearance as a be-all-and-end-all. Many people have become ‘Word Processing Junkies’ and no longer ‘write’ documents, they ‘draw’ them, almost at the same level as a pre-literate 3-year old child might pretend to ‘write’ a story, but is just creating a sequence of pictures with a pad of paper and box of Crayolas — this is perfectly normal and healthy in a 3-year old child who is being creative, but is of questionable usefulness for, say, a grad student writing a Master's or PhD thesis or a business person writing a white paper, etc. For this reason, I strongly recommend not using any sort of fancy GUI ‘crutch’. Use a plain vanilla text editor and treat it like an old-fashioned typewriter. Don't waste time playing with your mouse.
Note: I am not saying that you should have no concerns about the appearance of your document, just that you should write the document (completely) first and tweak the appearance later...not [spend time on] lots of random editing in the bulk of the document itself.
(11 March 2003), comp.text.tex
Learning to write well can be hard, but authors shouldn't have to make things even harder for themselves by using manually-driven systems which break their concentration every few seconds for some footling adjustment to the appearance, simply because the software is incapable of doing it right by itself.
Don Knuth originally wrote TEX to typeset mathematics for the second edition of his master-work The Art of Computer Programming, and it remains pretty much the only typesetting program to include fully-automated mathematical formatting done the way mathematicians want it. But he also published a booklet called Mathematical Writing which shows how important it is to think about what you write, and how the computer should be able to help, not hinder.
And TEX is much more than math: it's a programmable typesetting system which can be used for almost any formatting task, and LATEX has made it usable by almost anyone. Professor Knuth generously placed the entire system in the public domain, so for many years there was no publicity of the commercial kind which would have got TEX noticed outside the technical field.
Nowadays, however, there are many companies selling TEX software or services,1 dozens of publishers accepting LATEX documents for publication, and hundreds of thousands of users using LATEX for millions of documents.2
To count yourself as a TEX or LATEX user, visit the TEX Users Group's ‘TEX Counter’ web site (and get a nice certificate!).
There is occasionally some confusion among newcomers between the two main programs, TEX and LATEX:
TEX is a typesetting program, originally written by Prof Knuth at Stanford around 1978. It implements a macro-driven typesetters' programming language of some 300 basic operations and it has formed the core of many other desktop publishing (DTP) systems. Although it is still possible to write in the raw TEX language, you need to study it in depth, and you need to be able to write macros (subprograms) to perform even the simplest of repetitive tasks.
LATEX is a user interface for TEX, designed by Leslie Lamport at Digital Equipment Corporation (DEC) in 1985 to automate all the common tasks of document preparation. It provides a simple way for authors and typesetters to use the power of TEX without having to learn the underlying language. LATEX is the recommended system for all users except professional typographic programmers and computer scientists who want to study the internals of TEX.
Both TEX and LATEX have been constantly updated since their inception. Knuth has now frozen development of the TEX engine so that users and developers can have a virtually bug-free, rock-stable platform to work with.3 Typographic programming development continues with the New Typesetting System (NTS), planned as a successor to TEX. The LATEX3 project has taken over development of LATEX, and the current version is LATEX?, which is what we are concentrating on here. Details of all developments can be had from the TUG at http://www.tug.org
Debunking the mythology
Naturally, over all the years, a few myths have grown up around LATEX, often propagated by people who should know better. So, just to clear up any potential misunderstandings...
MYTH: ‘LATEX has only got one font’
Most LATEX systems can use any OpenType, TrueType, Adobe (PostScript) Type1 or Type3, or METAFONT font. This is more than most other known typesetting system. LATEX's default font is Computer Modern (based on Monotype Series 8: see the table in section 8.2), not Times Roman, and some people get upset because it ‘looks different’ to Times. Typefaces differ: that's what they're for — get used to it.
MYTH: ‘LATEX isn't WYSIWYG’
Simply not true. DVI and PDF preview is better WYSIWYG than any wordprocessor and most DTP systems. What people mean is that LATEX's typographic display is asynchronous with the edit window. This is only true for the default CLI implementations. See the first paragraph in section 3 of the Introduction for details of synchronous versions.
MYTH: ‘LATEX is obsolete’
Quite the opposite: it's under constant development, with new features being added almost weekly. Check the comp.text.tex for messages about recent uploads to CTAN. It's arguably more up-to-date than most other systems: LATEX had the Euro (€) before anyone else, it had Inuktitut typesetting before the Inuit got their own province in Canada, and it still produces better mathematics than anything else.
MYTH: ‘LATEX is a Unix system’
People are also heard saying: ‘LATEX is a Windows system’, ‘LATEX is a Mac system’, etc., etc. ad nauseam. TEX systems run on almost every computer in use, from some of the biggest supercomputers down to handhelds (PDAs like the Sharp Zaurus). That includes Windows and Linux PCs, Macs, and all other Unix systems. If you're using something TEX doesn't run on, it must be either incredibly new, incredibly old, or unbelievably obscure.
MYTH: ‘LATEX is ‘too difficult’’
This has been heard from physicists who can split atoms; from mathematicians who can explain why ? exists; from business people who can read a balance sheet; from historians who can grasp Byzantine politics; from librarians who can understand LoC and MARC; and from linguists who can decode Linear ‘B’. It's nonsense: most people grasp LATEX in 20 minutes or so. It's not rocket science (or if it is, I know any number of unemployed rocket scientists who will teach it to you).
MYTH: ‘LATEX is ‘only for scientists and mathematicians’’
Untrue. Although it grew up in the mathematical and computer science fields, two of its biggest growth areas are in the humanities and business, especially since the rise of XML brought new demands for automated web-based typesetting.
Installing TEX and LATEX
This course is based on using one of the following distributions of TEX on the 2004 TEX Collection DVD or the 2003 TEX Live CD:
- for Linux and other Unix-like systems, including Mac OS X (Thomas Esser);
- for Microsoft Windows (Thomas Feuerstack), based on Christian Schenk's MikTEX;
- for Microsoft Windows (Fabrice Popineau) from the 2003 TEX Live CD.
Many other implementations of TEX, such as Tom Kiffe's CMacTEX for the Apple Macintosh, can be downloaded from CTAN. LATEX is included with all modern distributions of TEX.
The TEX Collection CD is issued annually on behalf of most of the local TEX user groups around the world (see http://www.tug.org/lugs.html for addresses), and edited by Sebastian Rahtz, Karl Berry, Manfred Lotz, and the authors of the software mentioned above. These people give an enormous amount of their personal time and energy to building and distributing these systems, and they deserve the thanks and support of the user community for all they do.
There are many other distributions of LATEX both free and commercial, as described in this Introduction: they all process LATEX identically, but there are some differences in size, speed, packaging, and (in the case of commercial distributions) price, support, and extra software provided.
One final thing before we start: publicly-maintained software like TEX is updated faster than commercial software, so always check to see if there is a more recent version of the installation. See the item ‘Use the latest versions’ in section 1.4.3 for more details.
1.1 Editing and display
When you install LATEX you will have the opportunity to decide a) which plain-text editor[s] you want to use to create and maintain your documents; and b) which preview programs you want to use to see your typesetting. This isn't much use to you if you're unfamiliar with editors and previewers, so have a look at the table below, and maybe flip ahead to section 2.3 for a moment, where there are descriptions and screenshots.
The best bet is probably to install more than one — if you've got the disk space — or maybe all of them, because you can always delete the ones you don't like.
- There is a wide range of editors available: probably no other piece of software causes more flame-wars in Internet and other discussions than your choice of editor. It's a highly personal choice, so feel free to pick the one you like. My personal biases are probably revealed below, so feel equally free to ignore them.
- For displaying your typesetting before printing, you will need a previewer. All systems come with a DVI previewer for standard LATEX, but if you are intending to produce industry-standard PostScript or PDF (Adobe Acrobat) files you will need a previewer for those formats. GSview displays both PostScript and PDF files; xpdf and Adobe's own Acrobat Reader just display PDF files.
For brief details of some of the most popular editors used for LATEX, see section 2.3.
For licensing reasons, the GSview PostScript/PDF previewer, the Acrobat Reader PDF previewer, and the WinEdt editor could not be distributed on the 2003 CDs. In those cases you have to download and install them separately.
- GSview is available for all platforms from http://www.ghostscript.com/gsview/index.htm (on Unix and VMS systems it's also available as GhostView and gv: see http://www.cs.wisc.edu/~ghost/)
- Acrobat Reader (all platforms) can be downloaded from http://www.adobe.com/products/acrobat/readstep2.html
- WinEdt (Microsoft Windows only) comes from http://www.winedt.com
Due to extreme typesetting problems, the rest of this book is only available as a pdf with full graphics and special characters.