An American Editor

December 24, 2018

Indexes: Part 7 — Lessons Learned in Using DEXembed for the First Time

Editor’s note: This version of the post incorporates corrections made by the author to the Options and Advice sections.

Ælfwine Mischler

I recently created an embedded index in Word for a book that will be published as an ebook and in print. I chose to use DEXembed because colleagues advised that its syntax — a space between the curly brackets and the enclosed text — will work better when the text is converted to an ebook.

A quick explanation of an embedded index: For a print book, the index is written after the book has been designed, using a PDF file of the final pages and page numbers as locators. This is changing, and many publishers are now asking for embedded indexes. For an embedded index, the indexer uses something else as locators. Depending on the program used, this could be paragraph numbers, word numbers, or temporary bookmarks. After indexing, the program embeds the entries by inserting field codes that look like this: { XE “main entry:subentry” }. The index is then generated from the field codes so the pages numbers are displayed. In an ebook, they may also be linked to the location in the text. If the book is designed as hardcover and paperback with different pagination, the embedded index entries will give the correct page numbers for each edition.

Embedded indexes are more work for the indexers, so most of us will charge more for an embedded index.

Options in DEXembed

DEXembed (available from the Editorium) is a Word add-on that allows the indexer to use dedicated indexing software rather than Word’s clunky built-in indexing function. DEXembed can use paragraphs, words, or numbers as locators — but only one type in a given document. Paragraph number was the best choice for this project, but the author had sometimes used auto-spacing and other times had used Enter twice between paragraphs. I told him repeatedly that he had to remove the extra Enters and make the spacing between paragraphs consistent (which he did) and that he could not change the paragraphing after I had started indexing. (More on that in the second part of this article in February 2019.)

Experienced colleagues in the Digital Publications Indexing Special Interest Group (DPI SIG) say that Word does not handle ranges of locators well. It is therefore better to mark only the beginnings of entries that are less than two pages long. DEXembed offers three options for ranges: Mark them with bookmarks, mark them with beginning and end codes, or do not mark them. The documentation for DEXembed says that publishers usually prefer begin and end codes.

Before starting my index, I sent two small sample indexes to my author’s publisher — one using bookmarks and one using begin and end codes — and asked which worked better for them. They got better results with the bookmarks, which also meant one less step for me in the end. Hurray!

I Won’t Talk to You

DPI SIG members also advised me that Word and InDesign use different syntax for some things, and I had to take this into consideration while indexing. I also found that my Sky indexing software and Word do not always communicate well.

This index required a separate scripture index of Qur’an verses. In Word, you can use an f-switch that is coded with \f followed by a name to make two indexes at once { “heading1” \f “subject” } and { “heading1” \f “quran” } (See Seth A. Maislin’s blog for more.) However, my colleagues advised that InDesign will reject XE fields with a backslash.

A suggested solution that I followed was to use two levels of subentries, with the main entries for the two indexes. That is, I had only two main entries, for which I used bold text, and my first level of subentry was the real main entry I wanted. The sub-subentry was the real subentry I wanted. The designers can adjust the indentation and spacing to make these appear as two separate indexes:

The chapter and verse numbers presented two other problems of their own. How to write something like 2:10? First, Word signals heading levels with a colon, so I had to use a backslash before the colon to tell Word that this was a literal colon, not a subheading signal. I admit that at that point, I had forgotten the warnings of my colleagues that InDesign would reject these entries.

As of this writing, I am waiting for the author’s comments and corrections, and the results of a small test index for the publisher: three entries using a backslash and colon, and three using a plus sign to be replaced by a colon in the generated index. If I do indeed have to remove \: from the index, I want to be sure that + is not a signal for something else in InDesign.

A second problem in writing chapter and verse numbers was the sorting. I knew that in Sky, I had to enter one- and two-digit chapter numbers with preceding zeros so they would sort properly. Thus, Chapter 2 was entered as 002 and Chapter 16 as 016. The verse numbers following the colons, however, sorted properly in Sky without additional zeros.

Word was not happy with that, but I could only learn that at the end. I finished my index, embedded the entries, generated the index, and then found that Word had mis-sorted the verses so that, for example, 18:70 came before 18:7. I had to open Sky, add the zeros to the verses, re-embed the entries, generate the new index, and remove the extra zeros from the generated index.

Maybe I’ll Talk a Little Bit

Another difference between Sky and Word is how they handle text to be ignored in sorting. Sky’s sorting automatically ignores prepositions at the beginning of subentries, but  Word’s does not. Sky also allows the indexer to code other things to be ignored in sorting. I commonly do this with the al- that begins many Arabic names.

For the embedded index, I had to enclose items to be ignored in angle brackets, but then in Sky, they all sorted to the top because they started with symbols. I was not sure that Word would put <al->Bukhari, <al->Ghazali, <al->Tabari, etc., in the proper places in the generated index. On this, I did have success, but I had to go back to the few subentries that begin with prepositions and enclose the prepositions in angle brackets.

DEXembed uses a text file to embed the entries, and all the bold and italics are lost in the process, although their coding remains. Once the entries were embedded, I had to edit the XE fields to get the bold and italic formatting back. (See Sue Klefstad’s blog post for details.) This was not difficult with a Find and Replace using wildcards (but be sure to turn off Tracked Changes!), but it was an extra step to perform.

Advice for Embedded Indexing

It is important to communicate with the author and publisher before beginning an embedded index. Learn how the Word manuscript will be handled after indexing and how it will be published. (There is more information on the resources page of the DPI SIG website.)

Once you have written your index in your dedicated indexing software, always embed in a copy of the document. Always keep the original “clean” and do not embed in it. Sometimes Word does not embed the entries properly and you might have to try again. DEXembed does have a function to remove embedded entries, but if Word gives you run-time errors as it did to me (see the second part of my February 2019 column), you will want to try again in a clean copy so there is no chance of stray coding in the file.

My thanks to colleagues Sue Klefstad and Seth A. Maislin for their invaluable blog posts, and to other colleagues in the DPI SIG for their advice in e-mail messages.

Ælfwine Mischler is an American copyeditor and indexer in Cairo, Egypt, who has been the head copyeditor at a large Islamic website and a senior editor for an EFL textbook publisher. She often edits and indexes books on Islamic studies, Middle East studies, and Egyptology.

Advertisement

September 17, 2018

Book Indexes — Part 4: The Metatopic

Ælfwine Mischler

A few years ago, I was asked to index a book about a medieval ruler and the mosque and city he built. The book was primarily an architectural history, but it included substantial information about the city and about the ruler’s childhood in central Asia and its influence on the mosque’s architecture.

But I was told that the names of both the ruler and the mosque, and the name of the city, were not to appear in the index.

I interpreted this to mean that those names were not to be main entries. There were entries on the other cities in the country discussed, so I put the forbidden city as a subentry under “cities,” and I made entries for “education of X” and “rise to power of X” even though I knew that they ought to be subentries under the name of the not-to-be-named ruler.

Being very much a newbie at the time, I asked for a volunteer to peer review my index. My reviewer rightly asked why I had not put main entries for the ruler and the city. When I told her that that was what the editor and author had requested, she suggested that I make a second version of the index with those items properly indexed and give the editor the choice. I did that, but the editor replied that they had decided on the first option. I later saw that in the published version they had also removed the education and rise-to-power entries, as well as the cities main entry so that the “forbidden city” was nowhere to be found in the index, although the other two cities retained their main entries.

Why? I have never understood why the client did not want those items in the index when they were so obviously part of what the book was about.

Long-time indexers say that they were taught decades ago not to index the main topic of the book — what indexers now call the metatopic. Now, though, whenever we peer-review an index, the metatopic is the first thing we look for.

It has been found that when readers use an index, they usually look first for the metatopic that is apparent from the book title or subtitle. If the book is about aardvarks and readers do not find “aardvarks” in the index, they do not conclude that the index is bad; they conclude the book is bad, with nothing about aardvarks.

Obviously, you cannot put everything as subentries under the metatopic, or you would be indexing the whole book. A joke among indexers is of a graduate student who was asked to index his professor’s book. When it came to the metatopic, he started to add page numbers — 1, 2–3, 4, 5–7 — and then threw up his arms with “It’s on every page!”

But under the metatopic(s) — there can be more than one — an indexer can put subentries that cannot stand alone as main entries, such as a definition or other items that readers are unlikely to look for in the index, and then add See also cross-references to guide the reader to the entries for the main discussion. Every main entry in the book should relate to the metatopic(s) in some way.

Here are some of the subentries I put under the metatopic “Egyptology” and the See also cross-references in the index of a three-volume history of Egyptology. (This was a run-in index, which is reflected in the wording, but I am displaying it here as an indented index.)

Handling the metatopic(s) is not always easy, and indexers have different ways to approach the task. The metatopic(s) may be easy to identify from the title or subtitle, or by reading the introduction and conclusion — which indexers read before beginning the index. On the other hand, in a complex scholarly book, the metatopic may not be readily apparent. An indexer may formulate the metatopic as a sentence or short paragraph before deciding on a concise phrasing suitable for an index entry.

As a reader, do you look for the metatopic when you open an index for the first time? Are you disappointed if you do not find it? Have you noticed a difference in indexing styles between older and newer books?

Ælfwine Mischler is an American copyeditor and indexer in Cairo, Egypt, who has been the head copyeditor at a large Islamic website and a senior editor for an EFL textbook publisher. She often edits and indexes books on Islamic studies, Middle East studies, and Egyptology.

July 16, 2018

Book Indexes — Part 3: The ABCs of Alphabetizing

Ælfwine Mischler

The alphabetizing I learned in school so many years ago — all before PCs and the Internet, of course — was easy. Go by the first letters — Bincoln, Fincoln, Lincoln, Mincoln — and if they’re all the same, look at the second, then the third, etc. — Lankin, Lanky, Lenkin, Lincoln, Linkin. I rarely had to alphabetize anything outside of school assignments (I did not organize my spices alphabetically), but I had to understand alphabetization to find a word in a dictionary, a name in a phone book, a card in a library catalog, or a folder in a file cabinet. Hunting for an organization or business whose name was just initials or began with initials was sometimes tricky, but I soon learned that if I did not find something interspersed with other entries, I could look at the beginning of that letter.

As an indexer, I have to know the conventions of alphabetizing so I can enter terms in the software program, and like so many other things in editorial work, there are different standards to follow. There are two main systems of alphabetizing — word-by-word and letter-by-letter — with some variations within each system. If you are writing an index or hiring an indexer, you have to know which system the publisher uses. Occasionally an indexer might find, in the midst of a project, that switching to the other system would be better, but this must be cleared with the publisher.

Word by Word

In the word-by-word system, generally used in indexes in Great Britain, alphabetizing proceeds up to the first space and then starts over. According to New Hart’s Rules, 2nd ed., hyphens are treated as spaces except where the first element is a prefix, not a word on its own (p. 384). However, the Chicago Manual of Style, 17th ed., treats hyphenated compounds as one word (sec. 16.60).

Letter by Letter

Most US publishers prefer the letter-by-letter system, in which alphabetizing continues up to the first parenthesis or comma, ignoring spaces, hyphens, and other punctuation.

If you are writing your own index in a word processing program, it will use word-by-word sorting. Dedicated indexing software can use either system along with variations. The following table comparing these systems uses Microsoft Word and SKY Indexing Software with various settings. (The items in the table were chosen to demonstrate how the different systems handle spaces, hyphens, commas, and ampersands. Not all of them would appear in an index. The variations on Erie-Lackawanna, for example, would normally have another word, such as “Rail Road,” following them.)

 

Entries with Same First Word

In the first edition of New Hart’s Rules, names and terms beginning with the same word were ordered according to a hierarchy: people; places; subjects, concepts, and objects; titles of works. You may see this in older books, and it occasionally comes up in indexers’ discussions. However, the second edition of New Hart’s Rules recognizes that most people do not understand this hierarchy and that alphabetizing this way is more work for the indexer. The second edition (p. 385) recommends retaining the strict alphabetical order created by indexing software.

Numbers Following Names

Names and terms followed by numbers are not ordered strictly alphabetically. These could be rulers or popes, or numbered articles or laws, etc. An indexer with dedicated software can insert coding to force these to sort correctly. If you are writing your own index in a word processor, you will have to sort these manually.

When people of different statuses — saints, popes, rulers (perhaps of more than one country), nobles, commoners — share a name, these have to be sorted hierarchically. See New Hart’s Rules, 2nd ed., section 19.3.2, and Kate Mertes, “Classical and Medieval Names” in Indexing Names, edited by Noeline Bridge.

Numerals and Symbols at the Beginning of Entries

Entries that begin with numerals or symbols may be sorted at the top of the index, before the alphabetical sequence. This is preferred by the International and British Standard, and when there are many such entries in a work. Alternatively, they may be interspersed in alphabetical order as if the numeral or symbol were spelled out, and they may be also be double-posted if they appear at the top of the index.

However, in chemical compounds beginning with a prefix, Greek letter, or numeral, the prefix, Greek letter, or numeral is ignored in the sorting.

Greek letters prefixing chemical terms, star names, etc., are customarily spelled out, without a hyphen (New Hart’s Rules, 2nd ed., p. 389).

If you are writing your own index in a word processing program, you will have to manually sort entries with Greek letters or prefixes to be ignored, and entries beginning with numerals if you do not want them sorted at the top. Dedicated indexing programs can be coded to print but ignore items in sorting, or to sort numerals as if they were spelled out.

That’s Not All, Folks

This is just the beginning of alphabetizing issues that indexers face. While most of the actual alphabetizing is done by the software, indexers have to know many conventions regarding whether names are inverted; how particles in names are handled; how Saint, St., Ste. and Mc, Mac, Mc in surnames are alphabetized (styles vary on those); how to enter names of organizations, places, and geographical features. In addition to checking the books mentioned above, you can learn more about indexing best practices and indexing standards on the American Society for Indexing website and from the National Information Standards Organization.

Ælfwine Mischler is an American copyeditor and indexer in Cairo, Egypt, who has been the head copyeditor at a large Islamic website and a senior editor for an EFL textbook publisher. She often edits and indexes books on Islamic studies, Middle East studies, and Egyptology.

June 18, 2018

Book Indexes — Part 2: No Magic Wands

Ælfwine Mischler

I took up indexing several years ago when I wanted to branch out from copyediting. I have found indexing to be more intellectually challenging and, thus, a welcome change from copyediting. I do both as a freelancer, but not on one book at the same time, and enjoy the variety.

Most indexers describe what they do as mapping a book — and it is mapping — but I think of it as looking at the book from a different angle. Think of forest and trees. When I am copyediting, it is like creeping along the forest floor, looking at not just every tree but at every detail. (I have seen that name spelled two different ways; which is correct? Does that comma belong here? This verb does not match the subject, but what is the subject in this twisted sentence? Is there a better word for that?) But when I am indexing, it is like flying over the treetops, seeing a bigger picture. (Here is a section on topic X. Over there, the topic is raised again. And this topic here is related to X. There is a lot of information about this person. How should I break it up and organize it?)

Indexing is a creative process. It is said that no two indexers would produce the same index of a given book. I have software to help me organize what I put into an index, but I am the one who decides what to include and what words to use. Just as you do not open a word processing program and expect it to write a document for you, I do not open my indexing program and expect it to write an index for me. Many people seem to think that I plug the manuscript into some software and out pops the index. (There are some programs that claim to do just that, but indexers in my circles say they cannot rely on them to produce a good index.)

No, folks, writing an index is not that easy. I actually read the book, cover to cover. I sometimes wish I had a magic wand that could do it for me — “Indexify!” — but I have to read everything.

“So do you read a page and put in all the A words, then all the B words, then all the C words?” asked a friend.

“No, I put in the words and the software alphabetizes them.”

She still seemed a bit stumped.

“Do you read the whole book first?” asked a nephew.

“No, there is not enough time to do that. I have to index from the start.”

Working from a PDF file of a book’s second proofs (usually), I read the foreword, preface, and introduction to get an idea of the importance of the book, the topics covered, and the book’s organization. From the table of contents, I often index the chapter titles and section headings to form the basic structure of the index. Each chapter title becomes a main entry, and the section headings form subentries. I will then break out most of those subentries to form their own main entries as well. (See Part 1 of this series.)

I often have to change the chapter titles or section headings to make them suitable for index entries. If the book does not have section headings, I have the more-difficult task of skimming the text for verbal clues to a change of topic.

Then I go back over the chapters and pick up more details within each section. If the entry has a long page range, I look for some logical way to break it down into smaller ranges; that is, create subentries. Also, if a particular name or concept has many different locators, I look for some way to break them into subentries. I also look for related concepts and write see also cross-references.

What to call a given entry is not always obvious. If nothing comes to me quickly, I use tools within the software — color coding to remind myself to come back to it later, and hidden text with a few words about the topic. Often after reading a few more pages, the answer comes to me.

One of the things that makes indexing so mentally challenging is that I have to keep so many things in my head at one time. If I indexed concept Z as term Z′, I have to continue to keep an eye open for Z throughout the book and remember to call it Z′ and not something else — all the while doing this for concepts A, B, C, etc. My indexing software can help me to use Z′ and not something else, but it cannot help me to remember to pick it out from the book. If I later realize that I have missed some cases of Z, I can attempt to search for a word in the PDF file to find it, but in most cases, there is no exact word or phrase that will take me to Z. The words in an index are often not found in the book, which is another reason why automatic computer indexing cannot produce a good index.

Names often present challenges to me and other indexers. In school years ago, I learned to look for names in an index under the surname — Abraham Lincoln under Lincoln — but not all cultures invert names, and parts of names such as de, von, la, Abu, and Ibn can be problematic. Medieval names and names of nobility and royalty have their own conventions. The first book I indexed for hire contained the whole range of problems: ancient Egyptian, ancient Greek and Roman, medieval, and royal names; pre-modern and modern Arabic names (which follow different conventions); European names with particles; nobility titles (from various countries, no less!); and saints, too!

Fortunately, I had a very understanding managing editor who knew this was my first paid index and was willing to help me with the difficult names. Not all indexers are so fortunate in their clients. (For more information about the complexities of indexing names, see Indexing, edited by Noeline Bridge, and occasional articles in The Indexer.)

What did I have to learn in my indexing course? In addition to conventions about names, there are conventions for wording entries (for example, use plural nouns, don’t use adjectives alone, use prepositions or conjunctions at the beginning of subentries in run-in style), different ways to alphabetize (handled by the software options), and guidelines for whether to index a given item — a topic for another day. The course I took from the University of California at Berkeley Extension also required us to sample the three major indexing software programs — Macrex, Cindex, and Sky — which all do the same things but are different in their interfaces. Online courses are also available from the American Society for Indexing and the Society of Indexers.

Now I leave you so I can sail over the trees of another book.

Ælfwine Mischler is an American copyeditor and indexer in Cairo, Egypt, who has been the head copyeditor at a large Islamic website and a senior editor for an EFL textbook publisher. She often edits and indexes books on Islamic studies, Middle East studies, and Egyptology.

Blog at WordPress.com.

%d bloggers like this: