An American Editor

August 22, 2016

Lyonizing Word: Before Typesetting

by Jack Lyon

I need your help, Gentle Reader. I need your ideas. Back in 1996, when I started selling Microsoft Word add-ins at the Editorium, getting a Word document into QuarkXPress was tricky: Quark was prone to crashes and didn’t handle footnotes at all. To solve these problems, I created QuarkConverter, and NoteStripper. A few years later, when people started switching to InDesign, I created InDesignConverter.

In the past several years, however, both QuarkXPress and InDesign have become much better at importing Word documents directly, without the need for a converter. The crashes are mostly gone, and footnotes come right on in. Nevertheless, I’m wondering what else might be done to a Word document to save time and trouble when importing into a layout program — and I’d greatly appreciate your thoughts about that. Here are some examples of the kind of thing I have in mind:

  • Add nonbreaking spaces to dates and initials.

For example, if the text includes a date like “August 17, 2016,” most typesetters want “August” and “17” to stay together; adding a nonbreaking space between the two elements does the trick. Similarly, if a name like “C. S. Lewis” shows up, it’s nice to keep the “C.” and the “S.” together. (To add a nonbreaking space in Word [Windows] 2007 and newer, hold down the CTRL and SHIFT keys as you press the spacebar. For Word [Mac], press the Option key as you press the spacebar.)

  • Remove formatting “overrides.”

Typesetters typically want to handle formatting with styles, so that changing a style attribute in InDesign automatically changes formatting throughout the document. If an author or editor has applied styles in a Word document, those styles can be imported and used in InDesign. But if an author or editor has applied direct formatting using various fonts, that formatting will be imported as “overrides” on the text, which can be a bit of a pain to clean up.

Override Options

Override Options

In its Styles pane, Microsoft Word offers to “Clear All” formatting and styles from selected text.

Clear All Option

Clear All Option

The problem is, “Clear All” really does mean “Clear All,” including not just font overrides but also such local formatting as bold and italic, which needs to remain intact. InDesign’s “Clear Overrides” feature has the same problem. Do you really want to remove italic formatting from the hundreds of journal titles in that giant manuscript you’re editing? If you’re proofreading or setting type, do you really want to put all that formatting back in again by hand? My FileCleaner add-in includes an often-overlooked feature (“standardize font formats”) that removes font overrides but leaves bold, italic, and other local formatting intact, which is exactly what’s needed.

Standardize Font Formats Option

Standardize Font Formats Option

  • Turn straight quotation marks into curly ones.

InDesign can do this—sort of. But it can’t handle things like “’Twas the night before Christmas” or “A miner, ’49er” (dreadful sorry, Clementine). FileCleaner does a much better job of dealing with this; it properly handles ’til, ’tis, ’tisn’t, ’twas, ’twasn’t, ’twould, ’twouldn’t, and ’em, as well as single quotation marks in front of numbers, all of which then come into InDesign correctly. If you have other items that should be included in this list, I’d love to know what they are.

  • Remove multiple spaces between sentences.

In the 1800s many books were set with extra space between sentences.

Sample of 1800s Typeset Page

Sample of 1800s Typeset Page

But, frankly, the 1800s were not exactly the golden age of typesetting.

1800s Poster

1800s Poster

Modern books include just one space between sentences. Still, many authors continue to use two, following the instructions they were given by their high-school typing teacher back in the twentieth century. And that means the double spaces need to be removed at some point. InDesign has built-in find-and-replace routines that will fix this and a few similar items.

InDesign Find & Replace

InDesign Find & Replace

FileCleaner, however, fixes many such things. And the version that’s included with Editor’s ToolKit Plus 2014 fixes many more.

FileCleaner Options

FileCleaner Options

  • Change italic and bold formatting to character styles.

Using character styles in InDesign provides much more stability and flexibility than local bold and italic formatting. It would be nice to have these styles already applied in Word before the document is imported into InDesign. My tools don’t currently do this, but they probably should.

QuarkConverter and InDesignConverter include some other useful fixes.

Quark Converter Options

Quark Converter Options

 

InDesign Converter Options

InDesign Converter Options

Nevertheless, I can’t help thinking that there must be things I’ve overlooked. I’m an editor, not a typesetter, so I don’t really know all of the things that typesetters have to fix that they really shouldn’t have to deal with. (This probably includes the most common errors that proofreaders mark.) So if you do typesetting or proofreading, would you help me out? I’d really like to know what I’m missing — things that could be cleaned up in an automated way in Microsoft Word before a document is ever imported into InDesign. What problems do you routinely encounter that you wish would go away? If you’ll let me know, I’ll try to come up with an add-in designed specifically to fix such things. Your suggestions for this would be most welcome.

Of course, typesetters and proofreaders aren’t the only ones who can benefit from this kind of cleanup. It’s also valuable to editors, allowing them to focus on words, structure, and meaning rather than deal with these tiny but pervasive problems. Little things like double spaces and straight quotation marks may not seem all that bothersome, but like pebbles in your shoe, they create subliminal annoyance that really adds up, making editing much more difficult than it should be. At least that’s my experience. What do you think?

Jack Lyon (editor@editorium.com) owns and operates the Editorium, which provides macros and information to help editors and publishers do mundane tasks quickly and efficiently. He is the author of Microsoft Word for Publishing Professionals, Wildcard Cookbook for Microsoft Word, and of Macro Cookbook for Microsoft Word. Both books will help you learn more about macros and how to use them.

6 Comments »

  1. Jack, it would save more time if the split doc function exported as .docx rather than .doc. Thank you for a great and versatile set of tools.

    Like

    Comment by Bonnie Britt — August 24, 2016 @ 4:50 pm | Reply

  2. Jack, maybe you will think this through to determine what macros Editorium can offer given the following scenario.
    Some indy designers are not interested in renting software. We own version 6 of Adobe’s Creative Suite and that’s where we plan to stay. We are happy with that, especially for typesetting print books. However, after the galleys come the revisions, which are done in InDesign. Those files are then exported for printing via an Acrobat file. And then it becomes time to convert to ePub. One way to do this is to make revisions twice (once in Word and once in InDesign.) Another way is to revise in InDesign and then export (or “save as .docx”) the book file from Acrobat to Word and then fix most of the production issues discussed in the following article.
    What Agents Should Know About Ebooks Made from PDFs
    By: Ben Denckla | August 24, 2016
    http://www.digitalbookworld.com/2016/agents-know-ebooks-made-pdfs/

    CS6 designers want easier ways to fashion ePubs from Acrobat files saved to Word’s .docx without having to revise twice. For that we need perfectly styled Word .docx files. Editorium 2014 gets us part of the way there, including splitting the chapters. (The latter currently have to be handled twice in that they are .doc files that need to be saved as .docx.)
    Of Denckla’s list, the contents table is least important as that is easily done. Handling images is a different story and outside the realm of this request.
    Fixing the other items is labor-intensive. They are:
    1. Distinguishing hard vs. soft line breaks:
    2. Distinguishing hard vs. soft hyphens;
    3. Handling footnotes and endnotes;
    4. Rejecting headers & footers.

    Thanks for giving this some thought, Jack, especially #s 1 and 2.

    Like

    Comment by Bonnie Britt — August 24, 2016 @ 9:33 pm | Reply

    • Wow, Bonnie, thanks for the detailed analysis. This is incredibly helpful. I’ll definitely give this some thought, and I’ll probably have some questions for you as I do that. I hope that’s okay.🙂

      I really hope others will weigh in on this as well.

      Like

      Comment by Jack Lyon — August 25, 2016 @ 7:19 pm | Reply

  3. Here are some of the (more or less common) words I find that start with apostrophes, other than those you mentioned:

    ‘er (for her)
    ‘im (for him)
    ‘cause (for because)
    ‘bout (for about)
    ‘round (for around)
    ‘n’ (for and)
    ‘twere and ‘tweren’t
    ‘twill and ‘twon’t
    ‘tain’t
    ‘riting and ‘rithmetic
    ‘fore (for before)

    (Naturally all turned the wrong way because I cut and pasted from Word…)

    Like

    Comment by Eliza Dee — August 31, 2016 @ 12:49 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: