An American Editor

May 27, 2015

Lyonizing Word: We Can Do This the Easy Way, or . . .

We Can Do This the Easy Way,
or We Can Do This the Hard Way

by Jack Lyon

American Editor Rich Adin called me recently with a puzzle. He was editing a list of citations that looked like this:

Lyon J, Adin R, Poole L, Brenner E, et al: blah blah blah.

But his client wanted the citations to look like this:

Lyon J, Adin R, Poole L, et al: blah blah blah.

In other words, many of the citations included one author name too many; the client wanted a limit of three rather than four. And there were hundreds of citations. Rich really didn’t want to remove the superfluous names by hand; it would have taken hours to do, and hours are money. And so, Rich queried, “Is there a way to remove the fourth name automatically?”

There’s nearly always a way. Rich had already tried using a wildcard search, but without success. Microsoft Word kept telling him, “The Find What pattern contains a Pattern Match expression which is too complex.”

The Too-Complex Find What

I’m not sure what wildcard search Rich tried to use, but it might have looked like this:

Find what:

([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )([A-Z][a-z]@ [A-Z], )(et al:)

Replace with:

\1\2\3\5

That’s definitely too complex for Word to handle. Here’s what it means:

Find a capital letter ([A-Z])
followed by a lowercase letter ([a-z])
repeated any number of times (@)
followed by a space
followed by a capital letter ([A-Z])
followed by a comma
followed by a space
with all of that in parentheses to form a “group.”

All of that is repeated three more times, then followed by “et al:” in parentheses to form a group.

The “Replace with” string tells Word to replace what it finds with the contents of groups 1, 2, 3, and 5 — in other words, with the first three names followed by “et al:”.

What’s the Handle?

If Word could handle it, that should work. But Word can’t handle it, so we’ll need to simplify. So we ask ourselves, “What, besides letters, do all of the names have in common?” In other words, “What’s the handle? What can we grab onto?” Well, that’s easy — each name is followed by a comma and a space. That’s our handle!

(For more on this, please see my article “What’s Your Handle?” (2003) at the Editorium Update.)

The Find That Works

The handle means we can simplify our wildcard search string to something like this:

Find what:

([!^013]@, [!^013]@, [!^013]@, )[!^013]@, (et al:)

Replace with:

\1\2

Here’s what that means:

Find any characters except a carriage return ([!^013])
repeated any number of times (@)
followed by a comma
followed by a space
with all of that repeated three times
and enclosed in parentheses to form a “group.”
Then it’s repeated one more time, ungrouped
and followed by “et al:” in parentheses to form a group.

The “Replace with” string tells Word to replace what it finds with the contents of groups 1 and 2 — in other words, with the first three names (group 1) followed by “et al:” (group 2). The fourth name is simply ignored.

To Group or Not to Group Using Parens

Rich ran the new find and replace, then replied, “Thanks, Jack, that works like a charm. Why isn’t the second ‘group’ grouped, that is, in parentheses? I thought that was necessary.”

I replied, “No, it’s not necessary. You group only the items that you want to reference (by \1, \2, etc.) in the ‘Replace with’ box. You could group the other item, in which case you would use ‘\1\3’ in the ‘Replace with’ box. But there’s no need to do so.”

Note that this method of finding the names offers another advantage. Not only will it find names that look like this:

Lyon J,

it will also find names that look like this:

Lyon JM,

or even this:

Lyon JMQ

It will even find names like this:

Thaler-Carter Ruth,

or this:

Harrison G.B.H.,

In fact, it will find anything (except a carriage return) followed by a comma and a space.

Why the Carriage Return?

“Why,” you may be wondering, “specify anything but a carriage return? Why not specify letters instead?” Well, we could have done that, using something like this:

Find what:

([A-z ]@, [A-z ]@, [A-z ]@, )[A-z ]@, (et al:)

Replace with:

\1\2

That means:

any capital or lowercase letter or space ([A-z ])
repeated any number of times (@)
followed by a comma
followed by a space
And so on.

Such a wildcard string would find names like this:

Lyon J,

but not this:

Thaler-Carter R,

Yes, we could add a hyphen to our string, but then we start to wonder about other characters we might need to include, and then things get complicated again. And besides, it’s true that we don’t want to include carriage returns in our search, so it makes sense to exclude them. If we tried to simplify too far, we might use this:

Find what:

(*, *, *, )*, (et al:)

Replace with:

\1\2

The problem with using the asterisk wildcard (*) is that it finds any character any number of times, including tabs, spaces, carriage returns, and everything else you can think of. Sometimes that’s useful, but more often it just leads to confusion. We want to keep things simple but not too simple.

Why Wildcard

To return to our original problem: Rich could have removed all those extra names one at a time, by hand, which is doing it the hard way and eats into the profit line — remember that time is money. Microsoft Word includes powerful tools for doing things the easy way, so why not learn them and use them? If you’ve read this far, you’re doing that, so congratulations.

If you’d like to learn more about how to use wildcard searches, you can download my free paper “Advanced Find and Replace in Microsoft Word.” Working through the paper requires some thought and effort, but the payoff is huge.

Coming Soon

I hope you’ll watch for my forthcoming Wildcard Cookbook for Microsoft Word. I’m still trying to find more real-life examples for the book, so if you have some particularly sticky problems that might be solved using a wildcard search, I hope you’ll send them my way. Maybe I can save you some work and at the same time figure out solutions that will help others in the future. Thanks for your help!

For EditTools Users

If you are a user of EditTools, you can manually create the find and replace strings in the Wildcard Find & Replace macro and then save the macro for future use. However, to do so you need to enter the Find string slightly differently:

Find Field #1: [!^013]@, [!^013]@, [!^013]@,
Find Field #2: [!^013]@,
Find Field #3: et al:

Note that you omit the parens for grouping because EditTools automatically inserts them, which means that you break the string into its group components. (IMPORTANT: Be sure to include in Find Fields 1 and 2 the ending space, i.e., the space following the final comma, which is not visible above.)

Because EditTools treats each of the three fields as a group, your Replace string is:

Replace Field #1: \1
Replace Field #2: \3

After manually entering the information in each of the fields, click Add to WFR Dataset and save this macro for future use. Next time you need it, just click Retrieve from WFR Dataset, retrieve this string, and run it. That is one of the advantages to using EditTools’ Wildcard Find & Replace — you can write a wildcard macro once and reuse it as many times as you need without having to recreate the macro each time.

Jack Lyon (editor@editorium.com) owns and operates the Editorium, which provides macros and information to help editors and publishers do mundane tasks quickly and efficiently. He is the author of Microsoft Word for Publishing Professionals and of Macro Cookbook for Microsoft Word. Both books will help you learn more about macros and how to use them.

Looking for a Deal?

You can buy EditTools in a package with PerfectIt and Editor’s Toolkit at a special savings of $78 off the price if bought individually. To purchase the package at the special deal price, click Editor’s Toolkit Ultimate.

 

May 15, 2013

The Only Thing We Have to Fear: Wildcard Macros

Whenever I talk to colleagues about macros, it is as if a funereal pall has enclosed us. My colleagues, generally, tell me that they cannot write macros, that it is much too complicated, especially wildcard macros.

If I ask if they ever use Word’s Find & Replace, they all admit that, yes, they do. “Congratulations,” I say, “because each time you use Find & Replace, you have written a macro! You just haven’t recorded it.”

The only thing we have to fear about macros is our fear of macros.

I suppose, technically, Find & Replace is not macro writing, but truly, a macro is just a way to find some sequence and do something to that sequence — be it bold the sequence, highlight it, replace it with another sequence, delete it, whatever.

Most everyone who uses Microsoft Word has recorded a keyboard macro. Word makes doing so very easy. Again, congratulations if you have written a keyboard macro, because you are on your way to macro wizardship.

There is a key to writing macros. It is a secret that macro wizards rarely share, but I’m going to share with you. The secret is wrapped up in a single magical word: analysis. Analysis of what you need a macro to do is the key to writing a macro. Sure you need to have some arcane language (what good is wizardry without arcane language all its own?) and all of the arcane language you need to write the macros can be found in Jack Lyon’s Macro Cookbook for Microsoft Word and in Wildcards in MS Word Macros, which is a compilation of information on wildcards that Jack Lyon wrote for his blog years ago and which you can download for free by clicking the title-link. Alternatively, you can use the Wildcard Find & Replace Macro found in EditTools to “write” the macros for you, but analysis is the real key to writing macros.

Consider this problem: You have a list of 100 references in which the styles are all over the place. Author names are often listed like this:

Arnold, J. H., K. L. Swift, and A.J.H. Archimedes.

but you need the author names to look like this:

Arnold JH, Swift KL, Archimedes AJH:

You can fix the names manually or by using macros. Manually will take nearly forever, so the better method is to use macros. Here is where analysis matters.

When I began using macros, I saw this problem and thought, “How can I write a macro to fix these author names?” My thinking was a single macro to take care of it all. I quickly discovered that a single macro can’t do the job, but a series of macros that can be combined into a single macro could. The key was series of macros, which meant that I needed to break the problem down into solvable (or macroable) parts.

The first part is Arnold, J. H., which I need to change to Arnold JH,. What I need to find is as follows:

  1. any mix of letters of varying length
  2. that is followed by a comma
  3. and a space
  4. a single uppercase letter
  5. followed by a period and a space
  6. a single uppercase letter
  7. followed by a period
  8. and a comma and a space

I need to replace the find list with

  1. the mix of letters found in 1
  2. the space found in 3
  3. the single uppercase letter found in 4
  4. the single uppercase letter found in 6
  5. and the comma and space found in 8

Note that what I no longer need is not included in the list of replace with items (i.e., find items 2, 5, and 7). Also note that, in analyzing what needs to be found, items that I no longer want are listed on their own lines in the find list.

If you are using Word’s Find & Replace dialog with Use Wildcards checked, you would manually enter the following Find string [paired parens represent the information on a single line in the find list, thus, ([A-z]@) represents line 1: any mix of letters of varying length]:

([A-z]@)(,)( )([A-Z]{1,1})(. )([A-Z]{1,1})(.)(, )

And the following Replace string (the backslash+number represents the corresponding find item, e.g., \1 represents line 1: any mix of letters of varying length and \8 represents line 8: a comma and a space):

\1\3\4\6\8

I can hear you groan. But it isn’t as difficult as it appears. All of the information to write the strings is available in the downloadable Wildcards in MS Word Macros document (just click on the link).

If you are using the EditTools’ Wildcard Find & Replace Macro, you click buttons to make your selection and the code is written for you. An added feature with the Wildcard Find & Replace Macro is that you can save this find and replace so that you can reuse it in the future; with Microsoft’s Find & Replace, the strings cannot be saved. However, what I used to do before I created the Wildcard Find & Replace Macro — and recommend that you do — was keep a special Word document with these strings in it so I could copy and paste when needed in the future. I set up the file like this:

1. Change Arnold, J. H., to Arnold JH,
Find: ([A-z]@)(,)( )([A-Z]{1,1})(. )([A-Z]{1,1})(.)(, )
Replace: \1\3\4\6\8

Once you have entered the strings in either Microsoft’s Find & Replace dialog or in the Wildcard Find & Replace Macro, click Replace All and all author names that fit this particular format will be altered. Then move to the next series to analyze, which is to change K. L. Swift, to Swift KL,. In this instance, what I need to find is as follows:

  1. a single uppercase letter
  2. followed by a period and a space
  3. a single uppercase letter
  4. followed by a period
  5. and a space
  6. any mix of letters of varying length
  7. that is followed by a comma
  8. and a space

I need to replace the find list with:

  1. the mix of letters found in 6
  2. the space found in 8
  3. the single uppercase letter found in 1
  4. the single uppercase letter found in 3
  5. the comma found in 7
  6. and the space found in 5

What I no longer need is not included in the list of replace with items (i.e., find items 2 and 4). Also note that, in analyzing what needs to be found, items that I no longer want are listed on their own lines in the find list.

If you are using Word’s Find & Replace dialog with Use Wildcards checked, you would manually enter the following Find string:

([A-Z]{1,1})(. )([A-Z]{1,1})(.)( )([A-z]@)(,)( )

And the following Replace string:

\6\8\1\3\7\5

I said that you can’t save the strings as a macro if you are using Word’s Find and Replace dialog. That is true as far as it goes, but it doesn’t go all that far. There is a way to save the strings as a true macro without using EditTools’ Wildcard Find & Replace Macro. What you do is record a simple Find and Replace macro, for example, find bush and replace it with blues, using Word’s Record Macro feature, and give it a name like WildcardAuthorCorrection1; be sure to keep a list of what that macro does (or will do once you edit it). (If you don’t know how to record a simple macro, the fastest and best way to learn is to use Jack Lyon’s Macro Cookbook for Microsoft Word. Within a few minutes you will be a master at recording simple macros and even at editing them.)

Open the newly recorded macro to edit it, and replace the .Text = bush entry with .Text = [your find string] and replace the .Replacement.Text = blues with .Replacement.Text = [your replace string]. Make sure all the items labeled as True are changed to False except change .MatchWildcards = False to .MatchWildcards = True.

Once you get hooked on macros, the possibilities are endless and you’ll never let go. More importantly, you will improve your editing speed, accuracy, and efficiency, which translates into a higher effective hourly rate and a more profitable editing business.

You’ve got nothing to fear — macros are conquerable!

November 7, 2012

The Business of Editing: Wildcard Macros and Money

I thought the mention of money might catch your interest :). But macros, especially wildcard macros, and money do go hand in hand. Consider the following two scenarios I recently experienced in the references of a project (same project, different chapters).

In the first scenario, there were, over two chapters, nearly 500 references that the authors had formatted like this:

Agarwal, S., Loh, Y. H., Mcloughlin, E. M., Huang, J., Park, I. H., Miller, J. D., Huo, H., Okuka, M., Dos Reis, R. M., Loewer, S., Ng, H. H., Keefe, D. L., Goldman, F. D., Klingelhutz, A. J., Liu, L. & Daley, G. Q. (2011) Telomere elongation in induced pluripotent stem cells from dyskeratosis congenita patients. Nature, 464, 292-6.

In the second scenario, the references were formatted like this:

Adhami F, G Liao, YM Morozov, et al: “Cerebral ischemia-hypoxia induces intravascular coagulation and autophagy.” Am J Pathol 2006;  169(2): 566-583.

What they need to look like is this:

Airley R, Loncaster J, Davidson S, et al. Glucose transporter glut-1 expression correlates with tumor hypoxia and predicts metastasis-free survival in advanced carcinoma of the cervix. Clin Cancer Res 2001;7(4):928-934.

The money question is how to I get the references from where they are to where they need to be quickly and efficiently so that I make money and not lose money? The answer lies in wildcard macros.

For most editors this is a daunting task that needs to be tackled manually. In the first scenario, the editor will manually remove each extraneous period, manually move the year to precede the volume number, and manually correct the punctuation problems in the citation. In other words, most editors will spend a good two or three minutes — if not longer — correcting each reference entry. I, on the other hand, spent less than 30 minutes cleaning up these references and verifying the journal names.

It is not that I am a brilliant macro writer — I am not. A skilled macro writer is someone like Jack Lyon, the creator of the Editorium macros that so many of us use. Instead, what I am is a smart user of the tools that will help me accomplish what needs to be done. In this case, I am a smart user of EditTools’ Wildcard Find & Replace (WFR) macro tool.

WFR has been designed to make creating and using wildcard macros easy. You do not need to know how to write the macros, the tool will do it for you; instead, you need to know how to tackle a problem, how to break it down into its component parts.

The first step is to find a pattern. Remember that macros are dumb and work on patterns. I began by analyzing the patterns in the author names: Agarwal, S., Loh, Y. H., Mcloughlin, E. M. I realized that, for example, each of the first names was represented by an initial followed by a period and a space except that the final initial was followed by both a period and a comma (e.g., Y. H.,). Thus each group was separated by a period-comma combination. I also noticed that some authors had a single initial and some had two initials (and I recalled from other reference lists that some authors had three initials).

Beginning with the single initial name, I used WFR to create the first macro. WFR lets me select from menus what I want (e.g., the Character menu gives me several options, including Exact Characters, Exclude Characters, lower case, UPPER CASE, Mixed Case) and based on my selection, WFR creates the entry for me (e.g., choosing UPPER CASE in the first field inserts an unlimited [A-Z]@ into the field, which WFR turns into ([A-Z]@), the correct form for a wildcard). I do not need to know how to write the entry, I need only give the correct instruction. Thus, the first thing I wanted the macro to find was the surname, which is mixed case. So from the menu of options, I chose Mixed Case and unlimited (unlimited because some surnames are short and others are long and I need to cover all of them) and WFR created ([A-Za-z]@) for me.

I continued to make my selections by filling in the fields in the WFR form so that in the end the fields were filled in for me like this (the @ indicates any number of the find criterion; the {1,1} indicates a minimum of 1 and a maximum of 1 of the find criterion; and in #3 and #7, preceding the { is a space):

Field #    Find                Replace
1              [A-Za-z]@       \1
2              ,                         \3
3               {1,1}                 \4
4             [A-Z]{1,1}          \6
5             .                           \7
6             ,
7               {1,1}

The Replace fields are where I tell the macro what to replace the find with. Again, this can be achieved by making selections from a menu. The \4, for example, indicates that what I want is found in field #4. So the Replace information tells the macro that I want the found criteria replaced with Surname (#1), a space (#3), the initial (#4), a comma (#6), and a space (#7). WFR creates a wildcard find string that looks like this:

([A-Za-z]@)(,)( {1,1})([A-Z]{1,1})(.)(,)( {1,1})

and a replace string that looks like this:

\1\3\4\6\7

and when the macro is run, every author name that looks like

Agarwal, S.,

becomes

Agarwal S,

Clearly, this one macro is not enough to clean up all the variations. In fact, for the first scenario it took 11 macros just for the name cleanup. But this is another feature of WFR. After I create a macro, I can save it, with a lengthy description, in a file with similar macros so I can use the macro again without having to create it again. But to have to run 11 macros individually is time-consuming, so WFR will let me create a script that will run all 11 macros in whatever order I want them to run.

A script is easy to create — you just double-click on the macros you want to add to a script and then save them. The script can be added to or subtracted from at any time.

Ultimately, I created another set of four macros to deal with the author names in the second scenario. All of these macros — those for scenario 1 and those for scenario 2 — can be modified to deal with different patterns as the need arises. I will not have to keep reinventing the macros.

Another feature of WFR is that the macros are editable. If you discover that you should have included or omitted something, you do not need to recreate the entire macro; just choose to edit it.

And WFR lets you test the macro to make sure it works as you expect. (One note of caution when working with wildcard macros: It is best to turn tracking off. With tracking on, wildcard macros often produce bizarre results. Run the same macro with tracking off and everything works fine.)

It took me about 30 minutes to write all of the macros for both scenarios. Once I wrote them, however, when I came to the next chapter that needed the cleanup, the cleanup was done in less than a minute. Compare a less-than-a-minute cleanup time to the time it would take to do the cleanup manually. The wildcard macros make me money by making my work easy, quick, and efficient.

The beauty of EditTools’ Wildcard Find & Replace macro is that you do not need to be a macro guru to create these macros. You simply need to break the tasks down into steps and use WFR to create the macros for you. One important point that is worth repeating: Macros are dumb. They will do what you tell them to do even if they shouldn’t. It is still your responsibility as an editor to check the items. Macros do make mistakes.

If you haven’t tried WFR, you should. It is an easy way to delve into the world of wildcard macros. And unlike using the wildcard feature of Word’s Find & Replace, WFR lets you save the macros for future use and gives you a way to run several wildcard macros sequentially without having to create them.

Blog at WordPress.com.

%d bloggers like this: