An American Editor

March 27, 2017

The Business of Editing: The AAE Copyediting Roadmap VI

So far I’ve created a stylesheet and cleaned the document (see The Business of Editing: The AAE Copyediting Roadmap II), and tagged the manuscript by typecoding or applying styles (see The Business of Editing: The AAE Copyediting Roadmap III), inserted bookmarks for callouts and other things I noticed while tagging the manuscript (see The Business of Editing: The AAE Copyediting Roadmap IV), and created the project- or client-specific Never Spell Word dataset and run the Never Spell Word macro (see The Business of Editing: The AAE Copyediting Roadmap V). Now it’s time to tackle the reference list.

Fixing Reference Callouts

Before I get into the reference list itself, I need to mention another macro that I run often but not on all files — Superscript Me. Nearly all of the manuscripts I work on want numbered reference callouts superscripted and without parentheses or brackets. The projects usually adhere to AMA style. Unfortunately, authors are not always cooperative and authors provide reference callouts in a variety of ways, including inline in parentheses or brackets, superscripted in parentheses or brackets, with spacing between the numbers, and on the wrong side of punctuation. Superscript Me, shown below, fixes many of the problems. (You can make an image in this essay larger by clicking on the image.)

Superscript Me

I select the fixes I need and run the macro. Within seconds the macro is done. One note of caution: It is important to remember that macros are dumb — macros do as instructed and do not exercise any judgement. Consequently, even though Superscript Me fixes many problems, it can also create problems. My experience over the decade that I have been using this particular macro has been that the fixing is worth the errors that the macro introduces, even though they require manual correction during editing. The introduced errors are few, whereas the fixes are often hundreds.

Tip: Superscript Me is a powerful, timesaving (and therefore profitmaking) macro, but as noted above, it is dumb and just as it can do good, it can do harm — especially to reference lists. Before using Superscript Me on the manuscript, move the reference list to its own file. Doing so will ensure that Superscript Me makes no changes to the reference list, only to the main text material, saving a lot of undo work.

Wildcarding the Reference List

By this point, the reference list has been generally cleaned and moved to its own file.

Tip & Caution: Wildcard macros can be a gift from heaven or a disaster from hell. I like to do what I can to ensure they are a gift and not a disaster. Consequently, I move the reference list to its own file. I know I have said this before, but wildcarding is another reason for separating the reference list from the manuscript file. Often what I want changed in a reference list, I do not want changed in the primary text; similarly, what I want changed in the primary text, I do not (usually) want changed in the reference list. But like all other macros, wildcards are dumb and cannot tell text from reference list. It can do no harm moving the reference list to its own file and working on it separately from the main text, so be cautious and move it.

Individual problems, however, have not been addressed. I scan the list to see what the problems are and whether the problems are few or many. For example, if author names are supposed to be

Smith AB, Jones EZ

but are generally punctuated like

Smith A.B., Jones E. Z.

or in some other way not conforming to the correct style, I will use wildcard macros and scripts to correct as many of these “errors” as I can. Wildcards can address all types of reference format errors, not just author-name errors. For example, a common problem that I encounter is for the cite information to be provided in this format:

18: 22-30, 1986.

or

1986 Feb 22; 18: 22-30.

when it needs to be

1986;18:22–30.

These formatting errors are fixable with wildcards and scripts.

Scripts are like a supermacro. A script is a collection of many individual wildcard macros that have been combined into one macro — the script — and run sequentially. One of my reference scripts is shown here:

Wildcard Find & Replace Script

In the image, the active script file (#1) is identified and what it does (broadly) is described in the description field (#2). The wildcard macros that are included in the script and the order in which they will run are shown in the bottom field (#3). Included is a description of what each of the included wildcard macros will do (#4). For example, the first wildcard macro that the script will run will change Smith, C., to Smith C, and the second wildcard macro to run will change Smith, A.B., to Smith AB,.

The wildcard macros were created using the Wildcard Find and Replace (WFR) macro shown below. In the image, the example wildcard macro (arrow) is the same as the second macro in the script above, that is, it changes Smith, A.B., to Smith AB,.

Wildcard Find & Replace

Creating the macros using WFR is easy as the macro inserts the commands in correct form for you (for more information, see the online description of WFR). Saving the individual wildcard macros, assembling them into scripts, and saving the scripts, as well as running individual wildcard macros or scripts, is easy with WFR. (For some in-depth discussion of wildcards, see these essays: The Business of Editing: Wildcarding for Dollars; The Only Thing We Have to Fear: Wildcard Macros; and The Business of Editing: Wildcard Macros and Money.)

With some projects I get lucky and the authors only have a few references that are a formatting mess and when there are only a messy few, I fix them manually rather than run the macros.

Fixing Page Ranges

If the references are in pretty decent shape (formatwise) so that I do not need to run WFR, I will run the Page Number Format macro (shown below) to put the page range numbering in the correct format For example, the macro will automatically change a range of 622-6 to 622–626, 622–6, or 622.

Page Number Format

Making Incorrect Journal Names Correct

At long last it is time to run the Journals macro. As my journals datasets have grown, they have made reference editing increasingly more efficient. It takes time to build the datasets, but the Journals Manager (shown below) lets me build multiple datasets simultaneously.

The Journals Manager

As shown in the image, I can build five datasets (arrows) simultaneously. My primary dataset — AMA with Period — has 212,817 journal entries (see circled items).

Tip: Move the reference list to its own file to shorten the time it takes to run the Journals macro. The larger your journals dataset, the more time the Journals macro requires to complete a run. Each iteration of the Journals macro searches from the top of the document to the end as it looks for matches. Leaving the reference list in the manuscript means the macro has that much more to search. In a recent timing test of the Journals macro using my primary dataset and a 50-page document with 110 references without separating out the list, the macro was still running after 2 hours and was not near completion. Running the Journals macro with the same dataset and on the same reference list — but with the list in its own file — took less than 10 minutes. (Think about how long it would take you to manually verify and correct 110 Journal names.)

The Journals macro searches through the reference list for journal names and compares what is in the reference list against what is in the chosen dataset. If the name in the reference list is correct, the macro highlights it in green (#5), as shown below; if it is incorrect, the macro corrects it and highlights the change in cyan (#6). All changes are done with Tracking on.

The Reference List After Running Journals Macro

The Journals macro does two things for me: First, if the incorrect variation of the journal name is in the dataset, it corrects the incorrect journal name so that I do not have to look it up and fix it myself (see #6 above). If the incorrect variation is not in the dataset, the macro makes no change. For example, if the author has written New Engand J. Med but that variation is not in the dataset, it will be left, not corrected to N Engl J Med. When I go through the reference list, I will add the variation to the dataset so it is corrected next time. Second, if the journal name is in the dataset, it highlights correct names, which means that I know at a glance that the journal name is correct and I do not have to look it up (see #5 above).

It is true that the names of some of the more frequently cited journals become familiar over time but there are thousands of journals and even with the frequently cited ones with which I am familiar, correcting an incorrect name takes time.

It is important to remember that time is money (profit) and that the less time I need to spend looking up journal names, the more profit I make.

After I run the Journals macro, I open the Journals Manager (see above) and I go through the reference list, doing whatever editing is required and fixing what needs fixing that my macros didn’t fix. Because of the current size of my journals datasets, there aren’t usually many journal names that are not highlighted. When I come to one that is not highlighted either green (indicating it is correct) or cyan (indicating it was incorrect but is now correct), I look up the name and abbreviation in the National Library of Medicine online catalog and other online sources. When I locate the information, I add it and the most common author variations (based on my experience editing references for more than 30 years) to the five datasets via the Journals Manager.

I take the time to add the journal and variations because once the variations have been added, I’ll not have to deal with them again. Spend a little time now, save a lot of time in the future.

In addition to editing the references for format and content, I also keep an eye out for those that need to be removed from the reference list and placed in text — the personal communication–type reference — and for those that need to be divided into multiple references. When I come across one, I “mark” it using a comment. For example, using the Insert Query macro (which is discussed in the later essay The Business of Editing: The AAE Copyediting Roadmap X), I insert the comment shown below for unpublished material:

Query for Unpublished Material

When I come to the in-text callout during the manuscript editing, I move the reference text to the manuscript, delete the callout and the reference, and renumber using the Reference # Order Check macro (which is discussed in the later essay The Business of Editing: The AAE Copyediting Roadmap VIII).

Now that the Journals macro has been run and the references edited, the next stop on my road is the search for duplicate references, which is the subject of The Business of Editing: The AAE Copyediting Roadmap VII.

Richard Adin, An American Editor

November 21, 2016

EditTools: Duplicate References — A Preview

The current version of EditTools is nearly 1 year old. Over the past months, a lot of work has gone into improvements to existing functions and in creating new functions. Shortly, a new version of EditTools will be released (it will be a free upgrade for registered users).

New in the forthcoming version is the Find Duplicate References macro, which is listed as Duplicate Refs on the References menu as shown here:

Duplicate Refs on the References Menu

Duplicate Refs on the References Menu

The preliminaries

The macro works with both unnumbered and numbered reference lists (works better when the numbers are not autonumbers, but it does work with autonumbered lists). It also works with the reference list left in the manuscript with the text paragraphs and when the reference list has been moved temporarily to its own file (it works, like other reference-specific macros in EditTools, better when the references are moved to a separate, references-only file).

Like all macros, the Find Duplicate References macro is “dumb”; that is, it only finds identical references. The following image shows references 19 and 78 as submitted for editing. (For all images in this essay: For a larger, more readable image, right-click on the image and click “Open link in new tab.” This will open a larger version of the image in a new tab that can be kept open as you read the description of the image.)

Original References

Original References

As the image shows, although references 19 and 78 are identical references and are likely to appear identical to an editor, they will not appear identical to the Find Duplicate References macro. Items 1 and 2 show a slight difference in the author name (19: “Infant”, 78: “Infantile”). The journal names are different in that in 19 the abbreviated name is used (#3) whereas in 78 the name is spelled out (#4). Finally, as #5 and #6 show, there are a couple of differences in the cite information, namely, the order, the use of a hyphen or en-dash to indicate range, and the final page number.

Because any one of these differences would prevent the macro from pairing these references and marking them as potentially identical, it is important that the references go through a round of editing first. After editing, which for EditTools users should also include running the Journals macro, the references are likely to look like this:

The References After Editing

The References After Editing

If you compare the same items (1 and 2, 3 and 4, 5 and 6) in the above image, you will see that they now better match. (Ignore the inserted comments for now; they are discussed below.) One more step is required before the Find Duplicate References macro can be run — you need to accept all of the changes that were made. Remember that in Word, when changes are made with Tracking on, the material marked as deleted is not yet deleted; consequently, when the macro is run, the Tracked items will interfere (as will any comments, which also need to be deleted). The best method is to (1) save the tracked version, (2) accept all the changes, (3) use EditTools’ Comment Editor to delete any comments, and (4) save this clean version to run the Find Duplicate References macro.

After accepting all changes and deleting the comments, the entries for references 19 and 78 look like this:

The References After Changes Accepted

The References After Changes Accepted

Running the macro

When the Find Duplicate References macro is run, the following message box appears.

Find Duplicate References Message Box

Find Duplicate References Message Box

To run the macro, the macro has to be told where to begin and end its search. If the references are in a separate file from the rest of the manuscript, check the box indicating that the references are in a standalone document (#5) and click Run (#6). If the references are in a file with other material, use bookmarks to mark the beginning and ending of the list as instructed at the top of the message box (#1). To make it easier, the Bookmarks macro now has buttons to insert these bookmarks:

The dupBegin and dupEnd Bookmark Insert Buttons

The dupBegin and dupEnd Bookmark Insert Buttons

The Find Duplicate References macro matches a set number of characters, including spaces. The default is 120 (#4) but you can change the number to 36, 48, 60, 72, 84, 96, or 108 using the dropdown arrow shown at #4 in the Find Duplicate References message box above.

The macro does a two-pass search, one from the beginning of the reference and another from the end of the reference, which is why a list of duplicates may have repetitions.

The results of the search appear like this:

List of Possible Duplicate References

List of Possible Duplicate References

(They appear as tracked changes only if the macro is run with Tracking on; if Tracking is off, the results appear as normal text.) Note the title of the duplicates is “Duplicate Entries (Nondefinitive).” The reason for “Nondefinitive” is to remind you that the macro is “dumb” and there is no guarantee that the list includes all duplicates or that all listed items are duplicated. Much of the macro’s accuracy depends on the consistency of editing, including formatting.

For the examples in this essay, the Find Duplicate References macro was run on a list of 735 references and the list of possibilities shown represents those likely duplicate references the macro found. Note that references 19 and 78 were found (#19 and #78 indicate the portions of those references found duplicated by each pass of the macro); however, if, for example, in editing the page range separator in #19 was left as an en-dash in reference 19 and in reference 78 as a hyphen, the macro would not have listed the material at #19 as there would not have been a match. Similarly, if the author name in reference 19 had been left as “Infant” and in reference 78 as “Infantile”, the macro would not have listed the material at #78 as there would not have been a match.

The next step is for the editor to determine which of the listed possibilities are duplicates. This is done using Word’s Find Navigation pane, as shown here:

Verifying Duplicate References

Verifying Duplicate References

Copy part or all of what was found (#1) into the Find field (#2). Find will display the search results (“3 matches”) (#3); clicking the Browse button (the rightmost button at #3) lists the three matches found (#4 to #6). The first entry (#4) is always the text in the duplicates list (#1), which means that, in this example, the possible duplicates are #5 and #6. Clicking on the text marked #5 to see the complete text of that entry. Then compare that text to the text of the reference at #6. (It is possible for the macro to find more than two possible matches for the same text — and all, some, or none may be duplicates.)

Tip: Use comments to track duplicates


When I find a duplicate, I insert a prewritten, standardized comment (using EditTools’ Insert Query) to tell the client that references x and y are duplicates and that I am deleting one and renumbering it (see image below for a sample comment). I insert the comment at each of the duplicate references, although I slightly modify the comment so that it is appropriate for the reference to which it is being attached. The comment shown below is inserted at reference 78 and its language is appropriate for that reference. It tells the client that references 19 and 78 are identical and that reference 78 has been deleted and renumbered as 19. This type of comment is added to the version (e.g., the Track Changes version) of the reference list that will be given the client. The comment is added to the appropriate references as duplication is confirmed.

The Inserted Comment

The Inserted Comment

The comment, in addition to serving as a message to the client, serves as a reminder message during editing of the manuscript. Duplicate references require renumbering so as to keep reference callouts in number order. For example, it may be that reference 78 is called out after the callout for reference 10 and before that for 19. In that case, reference 78 would be moved to position 11 in the list and renumbered as 11 and the comment would be modified (easy to do using EditTools’ Comment Editor). A prewritten note (another new EditTools feature) would be inserted at point 78 in EditTools’ Reference Number Order Check and reference 19 would be marked as deleted, the inserted comment (see above) would be modified, and a note would be added to Reference Number Order Check at point 19. (See the discussion below about the report.)


When editing of the manuscript is finished, have the Reference Number Order Check macro export a renumbering report to send with the edited file to the client. A partial sample report is shown here:

Sample Partial Renumbering Report

Sample Partial Renumbering Report

Every report bears the creator’s identification information (#1) and file title (#2). You set the creator information once and it remains the same for every report until you change it using a manager. The file title is set each time you create a report.

As the report shows, reference 78 was deleted and all callouts numbered 78 were renumbered as 19 (#3). The prewritten, standard message (a new feature) can be inserted with a mouse click; only the numbers need to be inserted or modified. The report shows that the renumbering stopped at callout 176 (#4) and started again at 197 (#5). Number 6 shows another deletion and renumbering.

Clients like these reports because it makes it easy for authors, proofreaders, and others involved in the production process to track what was done.

The Find Duplicate References macro is a handy addition to EditTools. While it is easy in very short reference lists to check for duplicate references, as the number of references grows, checking for duplicates becomes increasingly difficult and time-consuming. The Find Duplicate References macro saves a lot of time, thereby increasing an editor’s profits.

Richard Adin, An American Editor

May 6, 2015

Business of Editing: Cite Work Can Be Profitable

A recent “Tip of the Week” at Copyediting Newsletter, “Citing Work: What Do Editors Really Need to Do?” by Erin Brenner, discussed the problem of editing citations. As the article pointed out, “what you do to citations and how long that takes can greatly affect your bottom line.” Unfortunately, the article repeated and reinforced the shibboleth that editing citations is not (and perhaps cannot be) profitable.

As I am sure you have already guessed, I disagree.

The Problem

The problem with references is that too many authors put them together in a slapdash manner, ignoring any instructions that the publisher may have given about formatting. And Ms. Brenner is right that straightening out the author’s mess can be both a nightmare and unprofitable.

Let me step back for a moment. I want to remind you of what I consider a fundamental rule about profitability in editing: the Rule of Three, which I discussed 3 years ago in “The Business of Editing: The Rule of Three.” Basically, the rule is that profitability cannot be judged by a single project; profitability needs to be judged after you have done three projects for a client. Yes, I know that most freelancers look at a single project and declare profitability or unprofitability, but that doesn’t make it the correct measure. Anyway, the reason I raise this here is that it is true that for a particular project, having to edit and format citations can make a project unprofitable. But then so can editing the main text.

I have edited many projects over my 31 years where I wished there were more references and less text because the text was badly written but the references were pristine. References are not the automatic key to unprofitability.

Also part of the problem is not being clear what is your role as editor when it comes to the references. Copyeditors, for example, do not (should not) “fact check” references. When I have been asked to do so, I have clarified what the client really means, because I have no way of knowing if a cite actually supports the proposition to which it is attached. If the client really does mean “fact check,” which has yet to be the case, then I decline the project; I am simply not able to devote the time needed to read the cite and determine if it supports the author’s proposition and the client is not prepared to pay me to do so.

The copyeditor’s role is to conform the format of the cites to the designated style and to ensure the cite is complete. Whether the editor is supposed to complete the cite is a matter of negotiation. In my case, I limit that responsibility to a quick look at PubMed. If the cite isn’t readily found there, a quick author query is inserted and it becomes the author’s responsibility. I use EditTools’ Insert Query macro (see “The Business of Editing: The Art of the Query“) and selecting a prewritten query to insert so that a comprehensive query can be inserted within a couple of seconds. One example query is this:

AQ: (1) Please confirm that cite is correct. Unable to locate these authors with this article title on PubMed. In addition, PubMed/NLM Catalog doesn’t list a journal by this name. (2) Also, please provide the following missing information: coauthor name(s), year of publication.

It is much quicker to select a prewritten query than to write it anew each time.

One Solution

Cite work can be very profitable. As with most of editing, whether it is profitable or not often comes down to using the right tool for the job at hand.

I just finished working on a chapter (yes, a single chapter in a 130-chapter book) that is 450+ manuscript pages of which about 230 pages are citations. In fact, there are 1,827 cites for the chapter, and all the journal cites (roughly 1,800 of the references) were similar to this:

6. Jackson, S.P., W.S. Nesbitt, and E. Westein, Dynamics of platelet thrombus formation. J Thromb Haemost, 2009. 7 Suppl 1: p. 17-20.

7. Roth, G.J., Developing relationships: arterial platelet adhesion, glycoprotein Ib, and leucine-rich glycoproteins. Blood, 1991. 77(1): p. 5-19.

8. Ruggeri, Z.M., Structure and function of von Willebrand factor. Thromb Haemost, 1999. 82(2): p. 576-84.

when they needed to be like this:

6. Jackson SP, Nesbitt WS, Westein E: Dynamics of platelet thrombus formation. J Thromb Haemost 7 Suppl 1:17–20, 2009.

7. Roth GJ: Developing relationships: Arterial platelet adhesion, glycoprotein Ib, and leucine-rich glycoproteins. Blood 77(1):5–19, 1991.

8. Ruggeri ZM: Structure and function of von Willebrand factor. Thromb Haemost 82(2):576–584, 1999.

As you can see by comparing what the authors provided and what the book style was, a lot of work needed to be done to go from the before to the after. Conforming 1,800 references the standard/usual way editors do this type of work — that is, manually, period by period — could take many hours and thus be a losing proposition — or by using the right tools for the job, it could take a few hours and be a money-making proposition. I was able to conform the references in less than 4 hours and for 3.5 of those 4 hours, I was able to do other editing work while the references were being conformed.

How? By using the right tools for the job, which, in this case, was EditTools’ Wildcard Find & Replace and Journals macros, which were topics of recent essays (see “The Business of Editing: Wildcarding for Dollars” and “The Business of Editing: Journals, References, & Dollars,” respectively).

[There is an important caveat to the above: I was able to conform the references in less than 4 hours because I already had my datasets built. Over the course of time, I have encountered these problems and I have added, for example, scripts to my Wildcard dataset and journal names to my Journals dataset (which now has 78,000 entries). If I didn’t already have the scripts, or if I had fewer scripts that would address fewer problems, it would have taken me longer. But a professional editor tries to plan for the future and the key to successful use of a tool is the tool’s ability to handle current-type problems in the future.]

To clean up the author names and the cite portion (i.e., 1991. 77(1): p. 5-19) I used EditTools’ Wildcard Find & Replace Macro. Because it lets me write and save a find-and-replace string and put multiple strings together in a single “script,” with the click of a button I was able to run several dozen macros that cleaned up those items. In addition, EditTools’ Page Number Format macro let me change partial page ranges (e.g., 110-19) to full page ranges (e.g., 110-119) automatically. It took less than 15 minutes for the full reference list to be conformed and should I face a similar task next week, I already have the necessary scripts; I just need to load and run them.

What took the most time was fixing the journals. My journals dataset is currently 78,000+ entries and the Journals macro has to run through 1,827 references 78,000+ times. But what it does is fix those incorrect entries it finds in the dataset and highlights them; it also highlights (in a different color) those journal names that are correct. What that means is that I can see at a glance which cites I need to check (in this case, just a handful). And while the EditTools Journals macro is running in the background, I can continue editing other files – which means I am getting paid twice (because I charge by the page, not the hour).

Is it Profitable?

Do I earn money on this? Yes, I do. Consider this example (the numbers have no relevance to what I actually charge; they are an example only): If I charge $3 per manuscript page and the references constitute 230 pages, it means the cost to the client is $690 regardless of whether the references take me 1 hour or 50 hours. In this case, to conform the references took about 4 hours. For those 4 hours, I earned $172.50 an hour as an effective hourly rate. The reality, of course, is that I still had to look over the references and lookup a few, and I actually spent  7 hours on the references altogether, which means my effective hourly rate would be $98.57 at $3 per page. (Had I charged $25 an hour, I would have earned just $25 an hour, approximately one-quarter of the per-page rate earnings, which is why I prefer a per-page rate.) As you can calculate, at a different per-page rate, the earnings would have been higher or lower.

And that doesn’t count what I earned while continuing to edit as the Journals macro ran in the background.

My point is that using the right tools and the right resources can make a difference. I do agree that if I had to fact check each reference, I would not have made any money at a per-page rate (nor at an hourly rate because no client would pay for the time it would take to fact check 1,827 references — especially when this is only one of 130 chapters), but then I wouldn’t have done the work at such a rate (or at all). Whether a task is profitable depends on many factors.

The notion that editing references cannot be profitable is no more true than is the notion that editing text is always profitable. Editing references may not be stimulating work, but with the right tools it can be profitable. The key to profitable editing, is to use the right tool for the job.

Richard Adin, An American Editor

_____________

Looking for a Deal?

You can buy EditTools in a package with PerfectIt and Editor’s Toolkit at a special savings of $78 off the price if bought individually. To purchase the package at the special deal price, click Editor’s Toolkit Ultimate.

March 2, 2015

The Business of Editing: Journals, References, & Dollars

In The Business of Editing: Wildcarding for Dollars, I discussed wildcard macros and how they can increase both accuracy and profitability. Profitability is, in my business, a key motivator. Sure I want to be a recognized, excellent, highly skilled editor, an editor who turns ordinary prose into extraordinary prose, but I equally want to make a good living do so — I want my business to be profitable.

Consequently, as I have mentioned numerous times previously, I look for ways to make editing more efficient. The path to efficiency is strewn with missteps when editors think that all editing tasks can be made more efficient; they cannot. But there are tasks that scream for efficiency. Wildcard macros are one method and work very well for the tasks for which they are suited. A second method, which deals with references, is the EditTools Journals macro.

As I relayed in previous articles, I work on very long documents that often have thousands of references. My current project runs 137 chapters, approximately 12,000 manuscript pages, with each chapter having its own list of references, ranging in length from less than 100 to more than 600 references. And as is true of the text of the chapters, the condition of the references varies chapter by chapter. The goal, of course, is for all of the references to be similarly styled. as well as to be accurate.

The first image shows a sample of how journal names were provided in one chapter. The second image shows how the names need to end up.

Journals in original

Journals in original

 

How the journals need to be

How the journals need to be

The question is how do I get from before to after most efficiently? The answer is the Journals macro.

The key to the Journals macro is the Journals dataset. In my case, I need journal names to conform to the PubMed style. However, I could just as easily create a dataset for Chicago/MLA style (American Journal of Sociology), CSE (Cell Biochem Funct.), APA (Journal of Oral Communication,), AAA (Current Anthropology), or any other style. The image below shows the Journals Manager with my PubMed dataset open. The purple arrow shows a journal name as provided by an author; the blue arrow shows the correct PubMed name of the journal, that is, to what the macro will change the wrong form.

PubMed dataset in Journals Manager

PubMed dataset in Journals Manager

The next image shows a sample APA-style dataset. The red arrow shows the abbreviated version of the journal name and the green arrow shows the full name to which it will be converted by the macro.

APA style in Journals Manager

APA style in Journals Manager

As I stated, nearly all my work requires PubMed styling so my PubMed dataset is by far the largest. If you look at the PubMed dataset image above, you will see that as of this writing, the dataset contains more than 64,000 journal name variations. “Variations” is the keyword. Authors give journal names in all kinds of style, so to cover the possibilities, a single journal may have two dozen entries in the dataset.

The key to creating the dataset is to make use of the Journal Manager — and to keep adding new variations and journals as you come across them: Spend a little time now to make more money every future day. The images of the Manager shown above show you the primary interface. The problem is that it would take an inordinate amount of time to add each possible variation individually. The smarter method is to use the Multiple Entries screen, as shown here:

Journals Manager Multiple Entry dialog

Journals Manager Multiple Entry dialog

With the Multiple Entry dialog open, you enter a variation in the #1 field. By default, all of the trailing punctuation is selected (#2), but you could choose among them by deselecting the ones you didn’t want. For example, if the style you work in requires that a journal name be followed by a comma, you might want to deselect the comma here because this is the list of “wrong” styles and having a trailing comma would not be “wrong.” Clicking Add (#3) adds whatever you have typed in #1 to the main screen (#4) along with the selected trailing punctuation. In the example, I entered N Engl J Med once in #1, left the default selection in #2, clicked Add (#3), and had five variations added to the main field (#4) — I did not have to type N Engl J Med five times, just the once.

I then repeated the process for N. Engl. J. Med. (#4) and am prepared to repeat it for New Engl J Med. (#1). I will repeat the process for a variety of variations in an attempt to “kill” multiple possibilities at one time. When I am done, I will click OK (#5), which will take me back to the main Manager screen, shown here:

Journals Manager AFTER Multiple Entry

Journals Manager AFTER Multiple Entry

The main Manager screen — after using the multiple entry dialog — shows in faint lettering “Use ‘Multiple Entries’ button to adjust” in the Add Journal field (#1). This means two things: First, it tells you that there are journal variations waiting to be added to the dataset, and second, that if you want to modify the list of waiting names, either by adding or deleting, click the Multiple Entries button to bring the dialog back up for editing. If you are ready to add to the dataset, the next step is to tell the macro to what the “wrong” versions should be corrected. This is done by typing the correct form in the Always correct journal field (#2).

If your style was to add a comma after the correct form, you could enter the correct name trailed by a comma here. In the example I show, you would just add the comma after Med. But that might not be the best way to do it because you then lose the ability to use the dataset for a style that is identical but that doesn’t use the comma. There is an alternative, which we will get to. What is necessary, however, is that the correct form be entered here so the macro knows what to do. After entering the correct form (#2), click Add (#3) to add all of the variations and the correct form to the dataset.

The macro will not add duplicate entries so no need to worry about having an entry appear multiple times in the dataset. The macro automatically checks for duplicates. When you are done adding for this session, click Save & Close. (Tip: If you plan to add a lot of entries in one sitting, every so often click Save. That will save the dataset with the newest entries and let you continue to add more. Until Save or Save & Close is clicked, any entries are not permanently part of the dataset.)

Once you have your dataset, you are ready to unleash the Journals macro. It is always a good idea to put the reference list in a separate file before running the macro, but that can’t always be done. Separating the references into their own file helps speed the macro.

When ready to run the macro, click Journals (red arrow below) on the EditTools Tab.

EditTools Tab

EditTools Tab

Clicking Journals brings up this dialog with options:

Journals Macro Options

Journals Macro Options

Here is the best place to select trailing punctuation you want added to the correct journal name. Clicking on the dropdown (#1) will give you the choice of comma, period, semicolon, colon, or the default “none.” If you choose, for example, semicolon, every time a journal name is corrected, it will be followed by a semicolon. Note, however, that if the journal name is correct already except that it doesn’t have the trailing punctuation, the punctuation will not be added. In other words, New Engl J Med will be corrected to N Engl J Med; but N Engl J Med will be left as it is. In this instance, using the other system (adding the punctuation to the correct name in the dataset) will work better.

If your manuscript has endnotes or footnotes with references, clicking #2 will instruct the macro to search those items as well. You can also tell the macro to make the journal names italic, nonitalic, or as they currently are. In this instance, the macro will only change those journal names it highlights. For example, if it doesn’t change/highlight N Engl J Med because it is not in the dataset, it will not change the text attribute of it either.

Clicking #4 lets you change the dataset file to be used by the macro and #5 starts the macro running.

The results of running of the Journals macro depends on your dataset. Clearly, the larger your dataset (i.e., the more journals and variations it contains), the greater impact the macro will have on your reference list. The following image shows the results of running the Journals macro. Journals macro makes use of track changes and color highlighting. As the first instance (#1) shows, the incorrect journal name, Am. J. Kidney Dis. Off. J. Natl. Kidney Found., was corrected to Am J Kidney Dis and highlighted in cyan. The cyan tells me that the name is now correct. Note that the change was made with tracking on, which gives me the opportunity to reject the change. The green highlight (#2) tells me that the journal name Pharmacotherapy was correct as originally provided. And #3 tells me that this journal name variation is not found in my dataset. At this juncture, I would look up the journal in PubMed Journals, open the Journal Manager, and add the variation other needed variations of the name to the dataset so that next time it will be found and corrected.

Results of Running the Journals Macro

Results of Running the Journals Macro

I know this seems like a lot of work, and it is when you are starting out to build the dataset. But as your dataset grows, so do your profits. Consider this: If the reference list you need to check is 100 entries, how long does it take you to check each one manually? I recently checked a reference list of 435 entries. The author names were done incorrectly (see The Business of Editing: Wildcarding for Dollars for examples) and the year-volume-pages portion of the references were also in incorrect order. Most — not all — of those errors I was able to correct in less than 10 minutes using wildcarding. That left the journal names.

Nearly every journal name was incorrectly done. With my large dataset (over 64,000 variations), it took the Journals macro 32 minutes to correct the journal names. (Nine entries were not journals and so were not in the dataset and seven incorrect journal names were not in the dataset and had to be added afterward.) I still had to go through each entry in the reference list, but to complete a review of the reference list and make any additional corrections that were needed took me an additional 2 hours and 10 minutes. In other words, I was able to completely edit a 435-entry reference list, fixing all of the formatting problems and incorrect journal names, in less than 3 hours.

How quickly could you have done the same?

Combining macros is a key to efficiency. Recognizing that a problem has a macro solution and then knowing how to impose that solution can be the difference between profit and no profit. Using macros wisely can add fun and profit to the profession of editing.

Richard Adin, An American Editor

Related An American Editor essays are:

____________

Looking for a Deal?

You can buy EditTools in a package with PerfectIt and Editor’s Toolkit at a special savings of $78 off the price if bought individually. To purchase the package at the special deal price, click Editor’s Toolkit Ultimate.

March 19, 2014

The Business of Editing: Recordkeeping II

In The Business of Editing: Recordkeeping I, I discussed the importance of keeping records to determine whether it is better for you to charge by, for example, the page or the hour. But that article gave a very limited view of why recordkeeping is important.

Businesses run on data. As freelancers, we are well aware of the reliance of corporate clients on data — the data is used to determine everything from whether a new edition of a book should be undertaken to how much should be budgeted to produce the book. Although we do not have the same issues to think about, those that we do have are as equally weighty for our business.

For most freelancers, the beginning year(s) are devoted to accepting paying work of any type. When I first started, I accepted book editing, book proofreading, journal article editing, advertising, desktop publishing, and whatever other assignments came my way. And I kept detailed data on every one of those assignments.

Every couple of months I would analyze the data, but it wasn’t until I had about a year’s worth of data that I could draw conclusions. The data told me that for me:

  • advertising work didn’t pay
  • proofreading didn’t pay
  • book editing was the most lucrative work — but only if
    • it was on a per-page or project-fee basis
    • the manuscripts were of a sufficiently large size
    • the work was nonfiction
    • the work was not for academic presses
    • the work was not directly with the author
    • the work was copyediting

I also learned other things, such as what types of subject matter were best for me and that I could increase profitability by working with other editors.

Let me emphasize that the above were lessons I learned based on my experience and my data. I am not suggesting that they are true lessons for anyone else. Rather, the point is that the collection of data can help direct your business into the areas that are most lucrative for you.

Data also helps guide marketing efforts. Once I learned what was best for me, I was able to focus my marketing efforts on those services and (potential) clients. I stopped trying to be all things to everyone; instead I focused solely on those things that had the greatest potential to help me reach my goals. Once I realized that editing fiction was less lucrative for me than editing nonfiction, I eliminated my marketing efforts to fiction publishers and refocused my efforts to nonfiction publishers.

All of that is well and good, but the focusing of my efforts was not the biggest boon I got (and continue to receive) from data collection. Rather, the biggest boon is identifying those projects that were financially more successful and those that were less successful.

With that identification (which is something you cannot readily do if you charge by the hour because hourly charging makes all projects equally successful, regardless of whether that is the best or least success you can have), I was able to focus on what made one project more successful than another. I was able to glean the stumbling blocks.

One example: I discovered that projects that had hundreds of references with each chapter were a mixed bag of success. Those that were second or subsequent editions were more likely to have greater success than first editions because authors would often follow the citation formatting of the prior edition, but if it was a first edition, there often was no uniformity to the style the authors followed.

I also discovered that the two primary problems that I encountered with references were wrong journal abbreviations and wrong format of author names. The questions were (1) could these problems be solved or at least mitigated and if so, (2) what are the solutions? The solutions took some time to formulate, but having identified the problems, I could focus. The ultimate result was the creation of my Journals macro and the Wildcard Find & Replace macro. My journals database now approaches 20,000 entries (see Business of Editing: The Logistics of Large Projects for more information), which makes checking and correcting journal names easy and accurate. The Wildcard macro makes it possible to fix many of the incorrectly formatted author names. Combined, the two macros significantly reduce the time I need to spend on the references.

Of course, other problems also needed addressing, but I would not have been able to identify common problems in the absence of the data; in the absence of the data, I would have been able to identify only the problems in an individual project, which may not have recurred in other projects.

Ultimately, the more information you can parse from the projects you work on and can categorize, the more you will be able to identify common problems among your projects that you can address. The more of these that you address, the more profitable you can make your business.

There is all kinds of data worth collecting, but I have found one of the most valuable to be my churn rate; that is, how many pages an hour I can edit. That number varies by project and project complexity, but I have found it important to track. I know that I need to churn a minimum number of pages per hour (on average across a project) to meet my goals. When I see that a certain type of project consistently falls short of that minimum number, I know that I need to rethink accepting such projects.

As I hope is evident, data is the lifeblood of even a freelancer’s business. The more effort you put into collecting and analyzing data regarding your work, the more likely it is that your goals will be met. This endeavor is well worth the time and effort required.

What data, if any, do you collect and analyze? How often do you review the information? Has it helped guide your business?

Richard Adin, An American Editor

July 29, 2013

Making the Decision to Move to Lightspeed

The one thing that is true about technology is that obsolescence is built-in. Whatever you buy today will, after a manufacturer-predefined time, begin to turn to dust.

Before we go further into my tale, let’s sidestep for this bit of music:

As I have mentioned in other posts, I have my computers custom built locally. That allows me to choose higher-quality components from manufacturers that I prefer. One part from here, one from there. More importantly, my local computer shop warrants those components for 3 years.

For years, I replaced our computers every 18 to 24 months. The technology was changing rapidly and by the time 2 years had passed, it was like moving from the paleolithic era of computing to the future. And the more complex my macros became and the more I wanted them to do, the greater the computing power I needed and wanted. The one thing I didn’t want was to have time to twiddle my thumbs while waiting for my macros to do their tasks.

About 6 years ago that changed with the building of our current computers. I had finally hit the top of the hill. Sure changes continued to occur and components continued to improve, but none of them would have had much of an impact on my needs. The machine I had custom built would outlive my editing career — or so I hoped.

As with all such wishes, there was something I forgot: components are designed to fail. Manufacturers don’t want things to live forever, so obsolescence is built-in.

This past week, my boot drive began to fail. I could tell because it took longer and longer to boot up my computer each morning; because where once the computer easily handled a dozen open applications simultaneously, now it struggled to hand three or four; because I would be working and suddenly everything would freeze for a few seconds.

I also began to notice that my data drive also was generating errors. Reads and writes to the data drive (a separate physical drive from my boot drive) took a little bit longer; instructions weren’t being carried out quite as fast (or so it seemed) as in previous times.

Although these two hard drives were high-quality drives when purchased, time had passed them by. All traditional-type hard drives have moving parts, parts that eventually wear out. The one thing I didn’t want to happen was for the drive to start writing data to bad sectors, causing corruption, so I took the hints I was being given and called my local computer shop.

It took 3 hours of downtime, but in that time, I went from what now seems like crawling to near the speed of light. It previously took a little more than 90 seconds to boot up; now it takes less than 20 seconds. The cure was not only new drives but going to solid-state drives (SSDs).

Unfortunately, SSDs are expensive, at least double the price of traditional drives. But with that increased price comes compactness (four SSDs fit within the same space as one traditional drive), no moving parts to wear out (although these drives do eventually lose the ability to write to the disks, they, supposedly, never lose the ability to read from them), no heat generation, and no noise (no moving parts to make noise).

An advantage of custom building my computers is the ease with which these types of repairs can be made. As I have noted previously, all of my hard drives are hot swappable, which means that I can pull them out of their slot without turning off and opening the computer, and I can put a different hard drive in the slot and access it. It makes for great backup and for easy storage when I travel. It also means that my computer shop could do the repair in my office — I didn’t need to be without the computer for more than a few hours. (Most of the time I was “down” was spent cloning my old drives to the new SSDs. The physical replacement of the drives and getting Windows to recognize the new drives took only a few minutes.)

Now that I have new primary hard drives, I am thinking about updating my remaining traditional hard drives (six of them: one for storage of completed projects; one to hold my imaging backups; four in my NAS [network-attached storage] box for my daily backups) to SSDs. I am unlikely to do that upgrade soon because of the cost and the lack of real need. None of those drives get the use that my two primary drives receive.

The upgrades I will be doing in the coming few months are upgrades to my motherboard, processor, RAM, and video cards. With the new SSDs, my Journals macro that took nearly 26 minutes to run through 15,000 dataset entries on a list of 500+ references now takes closer to 11 minutes (see Business of Editing: The Logistics of Large Projects). It will be even faster once I upgrade the motherboard, processor, and RAM to ones that can take full advantage of the SSDs.

With my computer working great with just the SSD upgrade, why would I consider spending even more money to upgrade these other components? Because I will get a high return on my investment — I will make back the cost of the upgrade in just a couple of projects. Remember, I charge by the page so that the faster and more efficiently I can process data, the higher my effective hourly rate will be (see Thinking About Money: What Freelancers Need to Understand).

In other words, I am investing in my business. My pre-SSD computer configuration performed well for about 6 years. I received an excellent rate of return on that investment. Now it is time to invest for the next 6 years (or longer). When I make decisions about whether to buy new equipment/components, the biggest factors in my decision-making process are the answers to these questions: “Will it help to increase my effective hourly rate?” and “If it will, how quickly will it do so?”

If the answer to the first question is no, then I proceed no further. The only time I will buy is when I must because of, for example, a component failure. I haven’t bought a tablet for work because a tablet can neither improve my speed/efficiency nor positively affect my effective hourly rate. If the answer to the first question is yes, but the answer to the second question is a time frame that I think is too long, such as 20 or more projects or 1 year or longer, then I also do not buy. The return on investment is not sufficient to justify the investment. I need to wait for further technological improvements.

In the case of the hard drives, the decision had to be made whether to buy traditional drives or the SSDs. I decided to buy the SSDs because the answer to the first question was yes and to the second it was no more than 2 projects to recoup the price differential; in other words, it made fiscal sense to spend more now to reap long-term benefits.

What analysis do you do when deciding whether to buy new equipment or to upgrade your current equipment or even what type of equipment to buy?

May 1, 2013

Business of Editing: The Logistics of Large Projects

As I wrote in my previous post, Business of Editing: Taking On Too Much, I have been hired to help edit a portion of a very large project. My portion runs to 5,000 manuscript pages, which have to be edited within 6 weeks.

After having written about the ethical issues of having undertaken a project that was bigger than the original editors could handle, I thought it would be worthwhile to discuss some of the logistical problems of massive projects. Let’s begin at the beginning: This project, before editing of any chapters, ran approximately 8,000 manuscript pages. (I use approximately deliberately as this was the in-house editor’s estimate; I only know with certainty the page count for the chapters I have actually received.)

Projects of that size are the types of project that I often receive and over the years, I have developed a system for working with such massive amounts of manuscript. In fact, it was because of my receiving projects of that size that I developed EditTools. As you can imagine, with such projects consistency becomes a problem, and the stylesheet seems to grow on its own.

The first logistical problem I address is that of editors: How many editors will be needed to edit the manuscript within the time allotted by the schedule? I built my business, Freelance Editorial Services, around the idea that a team of editors can do better financially than a solo editor. Although this notion has been disputed many times over the years, I still believe it to be true, based on discussions that I have with solo colleagues. It is this team concept that enables me to undertake such large projects with confidence, knowing that I will have a sufficient number of well-qualified editors to do the work.

The second logistical problem I address is the online stylesheet and giving access to it to the editors who will be working on the project. I discussed my online stylesheet in Working Effectively Online V — Stylesheets. When several editors work collaboratively on a project, this online stylesheet enables all of the editors to see what decisions have been made, and to conform their decisions with the decisions that have been made by coeditors. Consequently, if an editor makes new editorial decision (i.e., it has not been previously decided by an editor and inserted on the stylesheet) to use distension rather than distention, or to use coworker rather than co-worker, all of the other editors can immediately see that decision — within seconds of its being entered into the stylesheet — and can conform their editing to that decision or dispute it. It also means that errors can be caught and corrected. For example, if an editor enters adriamycin, another editor can correct it to Adriamycin (it is a brand name, not a generic drug) and immediately notify all editors of the original error and correction.

In addition, my client also has access to the stylesheet. The client can view and print it, but not modify it. This serves two purposes: (a) the client can provide proofreaders with up-to-the-minute copies of the stylesheet and (b) the client can look at our editorial decisions and decide that he would prefer, for example, distention rather than distension, notify an editor of the preference, and the editor can then make the change and notify all of the coeditors, who can then make any necessary corrections in chapters not already submitted to the client.

The third logistical problem I address is the creation of a starter NSW (Never Spell Word) file for the project. The Never Spell Word module of EditTools is where known client preferences are stored. For example, if I know that the client prefers distention to distension, I enter into the NSW file the information to change instances of distension to distention. Also into this file goes editorial decisions, such as marking DNA as an acronym that does not ever need to be spelled out but that the acronym US (for ultrasound) should always be spelled out as ultrasound. The NSW file also serves to remind editors of other editorial-decision–related information. I provide each editor with a starter NSW file and each editor will add to their NSW file as they edit.

The NSW macro is run before beginning editing a chapter. Its purpose is to promote consistency across chapters and to make it easier for an editor to visually see editorial decisions that have been made. The NSW macro includes several components. For example, my basic NSW for medical editing also includes a dataset for drugs and organisms. Its use helps speed editing by providing visual clues, such as an indication that a drug name is correct even though the spell checker is flagging it as erroneous — it becomes one less thing that I need to verify.

The fourth logistical problem I tackle is references. These projects often have lots of references. One chapter of the project that I just received, for example, runs 305 manuscript pages, of which there are 61 pages of references — a total of 652 references (most of the chapters have more than 300 references). Dealing with references can be time-consuming. My approach is to separate the references from the main chapter, putting them in their own file. This serves four purposes: (a) Microsoft, in its wisdom, has determined that if spell check determines there are more than some number of errors in a document, it will display a message that there are too many errors for Word to display and turns off spell check. Although spell check is not perfect, it is a tool that I do use when editing. I would prefer it to flag a correctly spelled word as misspelled, giving me an alert, than my possibly missing something. Spell check is a tool, not a solution. (However, it does help that EditTools helps me create custom dictionaries so that correct words that are currently flagged as errors by spell check can easily be added to a custom dictionary and not flagged in the future.) By moving the references to their own file, I eliminate this problem of Word turning off spell check for too many errors.

(b) It provides me with an opportunity to run my Journals macro. Every time I come across a new variation of a spelling of a journal name, I add it to one of my journal datasets. My PubMed (medical) journals dataset currently has more 14,675 entries. With the references in a separate file, I can run that dataset against the reference list and have my macro correct those journal names that are incorrect (assuming the information is in my dataset) and mark as correct those that are correct. What this means is that rather than having to check journal names for 652 references in a chapter, I have to do so for at most a handful. It also means that I can concentrate on the other reference errors, if any, such as missing author names. Instead of spending several hours on the references alone, I can edit the references in a much shorter amount of time. (It took 26 minutes for the Journals macro to check the 652 references against the 14,675 entries in the dataset.)

(c) The third purpose is that separating the references from the main text lets me run the Page Number Format macro. In less than a minute, I had changed the page numbers in the 652 references from 1607-10 to 1607-1610 format. How long would it take to do this manually? Having the references in their own file means I do not have to worry about the macro making unwanted changes in the main text, especially as this macro runs without tracking.

(d) The fourth purpose separating the references from the main body of the chapter serves is that it lets me run my Wildcard Find & Replace macro just on the references. There is no chance that I will use the macro and make unwanted changes to the main text. WFR is efficient because it lets me create a macro that works, such as one to closeup the year-volume-pages cite, and save it for future reuse. WFR even lets me combine several of the macros into a single script (that also can be saved for repeat use) so that the macros run sequentially in my designated order. As an example: I have created macros to change author names from the format Author, F. H., to Author FH,. If you have to do this manually for several thousand author names, you begin to appreciate the power and usefulness of WFR and how much time it can save. (I also will use WFR on the main text when appropriate. What I avoid by separating out the references is the possibility of something happening to either the main text or the references that shouldn’t.)

The above steps are among those I take that make handling of large projects easier and more profitable. There are additional things that I do for each chapter, but the point is that by dealing with manuscript in a logical way, projects become manageable. In addition, by using the right tools, editing is more accurate, consistent, and faster, which leads to a happy client, more work, and increased profitability.

Do you have any thoughts on how to handle large amounts of manuscript? Do you take any special steps for preparing a manuscript for editing or while editing?

Blog at WordPress.com.

%d bloggers like this: