Law Technology News
December 2000
American Lawyer Media National Sites

National Sites

The American Lawyer Magazine

Corporate Counsel

National Law Journal

Law Catalog

Legal Seminars

Law.com

REGIONAL ALM SITES

New York

New Jersey

Connecticut

Pennsylvania

Delaware

Washington, D.C.

Georgia

Florida

Texas

California

Illinois

Imaging Technology

Scanners: Every Law Firm Has One

By Storm Evans

Scanners: Every Law Firm Has One SCANNERS ARE one of the most misunderstood and under-utilized pieces of office automation today, yet there seems to be at least one in every office. People are confused because they do not know exactly what to do with the scanner.

We scan documents for three basic reasons:

1. To produce an electronic (graphic) image that you can keep in your computer. This is almost like having a photocopy of a piece of paper in the physical file. In this use, the scanned document will never be changed. In fact, the purpose in scanning it is to keep a copy of it exactly as it is.

2. To turn a printed paper form into something you can type onto to fill in the blanks. The "forms typer" in Visioneer software and OmniForm software help you with that task. Once you scan in a form, the software creates blanks for you to fill in right on the scanned image, then you print, fax, or e-mail it.

3. To turn a copy of a document into a word processing document so that you can make changes to it, save it, and print it as you would if you had typed it into the word processing software itself. Sometimes we turn it into a word processing document even if we do not intend to edit it, just to make it available for word searches.

In the first two uses, you will not process the document using optical character recognition (OCR) software. You only use OCR software if you want to edit the document as text. Even a printed form is not converted into text because you want the text to remain locked in place while you enter data in boxes or blanks.

I began to work with OCR text in a law firm in 1978, with a technology that had none of the sophisticated tools that we have now, and with a scanner that invariably turned "g"s into "q"s. From that experience, I learned to look at the OCRed document as a series of components rather than as one character after another, and began to look for ways of systematically removing scanner "trash".

Scanner Tips

Here are some things to remember about the nature of scanned documents and suggestions for making the cleanup process easier.

* Make sure that your scanner software can handle multiple pages in a single scan. Nothing is more depressing than scanning a 20-page document into 20 documents and then pasting them back together to make a single document.

* Keep in mind that the scanners you see in the magazines and at the office supply stores are designed for home use. Those scanners do a great job scanning color photos, but are not as efficient at scanning black and white text. If you find that your firm is scanning lots of documents, a black-and-white Panasonic scanner with a sheet feeder ($2,000 and up) may be a good investment.

* Some of the all-in-one scanners and flat bed scanners cannot accommodate legal-size or A4 paper. The higher end scanners can, but so can the Visioneer PaperPort.

* OCR software tries to read every speck of black on a piece of paper. Make sure that the copy of the document you are scanning is first or second generation photocopy. Set the scanner resolution low so that the scanner ignores some of the specks that the document picks up when it goes through a copier. Consider using liquid paper, redacting tape, even sticky notes to cover up.

* Another, less labor intensive trick is to review the document in the scanner software and use editing tools like the eraser, and "select and delete" capabilities to remove what you do not want to go to the OCR software.

* The OCR software is designed to help you format the document, as well as give you the characters. as a result, you will often get a document with extraneous tab settings, styles and margin changes if the document was skewed during photocopying or during scanning, or if a font change makes the margins look different.

* Look at the document before you scan it. Maybe it would be easier to work with if you told your OCR software to ignore the margin, tab, and font changes.

Set the OCR software to save the document to RTF (rich text format). It is also possible that the document you are working with is a combination of straight text and tables with numbers in them.

You may find it more efficient to scan the text portion of the document and type the table portion (or if the table portion is never going to be edited, perhaps it would be better inserted as graphic instead of text).

Save the document that you have just scanned to WordPerfect 5.1 instead of your usual word processor. WordPerfect 5.1 had fewer codes that the Windows word processing software now, so the text produced by the OCR software will have fewer codes.

Scans the document into WordPad format, open it in WordPad, and do a quick clean up of extra hard returns and other "trash." Then copy the entire document (CTRL +A then copy) and paste it into a blank Word or WordPerfect document. Guarantee -- no extra codes if you do it this way, and you keep tabs, bold and underscore.

If you get the document into Microsoft Word and still find strange codes and styles, select all of the text (CTRL+A) and press CTRL+ the space bar to remove the codes. Then go to work formatting it.

Not Perfect

In the best of circumstances, you will have probably have at least 25 errors on every double-spaced, letter-sized page. You will have to proofread it.

Once you have the document in your word processing software, use your head, and the tools in your word processor, to help you resolve problems. If you notice, for example, that the software often turns "q" to "g"s, don't go line-by-line, character-by-character looking for them. You already know that most documents have very few "q"s in them, so use the search and replace to find them and change them to g. The same would be true if the software was recognizing "the" and interpreting it as "tho."

Look for patterns in the OCR interpretation, then chase down the patterns to resolve "trash" quickly. Sometimes you will find a pattern after you have started proofreading a document. If so, put a set of characters to act as a bookmark so that you can resolve the problem that you have found and return to the point in the document where you first noticed it. I use ##. So, when I find a pattern, I put ## in the document where I want to return to it, and use the search feature to take me there when I am ready to resume proofreading.

Process the document as though it were on an assembly line. It is usually much more work and frustration to go line-by-line, character-by-character to clean up the document. So, spell check, then look for margin problems, fix the margin problems, then go back to the top of the document, look for format changes or text enhancements like bold or italics and fix them.

Human Element

As with utilization of any technology, there is always a human element. Meeting the needs of the human element requires that you train people to use the technology.

Take some of the documents you would like to scan, decide which suggested techniques work for your office, and turn them into a training program complete with crib sheet so that people in your office will not be intimidated by the scanning or OCR process.

Another human element is determining which person in your office is best suited to cleaning up scanned documents. It must be someone who knows the word processing software well.

Many secretaries will tell you that it is faster to type the document over than it is to scan it and clean it up. I can certainly understand how they could feel that way. But I believe that with proper configuration of the software, proper attention to the quality of the original document, suitability of hardware and software, and with some training of the people who will be doing the cleanup, scanning is a highly efficient way of getting text into your word processing software.

Storm Evans is a practice support consultant, based in Philadelphia.

Inside
Correction
Editor's Note
2000 Law Technology News Reader Response Awards
Year In Review: 2000



ASP Spotlight
Application Service Providers
Compare & Contrast
Document Management
Extranet Spotlight
Imaging Technology
International Trends
Internet Trends
Knowledge Management
Litigation Support
London Insider
MIS@The N.Y.P.D.
On The Road
Second Opinions
Small & Home Office
Snap Shot: Trisha Schwartz
Tech 101
Tech Circuit
Web Watch



Document Management
Mac Corner
Mail Call
Networking & Storage
Office Gear
Portable Office
Practice Tools
Quick Takes
Security Roundup
Time & Billing
Utilities Roundup
Web Works



Client Notes
Industry News
People In The News
Privacy Statement and Terms and Conditions of Use
Copyright copy; 2000 NLP IP Company. All rights reserved