SiteExperts.com Logo Home | Community | Developer's Paradise | Jobs
User Groups | Site Tools | Site Information | Search

Inside Technique : Legacy Data and the Web: Steps to a Successful Marriage : Successful Marriage... Page 2

Internal Print Printers Customer-Directed Printers Vendor/Supplier Printers
Cash reports Line Printer Invoices AFP or DJDE Purchase Order PCL
Accounts Payable Line Printer Escrow Analysis AFP or DJDE Invoice AFP
Accounts Receivable Line Printer Welcome Letters DJDE Checks AFP
Store or Warehouse logs Line Printer Inquiry responses PCL Inventory inquiry PCL
Shop floor logs Line Printer Checks AFP Credit notices PCL
Build reports Line Printer Insurance policies DJDE Inspection reports Line Printer
Inventory reports PCL User manuals DJDE Pre-paid orders PCL
Payroll reports Line Printer Benefits packages DJDE Credit requests PCL
Payroll checks AFP Gift certificates PostScript Shipping orders PCL
Audit reports Line Printer Letters of Credit PCL
Employee directories AFP Marketing material PostScript

What do all of those cryptic printer designations really mean to you as you try to get to the web? Each one has information in the data stream designed to work with a specific type of printer, which means it carries a lot of stuff your web browser will not understand at all.  Here's the quick tour:

AFP:  IBM's Advanced Function Printing, used by a large number of manufacturers to drive printers at the high, middle and low end of the print speed range. AFP is a structured file format that is generally found in large mainframe environments, but also AS/400 and some UNIX environments. Using AFP structures you can change fonts, add electronic forms, and even switch output bins at the printer.

DJDE: Xerox's Dynamic Job Descriptor Entry, another method of adding structure to data.  Using DJDE's embedded in the data you can cause electronic forms to be added, change fonts, and manipulate the printer.  There is also Xerox metacode, a lower level language produced by many applications in the financial industry, which works with the DJDEs to produce sophisticated printing.

PCL: The Printer Control Language defined by Hewlett-Packard, and extended by many other manufacturers.  PCL strikes fear in the hearts of many of us who work in datastream transforms because there are so many variations.

PostScript: The Adobe-defined language for printers, now moving into the high-end digital printers.

Line Printer: The term we apply to those devices that look or behave like very fast typewriters. These are often found on shop floors, in back offices, or in the IT print room churning out boxes of print each day.  They may use print chains or moving balls, but what they have in common is that the data that is passed to them includes control characters to tell them when to advance the paper, when to overprint to create bold type, and even when to change type faces on printers that support it. Line print can also be fed to Xerox DJDE/metacode printers, AFP printers, and most other printers as long as the proper commands precede the data. And this is where it gets interesting.  Industry estimates indicate that most printing done today is still conditioned for line printers, even though it may be heading for a more sophisticated device.

That's the landscape. Lots of data, lots of variations in how that data might be conditioned today.  In later columns we will drill down to the guts of the data, but this sets a good baseline.

The next step is to look at the nature of the print: does it print one-side only in portrait orientation (print along the narrow axis of the page)? Or do you have landscape print as well? Extra-wide? Extra-narrow? Two-up? Four-up? Duplex?  Letter or A4? Monarch? If it's getting a bit scary that's understandable.  Think about how you are going to get that form that is printed upside-down on the back onto the web. Depending on how the data is in the datasteam this could be a major problem. And what about documents that have portrait and landscape print on the same page? Not many of us have those monitors that turn on an axis. A friend of ours says to remember that a screen is not a piece of paper, and that is an important thing to keep in mind!

You haven't hit all of the formatting issues yet, either. In addition to the questions of orientation there are the questions of fonts and typefaces. We will come back to this topic in detail in later columns, but here is the groundwork. The fonts that you see representing the characters on a page have some type of encoding that cause the print device to know the name of the font and to map it to a file. That is done either on the print device, in the print file, or somewhere else that is accessible to the print environment. If everyone played by the same rules and all of the world used Arial and Times we wouldn't have the challenges that face us.  The facts are that for every type of printer and print file format there are font file formats that go along with them. The attempts to move to standard font formats like ATM and TrueType provide some consistency in the PC and Network world, but back that the IT center there is no such standardization.

Let's start with those line printer devices you find all over. There are still many of these devices sold every year since they meet all kinds of needs. If you look carefully at most line print devices, they support only a very few typefaces.  Many of them look familiar to those of us old enough to remember manual typewriters and the very earliest office word processors. Font names like Prestige and Elite or GT10 are fairly common. So common, in fact, that when IBM introduced its first high-speed laser printers they made sure to provide compatible fonts for those line printer applications to make it easy for their customers to move up to laser printing. You will also find Courier, which is also found among the standard PostScript fonts. A word of caution, though.  Fonts that have the same name are not always built with the same characteristics. The basic characteristic these line print devices shared was mono-space type; each character was the same width, which made printing columns of numbers a breeze. One the true line print devices the characters were on a print chain or ball that ensured that they were identically spaced. When we try to move these applications on to high-speed laser printers or on to the web we face the lack of true mono-space fonts. That means that columns of numbers may not line up correctly, and text targeted for specific locations in a preprinted form might end up in the wrong place entirely.  Any application that relies on the mono-space nature of the fonts to space the text becomes a potential problem.

Those same applications that were migrated to high-speed printers, like the old IBM 3800 printers and their emulations or the original Xerox ESS printers like the 8700 and 9700, have challenges of their own. If the migration strategy allowed for changes to the programs to take advantage of the fonts designed for the printers you will find applications calling for mono-space or proportional fonts with names like GT20, Pi and Specials or Sonoran Sans Serif in the IBM world. In the Xerox world you'll find names like UN111E or P0612B. There are thousands of applications running in print shops throughout the world using these fonts everyday. None of these fonts has easy equivalents on the web. Just one more challenge as we move forward.

If you have applications built in the past 10 years, you may find that the fonts called for include Helvetica, Century Schoolbook, Times New Roman and other familiar looking names. Take care. Once again the font name may not be the key. For any document where the formatting is tight and relied on the font metrics (how the characters were spaced, how white space was applied between characters and lines), you may be in for a rude surprise. Sometimes fonts with the same name behave differently between printers that are sitting next to each other because they were purchased from different font houses. And, since it is possible to edit fonts in both the IBM and Xerox printing environments, it's not unusual to find that characters were deleted or added to fonts.

A key piece of information is that in the IBM world, as on your PC, the font files remained on the host computer.  In the Xerox world the fonts live directly on the printer hard drive, similar to those cartridge fonts many of us lived through in earlier times. Especially in the Xerox environments, it is not unusual to have different fonts on each printer in a shop.  Eight printers, side-by-side, and no two with the same fonts available to them.  The IBM print environments are generally more centralized, but there are often a variety of font libraries and differences in individual fonts within those libraries.

So, fonts are going to be an issue as we move forward. If there is anyway to collect a list of every font available on your printing devices, and a list of those actually used, you would be in great shape.  Sadly, font management software is rare and shops with that key information are few in number. If you are one of the lucky ones, congratulations!

Oh, one more word. Fonts do not always mean typefaces.  In the world of enterprise printing there are thousands of instances of graphics being encoded as fonts. Sometimes they are signatures, sometimes they are corporate or departmental logos, and sometimes they are illustrations. You should look around for these.  Many of these were done as fonts because the printers were not yet ready to deal with large bitmap graphics. Many print programmers became proficient at designing fonts and adding the programming to print applications to use those fonts. The payoff was a more sophisticated looking document that didn't slow the printer to a dead stop the way many bitmap graphics could. 

Had enough?  We haven't even gotten to graphics. The overview is that there are many formats. At the PC level you may have BMP and TIFF files, but in enterprise printing you are more likely to see PSEG (IBM AFP Page Segments) and IOCA (IBM's Image Object Content Architecture) files or Xerox LGO or IMG files. There are IBM vector graphic formats, called GOCA, and enhancements to existing bitmap formats to add spot color and full color in both Xerox and IBM environments.  And, for the most part, those file formats do not directly translate to the web formats like JPEG, GIF, or PNG. 

This is another place where you'll want to make a list of the graphic formats in use in your target applications. You might find that many of them began on PCs and have an easy path to re-creation in a web-friendly form.  Hold on to that thought and we'll get back to it in later columns as we address migration strategies.

By now you may feel like Santa.  Making lists, checking them twice. But the lists will help as you start to assess what applications make good targets for moving to the web and begin planning for the migration.  Next time we'll look at the data itself to help you figure out what will move smoothly and what will present challenges.

About the Authors
P.C. (Pat) McGrew, EDPP and W.D. (Bill) McDaniel, EDPP spent 10 years building a software development company specializing in legacy data delivery, writing books and articles and speaking on issues as diverse as document design and emerging technologies. In 1998 they sold their company and set out to become evangelists for information delivery technology and integration of emerging technologies in the enterprise. Their website is www.mcgrewmcdaniel.com.

Discuss and Rate this Article

Page 1:Legacy Data and the Web: Steps to a Successful Marriage
Page 2:Successful Marriage... Page 2