DECimage Character Recognition Services for VMS Programmer's Reference Manual

*HyperReader

  CONTENTS

  Title Page

  Copyright Page

  Preface

  1      Introduction to DCRS

  1.1     What Is DCRS?

  1.2     Hardware and Software Requirements for DCRS

  1.3     OCR Concepts and Process
    1.3.1      Formats of Visual Data
    1.3.2      OCR Process

  1.4     How DCRS Works
    1.4.1      DCRS Objects
      1.4.1.1      Image frames
      1.4.1.2      Regions of Interest (ROIs)
    1.4.2      DCRS Process
    1.4.3      DCRS Services
      1.4.3.1      Page Segmentation Services
      1.4.3.2      Text Recognition Services
      1.4.3.3      Structure Access Services
      1.4.3.4      Text Export Services
      1.4.3.5      Postprocessing Services

  2      DCRS Process

  2.1     Segmenting Regions
    2.1.1      Listing Regions
    2.1.2      Copying and Deleting Regions

  2.2     Recognizing Text
    2.2.1      Feature Extraction
    2.2.2      Listing Words
    2.2.3      Types of Recognition Errors

  2.3     Exporting Text

  2.4     Deleting Structures

  3      Guidelines for Optimizing Recognition

  3.1     Checking the Quality of the Document and the Scanning Process
    3.1.1      Checking the Quality of the Document

  3.2     Checking the Quality of the Scanning Process

  3.3     Specifying a Language

  3.4     Examples of Text Processed by DCRS

  4      DCRS Routines

  IrsDeleteBuffer

  IrsDeleteRegion

  IrsDeleteStruct

  IrsExportASCII

  IrsExportDDIF

  IrsExportPS

  IrsGetRegionList

  IrsGetWordList

  IrsRecognizeText

  IrsSegmentRegion

  A   Condition Values and Error Messages

  B   Example Program in VAX C(RECOGNIZE_TEXT_C.C)

  Glossary

  FIGURES

  1-1        DCRS Relationship to DECimage Application Services for VMS

  1-2        Example of a Scanned Business Document

  1-3        DCRS Process

  1-4        DCRS Services

  2-1        Sequence of Routines in the DCRS Process

  2-2        Regions of a Segmentation Structure

  2-3        Spacing of Text

  2-4        Fonts Returned by DCRS

  2-5        Similarly-Shaped Characters

  2-6        Columnized Text

  2-7        Exported ASCII Text

  3-1        Scanned Image with Errors

  3-2        Highly-Stylized Font

  3-3        Measuring Font Size

  3-4        Enlarged Text From Low-Quality Document Before Recognition

  3-5        Text From Low-Quality Document After Recognition

  3-6        Enlarged Text From a High-Quality Document Before Recognition

  3-7        Text From High-Quality Document After Recognition

  TABLES

  1-1        Page Segmentation Services Routine

  1-2        Text Recognition Services Routine

  1-3        Structure Access Services Routines

  1-4        Text Export Services Routines

  1-5        Postprocessing Services Routines

  2-1        Returned Fonts

  3-1        Point Size and Scan Resolution Combinations

  4-1        Headings in the Routine Template

  4-2        DECimage Character Recognition Services Routines

  4-3        Export PostScript Flag

  4-4        Fields in the Region List Structure

  4-5        Font Style Values

  4-6        Fields in the Word List Structure

  4-7        Word Type Values

  4-8        Font Info Values

  4-9        Character Set Values

  4-10       Resolution Values

  4-11       Segment Region Flag