Home  » Services  » BPO     


Data Digitization

Our indigenously developed application helps us to produce and deliver higher outputs from each operator. All these projects require OCR from TIFF to 99.995% text accuracy & tagging to SGML/HTML/XML Standards.

Conversion to Tagged Text Output

Standard Generalized Markup Language: SGML
A markup language used to define the structure of, and manage documents in, electronic form. SGML is used widely to manage highly interrelated documents, larger document, and high-value documents that are subject to frequent revisions.


Extensible Markup Language: XML
XML provides new tags as needed which the users can define. Its structure is hierarchical and data can be modeled to any level of complexity, it can be easily validated for structural correctness and it is media independent, the same content can be published in multiple media. We specialize in Dublin Core Methodologies for XML

.
Hypertext Markup Language: HTML
HTML is a good choice for document with little or no graphics and where the requirement on precise document layout is not high. Unlike Plain Text HTML enabled web pages and documents to be linked to one another for the user to have a wider spectrum of information on the net. Content capture services include: Quark TM, PageMaker TM and MS Word TM to PDF and XML. Complete Tagging for XML, SGML & HTML. Up to 99.995% text accuracy from scanned documents. Most database formats supported, complete validation, data processing and fully customizable services.


OCR/ICR Services
We have expanded our traditional methods of data capture through scanning, optical character recognition (OCR), intelligent character recognition (ICR) and optical mark recognition (OMR) technologies. These automated data capture methods are ideal for high speed processing of a large volume of identical forms such as: Insurance, Medical, Banking, and Bill Remittance Forms. We have also expanded such automated methods in line of transaction processing such as: Credit cards payment processing, Cheque payment processing etc.
OCR is a process of using specialized software to turn images into editable text. Together, they provide us the ability to turn paper documents into re-usable formats, like MS Word files, PDF files or HTML pages. OCR can also be used to turn image-based PDF files back into editable text. Having been used to such automated methods of document processing, We provide its OCR perfection services in various formats such as: Checks with OCR, Full text with OCR, Indexing using OCR, Barcode recognition, Handwritten text with OCR.

ICR
Handwriting is a natural form of communication that people have used for centuries; and while we live in an increasingly keyboard-centric world, there are vast areas where keyboard and keypad entry simply never work as effectively as natural handwriting. ICR (Intelligent Character Recognition) enables handwritten text to be extracted and used for automatic indexing of images.

Indexing
We provide various types of indexing database as per clients requirement. For this particular requirement we have to OCR the scanned images and sort the data to build the database as per requirement.

OCR Proofing
This service involves proof reading for superior quality OCR files. Files with normal text undergo online proofing whereas texts containing special characters, chemical formulae, superscript characters, subscript characters that are more difficult to OCR through software, require manual proof reading. This is done through comparing hard copy with soft copy. All the errors marked by the proofreaders on the soft copy are updated online.

E-Book Publishing
We convert vast tomes of textual data into Adobe PDF format, required by various organisations for various purposes, including preserving text matter in a digital format. This services is usually suitable to Publishers and Libraries.


     Copyright © 2004 deusinfotech.com. All Rights Reserved.