
The method Create of the md object receives the path of the file to be converted. The method OCRImplementation will convert image files ( *.tif, *.jpg, *.gif, *.bmp, in this case we're using a TIFF file). MODI.Image image = (MODI.Image)md.Images At the COM tab, select Microsoft Office Document Imaging 12.0 Type Libraryįoreach ( string Name in checkedListBox1.CheckedItems).At Solution Explorer, select Add Reference.To use the Office 2007 OCR API, you have to add a reference to Microsoft Office Document Imaging 12.0 Type Library. Make sure that the component is installed.Click on the button Add or Remove Features.
#Microsoft word 2007 ocr tool install
The Office 2007 installation setup doesn't install this component by default, being necessary to install it later. It's necessary that you have installed the Microsoft Office Document Imaging 12.0 Type Library. In addition, we'll use the Speech Recognition API to improve the application User-Experience.īefore we start, it's necessary that you already have the following requirements installed: In this article, we'll create a Windows application that uses the Office 2007 OCR API to generate OpenXML documents. It's important to remember that the API used in this sample is exclusive of Office 2007 (Office 2003 has its own OCR API). To facilitate the work of developers and to avoid the integration with third-party applications, Microsoft released with Office 2007 one OCR (Optical Character Recognition) API that's called MODI (Microsoft Office Document Imaging).

NET Framework 3.0 and it's a default feature of Windows Vista. Developers can use this API and provide better User-Experience, easy access to specific information and so on.

The Speech recognition is a feature included with. It's 75 percent smaller than compared binary documents and is based on two major technologies: ZIP and XML.

The format is the default format of Microsoft Office 2007 documents ( *.pptx, *.docx, *.xlsx).
#Microsoft word 2007 ocr tool iso
OpenXML became an ISO standard (IS29500) and its adoption is growing day by day driven by its performance, scalability and security. If you want to convert directly to *.docx (OpenXML) documents, you'll have to use third-party applications or develop it from scratch. Some scanners provide applications that automatically perform this kind of conversion, but most times, the generated document format is a *.pdf or *.odt and so on. Sometimes at the development of an application, we face situations where we have a scanned document (image) and we want to convert it to text (Word 2007 document).
