Great questions to analyzing what you need from an OCR perspective:
OCR Software Information
Sunday, December 27, 2009
Thursday, November 26, 2009
I have held back comment on this one for far too long. What is up with the MFP manufacturers and what I call "Panel Insanity"? When it comes to scanning and capture, they want everything tied to the panel!!
So, I can understand, you make hardware. You want it to be as "all-in-one" as it can be, and give organizations a super powerful copy-scan-fax-print machine that can do it all.
But let's get serious for a minute, and talk about the drawbacks to panel processing for scanned documents:
- Although the panels are getting larger and larger with every generation, do we really want users standing at the copier, pecking away, entering their metadata? Is that the most efficient use of their time? Efficiency is so important, and getting the most out of employee time in this economic environment is absolutely required.
- Contention. When I want a copy, I want it now darnit. You scanners need to get out of my way, and yield to the almighty paper. Line at the copier? This is a bad thing.
- Scan versus Capture? I wrote an article on the differences between the two application types on my SharePoint specific blog: Scan versus Capture? Let's move away from scanning and use some automation to collect data, rather than manually enter it at the panel.
- Even though there are panel apps, there is almost always another component behind the scenes doing OCR, reading barcodes, etc. So now we have multiple failure points within the overall capture system. Keep it simple, the fewer application components, the better.
Just some thoughts on this topic.
Wednesday, November 18, 2009
Tuesday, April 14, 2009
So, with SharePoint there is always a challenge in how to provide a scanning and capture onramp to the masses. How do we enable our MFPs to scan to our SharePoint Server? How can I scan to WSS with my copier? In Document Imaging, there are many simple ways to allow a standardized, efficient way to capture documents. One of the challenges with copiers, is that they only provide basic scanning capabilities. The key here is to empower your end users with some type of automated way to get information into the end repository. Here is one of the best ways I have seen, and there is a video on how to use routing cover sheets with SharePoint:
Simple Scanning and Capture to Microsoft SharePoint
Posted by Steve at 6:32 AM
Sunday, February 15, 2009
In any Document Management or Enterprise Content Management System, there are four basic components: Hardware, Capture, Archive and Search and Retrieve. So what is the most important piece? Everyone nowadays seems to have the hardware. All of the copiers today have scanning capability, with the newer ones scanning at 70 pages per minute. The simplest archive is a series of folders on your server or workstation. And with files on the network, Windows Search, or the search capabilities within Abode allow you to find what you are looking for quickly (sometimes).
For the more advanced organization, they may have a Document Management System, or utilize Microsoft SharePoint for their archive and search and retrieve functions. But what seems to be lacking in most organizations, is a structured, automated way of capturing files. The argument of this BLOG entry is that Capture is the most important piece to any ECM or DM System.
As mentioned before, when we look at just about any office or organization today, they are scanning with a copier or desktop scanner. But inevitably, they take their paper mess and recreate it digitally. Why? No standardization in the process. Joe scans his files to his email and stores them in Outlook folders, Betty scans to her My Documents on her laptop, and uses a convoluted naming scheme that only she can decipher. They take their paper problem, and create a huge problem for IT. Disparate archives now pose a disaster recovery problem, along with the issues of accessibility.
So what is the answer? Advanced Capture. Advanced Capture applications provide the ability to set structure, and harness the capabilities of all the scanning hardware within the organization. They can provide standardization and structure, along with fantastic efficiency improvements. Take for example, PSIGEN's PSI:Capture. With its Microsoft SharePoint Migration feature, and auto-import capability, you can set all your scanning copiers to scan to a processing folder. Utilizing the barcode routing capability, you can create cover sheets for each library within your SharePoint site. When you scan, the software will pick up, process, rename and folder files automatically. The end result is a standardized folder structure, standardized naming scheme, and a searchable PDF all within your SharePoint site.
The other major contributor to efficiency within Capture applications is the ability to use separation technology. I see it all the time...the office that has 20 documents to scan. They walk up to the copier, and scan them one by one; a very time consuming process. With document separators, you can scan the entire stack and let the software split the documents, rename and folder them. Let the technology do all the hard work!
Thursday, February 5, 2009
Every now and then, a new version of software comes along that just amazes you. In selling Document Management and Capture applications, I am constantly looking for features to help my customers make their processes more efficient. PSIGEN's PSI:Capture 3.5 is packed with some phenomenal features to help make the scanning and capture process more streamlined, and reduce the time it takes to index scanned documents.
The new version includes OCR Assisted Indexing, or what I will call OAI. OAI allows the user to "point and click" at pertinent information on the scanned document, and automatically populates the index field. For example, a scanned invoice can be quickly indexed by just pointing at the invoice number, invoice date and vendor...no typing necessary. The software takes this function a step further, and can auto-highlight specific information on a document based on a pattern of characters.
Feature detection is a powerful tool that allows the application to intelligently identify features on a page, like say, a logo. This detection allows the ability to separate documents based on certain features, so you can "tag" the first page of every document by the logo in the right corner. No more inserting separator sheets!!! Along the same lines, you can now separate based on zone OCR values, or particular words on the front pages of your documents.
The Zone OCR function in version 3.5 has now been greatly enhanced. The product now includes the ability to anchor, based on a barcode, patch code or from the left corner of the document. This ensures repeatable zoning across scanned documents. You can now images process individual zones to remove lines, and clean up the image before OCR. The image processing enhancements include line removal, autoinvert, and the ability to run them within the Quality Assurance module.
The product now includes Optical Mark Recognition. To give you an example, in playing with the software, I created a SharePoint Routing Sheet for an accounting department. On the routing sheet I have 4 boxes, one for invoice, po, contract and correspondence. When I scan the document, the application will read which box I have checked, and then autopopulate an index field, rename the file, and route it to the appropriate SharePoint Document Library and folder!!! Wow!!
This is just an overview of the high points, but the new version also includes the following: Separation Profiles, Device Profile import and export, PDF/A OCR output, new barcode types, and greatly improved database write/read performance.
I will be covering each of the features in a separate post with screenshots in the coming weeks. You can grab some more info from www.psigen.com or on www.scanguru.com.
Posted by Steve at 7:24 AM