Friday, August 30, 2013

Featured Webinar: Your Profit is in Danger

Your Profit is in Danger
Join us for a Webinar on September 10
Space is limited.
Reserve your Webinar seat now at:
https://www1.gotomeeting.com/register/778535032
This joint webinar with PSIGEN and OPEX will focus on how to improve document scanning efficiency through a combination of PSIGEN PSI:Capture Enterprise and OPEX hardware.  See how you can reduce prep time and save on labor, improving your margins and driving higher profits.

Title:
Your Profit is in Danger
Date:
Tuesday, September 10, 2013
Time:
10:00 AM - 11:00 AM PDT

After registering you will receive a confirmation email containing information about joining the Webinar.

System Requirements
PC-based attendees
Required: Windows® 8, 7, Vista, XP or 2003 Server
Mac®-based attendees
Required: Mac OS® X 10.6 or newer
Mobile attendees
Required: iPhone®, iPad®, Android™ phone or Android tablet

Wednesday, July 11, 2012

Mobile Capture? Really?

The Document Management industry is all about mobile capture right now. Really? Taking pictures of documents, page by page, with a tablet/smart phone camera. Some of the biggies in the industry are spending huge amounts of money promoting the cause, and building complex infrastructures and image processing to handle these types of images. There are a number of new startups, like StratusFlow, that are focusing on solving the key problem through the cloud.  Want to see a simple solution? Video below uses Microsoft SkyDrive, an iPad and PSI:Capture on the backend to read barcode photos and process the data.

 

Thursday, May 31, 2012

How do you want to find your documents?


Document Capture Drives Search
One of the first stages in planning for any scanned image repository is to ask the question: How do you want to find your documents?  Theories vary on best practices, but here are a few tips when designing a document capture implementation for any ECM system:
  1. Limit your number of fields to 5 or less. So many times i see document scanning customers use way to many fields during capture.  The more fields you have, the more time for end users to index their documents, and the more chances fields will get skipped.  Take the time to interview the end users and truly find how they need to search for their documents.
  2. Always use a date.  Dates are the ultimate filter that can be a life saver when searching for that needle in a haystack in a scanned document repository.  Invoice date, purchase order date, contract date, etc. give you the power to narrow down your search results to a specified period and can be a huge help in audit based searches or searches for legal support.
  3. Use automation to reduce indexing time.  Document capture applications provide automation and efficiency, and can reduce end user keying requirements on documents.  Strong, accurate OCR technology, and Advanced Data Extraction (ADE) are absolutely required.
  4. Ensure your technology has a QA step.  If you are going to go to all the trouble of scanning, capturing and migrating documents to a repository, make sure you can check your work.  Misfiling a document can a painful experience.
  5. Full text search is the insurance policy.  Always, I repeat always, convert your scanned documents to a searchable format, PDF Image with Hidden text.  This will allow for granular searches beyond your index fields/columns, and can help you in the "find a needle in the haystack" tasks.  But do not, I say, do NOT rely on full text search as your primary search method.  Full text does not let you sort by specific document focused dates, cannot let you do range based searches on specific criteria, and restricts sorting and viewing in most repositories.
Just a few tips when designing your document scanning index fields.

Monday, April 30, 2012

Oooops. Did someone backup the paper?

If you  look at the headlines over the past few years, you cannot help but notice the number of natural disasters that have occurred.  In my conferences with IT and Departmental Management, I always pose the question when discussing business continuity or disaster planning: Do you have a plan for your paper?   Just about every company has implemented some type of plan for backing up their important digital files.  Some go to the extreme with data snapshots that can be recovered from multiple locations.  But companies typically don't take the same strategy with their paper assets.  The good ole file cabinet, the protector of all things paper will provide protection, right? Companies need to take a good hard look at their paper, and assess the business impact should disaster destroy their file room.  Backing up your paper nowadays is not hard, nor expensive when compared to the legal implications and time it would take to reproduce (if possible) contracts, customer files, sales records and the like. Any paper backup plan involves a concept i call Bridging the Gap (BTG).  BTG involve hardware and capture software to digitize and build the bridge to the digital world, and then a repository on the "other side" to house the records and make search and retrieval simple.  The repository can be as simple as a set of named network folders, or as complex as a true ECM system like MS SharePoint.  Take the initiative and backup your paper today.

Monday, October 17, 2011

Document Scanning and Capture Planning - Part 4 - Document Scanning Models


Document Scanning Models

After doing some planning on the hardware types and document scanning volumes, the next step would be to examine what type of model you need to deploy.  There are typically 3 standard  models for document scanning and capture: Centralized, De-centralized and Distributed. 
Each model has its own pros/cons, and below I will examine each, and dive into some detail.
Centralized
Ah, the centralized model.  Some call this old school scanning and capture, as for many years, this was the only way to get the job done, and convert your paper to digital form.  This model provides a centralized scanning center to provide mass conversion for the organization.  The operation can be run by in house personnel, be managed by a services provider in house, or be outsourced to a scanning service bureau.  It requires high volume/high speed hardware, and typically utilizes advanced capture software to allow for the utmost in automation and efficiency.  The software and hardware operators are typically highly trained, and there are usually only a few of them.  Paper and/or digital media is shipped to the centralized location and processed through a set, standardized capture workflow.
Centralized Pros
  • Easily standardized process due to a limited number of skilled/trained scan operators
  • High speed hardware/software results in minimal processing time once paper is received
  • Centralized reporting and control of overall process
  • No loading on WAN infrastructure
  • Centralized backup and restore
Centralized Cons
  • Usually a high time delay for availability of documents
  • High cost due to shipping of documents
  • High maintenance costs
  • High training costs to bring on new operators
  • Disaster recovery planning issues if centralized site is down
  • Operators are typically not knowledgeable in the documents they are indexing
Decentralized
Over time, as bandwidth and scanning hardware/software prices went down, the obvious move was to decentralize the whole scanning and capture process.  This move placed scanning in the branches, and allowed the whole document capture process to be performed by those who had working knowledge of the documents.  Smaller, desktop class hardware could be used, and most capture companies made batch scanning and upload to the centralized repository simple to accomplish.
Decentralized Pros
  • Scan operators are well versed in the documents they scan
  • Documents are available almost immediately
  • No shipping or transfer costs for documents
  • Branch control of the whole scanning process
Decentralized Cons
  • Standardization can be an issue
  • No centralized control or reporting
  • WAN Bandwidth consumption can be high
  • Licensing costs can be high depending on software utilized
Distributed
The advance of network-based scanning devices and the lowering of bandwidth pricing led to the newest model, the Distributed Model.  Distributed Scanning allows for just about anyone in the organization to walk up to a network scanning device/scanning copier/fax machine and send documents to a repository.  The devices are typically multi-faceted, and along with repository integration, can provide scan to network folder, FTP and email.  Collaborative back-end systems, like Microsoft SharePoint, lend themselves nicely to this model, as they allow anyone to participate in a Document Workspace.
Distributed Pros
  • Put scanning in the hands of everyone in the organization
  • Provides a great launching pad for collaborative solutions
  • Simple, easy to use interfaces allow for minimal training and quick adoption
  • Capture and indexing is now in the hands of the true document owner
  • One-to-many solution provides a single device to service many users
Distributed Cons
  • Lack of standardization without software addition
  • Security and document control can be major issues
  • Bandwidth from smaller branches can be a problem with larger scans
  • Lack of hardware integrations with back-end systems
So, most organizations today are combining the above models to create a Hybrid Scanning and Capture solution, and leveraging all the strengths together to minimize the weaknesses of any one model.   Another strategy is to tie scanning models to specific business processes, as most lend themselves nicely to specific scanning and capture workflows.

Hardware and Choosing Your Scanning Model


Most organizations will choose their model to leverage their existing hardware investment, but this can be lead to decisions that seem good at the time, but if deeper examination occurs, it can make sense to realign hardware with the best model.  Take for example, a company that instantly leans toward a distributed model, and attempts to leverage their copier fleet that is currently under lease.  If you examine the part of this guide that covers scanning hardware, copiers will not always fit for the type of scanning you need to perform.  Take for example a branch accounting department that is looking to scan receipts or check stubs.  Will the copier perform well with mixed original sizes?  Just a word of caution to examine the paper, workflow, and document types to get the best feel and adapt the best model.

Tuesday, August 2, 2011

Document Scanning and Capture Planning - Part 3 - Scanning Hardware

Now that I have covered Sizing and Storage in Part 1, and Document Separation in Part 2, now we can start to take a look at scanning hardware.  There are several key questions you need to answer:  Can I use pre-existing hardware such as copiers or fax machines?  Do I need a dedicated scanner?  If I choose to buy a scanner, what features/characteristics are important?

Some may argue you need to decide on a scanning model before you dive into hardware (distributed, centralized, or decentralized), but I will cover this in the next section.

So let’s start with a key question:

Scanning Copier or Dedicated Scanner??

Scanning Multifunction Peripherals (MFPs/copiers) have become standard in most offices. I receive the same question all the time from prospects and customers: Can’t I just use my copier for scanning? In many cases, for a typical office, with typical documents, a copier can be an appropriate component to any scanning solution. As offices become more complex in the way they handle their documents, or they expand their scanning efforts to other departments, dedicated scanners are usually required to achieve the desired result.

Below are some interesting statistics provided by InfoTrends:

· 65 % of office workers use digital copiers/MFPs
· Over 50% use the “scan” feature daily
· 71% expect scanning requirements to increase from year to year
· 72% believe it is necessary to view images before processing
· 36% will require dedicated scanners versus MFP devices
· 36% believe they will need both scanners and MFPs

So what are the benefits/drawbacks to scanning with both types of devices? Below is a summary:

Benefits of MFPs as scanners:

  • Leverage your existing investment in the MFP
  • Most copier maintenance plans do not charge for scans, so you get “free” maintenance for the scanning function (no print/copy, no click charge)
  • MFP manufacturers are really focusing on scanning capabilities: fast speeds, better quality and enhanced drivers, etc.
  • Network scanning functions:
  • Scan to email
  • Scan to Windows Folders
  • Scan to FTP
  • One-to-Many relationship: all workers can use one device.

Drawbacks of MFPs:

  • Contention – copying, scanning and printing may cause “a line at the copier”
  • Poor performance with differing paper sizes
  • Lack of color dropout (Scanning blue or black backgrounds will result in a black page)
  • Lack of image correction capabilities (auto deskew, despeckle, black border removal, streak removal, etc.)
  • Small Document Feeder sizes (50 – 100 pages)
  • On average, file sizes are 10-20% larger
  • Duplex scanning/DPI increase greatly slows down rated speed
  • Black and White scanning only on some models

Benefits of Dedicated Scanners:

  • Convenience – scan at your desk
  • Duplexing does not slow down scanner
  • Color dropout
  • Superior image quality due to enhancement features
  • Ease in handling differing paper sizes/types
  • Larger document feeder selections (up to 1000+ pages)
  • Smaller file sizes
  • Ability to preview scanned documents at scan time

Drawbacks of Dedicated Scanners:
  • One to One relationship – directly connected to PC
  • Additional Maintenance costs

Above are all the pluses and minuses, but in a nutshell, when should you use a dedicated scanner?

  • Scanning 50+ documents per day
  • Workers that are constantly scanning throughout the day
  • Mixed paper sizes, weights and colors
  • Poor quality, older documents or when image enhancement is required
  • OCR or ICR applications
  • High volume copying and printing environments
  • Large Document scanning
  • High security environments

Now that you have an idea of the pros/cons of both types of scanning devices, now let’s take a look at the different features of scanning devices, and what to look for when purchasing a dedicated scanner.


Scanning Speed

Scanning speed is a main area of focus when researching scanning hardware. A scanner’s speed is usually directly proportional to its price, but you have to ask yourself one question: How long do you have to accomplish your scanning tasks? If you buy that cheapo scanner at an office products store that scans at 8 pages per minute, good luck in getting those 10 file cabinets scanned. Another note to mention is that all the manufacturers rate their scanner speeds at 200 DPI. If you need high quality images, or are performing OCR, 300 DPI will probably be necessary. This will significantly slow down your scanning speed, as will color scanning and duplex (2-sided) scanning on some models.

Document Feeder Capacity

The document feeder provides you the ability to load anywhere from 1-1000+ sheets into the scanner. The feeder capacity you require all depends on the volume of paperwork you are scanning, and if you are using an intelligent capture application that provides the ability to use separator sheets to split documents automatically. If you are a Law Firm that routinely scans 200 page documents, then that is a good starting point for your feeder size requirements. This allows you to load your documents, and then let the scanner do the work.

Another focus area related to the feeder is the maximum and minimum paper sizes. If you intend to scan legal size paper or insurance cards, make sure the scanner can handle them.

Daily Duty Cycle

The Duty Cycle (DC) is a rating of the scanner’s durability, and defines just how much paper you can feed through the hardware in a day. If you are scanning 3000 pages per day, you do not want to buy a small desktop scanner with a DC of 750. What happens if you exceed this number? Nothing to begin with, but as time goes on the wear and tear on the unit will begin to show in the form of jams, miss feeds, skewing, etc. This number is also tied to the replacement of consumables (rollers and pads). If you continually exceed the DC, you will more than pay for a higher level scanner in consumables over time, and your maintenance costs may go way up.

Scanning Mode

Most scanners nowadays can scan both sides of your document, but there are still some lingering models that will only do simplex scanning. Also, if you have the requirement to scan color documents, ensure that color scanning is supported.

Warranty and Service

All warranties are not created equal. Some scanner manufacturers provide “depot” type service where you have to ship your scanner for warranty service. Others will provide onsite warranty service for a specified period of time. Along with this, the time period on the warranty also varies everywhere from 30 days, to a full year. Scanner service is a separate purchase, and in some cases, can be a shock to the purchaser. A basic service plan on a mid-range scanner can cost over $1000 per year. Get an advanced plan that provides Preventative Maintenance visits, and you could be in the $1500 - $2000 range, depending on your model. Get all the details up front, and some manufacturers will provide multi-year discounts on service.

Image Processing

Definitely investigate the image processing software that comes bundled with your scanner.  This software will improve the quality of your images, remove shading, borders, etc.  Many of the manufacturers now provide third party image processing software (Kofax VRS), but several have their own built into their drivers.  Most capture software also has built in image processing components as well.

So hopefully this will answer the majority of your questions on hardware.  Remember, hardware is just part of the overall capture solution.  Follow on articles will cover information on software selection and required features.

Friday, July 8, 2011

Document Capture and Scanning Planning - Part 2

Document Examination and Separation


One of the key steps in preparing for document scanning and capture is to identify how you will separate or split documents.  What is separation and how does it work?  Details below:

For those of you that are new to document management and capture, document separation is the notion of how we can determine when a document begins and ends.  With most simple scanning software, this process is easy.  You load a single document in the feeder, click scan, and when it is done, you name it and save it.  With advanced capture, you can load multiple documents into the feeder, scan them all at once, and use a separation method to split them into individual digital documents.    This is a massive time saver.  Imagine loading 20 individual documents into a scanner one at a time, scanning each individually, and then entering information about each.   Below are some key separation methods any advanced capture suite should have:

Fixed Page Count Separation – This allows you to split based on a certain page count.  So if you scan a stack of 100 two page forms, you will have 50 separate documents in your capture interface.

Barcode Separation – probably the most pervasive separation method is a barcode separator.  Place a sheet with a specific barcode pattern between each document, and you are off to the races.  To give you the most flexibility, applications should support the following enhanced barcode separation methods:

  • Separate on any barcode
  • Separate on specific barcode terms and patterns
  • Separate on barcode type
  • Separate on barcode count
  • Separate on a certain number of barcodes on a page
  • Separate when a barcode changes

You want to make sure your barcode engine supports 1D and 2D barcodes without the purchase of any expensive modules or add-ons, and it should also have a simple feature that lets you split 2D barcodes and identify separation terms.

Patch Code Separation – So what the heck is a patch code?  Just an old school horizontal barcode.  Below is an example.  If you work in the medical field, most medical billing forms will have these on them, and some scanners actually support using patch codes to shift scanner settings during the scanning process.  For flexibility, choose an application that supports patch code separation.

Optical Character Recognition (OCR) Separation – OCR is the process of converting a scanned or imported image into searchable text.  OCR separation searches for a key word, term or phrase on the document, and will recognize that page as the first page in a new document.  This is a preferred method, as you don’t have to kill trees to print cover sheets, and it makes document preparation simple (no inserting separator sheets).  For example, if you are scanning contracts, and you want to split when you find an 8 digit contract number in the right hand corner, this comes in very handy.  There are several key requirements in this feature that are absolutely required in your application to make sure you get high separation accuracy:


  • Scan at 200 or 300DPI and use an app that has image processing software to clean up the page.  Also, your image processing engine must allow processing of imported PDFs and TIFFs if you plan to harvest documents.  Some image correction/processing engines only work with scanners.
  • Insure you capture application allows you to use expression matching (Regular expressions) so you have the utmost flexibility in finding separation patterns.
  • Character sets are key.  These provide the ability to tell the OCR engine the type of characters you are looking for (A-Z, 0-9, etc), so if it misidentifies a character, it auto-corrects the information.
  • Finally, top line applications also allow you to separate when OCR terms change.  So you can look for that contract number, and only split when you find a new one.
Intelligent Character Recognition (ICR) Separation- ICR is the process of converting scanned images of hand printing to text.  This method can be utilized to split pages when certain patterns in hand printing are detected.  Note:  all of the features required to insure accuracy for OCR separation should also be considered if you utilize this method as well.

Document Import and Separation – There are several separation methods that can be key to success if you need to import large volumes of documents, or you want to process documents scanned from copiers, network scanners, or fax machines.  Below is several separation methods required for any document capture from imported files:
  • New File Separation – This method of separation will look at a directory, pick up files, and maintain each new file as its own digital document.
  • Folder-based separation – This is a key method if you are importing documents and want to combine them based on the folder.  One example might be a law firm that has a folder structure of case documents on different subjects for the case and wants to combine each folder into a single PDF file.


Blank Page Separation – I only mention this as I would always, always avoid it unless absolutely necessary, especially if you are scanning in duplex.  Most implementations of this method, unless operated under strict preparation by knowledgeable operators becomes an absolute mess. (Just my humble opinion ;)  )

Separation Scripting – Finally, for those rare and special occasions, you always want a product that has a pre-built scripting interface for customizing the whole process if necessary.  Now let me be clear, not a sales rep “Yeah we can do that” (Which usually means $20,000 in professional services), but a product that has simple hooks into the separation function, that allows you a simple “yes or No” based on some parameter or criteria that anyone with basic scripting skills can write.  When would you use something like this?  Usually for very complex jobs where the original documents cannot be modified, but you need to put some logic in place to spit documents.

The last separation topic I want to cover is something called triggered separation.  Let me set the stage on this one, and describe a process which is near and dear to every accounting manager’s heart, invoices.  So you have a stack of invoices, some single page, some multi-page and you are struck with a dilemma.  If I use barcode separators, and I have 100 single page invoices, do I really have to put 100 barcode separators between them all?  Separation triggers allow you to scan single page and multi-page documents all together.  So in this example, you can stack your singles, and then put separators between your stack of variable length separators.  Put a trigger sheet between the two stacks (this tells the capture software to switch from single page separation to barcode-based separation), and scan the whole stack in one fell swoop.  This is a huge time saver in high volume environments, and can allow you to also build redundant separation logic, so you get the highest accuracy in separation with the least amount of document preparation.  Phewwww.  That was geeky.


Do you really need all of this?  Does separation have to be that complex?  The whole goal here is to have as much as you possibly can in the tool kit to insure you can meet all the capture needs within your organization.  I liken it to buying the a base model with no accessories, and then wishing every day you one or another feature.

So now you have examined your documents, and figured out how to efficiently scan and split.