Sunday, September 21, 2008

Should I use SharePoint as a replacement for my file servers?

Should you utilize SharePoint as a replacement for your file server?

Many SharePoint features make it a fantastic replacement for traditional file shares. The product is very strong in two key areas when it comes to files: collaboration and versioning. For those file types, like Word Documents and Spreadsheets, the product provides the ability to share documents with a group, and allow collaboration and change management through its version control functionality. The ability to share over the web makes it simple and easy for all to access files.

So is it time to move all of your file server content to SharePoint? I would say no, and restrict the system to those types of files that fit into the below categories:

-Those that need to be shared over the web
-Files that require input or collaboration from multiple parties
-Whenever version control is required
-Whenever files are subjected to compliance rules

For more information on SharePoint go to ScanGuru SharePoint Information Page.

Monday, September 1, 2008

Captaris Single Click Entry Technology

I played with some cool technology this week. I received my demo version of the Captaris Single Click Entry (SCE) technology. Captaris aquired the application when they purchased Oce Document Technologies (ODT) back in January. The application has a native integration into SharePoint, and allows you to open a TIFF scanned image, and just point and click to enter index fields/columns. So, open an invoice, click on the invoice number field, and presto, you have instant OCR field population. There are some screen shots and an associated article at:

Captaris Single Click Entry for SharePoint Article

Sunday, August 17, 2008

Fujitsu and Kofax Demo

I saw some really cool technology this week at a demo our scanning distributor arranged. It was a joint demonstration with Fujitsu and Kofax. The capture side was the soon to be released Kofax Express. Kofax has built a really powerful capture and imaging application for the consumer on a limited budget. But note, this application has many of the features of a full-blown capture application. Separation, barcode index field population, and release to several formats, inlcudig Microsoft Sharepoint. Along with that, it will include all the image processing features of Kofax VRS 4.2. All this for a price between $495 and $2000, depending on the speed of your scanning device.

Along with the software, demonstrations of the Fujitsu fi-5900C, fi-6140 and fi-6770A were performed. The new Fujitsu line is pretty incredible. The fi-6670 and 6770 are replacements for the previous workhorses, the fi-5650C and fi-5750C. The scanners can also be bundled with built in hardware VRS, as denoted by the "A" in fi-6670A and fi-6770A. So what is new in this product? First of all, the speeds are at 90 ppm now, which places these scanners in a whole new class. Secondly, they have implemented a new multifeed technology that allows the scanner to learn from scanned documents, and ignore sticky notes, or other items like checks which are pasted/taped onto a sheet of paper. This can be a huge time saver.

For more info check out the links at ScanGuru Capture and Scanner Links.

Monday, August 4, 2008

Document Management Site Search with Google

I found my self searching way too many sites when I was looking for specific information, or doing research across sites for Document Management and Scanning. I have created a few Custom Google Search Templates for those of you would like to search across vendor sites for info on Document Imaging. There is a Scanner Manufacturer Custom Search, a Capture Software Custom Search and a Document Management Software Custom Search. For instance, if you wanted to look for HIPAA related articles across all the scanner vendor sites, you can do that. If you want to find all the scanners that support 11x17 paper, it is now possible.

Click the link below to use them:

Document Management Site Searching with Google Custom Search Applets

Let me know if you want any others, or want any other vendors added to the mix.

Thursday, June 19, 2008

What is OCR and how can it help me in my scanning project?

Ah, OCR, also known as Optical Character Recognition. Is it really necessary to use OCR software after scanning files to TIFF or PDF? What are the key benefits of OCR? How can I use OCR to create searchable or editable documents?

OCR technology has come a long way in the past few years, and the OCR engines on the market today utilize intelligence and speed to quickly and accurately convert scanned paper documents from plain old images, into searchable or editable documents. For a quick overview of OCR, ICR and OMR, click here.

When looking at OCR technolgies, you need to determine your end goal: is it searchability or a cleanly formatted, editable document. Is your goal speed, or accuracy?

There are a number of desktop applilcations (eCopy Desktop, Adobe, OmniPage, ReadIRIS), that can provide the ability to create searchable files, as well as Word Processing files, or even spreadsheets. These are perfect for low-volume, daily conversions.

If you are scanning a large volume of paper, and need rapid and accurate conversion, most of the Advanced Capture applications on the market can accomplish the task ( Psigen PsiCapture is an example). This capture software utilizes either the Expervision or OmniPage production OCR engines, and can convert a 1000 pages in 10 minutes to searchable PDF.

For more info on OCR and how it can work for you, see the links below:

OCR Software Links

Scanning and Document Management Articles and Research

Saturday, April 19, 2008

How do I scan documents into Microsoft SharePoint?

SharePoint was designed for collaboration with Microsoft Office, and it does a fantastic job, especially with MOSS 2007. But what happens when there are paper documents that need to be added to a site through a digital imaging or scanning process? What options exist?

Below are some key areas to evaluate before deciding on the type of scanning application you require:

Scanning Volume
What type of page and/or document count will your users be scanning each day? Scanning and Capture applications are divided into two separate segments: basic and advanced capture. For users that are scanning just a handful of documents, say below 20 per day, basic capture applications will suffice. What constitutes a basic capture app? All of the scanner manufacturers provide basic capture applications with their hardware. These applications allow you to scan to file (Usually PDF or TIFF), and then you can just use the SharePoint interface to add the file. There are some applications out there that actually interface with the SharePoint server to provide upload capability either directly through a SharePoint menu, or through a “middleware” application that interfaces with SharePoint.  You might also need to examine SharePoint OCR.  The links below provide information on several applications:

Scanning Applications for Microsoft Sharepoint

What if you have high-volume scanning needs where you are scanning boxes of paperwork into the SharePoint Server? This type of requirement will usually require an advanced capture application, like Psigen or Kofax, that will interface with a high speed scanner, allow you to utilize document separator sheets, and provide the ability to batch upload documents and metadata. Links below:

Distributed versus Centralized Capture
How do you envision your users scanning documents? Will you place the power of scanning in the end user’s hands, or will you have a centralized scanning process where documents are sent to a centralized facility for scanning? This decision is usually tied to the volume of scanning required, and most organizations will choose a hybrid method where both options are available. Placing desktop scanners in the hands of key users, or enabling a copier with an application like eCopy can provide a simple conduit into SharePoint for the masses. The centralized capture route is great for higher volume, or when you want to insure standardization and compliance.

Savvy or Technophobic Users?
Capture applications can be painfully complicated, or extremely simplistic. You need to gauge the learning curve for each type of technology, and insure acceptance. Simple is better for organizations that are new to scanning and capture technology. Making the process as simple as possible will encourage users to scan and add files to the repository. eCopy is an outstanding, yet simple application to get your feet wet in capture, below are some links to read up on eCopy:

What is eCopy?
If you have tech savvy users, and require a high horsepower application, checkout PsiCapture from Psigen or any of the other advanced capture apps:

Document Capture Links

SharePoint is a powerful tool for collaboration and sharing, and any of these applications can help image paper documents into the repository.

Sunday, March 30, 2008

eCopy, SharePoint and Scanning in the Enterprise

More and more organizations are taking the leap into a centralized document repository, through the use of Microsoft SharePoint. SharePoint is an oustanding tool for collaboration, and accels when utilizing Microsoft Office documents. But what happens when you create a SharePoint site that will require the uploading of numerous scanned documents? How can you standardize file naming, metadata population and the overall document imaging process?

One of the best solutions I have found is eCopy ShareScan. eCopy is a document capture solution that can be connected to just about any Multi-Function Copier or Dedicated Scanner. The challenge with traditional scanning solutions is that they usually require dedicated scanner hardware that is connected to a PC. This provides a one to one solution, allowing only the PC user to scan documents. With eCopy, you can provide a one-to-many relationship, and share the scanning capabilities of your copier or scanner with an entire office or department. It has a simple, easy to use touch screen interface that even the most technically challenged user can learn to use. It can provide a rapidly deployable solution, with quick adoption, and low training requirements.

eCopy provides a SharePoint Connector that allows quick integration into any site, with all security in tact, and the ability to require metadata fields. For larger organizations, you can configure one ScanStation, and then publish the configuration to all the others on your network. For more info on eCopy and additional tools, go to What is eCopy? or for other SharePoint Scanning Utilities, click on the following link SharePoint Scanning Utilities.

Thursday, March 27, 2008

TIFF or PDF for my document imaging project?

Ah, the age-old argument…what file format do I use for scanning and archiving? TIFF, PDF, JPG, MDI, BMP? I will focus on the two most prevalent options, TIFF and PDF.

TIFF – TIFF stands for Tagged Image File Format and has become the industry standard for Document Imaging. All of the scanner vendors, Document Management Software and Enterprise Content Management vendors support this format. TIFF was originally developed by the company Aldus as a standard for imaging. Adobe has since acquired Aldus. The TIFF format provides compression to produce manageable image sizes.

PDF – PDF stands for Portable Document Format, and was created by Adobe for simple document exchange. It has quickly been adopted by the document imaging industry because of the proliferation of the Adobe Reader. There are 3 types of PDFs: Image, Text and Image+Hidden Text. The Image + Hidden Text gives you an all in one package if you want a searchable file. The newest PDF standard is the PDF / A, or PDF Archive standard. This PDF type will insure long term archival compatibility. It also sets a standard for an all-in-one package that includes metadata, OCR text and the image itself.

So which do you choose?? I cannot tell you how often I get this question from customers. My answer is usually in the form of a question: What are you going to do with the documents once they are archived? If you distribute them outside the organization, then PDF is usually the best choice, as Adobe Reader is on over 90% of the PCs nowadays, and you will not have to deal with someone trying to open a TIFF. If your documents are going to be used for internal reference, and will be viewed through a document management system, then TIFF is perfectly fine. Another consideration will be OCR text. Does your document management system work with TIFF and PDF??

Most software today will accommodate either image format; just make sure it will meet all your needs. You can find a wide variety of software solutions at Scanning and Imaging Solutions.

Wednesday, March 19, 2008

What does paperless mean?

As I see more and more offices move towards the paperless office, i have realized it definitely pays to take the slow road. The major issue in almost any Document Management project is the human factor, and their ability to accept and use the new technology. Overload the employees with difficult and complex technology, and you may as well have thrown your investment in a bonfire. Organizations will always have different technology adoption rates, even when they are in like verticals. The best course of action for any organization is to choose a vendor you can trust, and that has a strong track record within your industry. A Document Imaging Solutions professional can assess your current situation, your ultimate destination, and then map a technology path to the promised land. Any vendor that is recommending full-blown, enterprise wide systems from the get go is just trying to pad their pockets, and will ultimately leave you a mess you will not be able to clean up. Figure out your pain points, come up with a reasonable budget/technology that will solve the problem, and then go for it.

There are some great planning resources here:

Document Management Planning

Document Management and Security

Justifying your Document Management, Scanning or ECM Project

Wednesday, February 13, 2008

Scanning and Document Management with Microsoft Sharepoint

SharePoint is being touted as the "swiss army knife" of Enterprise Content Management. I have to admit, we use this tool for just about everything under the sun: sales pipeline management, deal tracking, project management, help desk , sales material library, etc, etc. It is a very solid application for a variety of functions, and it is so easy to create a site. You can easily download an application template, and within minutes have a site up and running. The customization is even easier, and adding columns and metadata fields is a breeze. The other day I was playing around in our demo room, and decided to try out the eCopy SharePoint connector to test scanning into my library. It was a breeze, and eCopy is fantastic, with seemless integration into the product.

I will write some follow on articles on the subject, as I am finding more and more prospects asking about SharePoint inegration and scanning into the application. I have done some basic research on scanning utilities for tha app, links at:

Microsoft SharePoint Scanning Links

And for some great reading on SharePoint and Planning:

Microsoft SharePoint Information Links

More to follow.

Thursday, January 31, 2008

Document Management and Integration with Business Applications

In the beginning, most of the Document Management and ECM solutions I sold were stand alone applications. Users would scan documents (say invoices) into the repository, and would use the search client to bring up documents by index field, or perhaps do a full-text search of the OCR'ed contents.

What I am finding today, as more and more IT folks get involved in the decision making process, is that integration is King. Applications must play well, and play easy with all other business applications within the organization. What does that mean? What does integration truly mean?

I have found that it means many things, to many different people. Below is a summary:

Basic Metadata Population
Wow, that is a mouthful. Basic Metadata Population, or BMP, involves the pulling of index field information from an existing source, and allowing the user to manually pick the information from a vendor field. The most common used case here is to present a popup list of information for the user to choose. Take for example, one of my customers that has PeopleSoft Financials. One of my engineers created a view within the PeopleSoft DB of all the vendors. When Purchasing is indexing their Purchase Orders, they see a listing of vendors directly from the financial system. This prevents rekeying of data that has already been keyed, prevents duplicate names or mispellings, and insures standardization.

Advanced Metadata Population
Another mouthful, but AMP takes population of index fields a step further, and provides autopopulation of fields based on a database lookup. For example, you might have a Vendor Number field that is entered, and the capture application will go and lookup all the information on that vendor and assign it to the document.

Screen Scraping
This technology is usually used to "scrape" information off the screen from an application, and use it to populate index fields, or to perform a search. Different functions can be tied to hotkeys, or some advanced applications can have a quicklaunch bar that will perform certain operations. For example, if you are in your financial software looking at a particular vendor, you can hit a hotkey and have all the documents for that particular vendor resented.

True Integration
True Integration requires application programming interfaces that will allow two applications to talk directly to each other. For instance, you can create a button in the tool bar of your financial application that will link to a function within your document management system, or ECM system. So with one quick click, you can find all the associated documents, or scan a document to a particular vendor file with all the fields populated.

Integrations always present some challenge, and it is important to make sure you are on the same page when talking to a customer or vendor to insure everyone is satisfied in the end.

For more info on Document Management and ECM, go to the following link:

ScanGuru Document Management/ECM Portal

Monday, January 21, 2008

Document Management and Image Processing

Image processing is an area that is often overlooked when implementing a Document Management, Document Imaging or ECM project. In some cases, it can even be the key to success or failure, depending on how you are using the images. So, what exactly is image processing? It is the use of software to enhance or improve scanned images and the underlying content. An example would be that nasty copy, of a copy, of a copy of a fax. This page would be seriously speckled, very faint and could have a black border on the edges. Image Processing software can remove the speckles, enhance the text, and remove the border, resulting in a legible, clean, small image.

Many scanners on the market include image processing functions within the scanner driver. My Canon 9080 has border removal, deskewing and color drop out included within the driver. For basic, image only applications, this may be enough (Note: Setting these options may reduce the throughput of your scanner significantly). If you are relying on clean images to provide searchability, then you usually have to go with more powerful image processing software, such as Kofax Virtual Rescan (VRS). VRS provides a broad array of image processing features, and comes in two flavors: Basic and Professional. For an overview and comparison of VRS features click on the following link VRS Basic versus Professional Features.

For additional information on image processing and all the benefits, there is additional info at the following link ScanGuru Document Management and Image Processing Article.

Sunday, January 13, 2008

Document Management and Disaster Recovery

Disaster Recovery is always on the forefront of any solid IT strategy. Companies are becoming so dependent on technology that even the simplest power outage can wreak havoc on business operations. I am always surprised at the lack of attention paper files receive when it comes to the Strategic Disaster Recovery/Business Continuity plan. Companies will spend ten of thousands of dollars on the latest backup and recovery software, offsite data storage and redundant secondary sites, but when asked "What will happen when a fire hits the corporate office and all the paper is gone?", I usually get a blank stare. This is mostly due to the separation of duties within any organization. IT Managers see the data as their responsibility, and go to any length to protect this vital resource. Paper files are almost always managed at the departmental level, by managers who are usually not aware, or educated on disaster recovery and Business Continuity planning.

When examining the overall process, Disaster Recovery and Business Continuity Planning are usually split into separate, but co-dependent processes. Below is a listing of each process and what they include:

Business Continuity Planning (BCP)

  • Plan and Scope organization
  • Business Impact Planning and Analysis
  • Plan Development and Implementation

Disaster Recovery Planning (DRP)

  • Planning process
  • Testing of the plan
  • Recovery procedures

So where does Document Imaging and Document Management play into this process? Paper needs to be a primary focus during the Business Impact Analysis and overall planning exercise. How important are the file cabinets? Can business carry on if all is lost? Is the paper just a redundant copy of existing data? How easily can paper records be recreated?

The whole disaster planning process is a long and arduous task, but organizations need to take into account all their assets to insure business continuity and full operational functionality. Implementing a Document Management and Scanning solution will backup necessary paper files, making sure all required information is available after a disaster.

See the ScanGuru planning section for additional articles and information on planning:

Document Management Planning

Document Management and Data Backup

Sunday, January 6, 2008

Document Management and Data Backup

It is amazing how often data backup is left off the planning list when it comes to Document Management, Enterprise Content Management and Digital Imaging. For larger scanning projects, creating 600MB per 4-drawer file cabinet can wreak havoc on even a robust backup system.

When planning for backup, there are several areas on which to focus:

Size of Data Repository - You need to take into account the size of your repository today, and perform some projections to estimate the size as years go by. This will help decide which backup technology you will need to choose to support your backup and restore operations.

Speed of the Backup Technology - This is so important, and yet often overlooked. You can have massive backup storage capabilities, but if you only transfer 1 MB to tape every hour, your backups will never finish. When examining backup options, do some quick pencil math, and figure out how much data transfer you will require to complete the backups.

Recovery - How will you recover your data in the case of a disaster? Remember, if your server room burns down, and you were using a $5,000 tape drive to backup, you will need to have another $5,000 tape drive to recover data once you rebuild your server room. Other technologies can offer recovery capabilities without requiring a specific piece of hardware to recover your data.

Testing - Always test both backup and recovery to make sure your data store is complete. Also, perform some testing before you go live to uncover any issues you might have in other areas, sucsh as network bottlenecks, etc.

There are many areas that are vital in planning for a Document Management System, and backup and recovery are essential.

For more specific information on backup options, I have another article on specific backup technologies Document Management Data Backup Options