Converting PDF to .tif

In case you happen to need to convert a PDF to a .tif this seems to be a good way.

Here is a commandline that should work well for letter-sized pages of a multi-page PDF file: (Using ghostscript)

gswin32c.exe -o page_%03d.tif -sDEVICE=tiffg4 -r720x720 -g6120x7920 input.pdf

In case you want to do this under program control you could use GhostSharp,

ECM Queue Management

Hi folks, it has been too long!

Well enough with the chitchat, so as some readers might know we have implemented an ECM solution for our accounts payable process. I like to call this our personal ECM cloud. Most recently this has rolled out to our remote locations. So, what has this system gained us?

At first blush, many will often assume that the savings to be gained with imaging\ECM are in storage costs…. WRONG! If you ever implement a system to save on storage, you are doing it wrong!

One of the coolest things we are able to do is see what invoices are piling up on someones ‘virtual desk’  (This of course does rely on timely scanning into the system)

Using some Qlikview *’Magic*, we are to display the total invoices in the system by plant and analyze which are late ( in the queue longer than 7 days, adjustable by a slider) shown as red. Then we can click into a location, let’s pick on Chehalis.

Looking at this tab we can see there is one invoice that is late in the Plant Mgr. Review 1 queue, so lets click on that one and go to the Image Tab

So there is the image of our tardy invoice, hmm entered into the system on the 9th hmm why hasn’t this been approved yet? Let’s go talk to that manager.

In conjunction with the queue management Qlikview, we have also created some *minion’s* i.e. little automated tasks that look at the queues and if they get too large, or too tardy, email the offending person(s).

Another benefit that we have gained is the ability to spread out the load of data entry among many users. Typically, the AP person would have to memorize (or look up) thousands of different vendor numbers. Now, each individual can learn their 10-25 vendor numbers and enter them in, saving the AP person from having to do it. In addition, we added a quick look-up of the vendor name and address so the manager can double check they have the right vendor number.

Since the process is digital, a manager is able to approve invoices from wherever they are able to get a VPN connection. That also goes for pulling up paid invoices too, of course.

Did I mention that it scales well, instead of just one person processing invoices it could be 10, or 100.

At the low end of the spectrum, yes we do save on storage costs, postage, and no longer having to file the invoices away.

ECM, finding the ‘perfect’ scanner

We went through a lot of iterations trying to find a scanner that would match our AP process well. It’s very interesting that they don’t really make a class of scanner that fits what we wanted. We liked the simplicity of the Canon ScanFront but we learned that if we got a larger scanner we would be able to turn our invoices to landscape and get much better paper feeding. The typical creases that come from being stuffed in an envelope tend to stick pages together. When you look at the higher end scanners all of them require a computer to be connected with either USB or SCSI. This makes sense when you think of the processing that would be required to handle 60-70ppm duplex scanning. Then again, why can’t anyone just bundled an embedded XP box inside the scanner and call it good? Well, that’s basically what we ended up doing. We mounted a PC using the VESA mounts onto the back of a Canon 7550C this gives us good paper handling (landscape!), an easy to use Scan-To button, and very decent software that rotates pages based on text + Advanced Text Enhancement. All and all, the scanning part of our ECM projects has gone into very quiet IT/It just works mode. This is where I try to drive all of our technologies so that our small IT group can continue moving forward with new initiatives instead of getting bogged down in the OPERATIONS side of the IT business.

My new favorite quote:  A laggard IT operation ‘monitors.’  A follower ‘manages.’  A leader ‘automates.’ Bob Laliberte

A picture of our FrankenScanner, it works pretty well!

Rapid ECM, Up & Running in two hours (AKA our personal ECM ‘cloud’)

The stack goes like this.

Storage:                       EqualLogic PS6500 35TB
Processing:              IBM 3850 M2 64GB RAM
Virtualization:      VMWARE ESX
OS:                                  Server 2008 R2
Database:                  SQL Server 2008
Capture:                    iLinx Capture
Store:                          iLinx Content Store
Access:                      Terminal Services + Internet Explorer 8

Or in pictorial form:

Storage : I love our EqualLogic. They are simplicity and ease of use defined. Poetry in motion. A couple of clicks and create a new volume of virtualized storage for our virtual ECM box. Need more space, buy a new array and add it to the pool. Plus you get added IO’s when you add an array, no wonder they won a 2011 Infoworld best storage system award for the third year running

Virtualization: VMware, AKA ‘the good stuff’ really, how did we live before virtualization? So, right click and deploy from template the Server 2008 box with SQL 2008.

Software: Install iLinx Capture & iLinx Content Store. Both programs install quickly and easily

*****     *****

OK, now that our base is installed let’s go over what we want to accomplish. We have a purchase order acknowledgement that is coming in through email as a .tif. We want to store this image and be able to retrieve it based on the Purchase Order, the Vendor Name, and the Vendor Confirmation #, we also want to record the Purchase Order date and populate the ERP with it.

So off we go to configure capture. So create a blank PO Confirmation Batch and a blank PO Confirmation Document. The Batch will be super simple, basically take any document and assemble it into a PO Confirmation Document. (Set the Default Doc Type to PO Confirmation) That will look like this


Ok, so now we have to configure the PO Confirmation Document. We will add our four data entry fields. And to aid in getting exact metadata to populate content store lets do a lookup of the Vendor name using the PO number.

To configure the Lookup, Edit the PO Number Field Name and then check the check box, this will run you through a wizard and at the end you can paste in what ever SQL statement you desire.

Capture has its own format for using a field for a lookup (Notice the Carrots ^ ^) and then if you match an existing Field using a select x AS y statement it will populate it with the data returned. So now if a user types in a PO number and then tabs out of the field it will populate it with returned Vendor Name! Pretty Slick.

So next, create a Queue for these confirmations to sit in and then give access to the users who are going to process them. The configure an Export QSX to output to a folder for iLinx Content Store to import into.

So, activate our route and make sure all of the security settings are correct (giving access to the Batch, the Document and the QUEUE)

So now configure your ‘polling folder’ I wrote how to do that here

Ok, so now all our user has to do is drop the .tif into the polling folder and Capture will bring it in and then the user indexes it. Their screen will look something like this (we have them access it through a terminal server) :


When the user hits complete it will export the image and the metadata. So now let’s configure a new Content Store repository for this image and metadata to live in.

Log onto content store as an admin, click the Create/Edit Applications and then press the Add button. Give your Application a name, and then start creating your data fields

Make your data types and Lengths appropriate for you application. In my case I used a Picklist of ‘Value In Adage’ with a Yes\No so that after the confirmation is inserted into Adage (Our ERP) I would flip it to a Yes to indicate completion.

Now that our fields are setup, we need to configure the import task. So Click on Options, Import Multiple Documents, And the Add. Give the Job a Name, Import Source and a Target. For the template file select the .txt of the previously created output file (which contains the metadata) and then use the drop downs to Map from Content Store to ILINXCapture. You will also set the import, archive and export folder.

It will look like this:

So now when the Import Service runs it will bring in the image and the metadata into content store. This is what it looks like

Then I wrote a quick bit of VB to take the confirmation that was in content store and insert it into our ERP signifying that the PO confirmation had been received, and then mark it as done.

So there you have it, a very Rapid ECM implementation. In fact I think it was up and running quicker then it took to write this blog post.

Another post that shows similar screens to mine is this one I wanted to show what had to happen behind the scenes to make a system like this work!

iMplementing iLinx Capture – Part 3

It’s been awhile since I’ve written, we’ve been real busy here with our AP process go live, attending the Nexus ECM conference, and just all of the regular stuff we do.

A couple of things we have picked up on though, we really like getting a ‘bigger’ scanner because it let’s us turn our invoices on their side. Because most of the invoices have folds running horizontal (because of being stuffed into an envelope) by turning them against the grain you might say, it has really improved our document pickup. Since we tried this approach we also had to revisit our barcode sheets. A couple of lessons learned here: barcodes really like whitespace on either end (I wasn’t giving it enough and sometimes they wouldn’t read). I ended up using an 11 X 17 piece of paper so I could make the length of the sheet stick out just a little more than a regular 8 1/2 x 11. Since I was doing them over again I also have two identical barcodes one going horizontal and the other going vertical to help guarantee barcode reads. After making these changes the barcode reads are pretty much 100%.

So, picking up from the previous post. The case flows through, and hits the barcode separation, then it gets dumped to the first Queue. We named this Queue SCANNED. If something went wrong with the Barcode separation it will end up in the SCANNED HAD A PROBLEM queue. From the SCANNED queue, the receptionist checks the image, checks that the correct path was set for the route and then hits complete to send it on its way.

Here is a picture of the ‘AP Batch’. If the Receptionist wants to DELETE the batch she can set the status of the batch to DELETE. When the receptionist hits complete it goes to the Assembly which is where the Document is passed down to its Document Type route. A couple of other interesting things to note, I created a Scanned On Field name and the used the Default Value of [Current Date]

I wasn’t able to use the default value of [Scan Date] since we are scanning externally (if we had been scanning from within iLinx Capture, the value would have been filled in). Then at the document level if you create the same field as at the batch level the value will transfer to the Document ‘level’

As another aside, after an analysis of our typical documents we discovered that we had a lot of single page single sided invoices. With our barcode separator sheets we would have to put one in between each individual, or we could separate them inside of iLINX capture. Well this process was starting to get a little tedious so I decided to throw some code at it. What we ended up with is a few extra buttons on our scanner which I called burst magic X1, burst magic X2 etc… What happens is the scanner dumps the .tif to a folder where some code grabs the barcode image and then based on the desired frequency inserts it between the pages. It then passes the revised image along to the normal capture import folder and then magically the invoices are separated. Another thing we started playing with is the text optimized scanning modes provided by the scanner software, this has helped clean up some of our hard to read invoices.

Well, maybe next post I will back into some more nuts and bolts.

Happy trails.

iMplementing iLinx Capture – Part 2 Batch Profile

So, to recap. Configure an input source, configure a profile, the input source categorizes an image(s) into a Batch Profile. (This is my simplistic understanding)

Once the images are in the batch profile they need to be ‘indexed’. I believe that ‘indexing’ is what assigns them to a document type.

So, right click, create new batch profile. Then you will need to place a start.

In our case this first thing we needed to do was remove any blank pages. So click the QSX button, and choose the ILINX Cleanup.

Use the line tool to draw a path from the start to the ILINX Cleanup. To configure the Cleanup you simply need to right click on it and then choose configure.

In this example the remove border is checked as well as the delete blank batch pages. The 1500 bytes setting determines what is considered a blank page.

The next process is to do the barcode separation. This post covers some of the options.

In our case we only really have one ‘type’ of document, our incoming vendor invoice. Unfortunately this doc is routed all over crime and creation. So lets add the ILINX Barcode QSX and then configure it.

So this Tab does a few things. It enables document separation. It detects barcodes of the 3 of 9 flavor, it deletes what it determines are separator pages, and it uses a prefix (In our Case NFFC-) and assigns that to be the AP Doc document type. By using the Prefix other barcodes that don’t have the prefix will be ignored (so it won’t try separate on a vendors barcode for example)

As I mentioned earlier, we want to use the barcode to determine the ‘route’ an invoice would take. Inside of our ‘AP Doc’ document we have a Picklist configured with various routes: for example Sales, IT, Accounting etc… so then our trusty ImageSource consultant helped us configure the Barcode Recognition so that the Route read from the barcode would be inserted into the Route picklist field of our ‘AP Doc’. So, on the Barcode Recognition tab:

This screen allows you access to both Batch fields as well as document fields. To configure them you simply double click on the field value you want to set. i.e. in our case double click Route which brings up this configuration screen. We choose the map by barcode number, so we set a 1 for it to process the first barcode that it encounters, then we set the Prefix Filter and check the checkbox to remove the prefix from the barcode value (since our route is named IT and not NFFC-IT)

To create barcode sheets is pretty easy, download yourself a free 3 of 9 font, install it, open up wordpad type your barcode value surrounded by asterisks in all CAPS. For example


Then select the text you typed and change the font Code 3 of 9. If you have a barcode scanning gun you can scan it to make sure it reads, or print it and then scan it and use the ‘Test’ button on the Barcode Recognition tab screen, and then select your type and image to see if it can read your barcode.

So that’s about it for barcodes. (Well there are many things you can do with them, but this is one configuration) I will cover some more topics next time, until then happy ECM trails.

Implementing iLinx Capture

As I mentioned in a previous post, we have had the fun of playing with the software ” Lego’s ” that makes up a large portion of iLinx Capture. At the heart of the matter you have the ILINX Server. This is a nice scalable service which allows you to process your image files.

The Hierarchy goes something like this. Configure an Input Source: an input source monitors a directory and upon finding a matching file type will pass that on to the assigned Batch. The Input Sources are configured after clicking on a server group, and then a corresponding server.

In order to assign a Batch though, you need to have added the batch to your ‘Profiles’ (On the right)

To configure the import just set a Polling Folder, Archive Folder, and Exception Folder. Then use the familiar *.tif wildcard and assign these incoming images to the AP Batch.

Now that the formalities of getting images in has been taken care of, what next? Pull out your Lego’s cause this is where you can go crazy 🙂 Under Batch Profiles, you can configure what will next happen to your image. In my next post I will delve in deeper for how we ended up configuring this. But at a conceptual level, after experimenting with various methods of separating out our invoices we decided to go with barcode separator sheets. From simplest to most complex some of the options are

1) Have a button assigned to each route, and press a button to scan each separate ‘document’ – The most simple approach possible, negating the need for any type of separator sheet. Cons: This can turn into a large amount of button pressing as well as requiring a lot of ‘babysitting time’ and does not allow for high volume tasks.

2) Have a button assigned to each route, use separator page(s) within the batch to mark the beginning of a new ‘document’. Pros: Once again,  very easy to understand and implement. Cons: This scenario could easily end up creating a large amount of scanto buttons. This approach requires a medium amount of ‘babysitting’ since you have to wait between different routes. This approach also requires pre sorting of documents into their specific routes.

3) Have a single button for the Invoices. Use barcode sheets to identify the beginning of a document and what route it should be assigned. Pros: Allows you to batch up document imaging and handle higher volumes (Less babysitting, more automation). Allows you to mix and match document types. Cons: Slightly more technically difficult to implement, maybe harder for users to understand, could end up with a lot of different barcode sheets. Need to trust the scanner more (needs to have robust feeding mechanism as well as excellent double feed detection)

With always the goal of having a higher degree of automation we choose to go the barcode sheet method. Our users have actually found this method to be fairly easy to understand. We had a lot of fun creating the various barcode sheets. (We gave them unique border colors as well as easily identifiable images) These sheets are also printed on legal paper and then trimmed to where they are still an inch or so longer than regular paper to allow them to be pulled back out easily.

Our CIO hasn’t noticed yet but we used his effigy (with a barcode across his forehead, of course) to indicate an IT invoice (He is our fearless leader)

You can imagine some of the fun we have had with some of the others! One other thing to note, I would think using a barcode prefix would be considered ‘best practice’ otherwise you might end up splitting a document off of some random vendor barcode which is not the desired behavior.

Well, that’s enough for today. Until next time, happy ECM trails.