As there always seems to be some confusion when it comes to training & learning in KTM, I decided to give it some thought and draw a diagram. There is one explanation available in the KTM help, however you can only find it when searching for “Knowledge Base”. As not all knowledge that is available to KTM is inside a knowledge base (i.e. the compiled binaries), this is somewhat confusing.
Let’s start with a new project, and let’s image you use a group locator (does not matter if you use the trainable one, or the invoice- amount- or order group locator). Now, the first thing you do – as you train your project – you use the Project Builder to train documents (the world-famous F10 dialogue, which they could come up with a better name). What you create when using the Project Builder basically is something I’ll refer to as “Offline Learning”. I think the name is appropriate, because that knowledge won’t be active unless you re-synchronize your KTM project with Kofax Capture (unless you switch that behaviour off in the extended synchronization settings).
Offline Learning
The documents that you train in that manner will go to the training set, one folder inside your project file on the file system. If you train classification, you’ll find this document under learn\layout\{class name}; if you train for extraction, then it is under training\{class name}. Plus, there are Knowledge Bases – compiled binary versions of information, stripped of the image information and thus much smaller in size.In the Project Builder, you are also given the possibility to turn your trained information from the Training Set into a Knowledge Base (KB). However, this comes at a price: you can not access the documents or even information that resides inside that KB. That may be anticipated behaviour, if you want to provide new customers with generic knowledge that is based on many of your customers’ documents – no one will be able to access them. If you never plan to roll out or sell the process you set up to anybody else, I’d stick with Training Folders and prefer them over KB’s as I always can modify them. There is one great Application Note from Kofax available that describes the reasons when you want to go for a KB, and when not to. You may find it under this link – Kofax has not updated it for KTM 5.5 or even 6, but it still is valid. It also describes the differences between Generic and Specific learning in great detail.
All the knowledge that you added (training folder plus Knowledge Base) can be applied in the KTM server, but usually you need to synchronize your KTM project before it will be.
Online Learning
Online Learning is a quite convenient feature if you want to see improvement on-the-fly, without the need to go to the Project Builder and update your training folder. This approach utilizes the Knowledge Base Learning Server (KBLS), and the project must be set up to support that feature. When somebody marks a document in Validation, the KBLS will put the document into a predefined location on the file share. This feature is great for specific online learning – e.g. invoices, where a validation clerk will instantly train a new supplier (layout classification plus extraction information). One thing that is vital to know, and not many are aware of it: there is no such thing as generic online learning! For and invoice scenario, that means: there will be no improvement on unknown suppliers (not sure about KTM 6.0, though). You can mark them in Validation, but they must be handled with the Project Builder later on!
All the knowledge added by the KBLS will be available to the KTM server instantaneously.
Putting everything together
Using the Project Builder, you can import documents from online learning into the Project. As stated earlier – there is no generic online learning, so you must use that approach to train – or better: review – those documents. Please note: once you imported the documents that way, they’ll be gone from the online learning folder. And, if you do not re-synchronize your project with Capture, that knowledge will not be applied!
In addition, I hope it’s more clear now that knowledge does not have to reside in a knowledge base, but everything – online and offline learning – represents usable knowledge (marked with a grey border in the image), that KTM will apply.