Skip to main content

Accounts Payable – Optical Character Recognition

Written by Denisa Arjoca
Updated over a month ago

The primary purpose of this document to provide all parties with guidance within regards to the core functionality in relation to the COINS Account Payable Optical Character Recognition (AP OCR) module and processing of emails related to invoices/credit notes only.

General Information of Functionality

  1. Invoice format files supported are PDF, JPG, PNG, GIF and TIF

  2. Multiple invoices per document are not supported by default but can be enabled by COINS on request. Only applies to PDF file formats.

  3. All other file types will be ignored and/or treated as “Invoice attachments” – this can include but not limited to the email’s body text.

  4. ‘Actual Invoice/Credit’ files are captured (when relating to emails only*) if five (5) or more of these below conditions are met:

  • The invoice can be associated with a single KCO. 

  • The invoice can be associated with a single supplier. 

  • An invoice number is present. 

  • An invoice date is present. 

  • The OCR engine signals either a document type of invoice, or the document title is matched to “invoice”, “credit” or “adjustment” – adds two (2), to the condition count. 

  • An invoice total is present.

  • An invoice tax exclusive total is present. 

5. Invoice ingestion can either be done via Scanner Locations setup within COINS (refer 1.2.4 & 1.2.5) or via Mailbox (refer 1.2.1.f, 1.2.2 & 1.2.3) from preferred email application. Should be reviewed as part of setup process questionnaire.

6. Emails with multiple attachments, each attachment will be processed as an invoice/credit provided they are identified as an ‘Actual Invoice/Credit.’ Attachments not identified as an ‘Actual Invoice/Credit’ will be treated as an invoice attachment. If no ‘Actual Invoice/Credit’ is found, the email will be ignored and/or moved to a REJECTFODLER if one has been specified.

7. REJECTFOLDER (via Global Standard Text > PLDFTMAILBOX) is highly recommended.

8. If the Invoice details read from the OCR engine successfully identifies that the invoice pertains only to TRADE purchase orders (subcontractor order), then this too will classify the invoice as NOT an ‘Actual Invoice/Credit’*. TRADE purchase orders, and invoices, must be processed through the COINS Subcontract Ledger.

*‘Actual Invoice/Credit’ check does not apply to invoices that are being read from a Network Scanner Location, as in this instance each file will be treated as an independent invoice.

9. Where a MATERIAL purchase order can be found, regardless of the presence of a TRADE purchase order, the invoice will be treated as an ‘Actual Invoice/Credit.’

Configuration/Setup of Functionality

  1. GLOBAL STANDARD TEXT Maintenance:

2. PLDFTANALYSIS - If Contract or Procurement department are not extracted from invoice and valid, and vendor and invoice are applicable for NON-PO will default to this. Both Contract and Procurement department must be valid.

3. PLDFTATTACHRES – Minimum image resolution for an image type email attachment. Supported image types are GIF, JPEG, PNG and TIFF. Attachments that do not meet the minimum width or height specified in this Global Standard Text token will be ignored. Specified as <width>x<height>. For example, 1024x768.

4. PLDFTATTACHSIZE - Minimum file size in bytes for an email attachment. Attachments with a file size greater than the specified value will be sent to the OCR engine as a suspected invoice. This does not apply to PDF files. The usual default is 10kB – 10,240.

5. PLDFTBATCHSIZE – Controls the maximum number of invoices that will be included in a single Accounts Payable invoice batch.

  • PLDFTEMAILBLACKLIST – Comma separated list of email addresses that are known to no contain invoice file attachments. Email originating from these email addresses will be ignored.

  • PLDFTEMAILWHITELIST – Similar to the black-list, the white-list allows the specification of email sender addresses where it is known that the associated supplier’s email address will not match the sender of invoice emails. For example, a supplier using MYOB to send invoice emails may appear as “[email protected]”. It is highly likely that this email will be rejected due to no supplier having the email domain “apps.myob.com”. In this case the white-list could be set to *@apps.myob.com to allow any sender from the apps.myob.com email domain.


    NOTE: The use of CAN-DO list pattern matching should be used judiciously in the white-list token. Email scammers are known for making subtle changes to email sending addresses to fool recipients. Where possible, use specific email addresses to limit the risk of accepting a trojan email.

  • PLDFTERRORTOKENS – Can-do list of weight matching tokens, defined in 1.3.1 Invoice Header Level Matching, that can be used to ensure certain key data points match between the data coming from OCR to that held in Coins ERP+. Where any of the listed matching tokens do not match exactly with the data held in Coins, the invoice will automatically be placed in an error batch. Currently, only the SABN, SBAC and SEMA tokens are supported. SEMA only applies to invoices receipted through a mailbox and the Supplier’s email domain is compared to the sender of the invoice email.

  • PLDFFILEFILTER – Can-do list of file names matching text in them otherwise known as ‘File Name Filtering’. If any attachment file name can be found to match one of the matching strings, then only attachments with matching file names will be treated as invoices. Other attachments will be classified as “Invoice Attachments” and not subjected to the OCR process.

  • PLDFTMAILBOX – Set of mailboxes that can be used in the OCR process. Multiple mailbox configurations can be stored in this data, separated by commas. Each mailbox configuration is comprised of several components. These are shown below:
    <MailboxName>=<UserUPN>[|UNREAD][|DONEFOLDER=<folder>][|REJECTFOLDER=<folder>][|APPID=<app_name>] 
    Where:

<MailboxName> 

A free text description of the mailbox, for example: Accounts Payable TEST Mailbox 

<UserUPN> 

The email address of the mailbox, for example: [email protected]

UNREAD 

If DONEFOLDER=<folder> is NOT provided, then this can be used to limit the OCR processing to only process unread email. Once processed the email message will be marked as read. 

DONEFOLDER=<folder> 

If DONEFOLDER=<folder> is provided, then the OCR process will move processed email messages into the mailbox folder specified by the <folder> token. 

The folder must exist in the defined mailbox. 

When moving email to a “done” folder, all email in the mailbox’s “Inbox” folder will be processed. Folder names are in the form: Inbox/Processed Invoices/Put them in here. 

The ‘/’ characters separate the folder names in a hierarchy. 

REJECTFOLDER=<folder> 

If REJECTFOLDER=<folder> is provided, then the OCR process will move any mail items that contain attachments where no attachment is identified as a supplier invoice. By moving the mail items, repeated runs of the OCR process will not continually send the same email attachments to the OCR engine when they will never be processed. 

The folder must exist in the defined mailbox. 

APPID=<app_name>

If APPID=<app_name> is provided, then the OCR process will use the supplied <app_name> instead of the Azure Application specified in the Standard Text token PLDFTMBOXAPP, for this mailbox only. Refer to section 1.2.3 regarding the setup of Azure Applications.

  • PLDFTMATCHTOKENS – A Can-do list of the weight matching tokens, defined in 1.3.1 Invoice Header Level Matching, to use as an override for the default invoice header matching data points. By default, all data points listed in 1.3.1 Invoice Header Level Matching will be used to determine an invoice’s Company and Supplier combination. Defining this Global Standard Text token will allow the default set of data points to be overridden as required for the OCR implementation. Any matching tokens that do NOT match to the supplied Can-do list, will cause their associated data points to be removed from the Company and Supplier weightings system.

  • PLDFTMBOXAPP – The Azure Application code that contains the Microsoft Azure Application details (Tenant ID, Application (Client) ID and Secret Value. Refer to section 1.2.3 that covers the entry of Azure Entra App details.

  • PLDFTMINWEIGHT – The minimum supplier weighting that an invoice must achieve to be assigned to a Company and Supplier combination. Supplier weighting values are documented in the report logs associated with the OCR run. If the maximum Supplier weighting does not achieve the minimum weighting value set in this Global Standard Text token, then the invoice will not be allocated to a Company or Supplier and will be added to an error batch in the current selected company. Where the supplier weighting value equals or exceeds the minimum weighting value, the invoice will be allocated to the Company and Supplier combination with the highest weighting. The more data that can be used to match an invoice to a Company and Supplier combination will allow this figure to be increased. The value used for the minimum weighting must be based on weightings being achieved within the Coins installation.

  • PLDFTPAGELIMIT – A positive integer value that limits the number of pages that the OCR engine will interrogate when extracting data from a PDF file. This is useful to limit how much data is attempted to be read for large invoice files and can help with avoiding request timeouts where invoice documents contain more than fifteen (15) pages. This applies to ALL invoice documents. Company specific token settings will only take affect where the OCR process is invoked from within the specific company.

  • PLDFTTXTTYPE – Coins batch type mapping for each type of invoice that can be created during the OCR process. Comma separated list.

  • PLDFTWEIGHTVAL – A comma separated list of matching token weighting overrides. The matching tokens defined in 1.3.1 Invoice Header Level Matching can have their default weighting values overridden by using this Global Standard Text token. For example, the default weighting values for the “Purchase Order” (SPOR) data point is 100 for a direct match and 50 for a fuzzy match. This could be overridden by adding an override like SPOR=200|75. This would increase the weighting value to 200 for a direct match and 75 for a fuzzy match. Where a data point uses only one value, then only one figure is required for the override. For example, the “Supplier ABN” data point can be overridden using SABN=75. Weighting values can be increased or decreased but must be whole numbers, zero (0) or greater.

6. Configure Mailbox setup client will need to create an Active Directory / Entra App that provides Application Mail.ReadWrite access to the OCR mailboxes. By default, this access will apply to all email boxes within the client’s domain. The mailbox access can be limited using the Azure CLI from powershell: https://docs.microsoft.com/en-us/graph/auth-limit-mailbox-access - Once the App has been created a Client Secret will need to be added to the App. The name of the secret is not important, however we recommend using the secret name “AAD_SECRET”.

7. Once AD App has been created the customer will need to register the Azure Application with Coins. This is done using the System → System Setup → Azure Applications page within Coins ERP+.

A screenshot of a computer

AI-generated content may be incorrect.


Add a new Azure Application with a unique application name. The application name that is used in the Code field is the value that needs to be referenced by the PLDFTMBOXAPP Standard Text token or by the APPID attribute in the PLDFTMAILBOX mailbox specifications. Enter the:

  • Tenant ID

  • Application (client) ID

  • Value of the AAD_SECRET (not the secret GUID, the actual value that you only get to see once)

  • Permissions Type must be “Application”

  • Enter the date that the generated secret will expire. Providing this date will allow Coins to send notifications to system administrators regarding upcoming secret expirations.

8. Configuring Network File Location will be managed using Shared Folders & Network Scanner Locations. This should be undertaken with COINS consultant/technical lead via Document Management > Setup > Configuration.

Graphical user interface, text

Description automatically generated

9. Shared Folders are required to be setup to allow end servers to Upload Invoice/s to the COINS server. This should be undertaken with COINS consultant/technical lead via System > Setup > Shared Folders. Shared Folder Function access should be reviewed as part of the setup process questionnaire. SF* in Read and Write Function Denotes Shared Folder OR BSPPLOCR

A screenshot of a computer

Description automatically generated with medium confidence

10. Ingestion of invoices is recommended to be setup to run twice a day (morning and noon)

11. Company Group called OCRPL will assist will limiting companies that OCR (invoices) can be searched and processed for.

A screenshot of a computer

Description automatically generated

12. If multiple companies are using OCR please ensure each company is setup (ticked), and also ensure company is LIVE in System > Company Setup > LIVE Companies

13. AP Batch Header description is setup/maintained via System > System Setup > Field Default Maintenance. Field name is RS_CobDesc and Default should be AP Invoice Batch {date} - {moniker}.

14. Invoice Header Descriptions will default to Purchase Order Header Description if Purchase Order is quoted/captured, if nothing is entered in this field then the Invoice Header Description and subsequent invoice lines will use the corresponding Invoice Batch Header Description. In the instance if the batch an Error Batch – this will always be blank. The purchase order header description field can be made mandatory via System > System Setup > Field Access Maintenance. Field code is poh_desc

15. For Errored Batch invoices, if Contract/Project number is quoted on the Invoice, COINS will allocate to the contract, or else allocate to the PLDFTANALYSIS.

16. Document Management > Setup > Setup Maintenance > Document Category Maintenance> P/L Invoice&ap_invoice - Supplier avm_num needs to be unticked (not mandatory).

A screenshot of a computer

Description automatically generated

OCR Matching Capabilities

Matching an invoice to a COINS KCO and Supplier is done by comparing the retrieved invoice data with data stored in COINS at the Vendor, Purchase order (PO) and/or Good Receipt Notice (GRN) notice.

The process uses a weighting mechanism to attempt to match the invoice data with the best possible data match in COINS. 

All weightings are aggregated, and the highest weighted company, supplier and/or purchase will be selected.

If one supplier accumulates 150 points more than the next closest supplier, then that supplier and the associated company will be selected. If two suppliers score the same weighting, then supplier selection will fail. 

Where sectors are being matched along with the main entity details in Company Configuration, then the highest weighting will be used, either from one sector or the company configuration. The individual entity is allocated only one weighting and that may come from the sector details, not the company details. 

Invoice Header Level Matching

This table is designed to provide guidance on the field names that will be matched at an invoice header level.

Invoice Data

Matching Token

Entity Weighting

Supplier Weighting

Commentary

Bill To ABN

EABN

100

The COINS entity’s ABN number matched to either a Sector record or a Company Configuration record.

Bill To Name

ENAM

ELEG

Up to 100

Up to 100

Will be matched to the entity Name in either a sector or the entity Company Configuration record. Up to 100 points will be allocated depending on the number of matched words and the word ordering on the invoice as compared to COINS.

Will also be matched to the entity legal name in Company Configuration and will add up to an additional 100 points depending on matched words and their ordering.

Bill To Address

EADD

Up to 75

Will be matched to the entity address in either the sector or the entity Company Configuration record. Up to 75 points will be allocated depending on the number of words matched in the addresses and the ordering of the words.

Bill To Postcode

EPCO

25

Will be matched to the entity postcode in either the sector or the entity Company Configuration record.

Contract Code

EJOB

100

If the supplier’s invoice contains a valid COINS contract code.

Supplier ABN

SABN

100

100

Matched to the supplier ABNs in COINS

Supplier Name

SNAM

Up to 100

Matched to the supplier’s name in COINS. Points are allocated based on the number of matched words and the order in which they were matched.

Supplier Address

SADD

Up to 75

Matched to the supplier address in COINS. Point are allocated based on the number of matches words and the order in which they were matched.

Supplier Postcode

SPCO

25

Matched to the supplier post code in COINS.

Supplier Email

SEMA

30 or 50

Matched to the supplier email in COINS. If no direct match can be made on the full email address, then an attempt will be made to match only the email domain. (50 points for a full match; 30 points for a domain match; the domain match is the text after the ‘@’ in an email address)

Supplier Phone

SPHO

50

Matched to the supplier phone number.

Supplier Account

SACC

100

50

Matched to the “Supplier Code for Us” field in COINS on the supplier record.

Supplier Bank Account Name

SBAN

100

Matched to the “Payee Name” field in COINS on the supplier record.

Supplier Bank Name

SBNA

50

Matched to the supplier bank name field in COINS which is generated when the BSB number is entered for electronic payment methods.

Supplier BSB and Account

SBAC

100

100

Matched to the supplier BSB and account number fields in COINS, both must be matched together.

Purchase Order

SPOR

100 or 50

100 or 50

Matched to a valid COINS material purchase order. A match weight of 50 will be assigned when the PO number matched is not identical to what was read from the OCR engine. For example: xOCR/MO001 will match to: xOCR/M0001 but the “O” is replaced by a “0” to find a match.

The purchase order weighting may be applied more than once, if there are valid, unique purchase order numbers identified for individual invoice lines.

Delivery Docket

SGRN

200 or 100

200 or 100

Matched to a valid Coins goods receipt note record that contains the supplier delivery docket number. A match weight of 200 will be assigned if the delivery docket record in Coins has also been entered against the same purchase order as identified in purchase order matching. Otherwise a weighting of 100 is used.

The delivery docket weighting may be applied more than once, if there are valid, unique delivery docket reference numbers found for individual invoice lines.

Use the report log file – found in the Report Status screen – to assist with determining how the OCR process calculated supplier and company weighting values to select a supplier and company combination.

Invoice Line Level Matching

This table is designed to provide guidance on the field names that will be matched at an invoice line level, combinations are also provided once these fields are captured.

Line-Item Field

Line Weighting

Commentary

PO (Purchase Order) Number

Matched to a valid COINS material purchase order. A match weight of 50 will be assigned when the PO number matched is not identical to what was read from the OCR engine. For example: xOCR/MO001 will match to: xOCR/M0001 but the “O” is replaced by a “0” to find the match.

Line Description

100 – 300; line descriptions will weight higher when more words are found

Weighted match, based on the number of words matched from the invoice line to a purchase order line. Additional weighting points are gained where the words are matched in a similar order. Perfect matches will return the perfect match immediately.

Supplier Product Code

max 100; if the same product code is found in multiple purchase order lines, then weighting is reduced

As read from the supplier invoice, adding the supplier product code to the Purchase Order line description will aid in selecting an appropriate PO line item. Refer above item for this.

Price Invoiced

25

Price on the invoice line matches price on the PO line

Invoice Line GRN Number

Could be used to reduce the PO line items to allow in the search, but not yet implemented.

Invoice Line Level Combinations for 3-way Matching

Post using the above data to support Purchase Order Line Item Matching the system will go through nine (9) rounds of checks for 3-Way Matching.

Combinations will follow in the below order:

  1. Invoice Line GRN Number + Invoice Line Delivery Date

  2. Invoice Line GRN Number + Invoice Delivery Date

  3. Invoice Line GRN Number + Invoice Date

  4. Invoice Header GRN Number + Invoice Line Delivery Date

  5. Invoice Header GRN Number + Invoice Delivery date

  6. Invoice Header GRN Number + Invoice Date

  7. Any GRN for PO Number + Invoice Line Delivery Date

  8. Any GRN for PO Number + Invoice Delivery Date

  9. Any GRN for PO Number + Invoice Date

Aggregating of Lines on Invoice

Inbuilt COINS functionality to roll up invoice lines on an invoice only when the below criteria is achieved:

  • Prices match

  • Units of Quantity match

  • Product code and first forty (40), characters of description match

When using 2-way matching, the invoice lines will be aggregated to a purchase order line.

When using 3-way matching, the invoice lines will be aggregated to a purchase order line within a GRN.

Invoice Extras Matching

From COINS v12.03 onwards, invoice extra lines are also able to be mapped by the OCR engine, for example:

The OCR engine will not learn extra charges outside of freight/delivery fees, however these additional charge amounts can be mapped in COINS ERP+ by supplier account.

Use the Supplier Invoice Extras Mapping menu option, found with the other OCR menu options in the COINS side menu, to define how these extras will be found and how they are to be processed into COINS. Invoice extra amount mappings must be defined for each supplier, where the supplier invoice format contains additional charge amounts.

A screenshot of a computer

Description automatically generated

Company and Supplier Account

If the supplier account code is consistent across all companies using OCR, then the Company selection can be left at “-All Companies”. If the supplier account code varies between companies using OCR, then select the company (KCO) and supplier account code that the extras mapping will apply to.

Search Terms

A comma separated string of terms to use to search for invoice extra amounts. In the above example, the search term “Consumables” can be used to find the consumables total of $90.00 and the search term “Environmental Levy” can be used to find the levy of $33.44. Search terms of “Freight,Delivery,Pickup” would search for invoice extra amounts associated with either “Freight”, “Delivery” or “Pickup”. All matched invoice extra amounts would be treated in the same way.

Text Search

If checked, the invoice sub-total text area will be searched for the entered Search Terms. If found, the value to the immediate right of the search term will be returned.

A white rectangular object with black lines and text

Description automatically generated

If not checked, the invoice lines will be matched to the supplied search terms and if a match found, then the invoice line will be treated as an invoice extra amount. This can be useful where a supplier might provide multiple invoice extra amounts as different invoice lines with different descriptions, however the purchase order is only entered with a single, specific “Delivery” line item for invoice costing.

A screenshot of a computer screen

Description automatically generated

Find

The invoice extra amounts can be mapped to either:

  • Commodity extras – currently not supported, please do not use.

  • Purchase order lines – the matching purchase order must contain a specific purchase order line for the identified invoiced extra amount.

Look for

The value entered in the look for field will depend on whether the invoice extra amount is being mapped to a commodity extra or a specific purchase order line.

  • Commodity extra – enter the commodity extra code, found in the Commodity Extras lookup table, to process the invoice extra amount as.

  • Purchase order line – enter a matching string to use to find the purchase order line that the invoice extra amount should be matched with. Purchase orders raised for this supplier should contain a specific purchase order line where the description of the purchase order line can be matched to the string entered here. When the invoice is processed through OCR, invoice extras of this type will be matched to the specific purchase order line.

Setup Questionnaire/Checklist

OCR Setup Questionnaire/Checks

Question/Check

Owner

Answer

Client KCo’s (COINS Company Numbers) which will setup for OCR

Client

Scanner Locations – Central or Per User

Client

Via Network Scanner Location

Scanner location must be setup for this to work. Refer to 1.1.11.

  1. Invoices should be stored in Report Status > My Files. Under the Folder specified for Invoice scanning:

Graphical user interface, text

Description automatically generated

2. Click NEXT

3. Navigate to Accounts Payable > Custom AP Menus/Extended Functionality> Invoice Processing > Import Invoices – OCR

4. Select Load invoice images from a Network Scanner Location

5. Check Scanner Location file is correct.

6. Choose the relevant process applicable to your business.

7. Click NEXT.

Graphical user interface, text, application, email

Description automatically generated

8. Check/Action batches/invoices loaded via Accounts Payable > Invoices > Enter Invoices.

9. To assist with validating and reviewing of invoices with Matching Issues, Errors, etc use the reports created during upload via Report Status. File output will be called Import Invoices – OCR

Via Mailbox

Mailbox must be setup via Global Standard Text – PLDFTMAILBOX for this to work.

  1. Navigate to Accounts Payable > Custom AP Menus/Extended Functionality> Invoice Processing > Import Invoices – OCR

  2. Select Extract invoice images from a Mailbox.

  3. Check Mailbox

  4. Choose the relevant process applicable to your business.

  5. Click NEXT

  6. Continue with Steps 8 & 9from 2.1.

Post Process Review

When invoices go into Matching Issues (Matching Imbalance) and/or Errored batch users should navigate to the Report Status queue and review the Import Invoice OCR PDF document to provide further guidance on the issue/error for the purpose of actioning of invoices.

  1. Navigate to Home > Report Status or the Printer Button.

    A screenshot of a computer

Description automatically generated with medium confidence

2. Select the Import Invoices – OCR hyperlink, to review the report.

A screenshot of a computer

Description automatically generated with medium confidence

3. PDF will open in your default PDF application, first and last page will be your summary of what has been processed

A screenshot of a computer

Description automatically generated with low confidence

A picture containing text, screenshot, font, line

Description automatically generated

4. Pages between will give details on errors/issues as advised.

A screenshot of a computer

Description automatically generated with medium confidence
Did this answer your question?