AI Document Types
Umango can use artificial intelligence (AI) to analyze documents and automatically capture data based on a document type. These Document Types are then assigned to jobs which utilize the AI training to understand what data to capture and how to capture it.
Pre-trained Document Types are common to many businesses and industries and are flexible with the variations and data fields they capture. The pre-trained document types shipped as standard in Umango and do not require any further training.
Custom Document Types can be trained based on your own document samples. To use custom document types you need to complete the training process before assigning them to your jobs.
Information on the available document types are listed below.
Document Type |
Description |
Languages Supported |
Custom Trained |
Train and name your own document type using your own sample documents. While creating custom document types requires completing an AI training process, the results are often significantly more accurate and reliable compared to using the pre-trained Structured Document type or relying on traditional OCR zone-based capture methods. Neural Trained: Documents can be semi-structured. Training takes 20 mins to 1hr. Signature fields not supported. Fields can overlap. Template Trained: Documents must have a consistent structured. Training takes 1-5 mins. Signature fields are supported. Fields cannot overlap. |
Neural & Template Handwritten Text: English, Chinese (simplified), French, German, Italian, Japanese, Korean, Portuguese, Spanish Neural Trained Machine Text: Afrikaans, Albanian, Arabic, Bulgarian, Chinese Simplified, Chinese Traditional, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Macedonian, Marathi, Modern Greek, Nepali, Norwegian, Panjabi, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali (Arabic), Somali (Latin), Spanish, Swahili, Swedish, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese Template Trained Machine Text: Virtually all languages. |
Invoices/Purchase Orders |
Extract invoice ID, customer details, vendor details, ship to, bill to, total tax, subtotal, line items and more. Detailed tax information is included for India, Germany, Spain, Portugal and Canada. |
Albanian, Arabic, Bulgarian, Chinese (simplified), Chinese (traditional), Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hebrew, Hungarian, Icelandic, Italian, Japanese, Korean, Latvian, Lithuanian, Macedonian, Malay, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian (Cyrillic), Serbian (Latin), Slovak, Slovenian, Spanish, Swedish, Thai, Turkish, Ukrainian, Vietnamese |
Receipts |
Extract time and date of the transaction, merchant information, amounts of taxes, totals and more. |
Afrikaans, Akan, Albanian, Arabic, Azerbaijani, Bamanankan, Basque, Belarusian, Bhojpuri, Bosnian, Bulgarian, Catalan, Cebuano, Corsican, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Fijian, Filipino, Finnish, French, Galician, Ganda, German, Greek, Guarani, Haitian Creole, Hawaiian, Hebrew, Hindi, Hmong Daw, Hungarian, Icelandic, Igbo, Iloko, Indonesian, Irish, isiXhosa, isiZulu, Italian, Japanese, Javanese, Kazakh, Kazakh (Latin), Kinyarwanda, Kiswahili, Korean, Kurdish, Kurdish (Latin), Kyrgyz, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Maltese, Maori, Marathi, Maya, Yucatán, Mongolian, Nepali, Norwegian, Nyanja, Oromo, Pashto, Persian, Persian (Dari), Polish, Portuguese, Punjabi, Quechua, Romanian, Russian, Samoan, Sanskrit, Scottish Gaelic, Serbian (Cyrillic), Serbian (Latin), Sesotho, Sesotho sa Leboa, Shona, Slovak, Slovenian, Somali (Latin), Spanish, Sundanese, Swedish, Tahitian, Tajik, Tamil, Tatar, Tatar (Latin), Thai, Tongan, Turkish, Turkmen, Ukrainian, Upper Sorbian, Uyghur, Uyghur (Arabic), Uzbek, Uzbek (Latin), Vietnamese, Welsh, Western Frisian, Xitsonga |
Business Cards |
Extract person name, job title, address, email, company, and phone numbers from business cards. |
English, Japanese |
ID Cards |
Extract name, expiration date, machine readable zone, and more from passports, drivers licenses and ID cards. |
Worldwide: Passport Book, Passport Card United States: Driver License, Identification Card, Residency Permit (Green card), Social Security Card, Military ID Europe: Driver License, Identification Card, Residency Permit Southeast Asia: Driver License, Identification Card, Residency Permit India: Driver License, PAN Card, Aadhaar Card Canada: Driver License, Identification Card, Residency Permit Australia: Driver License, Photo Card, Key-pass ID New Zealand: Driver License, Identification Card, Residency Permit |
Contract/Agreement |
Extract the title and signatory parties information (including names, reference names, and addresses) from contracts. |
English |
Structured Documents |
Extract key value pairs and tables from any consistently structured forms or documents. Important! When using this document type, be sure to process documents that are exactly the same structure as the job's sample or results are likely to be poor or inaccurate. |
All supported OCR languages |
* Some languages are only supported when the extended language support option is enabled in the Umango license
This doesn’t mean Umango can’t read documents containing languages not included in the ones mentioned above (as long as they are based on the English character set) but accuracy may be diminished.