Constants

Multiple feature calls within the Textractor tool require the use of input parameters from fixed choices. They’ve been defined as Enum within the data package.

class textractor.data.constants.AnalyzeExpenseFields(value)

Bases: Enum

An enumeration.

ACCOUNT_NUMBER = 'ACCOUNT_NUMBER'
ADDRESS = 'ADDRESS'
ADDRESS_BLOCK = 'ADDRESS_BLOCK'
AMOUNT_DUE = 'AMOUNT_DUE'
AMOUNT_PAID = 'AMOUNT_PAID'
CITY = 'CITY'
COUNTRY = 'COUNTRY'
CUSTOMER_NUMBER = 'CUSTOMER_NUMBER'
DELIVERY_DATE = 'DELIVERY_DATE'
DISCOUNT = 'DISCOUNT'
DUE_DATE = 'DUE_DATE'
GRATUITY = 'GRATUITY'
INVOICE_RECEIPT_DATE = 'INVOICE_RECEIPT_DATE'
INVOICE_RECEIPT_ID = 'INVOICE_RECEIPT_ID'
NAME = 'NAME'
ORDER_DATE = 'ORDER_DATE'
OTHER = 'OTHER'
PAYMENT_TERMS = 'PAYMENT_TERMS'
PO_NUMBER = 'PO_NUMBER'
PRIOR_BALANCE = 'PRIOR_BALANCE'
RECEIVER_ABN_NUMBER = 'RECEIVER_ABN_NUMBER'
RECEIVER_ADDRESS = 'RECEIVER_ADDRESS'
RECEIVER_GST_NUMBER = 'RECEIVER_GST_NUMBER'
RECEIVER_NAME = 'RECEIVER_NAME'
RECEIVER_PAN_NUMBER = 'RECEIVER_PAN_NUMBER'
RECEIVER_PHONE = 'RECEIVER_PHONE'
RECEIVER_VAT_NUMBER = 'RECEIVER_VAT_NUMBER'
SERVICE_CHARGE = 'SERVICE_CHARGE'
SHIPPING_HANDLING_CHARGE = 'SHIPPING_HANDLING_CHARGE'
STATE = 'STATE'
STREET = 'STREET'
SUBTOTAL = 'SUBTOTAL'
TAX = 'TAX'
TAX_PAYER_ID = 'TAX_PAYER_ID'
TOTAL = 'TOTAL'
VENDOR_ABN_NUMBER = 'VENDOR_ABN_NUMBER'
VENDOR_ADDRESS = 'VENDOR_ADDRESS'
VENDOR_GST_NUMBER = 'VENDOR_GST_NUMBER'
VENDOR_NAME = 'VENDOR_NAME'
VENDOR_PAN_NUMBER = 'VENDOR_PAN_NUMBER'
VENDOR_PHONE = 'VENDOR_PHONE'
VENDOR_URL = 'VENDOR_URL'
VENDOR_VAT_NUMBER = 'VENDOR_VAT_NUMBER'
ZIP_CODE = 'ZIP_CODE'
class textractor.data.constants.AnalyzeExpenseFieldsGroup(value)

Bases: Enum

An enumeration.

RECEIVER = 'RECEIVER'
RECEIVER_BILL_TO = 'RECEIVER_BILL_TO'
RECEIVER_SHIP_TO = 'RECEIVER_SHIP_TO'
RECEIVER_SOLD_TO = 'RECEIVER_SOLD_TO'
VENDOR = 'VENDOR'
VENDOR_REMIT_TO = 'VENDOR_REMIT_TO'
VENDOR_SUPPLIER = 'VENDOR_SUPPLIER'
class textractor.data.constants.AnalyzeExpenseLineItemFields(value)

Bases: Enum

An enumeration.

EXPENSE_ROW = 'EXPENSE_ROW'
ITEM = 'ITEM'
PRICE = 'PRICE'
PRODUCT_CODE = 'PRODUCT_CODE'
QUANTITY = 'QUANTITY'
UNIT_PRICE = 'UNIT_PRICE'
class textractor.data.constants.AnalyzeIDFields(value)

Bases: Enum

Enum containing all the AnalyzeID keys

ADDRESS = 'ADDRESS'
CITY_IN_ADDRESS = 'CITY_IN_ADDRESS'
CLASS = 'CLASS'
COUNTY = 'COUNTY'
DATE_OF_BIRTH = 'DATE_OF_BIRTH'
DATE_OF_ISSUE = 'DATE_OF_ISSUE'
DOCUMENT_NUMBER = 'DOCUMENT_NUMBER'
ENDORSEMENTS = 'ENDORSEMENTS'
EXPIRATION_DATE = 'EXPIRATION_DATE'
FIRST_NAME = 'FIRST_NAME'
ID_TYPE = 'ID_TYPE'
LAST_NAME = 'LAST_NAME'
MIDDLE_NAME = 'MIDDLE_NAME'
PLACE_OF_BIRTH = 'PLACE_OF_BIRTH'
RESTRICTIONS = 'RESTRICTIONS'
STATE_IN_ADDRESS = 'STATE_IN_ADDRESS'
STATE_NAME = 'STATE_NAME'
SUFFIX = 'SUFFIX'
VETERAN = 'VETERAN'
ZIP_CODE_IN_ADDRESS = 'ZIP_CODE_IN_ADDRESS'
class textractor.data.constants.CLIOverlay(value)

Bases: Enum

An enumeration.

ALL = 0
FORMS = 4
LAYOUTS = 7
LINES = 2
QUERIES = 5
SIGNATURES = 6
TABLES = 3
WORDS = 1
class textractor.data.constants.CLIPrint(value)

Bases: Enum

An enumeration.

ALL = 0
EXPENSES = 5
FORMS = 3
IDS = 7
LAYOUTS = 8
QUERIES = 4
SIGNATURES = 6
TABLES = 2
TEXT = 1
class textractor.data.constants.CellTypes(value)

Bases: Enum

Special cells within the Table belong to one of these categories.

COLUMN_HEADER = 0
FLOATING_TITLE = 3
SECTION_TITLE = 1
SUMMARY_CELL = 4
class textractor.data.constants.Direction(value)

Bases: Enum

Directions available for search using DirectionalFinder

ABOVE = 0
BELOW = 1
LEFT = 3
RIGHT = 2
class textractor.data.constants.DirectionalFinderType(value)

Bases: Enum

Document Entity types recognized by Textract APIs.

KEY_VALUE_SET = 0
SELECTION_ELEMENT = 1
class textractor.data.constants.SelectionStatus(value)

Bases: Enum

These are the 2 categories defined for the SelectionStatus of a SelectionElement.

NOT_SELECTED = 1
SELECTED = 0
class textractor.data.constants.SimilarityMetric(value)

Bases: Enum

Similarity metrics for search queries on Document data

COSINE: Cosine similarity is a metric used to measure the similarity of two vectors. It measures the similarity in the direction or orientation of the vectors ignoring differences in their magnitude or scale. EUCLIDEAN: Euclidean distance is calculated as the square root of the sum of the squared differences between the two vectors. LEVENSHTEIN: The Levenshtein distance is a string metric for measuring difference between two sequences. It is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.

COSINE = 0
EUCLIDEAN = 1
LEVENSHTEIN = 2
class textractor.data.constants.TableFormat(value)

Bases: Enum

Various formats of printing Table data with the tabulate package.

CSV = 0
FANCY_GRID = 5
GITHUB = 3
GRID = 4
HTML = 16
JIRA = 8
LATEX = 18
LATEX_BOOKTABS = 20
LATEX_LONGTABLE = 21
LATEX_RAW = 19
MEDIAWIKI = 13
MOINMOIN = 14
ORGTBL = 7
PIPE = 6
PLAIN = 1
PRESTO = 9
PRETTY = 10
PSQL = 11
RST = 12
SIMPLE = 2
TEXTILE = 22
TSV = 23
UNSAFEHTML = 17
YOUTRACK = 15
class textractor.data.constants.TableTypes(value)

Bases: Enum

Types of tables recognized by Textract APIs.

SEMI_STRUCTURED = 2
STRUCTURED = 1
UNKNOWN = 0
class textractor.data.constants.TextTypes(value)

Bases: Enum

Textract recognizes TextType of all words in the document to fall into one of these 2 categories.

HANDWRITING = 0
PRINTED = 1
class textractor.data.constants.TextractAPI(value)

Bases: Enum

API types for asynchronous type fetching

ANALYZE = 1
DETECT_TEXT = 0
EXPENSE = 2
classmethod TextractAPI_to_Textract_API(api)
classmethod Textract_API_to_TextractAPI(api: Textract_API)
class textractor.data.constants.TextractFeatures(value)

Bases: Enum

Features to be used as parameter for AnalyzeDocument and StartDocumentAnalysis.

FORMS = 0
LAYOUT = 4
QUERIES = 2
SIGNATURES = 3
TABLES = 1
class textractor.data.constants.TextractType(value)

Bases: Enum

Document Entity types recognized by Textract APIs.

KEY_VALUE_SET = 2
LINES = 1
SELECTION_ELEMENT = 3
TABLES = 4
TABLE_CELL = 5
WORDS = 0