Image is a matrix of numbers, let’s convert the knowledge into features.

Introduction:

One of the hottest fields in data science is image classification, in this article I want to share some techniques for transforming image to a vector of features that can be used in every classification model after then.

As a data scientist in VATbox I usually work with texts or images, in this article I’ll combine them both and we will try to solve text problem using only image.

Problem definition:

VATbox, as the name suggests, deals with VAT problems (and a lot more) one of the problems in the invoices world is that I would want to know how many invoices are in one image? To simplify the question, we will ask a binary question, do we have one invoice in the image or multiple invoices in the same image? …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store