As one of the primary suppliers of AI and machine learning consulting on the local market, our squad partnered with Beehiveor R&D Labs for a joint research project on a reading assistant system (RAS). We were inspired by different fitness apps and wrist trackers, which use accelerometers, heart rate sensors, and GPS data among other things to help people do physical activities better. Similarly, if we can track gaze movement, why not to use this data to facilitate visual activities. Reading was chosen as a primary research objective since it’s one of the activities humans perform daily.
There are certain problems people encounter while reading, and everyone resolves those problems in a unique way: rereading difficult parts of a text, googling unknown words, writing down details to remember. What if this could be automated? For instance, to track human reading, evaluate speed, distinguish reading gaze movements from other activities, and annotate hard-to-read places. These are features that can be extremely useful for people who have to tackle huge amounts of text every day.
Before we get into the research process and findings, let’s figure out what Reading Assistant System (RAS) is and why it is so important. RAS is defined as an AI and gaze tracking-based system for various reading analysis purposes. RAS can be used in many settings, and the modern application is well documented in education, medicine, HR, marketing and other areas.
The system will allow the user to:
Here’s how the technology will behave in practice. The RAS program can automatically track movement of your gaze and match it with the text on the screen. It will allow to process reading patterns in real-time and store all the metadata for further analysis.
Thanks to deep learning-trained neural networks, a RAS can identify with up to 96% accuracy whether the person is reading at a given moment.
The initial purpose of our research was tri-fold:
Our team of 3 people carried out research activities over 2 months in the scope of DataRoot University activities. The primary task was to create a program for human visual activity analysis by gaze movement that would incorporate:
We used mainly open-source libraries based on Python:
The algorithms that we used: