Subtitles section Play video
SERENA AMMIRATI: Looking at this book page by page
and trying to decipher, read, and transcribe whatever is
there takes an enormous amount of time.
It would require an army of paleographers.
ELENA NIEDDU: What I am excited the most about machine learning
is that it enabled us to solve problems that up to 10,
15 years ago we thought unsolvable.
ELENA NIEDDU: Before using any kind of machine learning model,
we needed to collect data first.
You have thousands of images of dogs and cats in the internet,
but there's very little images of ancient manuscripts.
We built our own custom web application for crowdsourcing.
And we involved high school students to collect the data.
I didn't know much about machine learning in general.
But I found it very easy to create a TensorFlow
environment.
When we were trying to figure out
which model worked best for us, Keras was the best solution.
The production model runs on TensorFlow layers, an estimator
interface.
We experimented with binary classification,
with fully connected networks.
And finally, we moved to convolutional neural network
and multiclass classification.
ELENA NIEDDU: When it comes to recognizing single characters,
we can get 95% average accuracy.
SERENA AMMIRATI: This will have an enormous impact.
In a short period of time, we will
have a massive quantity of historical information
available.
ELENA NIEDDU: I just think solving problems is fun.
It's a game against myself, and how good I can do.