Siirry suoraan sisältöön
Learning from Imperfections: Building Datasets with Probabilistic Alignment of Handwritten Text
Tallenna

Learning from Imperfections: Building Datasets with Probabilistic Alignment of Handwritten Text

Kirjailija:
englanti
The word images must be able to be described by parameters extracted from a Con-volutional Neural Network (CNN). This to determine what words look similar. The similarity of word images is what is used to create an alignment. The algorithm finds words that are common in the text and identifies these by the shape of the word image and its position on the page. It uses the features it creates for each image to return an alignment, linking words in the transcript to word images given the similarity of reoc-curring words in the text. These words are added as labelled images to a data set. A labelled data set of word images subsequently grows with each page shown to the algo-rithm and is used to facilitate better alignments on future pages. This yields not only a probable alignment of words to word images on a page but also a data set of labelled words.
Kirjailija
Georgia
ISBN
9783384261038
Kieli
englanti
Paino
150 grammaa
Julkaisupäivä
1.6.2024
Kustantaja
tredition GmbH
Sivumäärä
94