A Robust Methodology for Creating Large Image Datasets Using a Universal Format

2021
In this paper, an autonomous methodology to create large image datasets by cropping images from an elementary and universal format has been suggested. This format consists of rectangles which can vary in size, number, and position. This methodology enables us to extract thousands of images in a matter of minutes without much manual effort. The primary reason for developing such technique is that large datasets are required in order to train intensively deep and enormous networks, which are the foundation of Artificial Intelligence(AI) and Computer Vision. Using this methodology we can harness large datasets quite conveniently and without the use of any special equipment. Also, this format can be used to collect diverse dataset which can help engineers and researchers from various domains. Another benefit of this methodology is that the proposed format can be used in real-time applications as well. In the present work, we have used this methodology to collect handwritten image dataset written in the Punjabi language. This technique uses contours and edge detection to locate specific shapes and match their dimensions and locations with the described parameters. Using this technique we were able to collect handwritten image datasets from three different forms of the Punjabi language with high accuracies.
    • Correction
    • Source
    • Cite
    • Save
    16
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map