{"id":212,"date":"2022-09-22T08:05:00","date_gmt":"2022-09-22T08:05:00","guid":{"rendered":"http:\/\/dnmrul.es\/tebeotecario\/?p=212"},"modified":"2022-09-22T08:09:20","modified_gmt":"2022-09-22T08:09:20","slug":"articulo-bcbid-es-el-primer-set-de-datos-de-comic-en-bengali","status":"publish","type":"post","link":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/2022\/09\/22\/articulo-bcbid-es-el-primer-set-de-datos-de-comic-en-bengali\/","title":{"rendered":"Art\u00edculo: BCBId es el primer set de datos de c\u00f3mic en bengal\u00ed"},"content":{"rendered":"<h2>BCBId: first Bangla comic dataset and its applications<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-214\" src=\"http:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1-1024x426.webp\" alt=\"\" width=\"1024\" height=\"426\" srcset=\"https:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1-1024x426.webp 1024w, https:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1-300x125.webp 300w, https:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1-768x320.webp 768w, https:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1-1536x639.webp 1536w, https:\/\/dnmrul.es\/tebeotecario\/wp-content\/uploads\/2022\/09\/10032_2022_412_Fig2_HTML-1.webp 2031w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<table>\n<tbody>\n<tr>\n<th style=\"width: 25%;\">Tipo<\/th>\n<td>Art\u00edculo de revista acad\u00e9mica<\/td>\n<\/tr>\n<tr>\n<th class=\"author\">Autor<\/th>\n<td>Arpita Dutta<\/td>\n<\/tr>\n<tr>\n<th class=\"author\">Autor<\/th>\n<td>Samit Biswas<\/td>\n<\/tr>\n<tr>\n<th class=\"author\">Autor<\/th>\n<td>Amit Kumar Das<\/td>\n<\/tr>\n<tr>\n<th>Resumen<\/th>\n<td>Comic document image analysis is now an active field of research in both academia and industry. However, comic document image processing research suffers due to its inherent complexities and the limited availability of benchmark public datasets. This paper describes the creation of the first-ever comic dataset among Indian Languages, namely Bangla Comic Book Image dataset (BCBId) (<a href=\"https:\/\/sites.google.com\/view\/banglacomicbookdataset\">https:\/\/sites.google.com\/view\/banglacomicbookdataset<\/a>), which is also made public for the benefit of the researchers. BCBId consists of 3327 images taken from 64 Bangla comic stories written by 8 writers. Bangla is the 6th most popular spoken language in the world\u2014used by 265 million people (https:\/\/en.wikipedia.org\/wiki\/Languages_of_India), and has a century-old heritage of comic strips (in newspapers) and books. BCBId has the ground truth for extracting various visual components of the comic book images, i.e., panels, characters, speech balloons, and text lines. BCBId also includes the metadata encoding of all images in XML format to describe the underlined structure, semantics, and other features of the documents to pursue research on understanding stories and dialogues. A tool is specifically designed for accurate and faster ground-truth generation. As an application of the dataset, we carry out the sentiment analysis of comic stories\u2014the first-ever attempt on comic book images. We also elaborate on a couple of applications of the BCBId in the comic research domain. Besides, we estimate the errors made by the annotators during the annotation process and describe different evaluation parameters to test the efficacy of the comic document image analysis algorithms.<\/p>\n<p>El an\u00e1lisis de im\u00e1genes de c\u00f3mics es ahora un campo activo de investigaci\u00f3n tanto en la academia como en la industria. Sin embargo, la investigaci\u00f3n del procesamiento de im\u00e1genes de documentos c\u00f3micos sufre debido a sus complejidades inherentes y la disponibilidad limitada de conjuntos de datos p\u00fablicos de referencia. Este documento describe la creaci\u00f3n del primer conjunto de datos de historietas entre los idiomas indios, a saber, el conjunto de datos de im\u00e1genes de c\u00f3mics en bengal\u00ed (BCBId) (https:\/\/sites.google.com\/view\/banglacomicbookdataset), que tambi\u00e9n se hace p\u00fablico en beneficio de los investigadores. BCBId consta de 3327 im\u00e1genes tomadas de 64 historietas bengal\u00edes escritas por 8 escritores. El bengal\u00ed es el sexto idioma hablado m\u00e1s popular en el mundo, utilizado por 265 millones de personas (https:\/\/en.wikipedia.org\/wiki\/Languages_of_India), y tiene una herencia centenaria de tiras c\u00f3micas (en peri\u00f3dicos) y libros. BCBId tiene la verdad b\u00e1sica para extraer varios componentes visuales de las im\u00e1genes del c\u00f3mic, es decir, paneles, personajes, globos de di\u00e1logo y l\u00edneas de texto. BCBId tambi\u00e9n incluye la codificaci\u00f3n de metadatos de todas las im\u00e1genes en formato XML para describir la estructura subrayada, la sem\u00e1ntica y otras caracter\u00edsticas de los documentos para realizar investigaciones sobre la comprensi\u00f3n de historias y di\u00e1logos. Una herramienta est\u00e1 dise\u00f1ada espec\u00edficamente para una generaci\u00f3n precisa y m\u00e1s r\u00e1pida de la verdad del terreno. Como una aplicaci\u00f3n del conjunto de datos, llevamos a cabo el an\u00e1lisis de sentimientos de las historias de historietas, el primer intento de im\u00e1genes de historietas. Tambi\u00e9n elaboramos un par de aplicaciones del BCBId en el dominio de investigaci\u00f3n de historietas. Adem\u00e1s, estimamos los errores cometidos por los anotadores durante el proceso de anotaci\u00f3n y describimos diferentes par\u00e1metros de evaluaci\u00f3n para probar la eficacia de los algoritmos de an\u00e1lisis de im\u00e1genes de documentos c\u00f3micos.<\/td>\n<\/tr>\n<tr>\n<th>URL<\/th>\n<td><a href=\"https:\/\/doi.org\/10.1007\/s10032-022-00412-9\">https:\/\/doi.org\/10.1007\/s10032-022-00412-9<\/a><\/td>\n<\/tr>\n<tr>\n<th>Accedido<\/th>\n<td>22\/9\/2022 9:47:14<\/td>\n<\/tr>\n<tr>\n<th>Publicaci\u00f3n<\/th>\n<td>International Journal on Document Analysis and Recognition (IJDAR)<\/td>\n<\/tr>\n<tr>\n<th>DOI<\/th>\n<td><a href=\"http:\/\/doi.org\/10.1007\/s10032-022-00412-9\">10.1007\/s10032-022-00412-9<\/a><\/td>\n<\/tr>\n<tr>\n<th>Abrev. de revista<\/th>\n<td>IJDAR<\/td>\n<\/tr>\n<tr>\n<th>ISSN<\/th>\n<td>1433-2825<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"csl-bib-body\">\n<div><\/div>\n<div class=\"csl-entry\">Dutta, A., Biswas, S., &amp; Das, A. K. (2022). BCBId: First Bangla comic dataset and its applications. <i>International Journal on Document Analysis and Recognition (IJDAR)<\/i>. <a href=\"https:\/\/doi.org\/10.1007\/s10032-022-00412-9\">https:\/\/doi.org\/10.1007\/s10032-022-00412-9<\/a><\/div>\n<\/div>\n<div><\/div>\n<div>Puedes descargar la referencia actualizada a este y otros art\u00edculos similares desde <a href=\"http:\/\/dnmrul.es\/spip\/spip.php?page=biblio\">tbotkBD<\/a>.<\/div>\n","protected":false},"excerpt":{"rendered":"<p>BCBId: first Bangla comic dataset and its applications Tipo Art\u00edculo de revista acad\u00e9mica Autor Arpita Dutta Autor Samit Biswas Autor Amit Kumar Das Resumen Comic document image analysis is now an active field of research in both academia and industry. However, comic document image processing research suffers due to its inherent complexities and the limited&#8230;<a class=\"read-more-link button\" href=\"https:\/\/dnmrul.es\/tebeotecario\/index.php\/2022\/09\/22\/articulo-bcbid-es-el-primer-set-de-datos-de-comic-en-bengali\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[73,72,9,61,70,74,12],"class_list":["post-212","post","type-post","status-publish","format-standard","hentry","category-articulos","tag-analisis-de-imagen","tag-analisis-de-sistemas-complejos","tag-articulo","tag-automatizacion-de-comic","tag-comic-bangladesi","tag-conjuntos-de-datos-de-comic","tag-tbotkbd"],"_links":{"self":[{"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/posts\/212","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/comments?post=212"}],"version-history":[{"count":2,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/posts\/212\/revisions"}],"predecessor-version":[{"id":216,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/posts\/212\/revisions\/216"}],"wp:attachment":[{"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/media?parent=212"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/categories?post=212"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dnmrul.es\/tebeotecario\/index.php\/wp-json\/wp\/v2\/tags?post=212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}