Expertini Research Research
Computer Science PDF Available Non-peer-reviewed Preprint

Multi-class Text Classification using BERT-based Active Learning

Sumanth Prabhu, Moosa Mohamed, Hemant Misra  ·  Published 2021-04-27

Abstract

Text Classification finds interesting applications in the pickup and delivery services industry where customers require one or more items to be picked up from a location and delivered to a certain destination. Classifying these customer transactions into multiple categories helps understand the market needs for different customer segments. Each transaction is accompanied by a text description provided by the customer to describe the products being picked up and delivered which can be used to classify the transaction. BERT-based models have proven to perform well in Natural Language Understanding. However, the product descriptions provided by the customers tend to be short, incoherent and code-mixed (Hindi-English) text which demands fine-tuning of such models with manually labelled data to achieve high accuracy. Collecting this labelled data can prove to be expensive. In this paper, we explore Active Learning strategies to label transaction descriptions cost effectively while using BERT to train a transaction classification model. On TREC-6, AG's News Corpus and an internal dataset, we benchmark the performance of BERT across different Active Learning strategies in Multi-Class Text Classification.
📄 Full Paper Available as PDF
This paper is available as a downloadable PDF.
📄 Download PDF

✨ AI Plain-English Summary

Get a plain-English summary of this paper generated by AI (5 free per day).

Comments (0)

No comments yet. Be the first to comment.