Introduction: Text categorization is the basic work of textual information processing and therefore attracts much attention. However, the feature representation of text severely limits the improvement of text classification performance. With the development of social networking, large-scale and even massive amounts of text information have increased dramatically, which has led to huge challenges in text classification. This article is a paper presented at the PRICAI 2016 conference and introduces a quick training method to solve this problem.
Title: Chart-Enhanced Fast Training for Large-Scale Text Classification
Summary:
This paper presents a fast algorithm for chart classification based on the enhanced algorithm. It uses the chart to input text and is applied to emotion analysis. The form of the chart is well-suited to represent text structures that have been processed using natural language processing techniques such as parsing, name recognition, and semantic parsing. At present, a large number of classification methods for representing text as a graph have been proposed. However, many of them limit the candidate characteristics in advance because of the large feature space. The proposed method, without limiting the search space, proposes two approximate methods to enhance learning based on chart rules. The experimental results on the sentiment analysis dataset show that our method helps to improve the training speed. In addition, the classification method based on graph representations utilizes rich text structure information, which cannot be detected when using other, simpler input formats, ultimately exhibiting higher accuracy.
Keywords: text classification; feature engineering; chart enhancement
First author:
Hiyori Yoshikawa
Fujitsu Labs Researcher, Fujitsu is Japan's #1 IT vendor, the world's fourth-largest IT service company, and the world's top five server and PC makers.
Via PRICAI 2016
Original paper download
Lei Feng Network Press: This article by Lei Feng network (search "Lei Feng network" public number concern) exclusive compilation, without permission prohibited reprint!