Categorization of Unorganized Text Corpora for better Domain-Specific Language Modeling
This paper describes the process of categorization of unorganized text data gathered from the Internet to the in-domain and out-of-domain data for better domain-specific language modeling and speech recognition.An algorithm for text categorization and topic detection based Stovetop Kettle on the most frequent key phrases is presented.In this scheme