tailieunhanh - Báo cáo khoa học: "Implementing a Characterization of Genre for Automatic Genre Identification of Web Pages"
In this paper, we propose an implementable characterization of genre suitable for automatic genre identification of web pages. This characterization is implemented as an inferential model based on a modified version of Bayes’ theorem. Such a model can deal with genre hybridism and individualization, two important forces behind genre evolution. Results show that this approach is effective and is worth further research. | Implementing a Characterization of Genre for Automatic Genre Identification of Web Pages Marina Santini NLTG University of Brighton UK M. Santini@ Richard Power Computing Department Open University UK Roger Evans NLTG University of Brighton UK Abstract In this paper we propose an implementable characterization of genre suitable for automatic genre identification of web pages. This characterization is implemented as an inferential model based on a modified version of Bayes theorem. Such a model can deal with genre hybridism and individualization two important forces behind genre evolution. Results show that this approach is effective and is worth further research. 1 Introduction The term genre is employed in virtually all cultural fields literature music art architecture dance pedagogy hypermedia studies computer-mediated communication and so forth. As has often been pointed out it is hard to pin down the concept of genre from a unified perspective cf. Kwasnik and Crowston 2004 . This lack is also experienced in the more restricted world of non-literary or non-fictional document genres such as professional or instrumental genres where variation due to personal style is less pronounced than in literary genres. In particular scholars working with practical genres focus upon a specific environment. For instance Swales 1990 develops his notion of genre in academic and research settings Bathia 1993 in professional settings and so on. In automatic genre classification studies genres have often been seen as non-topical categories that could help reduce information overload . Mayer zu Eissen and Stein 2004 Lim et al. 2005 . Despite the lack of an agreed theoretical notion genre is a well-established term intuitively understood in its vagueness. What humans intuitively perceive is that there are categories created within a culture a society or a community which are used to group documents that share some .
đang nạp các trang xem trước