Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Temporal information processing of a new language: fast porting with minimal resources"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We describe the semi-automatic adaptation of a TimeML annotated corpus from English to Portuguese, a language for which TimeML annotated data was not available yet. In order to validate this adaptation, we use the obtained data to replicate some results in the literature that used the original English data. The fact that comparable results are obtained indicates that our approach can be used successfully to rapidly create semantically annotated resources for new languages. | Temporal information processing of a new language fast porting with minimal resources Francisco Costa and Antonio Branco Universidade de Lisboa Abstract We describe the semi-automatic adaptation of a TimeML annotated corpus from English to Portuguese a language for which TimeML annotated data was not available yet. In order to validate this adaptation we use the obtained data to replicate some results in the literature that used the original English data. The fact that comparable results are obtained indicates that our approach can be used successfully to rapidly create semantically annotated resources for new languages. 1 Introduction Temporal information processing is a topic of natural language processing boosted by recent evaluation campaigns like TERN2004 1 TempEval-1 Verhagen et al. 2007 and the forthcoming TempEval-22 Pustejovsky and Verhagen 2009 . For instance in the TempEval-1 competition three tasks were proposed a identifying the temporal relation such as overlap before or after holding between events and temporal entities such as dates times and temporal durations denoted by expressions i.e. temporal expressions occurring in the same sentence b identifying the temporal relation holding between events expressed in a document and its creation time c identifying the temporal relation between the main events expressed by two adjacent sentences. Supervised machine learning approaches are pervasive in the tasks of temporal information processing. Even when the best performing systems in these competitions are symbolic there are machine learning solutions with results close to their performance. In TempEval-1 where there were statistical and rule-based systems almost 1http timex2.mitre.org 2 http www.timeml. org tempeval2 all systems achieved quite similar results. In the TERN2004 competition aimed at identifying and normalizing temporal expressions a symbolic system performed best but since then machine learning solutions such as Ahn et al. 2007 have .