University of Leeds

Marilena Di Bari's personal page

I am a fourth-year PhD student at the Centre for Translation Studies at the University of Leeds, and I am supervised by Dr. Serge Sharoff and Dr. Martin Thomas. My project aims to combine linguistic analysis and automatic tools for sentiment and emotion analysis. The annotation schema used is explained in detail in the article "SentiML: functional annotation for multilingual sentiment analysis" (Publication). The annotated corpus, the DTD used to annotate with the software MAE and the guidelines are available to download .

If you want a general overview, you can instead have a look at the poster that I presented at "The annual University of Leeds postgraduate research conference 2012" and "Digital Humanities in Leeds", as well as to the poster that I presented at "The annual University of Leeds postgraduate research conference 2013".

My latest work includes the creation of a dependency parsing model for Italian, which is available for downloading at this link (licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License). The model has been trained on 250,000 sentences from the corpus Paisa' by using the 1.7.2 version of Maltparser.

The pos tagset used is Tanl and the dependency tag set is ISST-Tanl. Please use this link for the tagset mapping and this for dependency mapping to other tagsets.

I have also created a dependency parsing model for Russian, which is available for downloading at this link (licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Unported License. Based on a work at http://cl.iitp.ru/.). The training corpus for the model was SynTagRus, as developed by Igor Boguslavky, Leonid Iomdin and their colleagues. The model has been created by using the version 1.7.2 of Maltparser under the consensus of Serge Sharoff, who created the previous model. See this link for further details.

Previous work

My MA thesis in Technical-Scientific Translation at the University of Bari, entitled "Problems of cross-cultural communication: a corpus-based analysis of humility and smirenie", has been my first attempt to employ computational linguistics techniques to test a theory derived from traditional approaches to intercultural communication. I dealt with a linguistic comparison between the English word humility and the Russian word smirenie. The work is based on Michael Stubb's new approach, who used corpora analysis to find out an empirical evidence of the importance of some cultural words in English. The starting point for my thesis has been Anna Wierzbicka's hypothesis that the English word humility is not the exact equivalent of the Russian word smirenie. I worked on the analysis of the two terms from the point of view of etymology and definitions and I conducted a corpus-based analysis using the web corpora UkWaC and Russian Web Corpus, available through Sketch Engine. It is available for reading in Scribd.

Contacts and other details

University profile webpage

University e-mail: mlmdb _ at _ leeds.ac.uk

Interview for Gravita' Zero