School of Computing

FACULTY OF ENGINEERING

 

Professor Bogdan Babych

Professor in Translation Studies, Heidelberg University and Visiting Research Fellow, University of Leeds

Member of the Natural Language Processing Group

Publications & CV

ORCID

ORCID iD iconhttps://orcid.org/0000-0003-1872-1677

Publications

2017

Babych, B. (2017, to appear). Extending Levenshtein edit distance with phonological features for cognate identification task.

Babych, B. (2017). Unsupervised induction of morphological lexicon for Ukrainian. In: Proc. of CAMRL2017: Workshop on Computational Approaches to Morphologically Rich Languages. Leeds. 5 July 2017

Yu Yuan, Bogdan Babych and Serge Sharoff. (2017) Reference-free System for Automated Human Translation Quality Estimation. In.: Proc. of 12th Iberian Conference on Information Systems and Technologies (CISTI), 21-24 June 2017

Bogdan Babych, Fangzhong Su, Anthony Hartley, Ahmet Aker, Monica Lestari Paramita, Paul Clough, Robert Gaizauskas (2017, to appear). Cross-Language Comparability and its applications for MT. In: Using Comparable Corpora for Under-Resourced Areas of Machine Translation. An account of the results from the project ACCURAT and beyond. Springer.- 44 pp.

Reinhard Rapp, Vivian Xu, Tatiana Gornostay, Olga Vodopiyanova, Andrejs Vasiļjevs, Klaus-Dirk Schmitz, Michael Zock, Serge Sharroff, Richard Forsyth, Bogdan Babych (2017, to appear). New areas of application of Comparable Corpora In: Using Comparable Corpora for Under-Resourced Areas of Machine Translation. An account of the results from the project ACCURAT and beyond. Springer. - 32 pp.

Babych, B. (2017). Deconstruction of the Russian propaganda discourse in military history: Identifying and neutralizing linguistic means of falsifying history of the Ukrainian division “Halychyna”. In: Proc. of 2nd International forum on crisis communications: “Information stream models as tactical instruments of communication content security”. Military Institute of National Trarash Shevchenko University, Kyiv, 22-23 May 2017, pp.: 80-89

2016

Babych, B. (2016). Graphonological Levenshtein Edit Distance: Application for automated cognate identification. Baltic Journal of Modern Computing. Vol.4 (2016), No.2 (EAMT-2016 volume), 115-128 [pdf] (preprint)

Babych, B., Sharoff, S. (2016). Rapid induction of morphological disambiguation resources from a closely related language. Fifth Workshop on Hybrid Approaches to Translation (HyTra-5) [pdf]

Babych, B. (2016) A hybrid machine translation system between English, Ukrainian, Arabic and Russian for automated terrorist activity detection. In: Proc. of XII International Conference "Military education and science: the present and the future" Military Institute of Taras Shevchenko National University, 25 November 2016, Kyiv, Ukraine.

Yuan, Y., Sharov, S. and Babych, B. (2016). MoBiL: A hybrid feature set for Automatic Human Translation quality assessment. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Ljubljana, Slovenia.

Babych, B. (2016). Nuclear weapons of ideological warfare: Ukraine's revolution and war poetry (2013- ) as translation of the totalitarian discourse. Sociology of poetry translation conference, June 2016, University of Leeds.

Babych, B. (2016). Deconstruction of the totalitarian discourse as a factor of country’s information security: Ukrainian post-2013 literature of Revolution and war as resistance to Russia’s hybrid aggression in the cultural information space. Proc. of International forum on crisis communications, Military Institute of National Taras Shevchenko University, Kyiv, Ukraine, 9-10 June 2016, pp.: 153-157

2015

Babych, B and Atwell, E (2015). Multilingual Information Extraction framework for real-time detection of terrorist propaganda threats in on-line communication. In: Proc. of XI International Conference "Military education and science: the present and the future" Military Institute of Taras Shevchenko National University, 27 November 2015, Kyiv, Ukraine. Abrstact (en) [pdf]; Abstract (uk) [pdf]; Powerpoint (en) [ppt]; Powerpoint (uk) [ppt]

2014

Bogdan Babych, Jonathan Geiger, Mireia Ginestí Rosell, Kurt Eberle (2014) Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks. Submitted to EACL 2014 Third Workshop on Hybrid Approaches to Translation (HyTra) [pdf]

Bogdan Babych, Anne Buckley, and Svitlana Babych (2014). Advanced learners' errors in correcting Machine Translation output: comparative corpus-based analysis. TALC-2014. Teaching and Language Corpora conference. Lancaster. [pdf] (abstract)

Babych, B. (2014). Automated MT evaluation metrics and their limitations. Revista Tradumàtica: tecnologies de la traducció. Desembre 2014: Número 12, Traducció i qualitat. ISSN: 1578-7559, pp. 464-470. [pdf]

2013

Marta R. Costa-jussa`, Rafael E. Banchs, Reinhard Rapp, Patrik Lambert, Kurt Eberle, Bogdan Babych (2013). Workshop on Hybrid Approaches to Translation: Overview and Developments. In: Proc of 2nd HyTra Workshop, ACL 2013. [pdf]

Svitlana Babych, Kurt Eberle, Bogdan Babych (2013). Development of hybrid Machine Translation systems for under-resourced languages: Automated creation of lexical and morphological resources for MT. In: Proc. of 6th International Conference: Applied and Literary Translation and Interpreting: Theory, Methodology, Practice. Kyiv, Ukraine 5-6 April 2013.

2012

Mārcis Pinnis, Radu Ion, Dan Ştefănescu, Fangzhong Su, Inguna Skadiņa, Andrejs Vasiļjevs, Bogdan Babych (2012) ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora. In Proceesings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012): Demo Session, July 8 - July 14, 2012, Jeju, Korea. [pdf]

Bogdan Babych, Anne Buckley, Richard Hughes, Svitlana Babych (2012) Machine Translation technology in advanced language teaching and translator training: a corpus-based approach to post-editing MT output. In: Proceedings of of TALC 2012 : Teaching and Language Corpora Conference. Warsaw, Poland on 12th - 14th July 2012. [pdf]

Bogdan Babych, Anthony Hartley, Kyo Kageura, Martin Thomas, and Masao Utiyama. (2012). Scaffolding, capturing and preserving interactions in educating for collaborative translation. In Proceedings of 2012 International Conference 'The Making of a Translator' Taiwan Normal University, Taipei, Taiwan, 28-29 April 2012

Bogdan Babych, Anthony Hartley, Kyo Kageura, Martin Thomas, and Masao Utiyama. (2012). MNH-TT: a collaborative platform for translator training. [Aslib 2012] Translating and the Computer 34, 29-30 November 2012, One Birdcage Walk, London, UK; 18pp. [pdf]; [presentation by Martin Thomas]

Fangzhong Su and Bogdan Babych (2012) Measuring Comparability of Documents in Non-Parallel Corpora for Efficient Extraction of (Semi-)Parallel Translation Equivalents. In: Proceedings of Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) at EACL-2012. Pp. 10-19. [pdf]

Kurt Eberle, Bogdan Babych, Johanna Geiß, Mireia Ginesti-Rosell, Anthony Hartley, Reinhard Rapp, Serge Sharoff and Martin Thomas (2012) Design of a hybrid high quality machine translation system. In: Proceedings of Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) at EACL-2012. Pp. 101-112 [pdf]

Fangzhong Su and Bogdan Babych (2012) Development and Application of a Cross-language Document Comparability Metric. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.3956-3962. [pdf]

Reinhard Rapp, Serge Sharoff, Bogdan Babych and Richard Forsyth (2012) Identifying Word Translations from Comparable Documents without a Seed Lexicon. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.460-466. [pdf]

Inguna Skadiņa, Ahmet Aker, Nikos Glaros, Fangzhong Su, Dan Tufis, Mateja Verlic, Andrejs Vasiļjevs and Bogdan Babych (2012) Collecting and Using Comparable Corpora for Statistical Machine Translation. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.438-445. [pdf]

2011

Babych, B. and Hartley, A. (2011). Meta-evaluation of comparability metrics using parallel corpora. International Journal of Computational Linguistics and Applications, Proceedings volume of CICLing-2011. [pdf] (preprint)

2010

Jo Drugan and Bogdan Babych. (2010). Shared resources, shared values? Ethical implications of sharing translation resources. JEC 2010: Second joint EM+/CNGL Workshop 'Bringing MT to the user: research on integrating MT in the translation industry', AMTA 2010, Denver, Colorado, November 4, 2010; pp.3-9. [pdf]

2009

Babych, B. and Hartley, A. (2009). Automated error analysis for multiword expressions: using BLEU-type scores for automatic discovery of potential translation errors. Linguistica Antverpiensia, New Series (8/2009): Journal of translation and interpreting studies. Special Issue on Evaluation of Translation Technology., 8, pp. 81-104. [pdf] (preprint)

Babych, B., Hartley, A. and Sharoff, S. (2009). Evaluation-guided pre-editing of source text: improving MT-tractability of light verb constructions. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.36-43 [pdf]

Sharoff, S; Babych, B and Hartley, A (2009). 'Irrefragable answers' using comparable corpora to retrieve translation equivalents. Language Resources and Evaluation 43 (1), 15-25.

2008

Babych, B., Sharoff, S., and Hartley, A. (2008). Generalising Lexical Translation Strategies for MT Using Comparable Corpora. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). Marrakech, Morocco. 28-30 May 2008. [pdf]

Babych, B., and Hartley, A. (2008). Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). Marrakech, Morocco. 28-30 May 2008. [pdf]

Babych, B., and Hartley, A. (2008). Automated MT Evaluation for Error Analysis: Automatic Discovery of Potential Translation Errors for Multiword Expressions. Proceedings of the ELRA Workshop on Evaluation Looking into the Future of Evaluation: When automatic metrics meet task-based and performance-based approaches. Marrakech, Morocco. 27 May 2008. pp.6--11.

Mudraya, O., Piao, S.L., Rayson, P., Sharoff, S., Babych, B. and Löfberg, L. (2008). Automatic Extraction of Translation Equivalents of Phrasal and Light Verbs in English and Russian. In Granger, S. and Meunier, F. (eds.) Phraseology : an interdisciplinary perspective. Benjamins, Amsterdam. Pp 293-309

2007

Babych, B., Sharoff, S., Hartley, A., and Mudraya, O. (2007). Assisting Translators in Indirect Lexical Transfer. Paper presented at the 45th International Conference of Association for Computational Linguistics ACL 2007, Prague, Czech Republic. [video]; [pdf]

Babych, B., Hartley, A., and Sharoff, S. (2007). Translating from under-resourced languages: comparing direct transfer against pivot translation. Paper presented at Machine Translation Summit XI, Copenhagen, Denmark. [pdf]

Babych, B., Hartley, A. (2007). Sensitivity of automated models for MT evaluation: proximity-based vs. performance-based methods. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 22pp. [pdf] of PPT presentation.

Babych B., Hartley, A and Sharoff, S. (2007). A dynamic dictionary for discovering indirect translation equivalents. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 10pp. [pdf]

2006

Sharoff S, Babych B., Hartley A. (2006). Using comparable corpora to solve problems difficult for human translators. Paper presented at the joint conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics COLING-ACL 2006, Sydney, Australia. [pdf]

Mudraya, O., Babych, B., Piao, S., Rayson, P., Wilson, A. (2006). Developing a Russian semantic tagger for automatic semantic annotation. In proceedings of Corpus Linguistics 2006, St. Petersburg, Russia, 10-14 October 2006, pp. 282-289 (in Russian) [pdf], pp. 290-297 (in English) [pdf].

Sharoff, S., Babych, B., Rayson, P., Mudraya, P. and Piao, S. (2006) ASSIST: Automated Semantic Assistance for Translators. In companion proceedings to the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), Trento, Italy, April 3-7, 2006, pp. 139 - 142. ISBN 1-932432-60-4. [pdf]

Sharoff, S., Babych, B., Hartley, A. (2006) Using collocations from comparable corpora to find translation equivalents. In Proc. of LREC2006, Genoa, May, 2006, pp. 465-470. [pdf]

2005

Babych, B., Hartley, A., and Elliott, D. (2005). Estimating the predictive power of n-gram MT evaluation metrics across language and text types. Paper presented at Machine Translation Summit X, Phuket, Thailand. [pdf]

Babych, B. (2005). Information extraction technology in machine translation: IE methods for improving and evaluating MT quality. Ph D thesis, University of Leeds, Centre for Translation Studies, March 2005. 186pp. [pdf]

2004

Babych, B., and Hartley, A. (2004). Extending the BLEU MT Evaluation Method with Frequency Weightings. Paper presented at the 42nd International Conference of the Association for Computational Linguistics, ACL 2004: Barcelona, Spain. [pdf]

Babych, B. (2004). Weighted N-gram model for evaluating Machine Translation output. Paper presented at the 7th Annual Colloquium for the UK Special Interest Group for Computational Linguistics, CLUK'04 University of Birmingham, UK. [pdf]

Babych, B., Elliott D., Hartley A. (2004). Extending MT evaluation tools with translation complexity metrics. Paper presented at the 20th International Conference on Computational Linguistics COLING 2004. University of Geneva, Switzerland. [pdf]

Babych, B., Hartley A. (2004). Selecting Translation Strategies in MT using Automatic Named Entity Recognition. Paper presented at the European Association for Machine Translation (EAMT) Workshop, Malta. [pdf]

Babych, B., Hartley A. (2004). Modelling legitimate translation variation for automatic evaluation of MT quality. Paper presented at the 4th International Conference on Language Resources and Evaluation LREC 2004, Lisbon, Portugal. [pdf]

Babych, B., Elliott D., Hartley A. (2004). Calibrating resource-light automatic MT evaluation: a cheap approach to ranking MT systems by the usability of their output. Paper presented at the 4th International Conference on Language Resources and Evaluation LREC 2004, Lisbon, Portugal. [pdf]

Babych, B., Hartley A. (2004). Comparative Evaluation of Automatic Named Entity Recognition from Machine Translation Output. Paper presented at the Workshop on Named Entity Recognition for Natural Language Processing Applications. In Conjunction with the First International Joint Conference on Natural Language Processing IJCNLP-04, Sanya. [pdf]

2003

Babych, B., and Hartley, A. (2003). Improving Machine Translation quality with automatic Named Entity recognition. Paper presented at the 7th International EAMT workshop on MT and other language technology tools at the 10th Conference of the European Chapter of the Association for Computational Linguistics EACL 2003, Budapest, Hungary. [pdf]

Hartley, A., Babych, B and Elliott, D. (2003). Using corpora to evaluate Machine Translation. Paper presetned at the Conference on Using corpora and databases in translation. University of Portsmouth, 14 November 2003. pp. 39-58.

Babych B., Hartley A., Atwell E., (2003). Statistical modelling of MT output corpora for Information Extraction. Paper presented at the Corpus Linguistics 2003 conference. Lancaster University, UK [pdf]

2000

Babych, B. (2000). Interpretational model of formal syntactic structures in Ukrainian. Thesis submitted in accordance with the requirements for the degree of Candidate of Sciences. Unpublished manuscript. [pdf] (in Ukrainian)

1999

Babych, B. (1999). Lexical semantics in syntactic structure of a text: formal representation and interpretation. In: Proc. of the all-Ukrainian scholarly conference "Semantics, Syntactics and Pragmatics of Speaking" 25th-27th January, 1999. Lviv, "Litopys", 1999. p. 101-108.
[pdf] (English)
[pdf] (Ukrainian)

1998

Babych, B. (1998). Systems of syntaxeme groups and their procedural semantics. Movoznavstvo (Linguistics) no. 6 (190), November-December 1998. Kyiv.
[pdf] (in English);
[pdf] (in Ukrainian)

1997

Babych, B. (1997). A method of automatic identification of sentence-level errors for automatic grammar checking. In: Abstracts of all-Ukrainian conference on grammar and spelling codification. Kyiv, 1997 [pdf] (In Ukrainian)

Babych, B. (1997). Representation and interpretation of ambiguous deep syntactic structures. Ukrainian Linguistics. Issue 21. Kyiv. Pp. 89-100. [pdf] (in Ukrainian)

Technical reports

Babych, B. (2002). Word order variation and comprehensibility of centre embedding: evidence from Ukrainian. Technical Report. TRE-CTS-Babych-2002 (Unpublished) [pdf].
Presentation at a research seminar of the Natural Language Processing group at Leeds: 11 October 2002: Comprehensibility limits on centre-embedded structures. [pdf]

Babych, B. (2001). The model of word order variation in Ukrainian declarative sentences. Technical Report TRE-Ieper-Babych-2001 (Unpublished) [pdf]

Babych, B. (2001). Language identification algorithm for disambiguating English and Ukrainian URL and e-mail tokens. TRE-Ieper-Babych-2001-b (Unpublished) [pdf]

Babych, B. (1997). Scalar implicature of logical connectives. Technical Report TRE-Cornell-Babych-1997. (Unpublished). [pdf]

Babych, B. (1995). Conceptual Syntax of Surface Syntactic Structures (on the material of military operation orders). Technical Report TRE-Kyiv-Babych-1995. (Unpublished). [pdf] (in Ukrainian)

Reviewing and feedback on literary translations

Eugenia Kononenko, A Russian Story. Glagoslav Publications, translated by Patrick Corness. London, 2013.

Otar Dovzhenko, The song of the railroad crossing barrier, translated by Patrick Corness. In Massachusetts Review, Spring, 2011, Volume 52, Issue 1

CV

Dr Bogdan BABYCH

EMPLOYMENT / RESEARCH / ACADEMIC DEGREES

2020 -- present, Heidelberg University, Institute for Translation and Interpreting
    -- Professor in Translation Studies

2010 -- present, University of Leeds, CTS
    -- Visiting Research Fellow (2020 - present)
    -- Associate Professor in Translation Studies (2014-2020)
    -- Lecturer (2010-2014)
    -- Co-ordinator and Principal Investigator: HyghTra (HyghQuality Hybrid MT System -- FP7 Marie Curie IAPP)
    -- Principal Investigator (for CTS, Leeds): ACCURAT FP7 ICT project

2009, October -- 2010, August: University of Leeds, CTS
    -- Research Fellow, TAUS, TTC, ACCURAT projects
    -- Projects: Intelligent access to shared translation resources (TAUS fellowship)
         FP7 Translation, Terminology and Comparable Corpora (TTC)
         FP7 Analysis and evaluation of Comparable Corpora
             for Under Resourced Areas of machine Translation (ACCURAT)

2007, October -- 2009, September: University of Leeds, CTS
    -- Leverhulme Early Career Research Fellow
    -- Project title: Translation Strategies in Comparable Corpora

2005, April -- 2007, October: University of Leeds, CTS
    -- Post-doc Research Fellow, EPSRC-funded project ASSIST

2005, April: PhD in Machine Translation, University of Leeds
    Thesis: "Information extraction technology in machine
    translation: IE methods for improving and evaluating MT quality"

2002, October -- 2005, April: University of Leeds, CTS and
    University of Sheffield, Department of Computer Science
    "White Rose" PhD studentship: for the project "Information Extraction Technology in Machine Translation"
    supervisors: Anthony Hartley (Leeds), Yorick Wilks (Sheffield)

2000 -- 2001: "Lernout and Hauspie Speech Products", Belgium
    Corporate R&D, Linguistic Engineering Department
    -- Computational Linguist at Text-to-Speech systems group

2000, February: 'Candidate of Sciences'
    in Ukrainian Linguistics, Ukrainian National Academy of Sciences
    Thesis (Ukr.): "Interpretational model of surface syntactic structures in Ukrainian"

1996 -- 2000: Ukrainian National Academy of Sciences
    Institute for Language Information Research
     Graduate student, Research Fellow

1996, June: Diploma in Ukrainian Philology
    and Computational Linguistics, Kyiv University (Ukraine)

1991 -- 1996: Kyiv University (Ukraine), Ukrainian Linguistics and Literature, Computational Linguistics

PROGRAMMING

Python, Perl, Prolog, Java, JavaScript/HTML/CGI, bash, AWK

GitHub repositories

Projects, corpuslabs, ideas, articles... on GitHub https://github.com/bogdanbabych

LANGUAGES

English, Ukrainian (native), Spanish, German, Dutch (elementary)

AWARDS / FELLOWSHIPS

2007 -- 2009: Leverhulme Early Career Research Fellowship

2002 -- 2005: "White Rose" PhD Studentship

1998 -- 1999: Scholarship of the National Academy of Scineces, Ukraine

1994 -- 1996: Taras Shevchenko Scholarship, Kiev University, Ukraine