Professor Bogdan Babych
Professor of Translation Studies, Heidelberg University and Visiting Research Fellow, University of Leeds
Member of the Natural Language Processing Group
Publications & CV
ORCID
GitHub Page
https://bogdanbabych.github.io/
Publications
2021
Jakub Piskorski, Bogdan Babych, Zara Kancheva, Olga Kanishcheva, Maria Lebedeva, Michał Marcińczuk, Preslav Nakov, Petya Osenova, Lidia Pivovarova, Senja Pollak, Pavel Přibáň, Ivaylo Radev, Marko Robnik-Šikonja, Vasyl Starko, Josef Steinberger, Roman Yangarber (2021). Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization, Classification, and Linking of Named Entities across Slavic languages. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. European Association for Computational Linguistics
2019
Babych, Bogdan. (2019). "Unsupervised Induction of Ukrainian Morphological Paradigms for the New Lexicon: Extending Coverage for Named Entities and Neologisms using Inflection Tables and Unannotated Corpora." In Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing at ACL 2019, Florence, August 2019, pp. 1-11.
Babych, Bogdan, Fangzhong Su, Anthony Hartley, Ahmet Aker, Monica Lestari Paramita, Paul Clough, and Robert Gaizauskas. (2019). "Cross-Language Comparability and Its Applications for MT." In Using Comparable Corpora for Under-Resourced Areas of Machine Translation, pp. 13-53. Springer.
Babych, Bogdan, Yu Chen, Andreas Eisele, Sabine Hunsicker, Mārcis Pinnis, Inguna Skadiņa, Raivis Skadiņš et al. "(2019). Training, Enhancing, Evaluating and Using MT Systems with Comparable Data." In Using Comparable Corpora for Under-Resourced Areas of Machine Translation, pp. 189-254. Springer.
Rapp, Reinhard, Vivian Xu, Michael Zock, Serge Sharoff, Richard Forsyth, Bogdan Babych, Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi. (2019). "New Areas of Application of Comparable Corpora." In Using Comparable Corpora for Under-Resourced Areas of Machine Translation, pp. 255-290. Springer.
2018
Babych, Bogdan (2018). Development and evaluation of phonological models for cognate identification. EAMT-2018. Proceedings of the 21st Annual Conference of the European Association for Machine Translation. 28–30 May 2018. Universitat d’Alacant. Alacant, Spain URL: https://rua.ua.es/dspace/bitstream/10045/76019/1/EAMT2018-Proceedings_06.pdf
Babych, Bogdan (2018). Construction Grammar Corpus Annotation for Morphologically Rich Languages. Presentation at CAMRL2018: Workshop on Computational Approaches to Morphologically Rich Languages. Leeds, 3 July 2018.
Babych, Bogdan (2018). Unsupervised discovery of Construction Grammar representations for under-resourced languages. Presentation at LxGR2018 symposium: Corpus Approaches to Lexicogrammar. Edge Hill University, 16 June 2018 URL: https://www.edgehill.ac.uk/english/files/2018/06/LxGr2018.Babych.slides.pdf
2017
Babych, B. (2017). Unsupervised induction of morphological lexicon for Ukrainian. In: Proc. of CAMRL2017: Workshop on Computational Approaches to Morphologically Rich Languages. Leeds. 5 July 2017
Yu Yuan, Bogdan Babych and Serge Sharoff. (2017) Reference-free System for Automated Human Translation Quality Estimation. In.: Proc. of 12th Iberian Conference on Information Systems and Technologies (CISTI), 21-24 June 2017
Babych, B. (2017). Deconstruction of the Russian propaganda discourse in military history: Identifying and neutralizing linguistic means of falsifying history of the Ukrainian division “Halychyna”. In: Proc. of 2nd International forum on crisis communications: “Information stream models as tactical instruments of communication content security”. Military Institute of National Trarash Shevchenko University, Kyiv, 22-23 May 2017, pp.: 80-89
2016
Babych, B. (2016). Graphonological Levenshtein Edit Distance: Application for automated cognate identification. Baltic Journal of Modern Computing. Vol.4 (2016), No.2 (EAMT-2016 volume), 115-128 [pdf] (preprint)
Babych, B., Sharoff, S. (2016). Rapid induction of morphological disambiguation resources from a closely related language. Fifth Workshop on Hybrid Approaches to Translation (HyTra-5) [pdf]
Babych, B. (2016) A hybrid machine translation system between English, Ukrainian, Arabic and Russian for automated terrorist activity detection. In: Proc. of XII International Conference "Military education and science: the present and the future" Military Institute of Taras Shevchenko National University, 25 November 2016, Kyiv, Ukraine.
Yuan, Y., Sharov, S. and Babych, B. (2016). MoBiL: A hybrid feature set for Automatic Human Translation quality assessment. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Ljubljana, Slovenia.
Babych, B. (2016). Nuclear weapons of ideological warfare: Ukraine's revolution and war poetry (2013- ) as translation of the totalitarian discourse. Sociology of poetry translation conference, June 2016, University of Leeds.
Babych, B. (2016). Deconstruction of the totalitarian discourse as a factor of country’s information security: Ukrainian post-2013 literature of Revolution and war as resistance to Russia’s hybrid aggression in the cultural information space. Proc. of International forum on crisis communications, Military Institute of National Taras Shevchenko University, Kyiv, Ukraine, 9-10 June 2016, pp.: 153-157
2015
Babych, B and Atwell, E (2015). Multilingual Information Extraction framework for real-time detection of terrorist propaganda threats in on-line communication. In: Proc. of XI International Conference "Military education and science: the present and the future" Military Institute of Taras Shevchenko National University, 27 November 2015, Kyiv, Ukraine. Abrstact (en) [pdf]; Abstract (uk) [pdf]; Powerpoint (en) [ppt]; Powerpoint (uk) [ppt]
2014
Bogdan Babych, Jonathan Geiger, Mireia Ginestí Rosell, Kurt Eberle (2014) Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks. Submitted to EACL 2014 Third Workshop on Hybrid Approaches to Translation (HyTra) [pdf]
Bogdan Babych, Anne Buckley, and Svitlana Babych (2014). Advanced learners' errors in correcting Machine Translation output: comparative corpus-based analysis. TALC-2014. Teaching and Language Corpora conference. Lancaster. [pdf] (abstract)
Babych, B. (2014). Automated MT evaluation metrics and their limitations. Revista Tradumàtica: tecnologies de la traducció. Desembre 2014: Número 12, Traducció i qualitat. ISSN: 1578-7559, pp. 464-470. [pdf]
2013
Marta R. Costa-jussa`, Rafael E. Banchs, Reinhard Rapp, Patrik Lambert, Kurt Eberle, Bogdan Babych (2013). Workshop on Hybrid Approaches to Translation: Overview and Developments. In: Proc of 2nd HyTra Workshop, ACL 2013. [pdf]
Svitlana Babych, Kurt Eberle, Bogdan Babych (2013). Development of hybrid Machine Translation systems for under-resourced languages: Automated creation of lexical and morphological resources for MT. In: Proc. of 6th International Conference: Applied and Literary Translation and Interpreting: Theory, Methodology, Practice. Kyiv, Ukraine 5-6 April 2013.
2012
Mārcis Pinnis, Radu Ion, Dan Ştefănescu, Fangzhong Su, Inguna Skadiņa, Andrejs Vasiļjevs, Bogdan Babych (2012) ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora. In Proceesings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012): Demo Session, July 8 - July 14, 2012, Jeju, Korea. [pdf]
Bogdan Babych, Anne Buckley, Richard Hughes, Svitlana Babych (2012) Machine Translation technology in advanced language teaching and translator training: a corpus-based approach to post-editing MT output. In: Proceedings of of TALC 2012 : Teaching and Language Corpora Conference. Warsaw, Poland on 12th - 14th July 2012. [pdf]
Bogdan Babych, Anthony Hartley, Kyo Kageura, Martin Thomas, and Masao Utiyama. (2012). Scaffolding, capturing and preserving interactions in educating for collaborative translation. In Proceedings of 2012 International Conference 'The Making of a Translator' Taiwan Normal University, Taipei, Taiwan, 28-29 April 2012
Bogdan Babych, Anthony Hartley, Kyo Kageura, Martin Thomas, and Masao Utiyama. (2012). MNH-TT: a collaborative platform for translator training. [Aslib 2012] Translating and the Computer 34, 29-30 November 2012, One Birdcage Walk, London, UK; 18pp. [pdf]; [presentation by Martin Thomas]
Fangzhong Su and Bogdan Babych (2012) Measuring Comparability of Documents in Non-Parallel Corpora for Efficient Extraction of (Semi-)Parallel Translation Equivalents. In: Proceedings of Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) at EACL-2012. Pp. 10-19. [pdf]
Kurt Eberle, Bogdan Babych, Johanna Geiß, Mireia Ginesti-Rosell, Anthony Hartley, Reinhard Rapp, Serge Sharoff and Martin Thomas (2012) Design of a hybrid high quality machine translation system. In: Proceedings of Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) at EACL-2012. Pp. 101-112 [pdf]
Fangzhong Su and Bogdan Babych (2012) Development and Application of a Cross-language Document Comparability Metric. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.3956-3962. [pdf]
Reinhard Rapp, Serge Sharoff, Bogdan Babych and Richard Forsyth (2012) Identifying Word Translations from Comparable Documents without a Seed Lexicon. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.460-466. [pdf]
Inguna Skadiņa, Ahmet Aker, Nikos Glaros, Fangzhong Su, Dan Tufis, Mateja Verlic, Andrejs Vasiļjevs and Bogdan Babych (2012) Collecting and Using Comparable Corpora for Statistical Machine Translation. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12), 21-27 May 2012, Istanbul, Turkey; pp.438-445. [pdf]
2011
Babych, B. and Hartley, A. (2011). Meta-evaluation of comparability metrics using parallel corpora. International Journal of Computational Linguistics and Applications, Proceedings volume of CICLing-2011. [pdf] (preprint)
2010
Jo Drugan and Bogdan Babych. (2010). Shared resources, shared values? Ethical implications of sharing translation resources. JEC 2010: Second joint EM+/CNGL Workshop 'Bringing MT to the user: research on integrating MT in the translation industry', AMTA 2010, Denver, Colorado, November 4, 2010; pp.3-9. [pdf]
2009
Babych, B. and Hartley, A. (2009). Automated error analysis for multiword expressions: using BLEU-type scores for automatic discovery of potential translation errors. Linguistica Antverpiensia, New Series (8/2009): Journal of translation and interpreting studies. Special Issue on Evaluation of Translation Technology., 8, pp. 81-104. [pdf] (preprint)
Babych, B., Hartley, A. and Sharoff, S. (2009). Evaluation-guided pre-editing of source text: improving MT-tractability of light verb constructions. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.36-43 [pdf]
Sharoff, S; Babych, B and Hartley, A (2009). 'Irrefragable answers' using comparable corpora to retrieve translation equivalents. Language Resources and Evaluation 43 (1), 15-25.
2008
Babych, B., Sharoff, S., and Hartley, A. (2008). Generalising Lexical Translation Strategies for MT Using Comparable Corpora. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). Marrakech, Morocco. 28-30 May 2008. [pdf]
Babych, B., and Hartley, A. (2008). Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08). Marrakech, Morocco. 28-30 May 2008. [pdf]
Babych, B., and Hartley, A. (2008). Automated MT Evaluation for Error Analysis: Automatic Discovery of Potential Translation Errors for Multiword Expressions. Proceedings of the ELRA Workshop on Evaluation Looking into the Future of Evaluation: When automatic metrics meet task-based and performance-based approaches. Marrakech, Morocco. 27 May 2008. pp.6--11.
Mudraya, O., Piao, S.L., Rayson, P., Sharoff, S., Babych, B. and Löfberg, L. (2008). Automatic Extraction of Translation Equivalents of Phrasal and Light Verbs in English and Russian. In Granger, S. and Meunier, F. (eds.) Phraseology : an interdisciplinary perspective. Benjamins, Amsterdam. Pp 293-309
2007
Babych, B., Sharoff, S., Hartley, A., and Mudraya, O. (2007). Assisting Translators in Indirect Lexical Transfer. Paper presented at the 45th International Conference of Association for Computational Linguistics ACL 2007, Prague, Czech Republic. [video]; [pdf]
Babych, B., Hartley, A., and Sharoff, S. (2007). Translating from under-resourced languages: comparing direct transfer against pivot translation. Paper presented at Machine Translation Summit XI, Copenhagen, Denmark. [pdf]
Babych, B., Hartley, A. (2007). Sensitivity of automated models for MT evaluation: proximity-based vs. performance-based methods. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 22pp. [pdf] of PPT presentation.
Babych B., Hartley, A and Sharoff, S. (2007). A dynamic dictionary for discovering indirect translation equivalents. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 10pp. [pdf]
2006
Sharoff S, Babych B., Hartley A. (2006). Using comparable corpora to solve problems difficult for human translators. Paper presented at the joint conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics COLING-ACL 2006, Sydney, Australia. [pdf]
Mudraya, O., Babych, B., Piao, S., Rayson, P., Wilson, A. (2006). Developing a Russian semantic tagger for automatic semantic annotation. In proceedings of Corpus Linguistics 2006, St. Petersburg, Russia, 10-14 October 2006, pp. 282-289 (in Russian) [pdf], pp. 290-297 (in English) [pdf].
Sharoff, S., Babych, B., Rayson, P., Mudraya, P. and Piao, S. (2006) ASSIST: Automated Semantic Assistance for Translators. In companion proceedings to the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), Trento, Italy, April 3-7, 2006, pp. 139 - 142. ISBN 1-932432-60-4. [pdf]
Sharoff, S., Babych, B., Hartley, A. (2006) Using collocations from comparable corpora to find translation equivalents. In Proc. of LREC2006, Genoa, May, 2006, pp. 465-470. [pdf]
2005
Babych, B., Hartley, A., and Elliott, D. (2005). Estimating the predictive power of n-gram MT evaluation metrics across language and text types. Paper presented at Machine Translation Summit X, Phuket, Thailand. [pdf]
Babych, B. (2005). Information extraction technology in machine translation: IE methods for improving and evaluating MT quality. Ph D thesis, University of Leeds, Centre for Translation Studies, March 2005. 186pp. [pdf]
2004
Babych, B., and Hartley, A. (2004). Extending the BLEU MT Evaluation Method with Frequency Weightings. Paper presented at the 42nd International Conference of the Association for Computational Linguistics, ACL 2004: Barcelona, Spain. [pdf]
Babych, B. (2004). Weighted N-gram model for evaluating Machine Translation output. Paper presented at the 7th Annual Colloquium for the UK Special Interest Group for Computational Linguistics, CLUK'04 University of Birmingham, UK. [pdf]
Babych, B., Elliott D., Hartley A. (2004). Extending MT evaluation tools with translation complexity metrics. Paper presented at the 20th International Conference on Computational Linguistics COLING 2004. University of Geneva, Switzerland. [pdf]
Babych, B., Hartley A. (2004). Selecting Translation Strategies in MT using Automatic Named Entity Recognition. Paper presented at the European Association for Machine Translation (EAMT) Workshop, Malta. [pdf]
Babych, B., Hartley A. (2004). Modelling legitimate translation variation for automatic evaluation of MT quality. Paper presented at the 4th International Conference on Language Resources and Evaluation LREC 2004, Lisbon, Portugal. [pdf]
Babych, B., Elliott D., Hartley A. (2004). Calibrating resource-light automatic MT evaluation: a cheap approach to ranking MT systems by the usability of their output. Paper presented at the 4th International Conference on Language Resources and Evaluation LREC 2004, Lisbon, Portugal. [pdf]
Babych, B., Hartley A. (2004). Comparative Evaluation of Automatic Named Entity Recognition from Machine Translation Output. Paper presented at the Workshop on Named Entity Recognition for Natural Language Processing Applications. In Conjunction with the First International Joint Conference on Natural Language Processing IJCNLP-04, Sanya. [pdf]
2003
Babych, B., and Hartley, A. (2003). Improving Machine Translation quality with automatic Named Entity recognition. Paper presented at the 7th International EAMT workshop on MT and other language technology tools at the 10th Conference of the European Chapter of the Association for Computational Linguistics EACL 2003, Budapest, Hungary. [pdf]
Hartley, A., Babych, B and Elliott, D. (2003). Using corpora to evaluate Machine Translation. Paper presetned at the Conference on Using corpora and databases in translation. University of Portsmouth, 14 November 2003. pp. 39-58.
Babych B., Hartley A., Atwell E., (2003). Statistical modelling of MT output corpora for Information Extraction. Paper presented at the Corpus Linguistics 2003 conference. Lancaster University, UK [pdf]
2000
Babych, B. (2000). Interpretational model of formal syntactic structures in Ukrainian. Thesis submitted in accordance with the requirements for the degree of Candidate of Sciences. Unpublished manuscript. [pdf] (in Ukrainian)
1999
Babych, B. (1999). Lexical semantics in syntactic structure of a text: formal representation and interpretation. In: Proc. of the all-Ukrainian scholarly conference "Semantics, Syntactics and Pragmatics of Speaking" 25th-27th January, 1999. Lviv, "Litopys", 1999. p. 101-108.
[pdf] (English)
[pdf] (Ukrainian)
1998
Babych, B. (1998). Systems of syntaxeme groups and their procedural semantics. Movoznavstvo (Linguistics) no. 6 (190), November-December 1998. Kyiv.
[pdf] (in English);
[pdf] (in Ukrainian)
1997
Babych, B. (1997). A method of automatic identification of sentence-level errors for automatic grammar checking. In: Abstracts of all-Ukrainian conference on grammar and spelling codification. Kyiv, 1997 [pdf] (In Ukrainian)
Babych, B. (1997). Representation and interpretation of ambiguous deep syntactic structures. Ukrainian Linguistics. Issue 21. Kyiv. Pp. 89-100. [pdf] (in Ukrainian)
Technical reports
Babych, B. (2002). Word order variation and comprehensibility of centre embedding: evidence from Ukrainian. Technical Report. TRE-CTS-Babych-2002 (Unpublished) [pdf].
Presentation at a research seminar of the Natural Language Processing group at Leeds: 11 October 2002: Comprehensibility limits on centre-embedded structures. [pdf]
Babych, B. (2001). The model of word order variation in Ukrainian declarative sentences. Technical Report TRE-Ieper-Babych-2001 (Unpublished) [pdf]
Babych, B. (2001). Language identification algorithm for disambiguating English and Ukrainian URL and e-mail tokens. TRE-Ieper-Babych-2001-b (Unpublished) [pdf]
Babych, B. (1997). Scalar implicature of logical connectives. Technical Report TRE-Cornell-Babych-1997. (Unpublished). [pdf]
Babych, B. (1995). Conceptual Syntax of Surface Syntactic Structures (on the material of military operation orders). Technical Report TRE-Kyiv-Babych-1995. (Unpublished). [pdf] (in Ukrainian)
Reviewing and feedback on literary translations
Eugenia Kononenko, A Russian Story. Glagoslav Publications, translated by Patrick Corness. London, 2013.
Otar Dovzhenko, The song of the railroad crossing barrier, translated by Patrick Corness. In Massachusetts Review, Spring, 2011, Volume 52, Issue 1
CV
Dr Bogdan BABYCH
EMPLOYMENT / RESEARCH / ACADEMIC DEGREES
2020 -- present, Heidelberg University, Institute for Translation and Interpreting
-- Professor in Translation Studies
2010 -- present, University of Leeds, CTS
-- Visiting Research Fellow (2020 - present)
-- Associate Professor in Translation Studies (2014-2020)
-- Lecturer (2010-2014)
-- Co-ordinator and Principal Investigator: HyghTra (HyghQuality Hybrid MT System -- FP7 Marie Curie IAPP)
-- Principal Investigator (for CTS, Leeds): ACCURAT FP7 ICT project
2009, October -- 2010, August: University of Leeds, CTS
-- Research Fellow, TAUS, TTC, ACCURAT projects
-- Projects: Intelligent access to shared translation resources (TAUS fellowship)
FP7 Translation, Terminology and Comparable Corpora (TTC)
FP7 Analysis and evaluation of Comparable Corpora
for Under Resourced Areas of machine Translation (ACCURAT)
2007, October -- 2009, September: University of Leeds, CTS
-- Leverhulme Early Career Research Fellow
-- Project title: Translation Strategies in Comparable Corpora
2005, April -- 2007, October: University of Leeds, CTS
-- Post-doc Research Fellow, EPSRC-funded project ASSIST
2005, April: PhD in Machine Translation, University of Leeds
Thesis: "Information extraction technology in machine
translation: IE methods for improving and evaluating MT quality"
2002, October -- 2005, April: University of Leeds, CTS and
University of Sheffield, Department of Computer Science
"White Rose" PhD studentship: for the project "Information Extraction Technology in Machine Translation"
supervisors: Anthony Hartley (Leeds), Yorick Wilks (Sheffield)
2000 -- 2001: "Lernout and Hauspie Speech Products", Belgium
Corporate R&D, Linguistic Engineering Department
-- Computational Linguist at Text-to-Speech systems group
2000, February: 'Candidate of Sciences'
in Ukrainian Linguistics, Ukrainian National Academy of Sciences
Thesis (Ukr.): "Interpretational model of surface syntactic structures in Ukrainian"
1996 -- 2000: Ukrainian National Academy of Sciences
Institute for Language Information Research
Graduate student, Research Fellow
1996, June: Diploma in Ukrainian Philology
and Computational Linguistics, Kyiv University (Ukraine)
1991 -- 1996: Kyiv University (Ukraine), Ukrainian Linguistics and Literature, Computational Linguistics
PROGRAMMING
Python, Perl, Prolog, Java, JavaScript/HTML/CGI, bash, AWK
GitHub repositories
Projects, corpuslabs, ideas, articles... on GitHub https://github.com/bogdanbabych
LANGUAGES
English, Ukrainian (native), Spanish, German, Dutch (elementary)
AWARDS / FELLOWSHIPS
2007 -- 2009: Leverhulme Early Career Research Fellowship
2002 -- 2005: "White Rose" PhD Studentship
1998 -- 1999: Scholarship of the National Academy of Scineces, Ukraine
1994 -- 1996: Taras Shevchenko Scholarship, Kiev University, Ukraine