Cross-Lingual Learning


Deep learning is revolutionizing natural language processing for English or Chinese, but other less developed languages (such as Slovak) simply can not use these new exciting technologies simply because of the lack of data. Most of the world languages lack the necessary amount of supervised (and unsupervised) data to successfully train more advanced and complex neural network architectures. Cross-lingual learning is one solution for this problem, as it tries to transfer the knowledge learned from rich languages to others. This knowledge can be transferred on various levels (label-level, model-level, parameter-level) that differ in how hard the transfer is and what resources we need to do it. In my work I focus on creating a multilingual language independent representations for sentences or higher lexical units. Such representations encode the semantic information present in the text into short vector. This vector is in fact a projection of sentences into feature space shared across languages. Such representations can then be used to transfer knowledge that models in one language learned about this space to other languages.