- Ondrej Čičkán: Comment classification in Community Question Answering
- Jakub Gedera: Reconstruction and Normalization of Slovak Texts
- Štefan Grivalský: Natural language processing using neural networks
- Michal Hucko: Looking for interensting places in user’s records
- Rastislav Krchňavý: Aspect-Based Sentiment Analysis
- Lukáš Manduch: Active fight against spam
- Róbert Móro: Navigation Leads for Exploratory Search and Navigation in Digital Libraries
- Samuel Pecár: Ontology learning from text
- Branislav Pecher, Michal Kováčik, Jozef Mláka, Pavol Ondrejka: Development of Inovative Application in International Competition
- Matúš Pikuliak: Neural Language Models
- Michal Puškáš: Advanced search and visualization
- Márius Šajgalík: Modeling Text Semantics
- Andrej Švec: Modelling the appropriatness of text posts
- Andrej Vítek: Online support solution for educational exercises
- Filip Vozár: Sentiment analysis from text about given object
Comment classification in Community Question Answering
master study, supervised by Marián Šimko
Abstract. Community Question Answering (CQA) forums have become very popular in last past years. It is widely open to public and everyone can contribute to problem solving of others and so it provides large repository of knowledge. Find out the best answer to new question in existing repository of questions and anwers would be useful not only for CQA services to reduce question duplicate, but also for automatic question answering.
In our work, we focus on ranking comments under question thread in the CQA forums. We will use annotated data published in SemEval 2017 (international workshop on semantic evaluation).
Reconstruction and Normalization of Slovak Texts
master study, supervised by Marián Šimko
Abstract. Many journals use sentiment analysis to detect misconduct in the discussions. The problem is that on the Internet dominates non-formal language, which complicate task of sentiment analysis. Typical feature of post is that users use emoticons that have strong impact in sentiment analysis. A lot of negative posts include funny emoticons, which affects the accuracy of the result, and vice versa. People fairly confidently assume that they can correctly identify emotions in text messages. Experiments from Chatham University found that this is certainly misleading.
The aim of our work is to reconstruct and normalize the input text. We mean to determine emotions from text and correctly replace emoticons that have different meaning than text emotion. Detecting emotion from text is a relatively new classification task. To solve this problem, we use emotion detection model. We consider Ekman’s six emotions class (joy, sadness, anger, disgust, fear, surprise).
Finally, we plan to compare success of sentiment analyzer on posts before reconstruction and after using our method that replace emoticons in post based on emotion of post.
Natural language processing using neural networks
bachelor study, supervised by Márius Šajgalík
Abstract. Natural language processing is a field at the intersection of computer science, artificial intelligence, and computational linguistics, which is focused on the analysis and comprehension of human (natural) language. Many different researches in this field aim to assemble information about human comprehension and language use. This knowledge is later used to develop tools and techniques, which can be used for computer systems, for manipulations and the use of natural language which aims to fulfil concrete tasks.
In our work, we focus on the categorization of texts, more specifically on language identification task using neural networks. We can divide this task into two basic parts, of which the first is composed of the alphabet identification and the second of the analysis of linguistic features. To solve this problem, we investigate the suitability of multiple currently popular neural network architectures.
Looking for interensting places in user’s records
bachelor study, supervised by Mária Bieliková
Abstract. Nowadays is internet full of opportunities to collect data from different sources. We can collect information about user’s behavior from different angles. Text input, mouse events and even the eyes. But we cannot analyze all of them.
In my bachelor’s work I am analysing user records from web services. Especially those records which were collected while answering questions. I concentrate on longer string answers where is not just one possible answer. Sometimes it is very hard to evaluate each from these documents manually. It is even impossible when we are dealing whit thousands and more records. Automatization of this process would be helpful.
My main goal is to help with checking these answers. I am trying to answer the question: How can answer clustering help with evaluating the content of answer? In my work I am trying to apply different metrics and methods used mostly in text classification. My dataset consist of students answers which were collect at our university during last years.