Students’ Research Works – Autumn 2012

Search and Recommendation

Information Analysis, Organization and Navigation

User Modeling, Virtual Communities and Social Networks

Domain Modeling, Semantics Discovery and Annotations


Doctoral Staff

bielikova
Mária Bieliková
Professor
o collaborative/personalized/context search and navigation
o recommendation for adaptive social web
o reasoning in information space with semantics
o user/user groups and contexts modelling

barla
Michal Barla
Postdoc
o user modeling
o implicit user feedback
o virtual communities
o collaborative surfing

jtvarozek
Jozef Tvarožek
Postdoc
o social intelligent learning
o collaborative learning
o semantic text analysis
o natural language processing

simkom
Marián Šimko
Postdoc
o domain modelling
o semantic text analysis
o Web-based Learning 2.0
o ontologies, folksonomies


Personalized Reading Resources Organization

burgerRoman Burger
master study, supervised by Mária Bieliková

Abstract. With vast amount of accessible information and resources through web, one may start to carefully filter what to read and what to ignore. Anyhow we try, chances are that our workspace will eventually get overwhelmed with quantum of resources stored for later reading or revisiting.

The most common approach for storing resources found on web is via bookmarks (or favourites). Unfortunately, keeping track of bookmarks organization and structure with large collections becomes tedious task. Users need to manually manage theirs bookmarks which produces significant overhead while trying to work with stored resources. It is not uncommon for users to actually abandon the management due to inconvenience.

In this project, we will propose a method for faceted bookmarks retrieval presented in hierarchical folder structure. Faceted filters are based on various meta-data retrieved from bookmarks and browsing session. Filters can be chained to specify search as much as possible while utilizing associative memory of users. Associative search is combined with hierarchical folder structure which resembles typical bookmarks organization that users should be familiar with. With faceted search in users bookmarks, no organization management shall be needed. On top of faceted search, users shall maintain ability to custom build specific folders and bookmarks contained in them, for example bookmarks related in specific project.

Imagine Cup 2013 – World Citizenship

demcak_galbavy_simek_strbakova
Peter Demčák, Ondrej Galbavý, Miroslav Šimek, Veronika Štrbáková
bachelor study, supervised by Michal Barla

Abstract. Imagine Cup is an international competition for students organized by Microsoft. Its goal is to encourage students from all around the world to make an innovative and useful application which can potentially change our world for a better place. This time the Worldwide Finals will be held in Russia’s St. Petersburg.

The main theme of Imagine Cup 2013 is “Dream it. Build it. Live it.” The theme is one of the several things that have changed up a little from previous years. Imagine Cup is now divided into these three big competitions: World citizenship, Innovation and Games.

Our attention is directed mostly at World citizenship competition which deals with serious worldwide problems that are yet to be solved. However we consider the Innovation competition as well.

A very important step to succeed in Imagine Cup is to come up with the right project idea. This demands taking many factors into consideration. We have a few ideas that deal with problems of people with certain handicaps. Important part of choosing and later working on an idea is to discuss it with other people, especially experts from the actual domain. With research behind us – and a lot still in front of us – we aim to chose the right idea, and do our best to make it come to life.

Crowdsourcing and Gamification as the Means of Metadata Acquirement

dulackaPeter Dulačka
master study, supervised by Jakub Šimko (team project)

Abstract. People react differently on various types of recommendations. Wrong article recommendation can be more easily overlooked than bad movie recommendation – mostly because of time spent on evaluating given (wrong) reference. Therefor it is crucial to have as much of as correct data as it is possible. To achieve it, various methods have been presented so far. Usage of automatic approaches showed us how incorrect computers still can be. On the other hand crowdsourcing turned out as a very effective approach to achieve it; but lack of motivation holds the progress up.

To motivate people do the work computers are not able to do correctly (or at all), we need to provide some kind of reward. Games with a purpose turned out as a good motivator by providing plain fun – when properly designed. Achieving our goal is also possible by using just elements of games in serious application (e.g. Foursquare and its rewarding and badge mechanism combined with social elements). In our work we focus on discovering new means of motivation for players by trying various gamification elements and the proper combination of existing ones on different types of players. Done correctly in combination with mobile technology, it could be a great source of data.

Recognizing User’s Emotion in Information System

fejesMáté Fejes
master study, supervised by Jozef Tvarožek

Human emotions and theirs signs are inbred characteristics of every people, regardless to the particular person. Thanks to that they can serve as implicit feedback from the users in information systems. Mimics of the face are unconscious signs of psychical reaction of people. According to a number of researches in the area of psychofeedback different movements of the face are common for all people, so we can derive theirs reason – emotions. In case of education systems we can estimate the users’ opinion about the red text from mimics, so we are able to find out knowledge, interests, mood and other attributes that wouldn’t be possible to identify by the help of traditional ways of gaining feedback.

In this project we deal with gaining, representing and utilizing of user emotions while using web based education system. To find out, what is on the subject’s mind, we need to have a camera that records the user’s face. For the extraction of emotions from video we can use a number of existing tools, which are able to recognize human face and its facial features. Facial features are important points of human face (e.g. border of mouth or eyes). Theirs location depends on the movements of facial muscles therefore on the emotions of the user.

Our aim is to propose a method for user modeling based on emotions invoked during work in web based education system. Our method is going to be based on results of experiment we plan to realize in real environment with a number of users. Within the experiment we will track the users by a webcam while they are working in the selected system. By comparing of extracted emotions with users’ activities we will try to find relations between psychical activity and executed actions in user interface. The goal of the experiment is to identify activities or theirs groups specific for users in certain emotive state. This way we will be able to find out emotions invoked by given content by traditional types of feedback.

Recommendation of Multimedia Content

edoEduard Fritscher
master study, supervised by Dušan Zeleník (team project)

Abstract. In our times it is very important that web pages or applications, not only store information, but it is also needed that the page or application could communicate with the user in certain ways. Because of the growth of the world wild web the amount of information which is stored in certain applications and pages has increased. To solve the problem of this information burst recommendation technics and methods were invented, but as the world changes, the access to the internet also changed. People are connecting to the world wild web from mobile smart devices and this opened a whole new era for information extraction and recommendation.

With this new unexplored territory for data extraction and recommendation there is a variety of ideas that could be implemented and tested that could push forward research in information technologies. Just the combination of standard data extraction threw communication protocols and the usage of a geological mapping threw gps receivers could open a new hole dimension of research.

Adaptive Feedback in Web Systems

grznarMarek Grznár
bachelor study, supervised by Martin Labaj

Abstract. One of the main processes in the adaptive web systems is the communication between the system and its users. The user evaluates the presented information, e.g., whether it was useful or helpful. Based on this information, the adaptive system provides recommendations of other content useful for the user.

The processes of feedback collection and result presentation used in the adaptive methods could also be adaptive themselves. For example, a window with a rating button and recommendations can appear just when the user finishes reading an article. Many such situations exist when we need more information from the user. For instance, when we find out that the user frequently provides incorrect answers to exercises, we notify the user that she should consider studying more explanations first and recommend such learning objects to her.

Our goal is to propose and evaluate an adaptive feedback method for web systems. Our project will be employed in the ALEF adaptive learning system , where it will be able to help students in their learning process.

Extracting Keywords from Educational Content

harinek2Jozef Harinek
bachelor study, supervised by Marián Šimko

Abstract. When dealing with many documents we need to have an efficient way to work with them. The classic human-recognized document representation (the coherent text as we read and understand it) is not good for computer processing. Computer cannot read the text as we do, so something easier to process is needed. One way of such simplified description of the document’s contents are keywords. They allow us advanced manipulation with the document. Usually there are many documents, hence it is impossible for us to add keywords to each document manually. There are many methods to extract keywords from the text (semi)automatically which are still being improved and much research is in progress.

The aim of our work is to analyze the possibilities of automatic keyword extraction methods, combine and improve some of it. Besides the automatic term recognition (ATR) algorithms we also want to use some user interaction, such as notes or text highlighting, which can help us identifying the relevant terms that describe the document the best.

We particularly aim towards educational documents in a school project which helps in managing study texts. Since the procedure depends on the language we extract keywords from, we had to choose one in which the work will be done. The method will be developed for Slovak language.

Linking Data on Social Adaptive Web

holubMichal Holub
doctoral study, supervised by Mária Bieliková

Abstract. Although the idea of Semantic Web is not new, the most information published on the Web is still intended for humans rather than computers. Moreover, information about one entity can be found scattered across many sources and is often inconsistent. There is a need for data linking methods which would discover relationships between entities from different sources. There are various types of relationships which can be found. One of the types is hierarchical links (e.g. isPartOf, isA). The other type represents complex real-world relations (e.g. earthquake causes tsunami) which are harder to discover than hierarchical relations. Recently, approaches for connecting entities based on the way people interact with them on the Web emerged. This is also the topic of our research.

We focus on lightweight semantics expressed in the form of triples (subject, predicate, object), which is also one of the principles of Linked Data initiative. Our target domains are digital libraries and knowledge of software developers in a company. Here, we create a domain model from entities in these domains using Linked Data approach. In digital libraries it means creating a graph of authors, articles, publications, events gathered from various web portals and linking them together. In software development domain we work with entities like source codes, programmers’ notes and web-based documentation. We also use Linked Data to model programmers’ knowledge of various technologies.

Additionally, we track users’ activities while interacting with the entities concerned (doing research in a digital library, programming). On the Web we track mouse and keyboard actions as well as interest indicators (text selection, page bookmarking, etc.). From this interaction we derive new relationships between entities and enhance the domain model. This can be used e.g. for creating better adaptation and personalization services.

Augmenting the Web for Facilitating Learning

horvathRóbert Horváth
master study, supervised by Marián Šimko

Abstract. Every day, users on the web go through large amount of articles and documents while fulfilling various needs. It takes large amount of time and we think this time can be spent more effectively. Information technologies and text augmentation methods are able to provide user with additional information during web browsing, which is helpful in learning process like learning new languages. Those texts are written in natural language which is a problem. It is not understandable for computers and therefore web augmentation is a complicated task. Finding methods which allow augmentation of selected parts of web documents is a research challenge in field of Technology enhanced learning.

In our work we aim to find a method for web augmentation during casual web browsing, which helps with learning process in domain of foreign language learning. Method we propose substitutes words on a webpage for their foreign equivalents, therefore users attention is needed for understanding the meaning of article and foreign words as well. Potential for this approach is supported by agreement of experts that vocabulary acquisition occurs incidentally and minimal mental processing (of presented vocabulary) can have memory effects. Our method represents user knowledge and its specifics (for example forgetting) in open information spaces. It is important to take into consideration the amount of knowledge user already has and his goals he would like to reach. As a tool for web augmentation we create web browser extension. And to evaluate proposed method, we plan to conduct an experiment with a selected group of users.

Recommendation for smart TV

kassakOndrej Kaššák
master study, supervised by Michal Kompan (team project)

Abstract. A few years ago televisions determined the content of broadcasting itself. It was mainly intended for the general public, because televisions wanted to achieve the highest ratings. Viewers could watch only what was being currently broadcast. Nowadays technologies proceed so that it is possible to focus on each viewer individually. For example, smart TVs offer their customers an archive of broadcasted shows, movies and news. This way everyone watch what he wants and when he wants, regardless of the main broadcast. But viewer does not often know that the archive may contain some content of his interest.

This opens up the possibility for deployment of the personalized recommendation. We are able to approach content to each viewer individually and advise him relevant content only that will interest him. There are several ways how to choose the recommended content. The first way is to recommend similar content which he has watched in the past. The second way is to recommend a content which has been watched by the audience with similar interests. Currently is common to use the combination of both approaches and take advantage of both of them. In the team project we will try to design a recommendation system for smart TV.

Group Recommendations for Adaptive Social Web-based Applications

kompanMichal Kompan
doctoral study, supervised by Mária Bieliková

Abstract. Personalized recommendation is a well researched area nowadays, while the social activity increase over the web is highly visible, thus, the group recommendation still needs to be explored. Several of our daily activities are not single but group based. Aspects of the group recommendation become more visible as larger part of our lives transformed and moved to social networks. The group recommendation can be used not only in the mean of recommendation of items to the group of users. Several types of groups can be observed – temporal or stable, forced or natural, moreover in the context of web we can consider virtual groups as well.

This brings us to the assumption that principles of the group recommendation can be used in the standard – single user recommendation tasks. Firstly the aggregation strategies used for the group preference aggregation can be used in the multicriteria problem. Similarly groups can be useful when a new user interacts with the system which refers to the cold start problem. The task of the group recommendation can be extended to the standard single user recommendation as well. The aggregation of single user profiles in order to obtain one group profile combines user preferences and also in some settings can introduce variety, which can be interesting from the recommendation improvement point of view.

Web Content Prefetching

konopkaMartin Konôpka
master study, supervised by Karol Rástočný (team project)

Abstract. The importance of the Web in every area of life, especially in education, positively affects development of countries around the world. Developing countries struggle to catch up the rest of the world with quality of the Internet connectivity. These countries would especially benefit from the Web as an almost infinite source of information. We realize that we are not able to improve the current network infrastructure in developing countries. However, we can improve the utilization of available slow and intermittent Internet connection using the software solution.

Our solution called OwNet enhances the Web surfing experience by application of caching and prefetching techniques. In addition, it also offers a set of collaborative tools to help new inexperienced users with exploring the Web. We are currently deploying OwNet to schools in rural areas of Kenya.

The author has focused his work on the module for Web content prefetching. Using the feedback from the users and usage data we plan to revise prefetching algorithms and evaluate their effectiveness in real environment.

Analysing Temporal Dynamics in Search Intent

kramarTomáš Kramár
doctoral study, supervised by Mária Bieliková

Abstract. Understanding how searchers change their search goals over time is important for purposes of a personalized search. This knowledge is crucial for building a time-sensitive ranker, search intent prediction and multitude of other search related tasks. In our work, we analyse temporal search intent dynamics, using the publicly available AOL search engine clickthrough log.

We hypothesize, that each person is having multiple personas. The term roots in psychology, where it denotes a social face the individual presents to the world, it reflects the role in life that the individual is playing. We believe that among many personas an individual can have, two should stand out: the persona related to personal life, and persona related to work life. Separating these two personas and creating a separate user model for each of them has the potential to bring a user model that is focused, similarly to the lean, short-term model, and yet has enough data to allow confident adaptation.

User Modeling Using Social and Game Principles

kratkyPeter Krátky
master study, supervised by Jozef Tvarožek

Abstract. Personality has a significant impact on the way user uses web applications. Many systems would take advantage from adapting content or search according to personality traits. Examples of such systems are education systems as users of different personalities have different learning habits. Therefore a need for modeling of user’s characteristics and traits arises. Explicit methods for retrieving information about user might be obtrusive or might seem too personal for him. That’s why we want to perform user modeling in funny and appealing way – using computer games.

Goal of our project is to design an implicit user modeling method aiming at user’s personality. The first part of our work is to deal with an issue of retrieving relevant data of the user to leverage on. We will design a set of little games containing as many universal gamification elements as possible (points, time pressure, leaderboards…). We will track the tendency of the user to be interested in specific elements as well as changes in behavior when such an element appears. In the second stage of our work we will design a method to infer the user model from the collected data. We will verify the user modeling method by an experiment at the faculty, within which we will retrieve users data both explicitly using questionary and implicitly using our method and we will compare the inferred models. We believe that the modeling method could be reusable for other games containing universal gamification elements.

Activity-Based Programmer’s Knowledge Model for Personalized Search in Source Code

kuricEduard Kuric
doctoral study, supervised Mária Bieliková

Abstract. Web is a wide information space which provides increasing range of resources in the form of text, graphics (pictures), animation, sound and video. Users use search engines in order to search for the required information resources. In addition to resources in natural language, users (programmers) create a „spider web” of software artifacts (components) in which is also necessary to search. Programmers often use the web as a giant repository of source code which can be used to solve their software development tasks (problems).

To support search-driven development it is not sufficient to implement a “mere” full text search over a base of source code (web of software components), human factors have to be taken into account as well. To assist and help programmers in locating and understanding source code in the process of reusing, there are some other challenges, such as cognitive and social aspect, too.

There are some crucial cognitive barriers which affect programmers in the process of concept location. In our work, we focus on the barriers with the goal to help the programmer locate the task-relevant information which facilitates her to realize target development tasks. Our goal is to propose activity-based programmer’s knowledge model and methods for its automatic retrieving which will take into account such factors as programmer’s work experience and interactions with source code, authorship duration (code stability), and role of source code elements. Activity-based knowledge model can be used for (social) collaboration or personalizing information. We concentrate on identification of reputable source code. Reputation ranking can be a plausible way to rank source code results. We have proposed two methods for supporting this goal, namely, a method for identifying popular source code fragments and a method for automatic extraction programmer’s experience from source code the programmer works with.

User Feedback in Personalized Recommendation

labajMartin Labaj
doctoral study, supervised by Mária Bieliková

Abstract. Practically every personalized system operates on some form of user modelling. The data collection (from feedback gathered from users) is an important stage of this process. User input, both implicit and explicit, is used to infer user’s knowledge, interests, or traits and based on this, information can be adapted for the user. Recommender systems suggesting items, pathways through items, sequences, etc. are one example of personalized systems.

In our research, we are exploring use of implicit and explicit feedback in personalized systems, specifically in personalized recommendation. Although both types of user feedback and their combinations have been already extensively researched, there are open problems. In implicit feedback, we focus on user browsing behaviour, where parallel browsing (tabbing) is invisible to traditional web usage mining methods (as opposed to often assumed linear browsing). We proposed a model of browsing behaviour and tabbing reconstruction from client-side scripting and realized it in an adaptive learning system, discovering new relations between learning objects and knowing how users work with resources while studying.

When collecting the explicit feedback, it is important when to ask for it and how. We created an adaptive evaluation questions method, where we elicit both object rating and personalization evaluation feedback from the user at the correct moments without interrupting her work. Retrieved feedback is of more quality and quantity than when asked at random moments or when we are waiting for the user to take the initiative and provide the rating herself. Another issue is the way in which the users rate – a user should rate with personalized scales and the collected feedback should be comparable between users with different rating styles (e.g. one user always rates low and only occasionally uses high rating, while other user always rates high) and even on different scales.

Personalised Recommendation of Learning Sources

lacnyJozef Lačný
master study, supervised by Michal Kompan

Abstract. Nowadays large amount of information is offered to the user via various information systems and e-shops. Therefore selection of this information is very important for the user. There has been much work done in the field of developing recommender systems to provide relevant information for the user – systems based on collaborative filtering, content analysis and many others.

Interesting application domain for personalized recommendation arises in the field of learning systems, where one of the challenges is to recommend appropriate learning sources to accomplish best studying results and enhance learning efficiency.

In our work we aim to find a method for recommentadion of learning resources in personalized learning system for implicitly determined temporal groups. Our method will take into account the users’ expertise in studied area but the overall expertise of the group as well to recommend propriate learning resouces. To accomplish that, we need to find a comprimise between the single user’s satisfaction and the group’s satisfaction. We believe that supporting collaborative learning and communication within the group in the learning system will increase the user experience and help students to achieve better results in shorten time.

OwNet – The offline web

laniMarek Láni
master study, supervised by Karol Rástočný (Team Project)

Abstract. Last year, I have had chance to take part in competition called Imagine Cup. Our team worked on project – OwNet, whose main idea was the offline Internet. Work on this project was also part of my bachelor work and I was focused mainly on functionality connected with user groups. As the offline Internet became theme for subject Team project this year, our team growed bigger and we are going to further develop OwNet. At the same time I am in state of choosing a theme for my diploma work.

Researcher Modeling in Personalized Digital Library

mliptak2Martin Lipták
master study, supervised by Mária Bieliková

Abstract. Researchers use digital libraries to either find solutions to particular problems concerning their current research or just to keep track with the newest trends in areas of their interest. However, the amount of information in digital libraries grows exponentially. This has two serious consequences. Firstly, many interesting works are unnoticed. Secondly, researchers spend too much time reading articles that turn out low-quality, unrelated to their current research or unrelated to their other interests. These kinds of problems are nowadays solved with recommendation systems or more effectively with personalized recommendation systems. The core of every personalized system is its user model.

Our aim is to design and implement a user model based on data from Annota. Our model will leverage the articles the user has read, the tags and folders she has used, the terms she has searched for, etc. Furthermore, user data from the Mendeley library organization service will be integrated. Probably a personalized article recommendation service for Annota will be used for its evaluation. At this point we are also considering other options as personalized search results or personalized article recommendation service for Mendeley. Based on available user data and evaluation options, we will seek for suitable representation and creation process of researcher (user) model in domain of digital libraries.

Unified Search of Linked Data on the Web

p_mackoPeter Macko
master study, supervised by Michal Holub

Abstract. Searching for information on the Web is increasingly difficult because of its enormous growth. To make matters worse, most of the data published on the Web is in unstructured format. However, more and more structured data is being published, which is also evident from the emergence of unifying initiatives like Linked Data. Structured data enable us to make web applications allowing users to search for information more comfortably. But querying this type of data is not a trivial task.

Nowadays, there are various structured data sources, but only few search engines are able to search in them utilizing the full power of the provided semantics. The majority of the search engines search for information using keywords which may not always give the users the results they desire. To utilize the full power of the structured data a special query language, like SPARQL, has to be used. However, queries in this language are not easily constructible for majority of standard users.

We would like to change this fact by creating complex search engine which could understand pseudo-natural language of humans. These queries will be transformed to SPARQL language and executed on an ontological database. Our method handles queries from the user. Then we analyze query using dictionary and Standford CoreNLP and try to match separate words against entities and their relations in database. After this step, we have enough information to construct request in SPARQL language against ontological database.

Secondly, our interface gives the user ability to modify constructed query. Every modification is logged and we learn from this modification for next searches. We plan to evaluate our method in the domain of scientific articles, authors and other parts of ACM, Springer and other digital libraries.

Querying large Web repositories

marconakMatej Marcoňák
bachelor study, supervised by Karol Rástočný

Abstract. Nowadays, the Internet is huge source of knowledge, information and data. It is necessary to efficiently store and process these data for their further use. The amount of these data exceeds capabilities of one machine or server, so it is necessary to look for other options and approaches to data processing. One of solutions to the problem is based on parallel data processing on multiple machines.

Number of these data is also related to expanding trend of the Semantic Web, which describes content of websites and allows a better collaboration or interactions to people and computers. The data of the Semantic Web are often represented as RDF triplets of subjects, predicates and objects (e.g., John, is friend, Mathew) and organized in ontologies. Ontologies and RDF data are standardly queried by SPARQL and its extended forms, which is also mentioned as query language of the Semantic Web.

Unfortunately, quantifies of these data do not allow us to store all data info RDF repositories efficiently. Therefore we try to store domain specific data in NoSQL databases. But NoSQL databases do not support querying by SPARQL, so we decided to propose MapReduce algorithm for evaluation of SPARQL and its advanced features. We plan to evaluate our approach on MongoDB with lightweight ontology of information tags stored in it.

Semantic Wiki

markechMartin Markech
bachelor study, supervised by Jakub Šimko

Abstract. We use internet daily. We are reading text, which is understandable for us – human, but on the other side is not understandable for machines.

With a proliferation of Web 3.0 or Semantic Web, we are able to describe web resources with more semantics. Thanks to Linked Data initiative we also join these data to each-other across the Web. With linked data it is more easily to find related data and explore the web of data.

In our work we focus to create easy-to-use method to add semantics into wiki application of our faculty, without need to write any RDF markup by the end user. We analyze and take into account specific needs of PeWe group members, like organizing events or bibliography linking. We use RDF store Sesame to store RDF triplets and try to create web service for automatization of bibliography linking. Not easy task is to solve the problem of mapping semantics to text, in process of writing articles in markdown editor. Many well-known solutions are developed only for html editors. Another usefull functionality is annotating of content. We create place to express user’s ideas or improvements there. Because of many those new features, we use toolbar to improve user experience and working with application. At the end we will get linked data, with the ability to create semantic search.

Imagine Cup 2014

tamajka_minarikMartin Tamajka, Matej Minárik
bachelor study

Abstract. Our role is to observe and learn as much as we can during PeWe seminars. We are planning to take part in Imagine Cup 2014. Imagine Cup is a prestigious student technology competition, which goal is to solve tough challenges with the most modern Microsoft software. We have not decided yet on the project topic, we consider Innovation or World Citizenship category.

Context-Aware Physical Activity Recommendation

mitrikŠtefan Mitrík
master study, supervised by Mária Bieliková

Abstract. We live in the age full of information, and automatic filtering or recommendation of the information can helps us to get the most valuable ones. We need to involve user preferences into this process. Most of the current solutions work with long term user observation whereby recommendation of proper information is done.

The smartphones and intelligent mobile devices are great way to determine current situation or context of the user and his or her needs. The context of smartphone includes not only location but also other interesting information such as network connectivity, lightning conditions and more. Thanks to the Fitly – All Day Activity Tracking Android app, we have information about user’s transfers during the day. With this, we can calculate probabilities for future transfers and recommend her many interesting information that can encourage her to do more physical activity.

Users can set their daily goal, so we can adapt our recommendation towards it. With information about their physical activity patterns and data from their calendars, we can adjust our recommendation so it is in harmony with their plans and habits. Example of such a recommendation can be short-term physical activity or incentive message.

Tag cloud navigation

molnarSamuel Molnár
bachelor study, supervised by Mária Bieliková

Abstract. Nowadays, we are using many tools to organize notable web content such as websites, articles or images we stumbled upon while browsing web. In general, these tools help us to classify the content by categories, label it with relevant tags and share it with others. The effort spent on creating such classification is rewarded in the future when user can quickly find a desired content by choosing category or tag. However, with growing amount of categories and tags, finding a relevant content that suits user’s needs might be time-consuming and after all, cumbersome.

In our work we propose an enhancement of tag cloud navigation. It employs user’s context and relations between tags associated with web content. By exploiting the context we are able to take user’s annotations, highlighted text and any other significant factors of user’s activity into account and utilize this knowledge to reorder relevant content by further categorization such as user’s interests and relevancy for particular tag. By using annotations and highlighted text we are able to retrieve specific fraction of content that user considers as important and extract even more relevant information for particular user. Discovered keyword relations might be used to gain better knowledge of the specific domain and more comprehensive overview of tag relations.

Our domain for navigation are digital libraries. We implement our proposal as a module into a system for web page annotating – Annota, which is being developed by several PeWe group members.

Enabling Information for Social Adaptive Web

moroRóbert Móro
doctoral study, supervised by Mária Bieliková

Abstract. Nowadays, keyword search is a prevalent search paradigm on the Web. We use a set of keywords as a query describing our information need and get a simple list of results in return, where each result is usually represented by its title, URL and a short snippet. This approach works reasonably well for simple information retrieval tasks such as fact finding.

However, if we are posed in front of a complex information seeking task such as researching a new domain, we can feel overwhelmed by the amount of information available and even become “lost in hyperspace” when trying to navigate ourselves on the Web. These complex tasks require for us to employ different exploratory search tactics and techniques and to aggregate information from various heterogeneous sources. Moreover, they can span over multiple sessions.

We research innovative ways of search results presentation, visualization and aggregation to help users get better acquainted with the new domain and support its exploration. We are also interested in researching social aspect of the search which becomes more and more prominent on the Web. Users form virtual communities; social links emerge also when tagging useful content. It is important to properly visualize these emergent links to help users efficiently navigate in the search space in order to fulfill their information needs.

Metadata Collection for Effective Organization of Personal Multimedia Repositories using Games With a Purpose

nagyBalázs Nagy
master study, supervised by Jakub Šimko

Abstract. Nowadays, an average person is overloaded with enormous amount of digital data. Besides multimedia (music, videos, images) we can mention also emails, web pages and information on social networks, blended together in a hypertext environment. For implementation of effective search and navigation in this space it is necessary to have enough descriptive metadata available for these resources. These can be collected automatically or manually through crowdsourcing methods and in particular, by games with a purpose.

In our research, we focus primary on image metadata acquisition. One of our goals is to upgrade and extend an existing game with a purpose (GWAP) called PexAce, which collects useful annotations for photos and transforms them to tags. Due to lack of metadata for personal photo albums we want to focus on obtaining descriptive metadata for this kind of media. Using them we will be able to query, order and filter these enriched photo albums much better. Our main goal is to propose an effective processing method for obtained data using various algorithms.

Our previous experiments with the PexAce within general domain indicate that this method of obtaining metadata is effective. According to our expectations, we should get positive results also after using our method in specific area such as personal photo albums. In fact, user may be more motivated because they annotating their own photos. Another side effect of this should be reflected also in the quality of obtained tags.

Recommendation based on Difficulty Ratings

nogaMatej Noga
bachelor study, supervised by Martin Labaj

Abstract. In adaptive web systems users often provide explicit feedback to displayed information in the form of ratings. Based on this, we can consequently infer information (whether an exercise was appropriate or not, too easy or too difficult, …)about rated objects and use them for recommendation. For example, in the case of an adaptive learning system, users can rate difficulty and usefulness of learning objects. The user (student) in the learning system has to face a large amount of text, examples, exercises, and questions, and she is often confused where to start and how to proceed, especially at the beginning of the course.

As a result, a recommendation system is important – it will help the students find exercises currently suitable for them (not too easy nor too difficult for them in the given time) and a sequence of explanations they should study. With known features of learning objects, we can create a recommendation for the user using her current knowledge expressed in the domain terms, combined with her ratings and the ratings of sooner users who had similar knowledge.

Our goal is to design a method for recommendation of learning objects that suit particular user’s needs. We will base it mainly on content-based and utility-based approaches. In the method, common problems such as the problem of “a new user” need to be addressed. Finally, we will realize and evaluate the method in the ALEF system.

Linking Data on Web

ondrej-proksaOndrej Proksa
master study, supervised by Michal Holub

Abstract. Last year, I had researched public data. A necessary precondition of transparency is the publication of data that allows greater control by the public. Automated post-processing, structuring and matching data from various sources became a huge problem. There are public data available on the Internet in very confusing and unstructured formats.

The main aim of my work was to, downloaded, processing and cleaning data, which will then pair with resources from other public sectors. It should be a tool for downloading, access methods to analyze the similarity of words and make its own system for creating and mapping data connectivity between them.

During my Master studies I’ll continue to act within the topic of connecting and integrating data. My focus will be to connect data with the help of Linked Data. The first part will create new datasets – in the area of public data and other domains. In the experimental part, I’ll search for suitable methods for the integration of data and compare those methods.

Metadata Maintenance for Large Information Spaces

OLYMPUS DIGITAL CAMERA

Karol Rástočný
doctoral study, supervised by Mária Bieliková

Abstract. Current content processing and presenting systems create a lot of different metadata that contain valuable information, for example logs about users’ behavior or derived concepts. These metadata are closely related to their resources – data in repositories of information spaces. But these data are not static and all their modifications affect validity of metadata, so metadata have to be maintained. Because several types of metadata exist and probably each type needs specialized maintenance approach, we have aimed to information tags (descriptive metadata with semantic relations to a tagged content) and we are working on a proposition of automatic information tags maintenance approach and information tags representation which is suitable for effective maintenance.

Due to structural similarity of information tags to annotations, we based information tags model on widely accepted Open Annotation model. Open Annotation model allows complex structures and it is proposed for RDF repositories, so we have lightened the Open Annotation Model and we have redesigned it to an object model which can be stored to fast and scalable MongoDB repository.

Problem of metadata maintenance has not any sufficient solution. But this problem can be divided to two partly indifferent sub-problems. The first is maintenance of anchoring, which can solved by accurate robust position descriptor. The second problem is maintenance of bodies of metadata. This problem is not solved in current approaches of metadata maintenance. We work on proposition of information tags maintenance which engages:

  • Machine learning techniques – the most of information tags are created by systems that use deterministic algorithm. So there is good chance to learn some dependencies between modifications in files and necessary updates of information tags;
  • Crowd computing – we can employ systems (explicit and implicit) feedback for assurance that an information tag is anchored to right position and that an information tag is valid.

Web Browser as a Platform for User Modelling

sajgalikMárius Šajgalík
doctoral study, supervised by Mária Bieliková

Abstract. Modern web browser as we know it and use it often represents our main everyday work tool. Inside web browser we work with various web applications, play online games, read news from all around the world and search the largest library in the world. Thus, it has a great strategic position by linking all these various activities of ours into a single place. This means lots of data from all these sources which is potentially accessible for extensive data mining and modelling the user.

We have taken advantage of new HTML5 standards and common features of browser extensions being implemented in all major web browsers to build up user modelling platform inside the web browser. Currently, it exposes simple model of user interests formed by simple unigrams. These are extracted as weighted keywords from webpages visited by user which aggregate across the whole user browsing history. Interests are indexed in special modified version of radix tree overlaid by weighting marks to enable fast queries.

We envision broadening this simple yet effective model to cover more complex user characteristics. We consider exploitation of Bayesian networks to reflect inevitable vagueness of user characteristics. We also analyse other techniques to enhance our notion of possibilities in user modelling like fuzzy logic and neuro-fuzzy approaches.

JakubSevcech_fotoJakub Ševcech
master study, supervised by Mária Bieliková

Abstract. We often use various services for creating bookmarks, tags, highlights and other types of annotations while surfing the Internet or when reading documents. These annotations can be considered additional information attached to the original document that extend the document or that describe it. We usually use these annotations as marginalia to store our thoughts in the margin of the document or to organize personal collections of documents using methods such as tag-cloud.

In our work we support navigation between documents using annotations in search. We use annotations in two ways. We search using annotations and we search in annotations. The first process uses annotations that describe the document to create a query to obtain other documents related to the currently studied document. We consider these annotations indicators of user interest in specific parts of the document and we search for documents related to concepts of studied document which user attached annotations to.

The second use of annotations lies in search in annotations. We use annotations as additional content attached to the original document in similar way anchor texts of web links are used when indexing web content. Similarly to anchor texts, annotations summarize document they are attached to and they don’t describe it in terms of document author but in terms of its readers.

Expertise of Players in Semantics Acquisition Games

simkojJakub Šimko
doctoral study, supervised by Mária Bieliková

Abstract. The games with a purpose (GWAP) have been a part of the crowdsourcing domain for some time. These specially designed computer games transform a human intelligence tasks (e.g. task hard to solve by machine but easy to solve by human) into appealing fun. They „trick“ their players to disclose a part of their knowledge or solve a particular problem, while giving them incentives in the form of fun. Most of the tasks solved by today’s GWAPs are from the field of semantics creation and maintenance: resource tagging and linking, knowledge acquisition.

Our work has two main courses. The first is the creation of semantics acquisition GWAPs themselves. We have devised games for acquisition of lightweight metadata relationships and image tags and a game for filtering invalid music metadata. The second course is the general investigation of the principles governing GWAPs (including ours), particularly their creation.

Currently, we aim to explore the possibilities of improving the GWAP problem solving capabilities by exploiting the “player model”, i.e. information about player’s expertise, skills or knowledge in various fields. The basic idea is to deploy players to tasks they are best fit for. We investigate: (1) if such deployment increases the quality (correctness, specificness) and quantity of metadata put out, (2) if “player models” can be inferred directly from gameplay or an external help is needed and (3) analogically, how to deal with the task metadata as the matching counterparts of the player “models”.

Models of Web Systems for Personalized Support of Collaboration

srbaIvan Srba
doctoral study, supervised by Mária Bieliková

Abstract. Nowadays, collaboration between users is present in many web applications. This trend causes that we have to face many new challenges. One of most important one is that users are very different in their goals and activities they use to achieve these goals. Thus their collaboration is not very successful in many cases, especially in educational domain where successfulness and effectiveness in collaboration is achieved only very rarely. We analyze challenges in education domain in three related areas: scripting, collaborative environments and collaboration management model.

We are going to focus on researching possibilities how personalization can positively influence and support collaboration in these three areas. Firstly, in scripting, how we can personalize designing and planning collaborative course, tasks and group formation. In collaborative environments, how we can adapt interfaces and visualizations according particular users’ needs. And last but not least, in collaboration management mode, how to analyze interactions and use the results of this analyses to advice users how to collaborate more effectively.

In addition, we work on a study of collaborative learning which is based on data we collected during a long-term experiment aimed to evaluate a proposed method for creating different types of groups. The results of this collaborative study will provide us preliminaries for further research in educational domain.

Method for Social Programming and Code Review

tomlein-michalMichal Tomlein
master study, supervised by Jozef Tvarožek

Abstract. Code review is an important part of quality software development. In programming courses, peer review has the potential to be an effective driving force behind the learning process.

While collaboration solutions are widely available, their use is, by their nature, generally limited to larger projects and/or requires a certain discipline to be effective. Moreover, these solutions rarely take into account the strengths and weaknesses of individuals, instead they rely on manual assignment or selection of reviewer. Automating the selection of the right reviewer is a non-trivial problem unaddressed by present collaboration and code review solutions.

In our work, we aim to make programming more effective through social interaction and peer reviews. We believe it is very important to reduce friction in the process. We believe that by making it possible for students and software developers to collaborate more tightly and easily, we can speed up the development process and achieve higher quality overall.

Intelligent Local Proxy Server With Distributed Cache

matustomleinMatúš Tomlein
master study, supervised by Karol Rástočný (Team Project)

Abstract. Slow and intermittent Internet connection is still a serious issue in developing countries and it is not fully resolved even in developed countries. To deal with this problem, we implemented an intelligent local proxy server application that enables access to websites stored on the user’s computer or on local network even without Internet connection. Stored websites include websites the user visited and also websites that were intelligently prefetched. We build upon this concept to enable collaboration between users on local networks and also on the Internet.

We primarily focus on the design and implementation of proxy interfaces and making use of a distributed cache of Web objects on local network. We explored this area in our bachelor thesis and in a team of four students competed in the worldwide finals of the Imagine Cup 2012 competition with the project OwNet.

Recommendation for smart TV

trebula2Ján Trebuľa
master study, supervised by Michal Kompan (team project)

Abstract. Today watching TV offers much more possibilities than in the past. It is caused by the bigger freedom for audience and lot of multimedia content that is needed to be personalized. As appropriate solution we propose interconnection between TV and Web.

In our project we analyze available records about users, monitoring multimedia content and description of this content. Next we focus on creating web service, which role will be content filtration (especially aimed on anonymous users) and its recommendation (aimed on registered users). We want to focus on user groups (e.g., individuals, different groups) in the recommendation. Further we plan to use this service in web and/or mobile application. These applications will be used to collect users’ preferences and also for the presentation of recommendations. It opens possibilities for making work with these applications interesting and innovative (e.g., game).

Information Retrieval Using Short-term Context

vaculaMatúš Vacula
master study, supervised by Dušan Zeleník

Abstract. The content of the Web is continuously expanding. With the growing amount of the information accessible to end user, the difficulty to achieve search results which are more relevant to user’s interest and intention. All of the modern search engines have to accommodate to this. Search engine developers are trying to make searching more efficient by involving the context. Although the resources are available and the some methods of using the context are known, search engines are still not using these possibilities at their full potential.

Information about user’s current activity could lead to achieving more accurate search results or at least less ambiguous meaning of his query. These information are relatively easy to obtain from web browser. It is common phenomenon that web users are multitasking while they are surfing the Web. That means browsing multiple websites at the time and switching between them. Even from the single view on the list of open websites we can assess what kind of activity is user performing with considerable probability and precision.

It is possible to extract the information about user’s activity by identifying the keywords or from the topic of viewed websites. We are able to use these keywords to enrich the search query of user to specify the search and to achieve more precise results. This approach does not require tracking the long term history of user activity nor does it enclose the user in the imaginary “bubble” of preferred domains which would prevent him to obtain information from another domain or area.

Search in Source Code Including Temporal Dynamics

vincur2Juraj Vincúr
bachelor study, supervised by Eduard Kuric

Abstract. Studies on reuse have shown that only 15% of the code of most systems are unique to specific application, what in theory means that the other 85% of the product might be reused. This ideal figure is rare and only 40 to 60% is gain-able in practice.

To fully exploit such a potential of existing material in future project, we must be able to find relevant data quickly over hundreds of thousands source code files. Nowadays, several searching engines are available for this purpose. But are they sufficient? If so, why only few organizations employ reuse? The main problem lies in poor average relevance which significantly increases effort of a programmer to investigate results.

In our work we will try to proceed and overcome this difficulty by enhancing results based on semantic / “behind the words” comparison with temporal dynamics. To increase relevance in this approach, searching algorithm will be adapted to consider past actions of users gathered by implicit feedback. To boost the search even more we will clarify the results by graph visualization based on abstract syntax tree.

Personalized TV Content Recommendation

visnovskyJuraj Višňovský
master study, supervised by Michal Kompan (team project)

Abstract. As people usually do not have time to watch their favourite TV shows and movies at the broadcast time, digital televisions allows their users to watch the shows at any time and any place. The multimedia content is streamed on demand of the user over the Internet. Digital television’s library can offer its users much wider variety of shows and movies.

There are many conditions influencing TV user’s preferences. User may prefer to watch a show of a certain genre or he may want to see his favourite actor or actress, also he might like better to watch a short show if he is in hurry. In the task of discovering user’s TV viewing patterns, we may also find very useful information about user’s viewing history, his demographic information and TV show metadata. Among content-based factors it might be useful to track user’s contexts, as his viewing habits may vary depending on his current mood, occupation or interests.

The aim of our work is to track and analyze user TV viewing patterns. We will acquire any relevant information about TV user’s environment and use it to build strong and reliable user model for recommendations.

Personalized Web

?

Ľubomír Vnenk
bachelor study, supervised by Mária Bieliková

Abstract. I am greenhorn here and I just acquaint with this PeWe group and personalised things in general. I am going to choose my bachelor work in few months and I hope that other´s ideas inspire me.

I also think, that this group mostly attend best students of IT branch, so everyone´s work will be precise and perfect so I can learn something. Also I can familiarize with themes I would have no chance to know without this group

Personalized Search in Source Code

zbellPavol Zbell
bachelor study, supervised by Mária Bieliková and Eduard Kuric

Abstract. Programmers working on or maintaining at first unfamiliar and complex software systems are often faced with difficulties understanding how exactly are features or whole concepts implemented. Even a simple maintenance task can be hard to achieve, because the programmer needs to find all program elements relevant to the feature and understand how they work together, to be able to make any changes to the source code. An approach to ease this process lies in feature location and impact analysis. Feature location identifies an initial location in the source code that corresponds to specific feature or concept, and impact analysis finds all program elements related to the initially identified one and potentially affected by the upcoming change.

In our research we focus on enhancing searching in source code by personalizing it. We apply existing feature location techniques to produce relevant results, which we further process on tracking how the programmer works with IDE and web browser. By collecting and using various implicit data such as what the programmer copies from the web or other source codes, how much he modifies the copied parts of code etc. we can formulate queries like “What parts of the system were recently added just by plain copying from the specific web site?”, these parts need to be reviewed by other programmers for example. In a large and complex system we can constrain such search query to a context of a specific feature only, thus getting filtered and more relevant results to the programmer’s needs.

Reducing the Sparsity in Contextual Information

zelenikDušan Zeleník
doctoral study, supervised by Mária Bieliková

Abstract. Our work focuses on the improvement of the accuracy of context-aware recommender systems. Our task is not to design new approach to recommend using context. We focus on the problems which cause low accuracy in this type of recommendation. The problem which we identified is lack of contextual information which is available for recommendation purposes. Contextual information showed to be promising factor in recommender systems, but have to face incomplete information on user, user’s state and environment. That the main drawback of pure context-based recommender systems. Nowadays this approach can not outperform other approaches mainly due to high sparsity of contextual information.

We propose an idea to improve accuracy of context based recommender systems by context inference. Context inference is based on effect discovered by analyses of the context as a factor influencing user needs. Analyses of the news readers reveals existence of behavioural correlation which is the main pillar of proposed context inference. Our method for context inference is based on collaborative filtering and clustering of web usage (as a non-discretizing alternative to association rules mining).

This approach has promising contribution in context-aware recommender systems. We work with dataset containing click stream from news portal, proxy logs enriched by context, developers’ logs and the most attractive movie rating dataset which actually contains even explicitly gained contextual information on mood or emotions. To prove our concept we work with these datasets in offline testing.