Students’ Research Works – Spring 2012

Proceedings of Spring 2012 PeWe Workshop

Search, Navigation and Recommendation

User Modeling, Virtual Communities and Social Networks

Domain Modeling, Semantics Discovery and Annotations

to the top | to the main


Doctoral Staff

bielikova
Mária Bieliková
Professor
o collaborative/personalized/context search and navigation
o recommendation for adaptive social web
o reasoning in information space with semantics
o user/user groups and contexts modelling

barla
Michal Barla
Postdoc
o user modeling
o implicit user feeback
o virtual communities
o collaborative surfing

jtvarozek
Jozef Tvarožek
Postdoc
o social intelligent learning
o collaborative learning
o semantic text analysis
o natural language processing

simkom
Marián Šimko
“Fresh” Postdoc
o domain modelling
o semantic text analysis
o Web-based Learning 2.0
o ontologies, folksonomies

to the top | to the main

Information Recommendation Using Context in a Specific Domain

bencic

Anton Benčič
master study, supervised by Mária Bieliková

Abstract. Adaptive and personalization methods all engage in a recommendation process. This process consists of a few decisions that have to be made in order to deliver something to the user. More often than not adaptive and personalization methods engage in only one decision and that is what to deliver to the user. This is especially true for methods that work in an on-demand basis, thus deliver content when a query for it is made. These methods generally do not consider if it is the right time to deliver the content, in what volume should it be delivered and how will it be presented, which are the three other decisions in any recommendation process.

Apart from methods working in an on-demand basis there are also a few methods that work proactively. A proactive method decides on an action by considering various types and amount of criteria. This can be as simple as setting the sound profile according to the user defined time windows or as complicated as recommending the right music for a given user and his friends around him considering their mood for example. Our project is aimed at designing a method that is able to effectively and efficiently learn what actions should be performed in what situations and then use this model to recommend actions given a particular situation.

To accomplish this we devised a rule based method that uses different classes of contextual information as antecedents for the individual rules and then aggregates those rules and their certainties to compute the final recommendation score. We are now in experimentation and refinement phase where we first run a couple of simulations to ensure that our method’s mathematical model in its basics works as it is supposed to which allows us to proceed further on to a live experiment.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 03-04

Integration and Adaptation of Motivational Factors into Software Systems

bielik

Pavol Bielik
master study, supervised by Michal Barla

Abstract. Gamification defined as „applying the mechanics of gaming to non-game activities to change people’s behavior“, is often used in variety of areas including business services, behavior promotion, content portals or even in project management web applications. After widespread adoption in the second half of 2010, the term itself is unfortunately now often misused for marketing purposes. Nevertheless, the idea of using game mechanics and dynamics to drive participation and engagement, mostly by using extrinsic motivation, is certainly worth further examination. The entire field however needs to be examined to determine what elements work in what situations as we do not currently know how exactly they affect our motivation, both positively and negatively, and which combination of game mechanics is suitable in given situation.

In our work we seek to integrate and adapt these game mechanics and dynamics in the domain of health promotion, specifically to motivate people engage in appropriate physical exercise. The solution will be implemented in move2play system, which already provides required activity tracking, evaluation and recommendation of appropriate physical exercise for our purposes on Android platform.

Main areas to explore in this domain are adaptation motivational factors for individual user, obtaining relevant content sources and mining user interests. Firstly, we need to adapt motivational factors as each user is different in terms of what motivates him more. For example, some people like to socialize while others prefer competition. Secondly, we want to obtain relevant content sources that will be used by game mechanics and dynamics. Content sources are important because core mechanics does not change very often and therefore we need to change content of these mechanics to keep user interested. Most of current integrations of game mechanics use content sources created by domain experts, but we see here possibility of obtaining subset of such content automatically. Finally, once we obtain relevant content, it is desirable to choose for given user those topics, in which is he already interested in. Based for example on what kind of sports, movies, books or TV series he likes. In order to obtain this kind of information, we need to track user activity, both on and off computer, and extract his interests.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 31-32

Personalized Reading Resources Organization

burger

Roman Burger
master study, supervised by Mária Bieliková

Abstract. With wast amount of accessible information and resources through web, one may start carefully filter what to read and what to ignore. Anyhow we try, chances are that our workspace will eventually get overwhelmed with quantum of resources. The very common cause is that resources processing speed of the user is lower than resources retrieval speed.

Keeping track of resources organization and structure then becomes tedious task, making it even harder to focus on individual resources. Users need to manually manage theirs workspaces which produces overhead while processing resources. It is not uncommon for them to actually abandon the management due to convenience. Especially for user’s convenience we seek for means of automatic organization management of resources.

In this project, we will propose a method for automatic organization for emerging resources. For user’s comfort, method shall take into account user’s feedback on how to organize resources, making the method personalized. The method will work mostly on resources meta-data and their context. On top of resources organization, recommendation of reading order could be employed. With automatic personalized organization and processing order recommendation, user would need little to none effort maintaining clean-ish workspace.

Validation of the method will be done through user experiment where we will measure amount of user correction in generated structure. Hypothesis says that user intervention should decrease over time.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 5-6

Web Surfing in Conditions of Slow and Intermittent Internet Connection – Personalized User Interface

demovic

Ľuboš Demovič
bachelor study, supervised by Michal Barla

Abstract. Despite of the advancements in information and telecommunication technologies, slow and intermittent Internet connection is still a serious issue in many places of the World and is most visible in developing countries.

At the same time, the Internet with its most popular service – the Web, have become very important parts of our everyday lives as more and more of human activity is taking place online.

We propose a concept of software solution called OwNet which makes the Web surfing experience less frustrating even in the case of slow and intermittent Internet connection. OwNet is based on using a local proxy server, acting as an intelligent bridge between the client’s browser application and the Internet, communicating with other clients and services in order to provide the best surfing experience.

The author focuses on user interface and its personalization. All successful applications have to, in addition to good functionality, also have a modern design with an emphasis on clear and intuitive navigation for the users. The interface is a very important component of any innovative applications.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 7-8

Validation of Music Metadata via Game with a Purpose

dulacka

Peter Dulačka
bachelor study, supervised by Jakub Šimko

Abstract. Quantity of music metadata on the Web is sufficient, music recommendation and online repository systems are proof of it. However it became a real challenge to keep quality of metadata at reasonable level. Automatic approaches are fast but inaccurate; the cost of human computation is too high. We present a Game with a purpose (GWAP) called City Lights – a music metadata validation game which lowers the cost of human computation and makes validation fun.

Our goal is to get rid of wrong user-submitted metadata or metadata not usable at global scale. The main problem of crowdsourced data is their variety and occasional subjectiveness – song can be „good“ or „awesome“ for someone, but „boring“ for someone else. In our game we deal with this problem by letting crowd checking annotations submitted by another crowd in the past. By not letting players add new annotations we maintain their concentration on validation. We try to keep fun in the game by many additional features – the most important one is playing music only from player’s favourite domain.

The principle of the game is simple. Player listens to song and is presented couple of sets of annotations. Each of these sets belongs to exactly one song (different song for each set); and one of these sets belongs to song being played to player. Player has to decide which of given sets belongs to song she hears. By tracking player’s action we are able to determine whether the given set of annotations is describing the song correctly – hence validate the annotations. Even the amount of wrong annotations could be higher in the beginning, first results already showed us that number of validated or removed annotations ratio is very good and wrong annotations are being removed very fast.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 57-58

Recognizing User’s Emotion in Information System

fejes

Máté Fejes
master study, supervised by Jozef Tvarožek

Abstract. Human emotions and theirs signs are innate to humans, regardless to the particular person. Thanks to that they can serve as implicit feedback from the users in information systems. Mimics of the face are unconscious signs of reaction to affect. According to a number of researches in the area of psychofeedback different movements of the face are common for all people, so we can derive reasons if these reactions – emotions. In case of education systems we can estimate the users’ opinion about text from the face mimics, so we are able to find out knowledge, interests, mood and other attributes that wouldn’t be possible to identify by the help of traditional ways of gaining feedback.

In this project we deal with gaining, representing and utilizing of user emotions while using web-based (education) system. To find out, what is on the subject’s mind, we need to have a camera that records the user’s face. For the extraction of emotions from video we can use a number of existing tools, which are able to recognize human face and its facial features. Facial features are important points of human face (e.g. border of mouth or eyes). Theirs location depends on the movements of facial muscles therefore on the emotions of the user.

Our aim is to propose a method for user modeling based on emotions invoked during work in web-based (education) system. Our method is going to be based on results of experiment we plan to realize in real environment with users. Within the experiment we will track the users by a webcam while they are working in the selected system. By comparing of extracted emotions with users’ activities we will try to explore relations between psychical activity and executed actions in user interface. The goal of the experiment is to identify activities or theirs groups specific for users in certain emotive state. This way we will be able to find out emotions invoked by given content by traditional types of feedback.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 33-34

Educational Content Recommendation Based on Collaborative Filtering

edo

Eduard Fritscher
bachelor study, supervised by Marián Šimko

Abstract. In the past students could only study from handwritten books and to achieve their goals, which were to pass the exams or expand their knowledge, they could only relay on the teacher to guide them through the mass of information. But times have changed. Thanks to E-learning and the Internet students have access to all information they could ever possibly need, but without any guidance. Some people do not need any guidance, they intuitively know which path to choose; but for those who do, recommendation systems are the perfect tool to guide them through the lectures they need to process.

There are all sorts of recommendation systems which are based on different methods. Some of them recommend based on geographical data, some learn from the user, some recommend based on previous actions; the list of techniques is endless. But every method can be traced back to two main recommendation types: content-based or collaborative. In this project we will try to create a recommender system that is based on a hybrid technology, which uses the best aspects of the two main types: collaborative and content-based recommendation and later implement it in the system ALEF. ALEF or Adaptive Learning Framework is an E-learning system which was created by Slovak University of Technology, more precisely by the Faculty of Informatics and Information Technologies. Since this recommender will be implemented in an E-learning system the main goal is that the recommender will be able to guide the student through the courses recommending exactly those studying materials that he or she will need to successfully pass the course.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 9-10

Discovering Relationships between Entities in Web-based Digital Libraries

holub

Michal Holub
doctoral study, supervised by Mária Bieliková

Abstract. Information about various entities that can be found in digital libraries on the Web is mostly intended for processing by humans. Data is scattered across many web portals. There are few explicitly expressed relationships between these entities so we have difficulties creating more intelligent web applications which could e.g. enable better search possibilities. Relationships bring semantics to the data. Search engines working with the Web of Data with semantics can provide more precise results for the queries, especially when asking questions about entities.

Relationships between entities and objects are also essential for their integration and creating mashups of things. We can use it with exploratory search when we create an overview of the target domain from web objects. There are various types of relationships which we can find between entities. The most interesting relationships are created based on the interaction of users with web objects. These relationships may not have a meaningful name but can express relatedness of the objects (e.g. authors, papers, conferences).

In our research we focus on the domain of digital libraries. We crawl various web portals and parse information from them, which we subsequently integrate into one dataset. Apart from metadata about scientific articles this dataset also contains user-generated content like tags. Next, we transform the data according to the Linked Data principles and we discover relationships between various entities. We focus both on relationships extracted using text processing, as well as on relationships extracted from behavior of users when working with a digital library. This introduces research challenges such as verification and validation of discovered relationships, their weighting and ranking. Newly discovered relationships enrich the domain model which later leads to improvement of search, personalization and recommendation processes.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 59-60

Augmenting the Web for Facilitating Learning

horvath

Róbert Horváth
master study, supervised by Marián Šimko

Abstract. Every day, users on the web go through large amount of articles and texts with different purposes. It takes a lot of time and we think this time can be spent more effectively. Information technologies and text augmentation methods are able to provide user with additional information during web browsing, which is helpful in learning process like learning new languages. Those texts are written in natural language which is a problem. It is not understandable for computers and therefore web augmentation is a complicated task. Finding methods which allow augmentation of selected parts of web documents is a research challenge in field of Technology enhanced learning.

The goal of our work is to find a method for web augmentation during casual web browsing, which helps with learning process in domain of foreign language learning. Potential for this approach is supported by agreement of experts that vocabulary acquisition occurs incidentally and minimal mental processing (of presented vocabulary) can have memory effects. Our method needs to represent user knowledge and its specifics (for example forgetting) in open information spaces. It is important to take into consideration the amount of knowledge user already has and his goals he would like to reach. As a solution we consider implementation of plugin into Adaptive proxy project or web browser extension.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 61-62

Predicting Friends’ Locations as Contribution to Social Context Acquiring

jendek

Tomáš Jendek
bachelor study, supervised by Dušan Zeleník

Abstract. User is situated in certain environment. This environment affects his behavior. Location context represents environment attribute, which can be used in personalized systems or recommending systems. Nowadays people use mobile devices on daily basis. This is great opportunity to gain location context from user. Gaining information from mobile devices has become relatively simple due to a fact, that devices are equipped with various sensors. Our goal is to discover whether user is at home, or at work and also discover places of interest where user spend certain amount of time. In addition to discover location context, we focus on estimating future position of user’s friends. Location context can contribute to determine social context like community integration or relation to people around user.

Our focus lies on effectiveness of obtaining location data. We propose using GSM sensor instead of GPS sensor, which is more energy effective. Our solution is based on GSM tower location database. For experiments and evaluation we implement android application Android OS, which collects tracking data on user and predict future position of user. To evaluate prediction itself, we use implicit feedback by analyzing location logs.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 35-36

Discovering Keyword Relations

kajan

Peter Kajan
master study, supervised by Michal Barla

Abstract. When working with keywords, there are some issues that one has to deal with. For instance, in the search process problems can be caused by: homonyms (wrong entities are found), synonyms (entities are not found) and the abstraction level (queries are too specific/general). If these relations are identified, search queries may be adapted to achieve more relevant results.

Common approaches of revealing keyword relations are based on the user explicit actions such as tagging. We experimented with the identification of keyword relation from user’s more implicit actions such as document browsing. We analysed streams of visited documents described by keywords automatically extracted from their content. The main hypothesis is that documents visited in one “session” are related and therefore there is a relevant probability that their keywords are also related.

The advantage of our approach is that document browsing is a common action covering more domains than tagging. The data is “cheaper” to acquire and therefore we are able to analyse more data and reveal more relations. This method consists of multiple steps. First, similarity and parent-child relations are calculated according to the above mentioned hypothesis. Then, the relations are organized into lightweight ontology. The ontology is mapped to Linked data network in the next step, which is further used for relations enrichment. We plan to evaluate the proposed method using logs from Pewe proxy, proxy server used as a platform for personalization developed on FIIT STU.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 63-64

kanta

Marcel Kanta
master study, supervised by Marián Šimko

Abstract. Internet brings people information, however, there is too much information and people face these days a phenomena called information overload. There are several ways to cope with this problem, the most significant are recommender systems and faceted search. Modern systems are not only personalised, they are trend-aware and there is a trend to use location to improve the results, thus the user experience.

In our work we focus on trend-aware user model with location-aware trends. This model can be used to improve user’s recommendations of news, however, it is a general model. It can be a part of other information system. The model is inspired by another work done in this field, we focus on location aspect of the model. It means we extended the model with location metadata and then we use it when recommending news. Location aspect is realised in the model as partitioning users metadata to quadtree regions and then recommending aggregated results from regions and its parent regions.

We created a model, its quantitative validation process and comparison between model with location used and without. We will use simple recommendation algorithm, such as cosine similarity and validate the results of recommendation with Precision and Recall: P@n. We will do it on a distributed map-reduce infrastructure.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 37-38

Named Entity Recognition for Slovak Language

kassak

Ondrej Kaššák
bachelor study, supervised by Michal Kompan

Abstract. Nowadays we are literally overwhelmed with information. It is impossible for us to process all the information we find. There have been proposed various approaches as personalized recommendations based on the content or searching methods using key entities from the texts. They assume the named entities appearing in the text for their working as an input. Based on them, recommendation algorithms can search and work more efficiently in comparison with other methods working only with the text titles or with the most frequent words in the text.

In our research we propose the method for recognizing and extraction of named entities in texts. The aim of our proposed method is to recognize entities in the text and then place them into the proper categories. We primarily focus in texts in Slovak language because a comprehensive tool for this language that would identify all entities classified according to the MUC-6 (6th Message understand conference) is missing. We also describe possibilities of application for other flective languages.

The proposed method consists of two parts – the initial part of pre-processing of the text and the recognition of the named entities. Process of the entity recognizing is composed from identifying potential entities occurring in the processed text, determine its scope and consequently identify the category to which they belong. We use Slovak and English version of Wikipedia to identify new entities, database for fast recognizes of entity that we found before and Slovak National Corpus for filtering common words with first capital letter from entities.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 65-66

Building Domain Model via Game With a Purpose

kiss

Marek Kiss
bachelor study, supervised by Jakub Šimko

Abstract.The Web brings many possibilities to improve learning processes. A part of them is represented by adaptive learning systems. These systems try to determine the level of students’ current knowledge of particular concepts in order to adapt to his needs and make the learning for him more effective. Prerequisite for a working adaptive system is the existence of domain model to provide basis for modelling both user knowledge and semantics of domain documents. The simplest domain model can be represented as a concept relationship networks. However, even the creation of such simple model can’t be fully automated and it is usually work for authors of contents.

Building a domain model for adaptive learning system is a task that is suitable for creating a game with a purpose. Game can motivate students to contribute on creating of a lightweight term relationship network for a domain model for the same system they are using for learning. The idea is to make from this boring work an enjoyable part of a learning process.

We have designed for this purpose a game, based on principles of Little Search Game. We placed our game to specific domain, what means that we can predict player’s moves and prepare for him a set of potential options. It brings more dynamics and attraction to our game. In our game player shoot on bubbles containing words. Hi try to pick words that have something in common with a given concept. The amount of points he obtain is affected by the number of occurences of this pair of words in domain documents. We allow players to insert their own words too.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 67-68

Group Recommendations for Adaptive Social Web-based Applications

kompan

Michal Kompan
doctoral study, supervised by Mária Bieliková

Abstract. Personalized recommendation is a well researched area nowadays, while the group recommendation still needs to be explored. Several of our daily activities are not single but group based. Aspects of the group recommendation become more visible as larger part of our lives transformed and moved to social networks. The group recommendation can be used not only in the mean of recommendation of items to the group of users. Several types of groups can be observed – temporal or stable, forced or natural, moreover in the context of web we can consider virtual groups as well.

This brings us to the assumption that principles of the group recommendation can be used in the standard – single user recommendation tasks. Firstly the aggregation strategies used for the group preference aggregation can be used in the multicriteria problem. Similarly groups can be useful when a new user interacts with the system which refers to the cold start problem. The task of the group recommendation can be extended to the standard single user recommendation as well. The aggregation of single user profiles in order to obtain one group profile combines user preferences and also in some settings can introduce variety, which can be interesting from the recommendation improvement point of view.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 11-12

Web Surfing in Conditions of Slow and Intermittent Internet Connection – Web Content Prefetching

konopka

Martin Konôpka
bachelor study, supervised by Michal Barla

Abstract. Despite of the advancements in information and telecommunication technologies, slow and intermittent Internet connection is still a serious issue in many places of the World and is most visible in developing countries.

At the same time, the Internet with its most popular service – the Web, have become very important parts of our everyday lives as more and more of human activity is taking place online.

We propose a concept of software solution called OwNet which makes the Web surfing experience less frustrating even in the case of slow and intermittent Internet connection. OwNet is based on using a local proxy server, acting as an intelligent bridge between the client’s browser application and the Internet, communicating with other clients and services in order to provide the best surfing experience.

Pro-active downloading and caching not yet requested web objects of user’s interest may result in user’s perception of having a faster Internet connection. The author experiments with existing prefetching algorithms which are usually able to prefetch only previously visited objects in user’s browsing history. However, by enhancing user’s history with objects visited by other users using collaborative filtering method, prefetching algorithm may download objects which user has not yet visited. At the same time, it is also important to intelligently adapt the prefetching mechanism to user’s behaviour, e.g., to choose when the objects should be prefetched to avoid unnecessary downloads.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 7-8

Emotion Classification of Microblogs Based on Appraisal Theory

korenek

Peter Korenek
master study, supervised by Marián Šimko

Abstract.From immemorial there is a desire of people and companies that sell goods or provide services to know what people think about their products. In this paper we present a novel method for emotion classification of text that allows recognizing opinions in microblogs.

This method differs from other approaches by using the Appraisal theory, which we utilize in linguistic, semantic and syntactic analysis of microblog texts. Our method is target oriented – we assign appraisal expressions to targets – not to whole microblogs as in other methods. This approach allows us to more accurately classify emotions that are expressed in microblogs. Using the approach we determine which sentences and targets are more important in microblog’s text. We manually created patterns used for identification of targets in microblogs. These patterns are based on a POS tagging and syntax and semantic analysis of microblogs.

We use the Appraisal theory to describe in more detail the type of emotional relationship between the user and entities he writes about in his microblogs. Analysis of assigned appraisal expressions to targets provides as method to find out what user interests are. We also discover what attitude user have to his interests.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 13-14

Analysing Temporal Dynamics in Search Intent

kramar

Tomáš Kramár
doctoral study, supervised by Mária Bieliková

Abstract. Understanding how searchers change their search goals over time is important for purposes of a personalized search. This knowledge is crucial for building a time-sensitive ranker, search intent prediction and multitude of other search related tasks. In our work, we analyse temporal search intent dynamics, using the publicly available AOL search engine clickthrough log.

We hypothesize, that each person is having multiple personas. The term roots in psychology, where it denotes a social face the individual presents to the world, it reflects the role in life that the individual is playing. We believe that among many personas an individual can have, two should stand out: the persona related to personal life, and persona related to work life. Separating these two personas and creating a separate user model for each of them has the potential to bring a user model that is focused, similarly to the lean, short-term model, and yet has enough data to allow confident adaptation.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 15-16

User Modeling Using Social and Game Principles

kratky

Peter Krátky
master study, supervised by Jozef Tvarožek

Abstract. Personality has a significant impact on the way user uses web applications. Personalized education systems are such an example of applications which take advantage of availability of user’s characteristics as different personalities have different learning habits. Therefore a need for modeling of user’s characteristics and traits arises. Explicit methods for retrieving information about user might be obtrusive or might seem too personal for him. That’s why we want to perform user modeling in funny and appealing way – using computer games.

Goal of our project is to design an implicit user modeling method aiming at user’s personality. The first part of our work is to deal with an issue of retrieving relevant data of the user to leverage on. We will design a method of tracking user’s behavior in an (open-source) game. Tracked behavior is a set of actions performed in the game which are correlated to the major personality traits (e.g. socializing, competitiveness, teamwork). In the second stage of our work we will design a method to infer the user model. We will verify the user modeling method by an experiment at the faculty, within which we will retrieve users data both explicitly using questionary and implicitly using our method and we will compare the inferred models.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 39-40

kuric

Eduard Kuric
doctoral study, supervised by prof. Mária Bieliková

Abstract. When programmers write new code, they are often interested in finding definitions of functions, existing, working fragments with the same or similar functionality, and reusing as much of that code as possible. Short code fragments, which are returned to a programmer’s query, do not provide enough “background” to help them how to reuse the fragments, and programmers usually have to invest considerable effort to understand how to reuse the fragments. Keyword-based code search instruments (tools) face the problem of low precision on their results due to the fact that a single word of the programmer’s query may not match the desired functionality. It is because no source code content is analyzed or the programmer’s needs are not clearly represented in the query. Understanding code and determining how to use it, is a manual and time-consuming process. In general, programmers want to find initial points such as relevant functions. They want easily understand how the functions are used and see the sequence of function invocations in order to understand how concepts are implemented. When programmers learn about a program (source code), the control flow (execution of function calls) needs to be followed. It means successive jumping from one function to another.

Our main goal is to enable programmers to find relevant functions to query terms and their usages. In our approach, identifying popular fragments is inspired by PageRank algorithm, where the popularity of a function is determined by how many functions call it. We designed a complex model based on the vector space model, by which we are able to establish relevance among facts whose content contain terms that match programmer’s queries directly. The result is a sorted list of relevant functions that reflects the associations between concepts in the functions and the programmer’s query.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 17-18

Explicit and Implicit Feedback in Recommendation

labaj

Martin Labaj
master study, supervised by Mária Bieliková

Abstract. Today Web systems are becoming more and more adaptive, when they are modifying their behaviour to suit different users’ needs. Recommender systems are an important part of such adaptive Web systems. They are of benefit to both users (e.g. see more items in which he is interested in without navigating through vast amount of items available) and system owners (e.g. sell more items). In technology enhanced learning (TEL) environment, it is important to aid users while learning. Several of the user tasks supported by current recommender systems are well explored (e.g. find good items, recommend sequence), but some can be supported better (e.g. find good pathways).

In our research we are exploring the use of implicit and explicit feedback in recommendation. In implicit feedback we previously employed gaze position and other interest indicators in fragment recommendation on the Web. Also, in current Web browsers users can open multiple new windows and tabs and switch between them at any time. Such behaviour is often invisible to Web usage mining, as only page loads are tracked server-side. However, if we can capture such data from common users, we can track revisitations, paths through resources, etc. more accurately. Currently we are tracking and evaluating user parallel browsing behaviour in the ALEF system.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 19-20

Web Surfing in Conditions of Slow and Intermittent Internet Connection – Modeling User Groups

lani

Marek Láni
bachelor study, supervised by Michal Barla

Abstract. Despite of the advancements in information and telecommunication technologies, slow and intermittent Internet connection is still a serious issue in many places of the World and is most visible in developing countries.

At the same time, the Internet with its most popular service – the Web, have become very important parts of our everyday lives as more and more of human activity is taking place online.

We propose a concept of software solution called OwNet which makes the Web surfing experience less frustrating even in the case of slow and intermittent Internet connection. OwNet is based on using a local proxy server, acting as an intelligent bridge between the client’s browser application and the Internet, communicating with other clients and services in order to provide the best surfing experience.

Author’s target is explicit creation of user groups, which will provide useful information about user’s interests in form of tags. Tags together with page rating can be helpful for web site recommendation and prefetching.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 7-8

Automated Public Data Refining

liptakm

Martin Lipták
bachelor study, supervised by Ján Suchal

Abstract. Public institutions have legal obligations to share certain data on the Web. While public registers (e.g. businesses, organizations) and bulletins (public procurements) are essential for business communication, other data increase transparency of public institutions and enable public investigation (public contracts).

Despite the fact that these data are becoming publicly available on the Web, there are two problems. The first problem is format and structure that might not be suitable for machine processing. For example some documents are published as scanned images with censored names and prices. This makes such documents difficult to investigate by a human expert and almost impossible to process with computer. For example company liquidations are published in periodic PDF bulletins as unstructured text content and it is difficult to reliably find out if a company is being liquidated or the liquidation is being cancelled. Fortunately the most common format is HTML, which is easy to parse and in most cases provides structure. The second problem are various mistypings, disambiguities and duplicates. They are common even in correctly parsed and structured data.

We address the second problem by designing a trainable duplicate detection method. Our method uses supervised machine learning to clean off (or refine) duplicates, mistypings and other disambiguities to make data querying result consistent and reliable. We have evaluated a prototype on data set from Slovak Companies Register. Our method is not complete yet, but the prototype evaluation has given us valuable experience and intuition for further research.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 69-70

Acquiring Web Site Metadata by Heterogeneous Information Sources Processing

lucansky

Milan Lučanský
master study, supervised by Marián Šimko

Abstract. World Wide Web is almost unlimited source of knowledge and information and every year the number of available web sites increases in millions. That introduces the demand for automatic processing of vast collection of web documents. We need to assign descriptive metadata to web pages to facilitate further processing and it turns out that keywords are suitable representation of web content. Nowadays, keywords form a basis for semantic representations as they are utilized in the field of ontology engineering. The most of popular search engines are based on keyword search paradigm and keywords are even used in user modeling for adaptive web-based systems to represent the context. Social services, such as Delicious utilize keywords too.

In offline document collections there are various approaches to automatic term recognition (ATR) from plain text corpora. If used on web documents, they could possibly benefit from hidden semantic of HTML elements used to format and style sheets to visualize text content. Our current research aims at cascade style sheets (CSS) as additional source for identifying potential keywords. The idea of utilization of CSS in co-operation with ATR algorithms is quite new and unexplored.

Plain text content is passed to ATR algorithm, which extract weighted keywords. From the web page we extract keywords formatted by selected CSS attributes, compute the CssRel coefficient and improve ATR keywords weights. For the anchor texts pointing to examined web page we compute LinkRel and improve ATR keywords. Finally we acquire text content from selected HTML elements, compute TagRel and improve ATR keywords. We produce new order of extracted keywords, where the most relevant should appear at the top of list with the highest weights. Our plan is to evaluate our method on random set of different format (blogs, news portals, wiki articles etc.) web pages from World Wide Web.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 71-72

Unified Search of Linked Data on the Web

p_macko

Peter Macko
master study, supervised by Michal Holub

Abstract. Searching for information on the Web is increasingly difficult because of its enormous growth. To make matters worse, most of the data published on the Web is in unstructured format. However, more and more structured data is being published, which is also evident from the emergence of unifying initiatives like Linked Data. Structured data enable us to make web applications allowing users to search for information more comfortably. But querying this type of data is not a trivial task.

Nowadays, there are various structured data sources, but only few search engines are able to search in them utilizing the full power of the provided semantics. The majority of the search engines search for information using keywords which may not always give the users the results they desire. To utilize the full power of the structured data a special query language, like SPARQL, has to be used. However, queries in this language are not easily constructible for majority of standard users.

We would like to change this fact by creating complex search engine witch could understood pseudo-natural language of humans. These queries will be transformed to SPARQL language and executed on an ontological database. This is not a trivial task and we are now considering two simplifications: 1. the user has a skeleton witch will guide him through writing of a valid query, or 2. we will use some natural language processor (e.g. Stanford CoreNLP) which can help our search engine to understand what the user wants.

We evaluate our method in the domain of scientific articles, authors and other parts of ACM, Springer and other digital libraries.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 21-22

Context-Aware Physical Activity Recommendation Through Challenges

mitrik

Štefan Mitrík
master study, supervised by Mária Bieliková

Abstract. The lack of physical activity is a phenomenon of this age. It negatively affects both our physical and mental health. Diabetes, heart-related diseases and cancer are just some of the diseases which are partly caused by the lack of physical activity.

We believe that personalized challenges can motivate people to exercise more and thus improve their health and the quality of their lives. Personalized challenges are physical activities we recommend to the user. A very simple example of such a challenge could be: “Can you walk 3000 steps in 3 hours?” or “Can you get to the nearest park in under one hour?”.

We are already able to track users’ physical activity throughout their day with their smartphones, so no additional hardware is needed. Also, the recommendation and visualization of the challenges takes place in the phone application. The phone application allows us to exploit contextual information, such as the user’s location, agenda or physical condition, in order to recommend challenges. Another thing that can make our recommendation more accurate are patterns of the user’s physical activity. For example, we can discover that the user leaves her office at 4 p.m. on Tuesday and thus assume that the ideal time for a challenge recommendation is at 3:45 p.m. just before she leaves the office.

As we all know, people’s preferences differ significantly. There are some of us who prefer a large number of shorter and more focused challenges, but others might like longer and more dynamic challenges. Our recommendation engine takes that into consideration and tailors the content of the challenge according to previous implicit ratings of challenges in similar contexts.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 23-24

Recommendation based on Implicit Feedback

molnar

Samuel Molnár
bachelor study, supervised by Mária Bieliková

Abstract. Nowadays the Web and other technologies it includes provide a variety of ways to determine and track user activity while browsing the web content. All aspects of user activity consisting of the way user works with website content, time spend on the specific web page, hyperlinks he has followed and other meaningful viewpoints, can be evaluated by specific criteria to gain some personal information about the user such as interests or favorite items of specific domain. All of this information can be analyzed afterwards and later on, we are able to use our conclusions in order to recommend the user interesting items or activities of our domain based on his interests and previous activity.

The center of our research is focused on personalized web, particularly recommender systems. The concepts of recommender systems is the main reason, why we are interested in area of personalized web since it does not only include knowledge from information technology, but other fields such as psychology as well. The way a user interacts and works with the Web and its content can be analyzed by patterns in human behaviour discovered and documented by psychologists. By using some of these patterns, we are able to create a comprehensive and coherent analysis of user personality, sets of interests based on personal and work life and other social-related information acquired by user activity on the web. Thus, our research of recommender systems is focused, from most part, on implicit feedback from user and large amount of aspects we may consider in analyzing such feedback.

to the top | to the main

Personalized Text Summarization

moro Róbert Móro
master study, supervised by Mária Bieliková

Abstract. One of the most serious problems of the present-day web is information overload. As we can find almost everything on the web, it has become very problematic to find relevant information. Also, the term “relevant information” is subjective, because as users of the web, we differ in our interests, goals or knowledge. Automatic text summarization aims to address the information overload problem. The idea is to extract the most important information from the document, which can help readers to decide, whether it is relevant for them and they should read the whole text or not.

We propose a method of personalized text summarization using the combination of different raters, which unlike the classical (generic) automatic text summarization methods takes into account the differences in readers, their needs and characteristics. Because annotations (e.g. highlights) added by readers can indicate their interest in the particular parts of the document, we use them as another source of personalization. We aim for our approach to be as independent of the domain and also of the language of the summarized document as possible.

We evaluate our method in the domain of e-learning in ALEF (Adaptive Learning Framework), considering summarization for knowledge revision as our specific use case. For this purpose, we propose a personalized method for selecting documents for revision which considers recent changes of a reader’s knowledge and supports concepts, the knowledge of which the reader has recently gained or, on the contrary, lost. Our preliminary experimental results suggest that using the domain-relevant terms in the process of summarization leads to selecting more representative sentences capable of summarizing the document for revision. We plan more extensive evaluation of both the proposed method of personalized summarization and the method of documents’ selection for revision.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 73-74

Metadata Collection for Effective Organization of Personal Multimedia Repositories using Games With a Purpose

nagy

Balázs Nagy
master study, supervised by Jakub Šimko

Abstract.Nowadays, an average person is overloaded with enormous amount of digital data. Besides multimedia (music, videos, images) we can mention also emails, web pages and information on social networks, blended together in a hypertext environment. For implementation of effective search and navigation in this space it is necessary to have enough descriptive metadata available for each resource. These can be collected automatically or manually through crowdsourcing methods and in particular, by games with a purpose.

In our research, we focus primary on image metadata acquisition. Our goal is to upgrade and extend an existing game with a purpose (GWAP) called PexAce, which collects useful annotations for photos and transforms them to tags. Due to lack of metadata for personal photo albums we want to focus on obtaining descriptive metadata for this kind of media. Using them we will be able to query, order and filter these enriched photo albums much better.

Our previous experiments with the PexAce within general domain indicate that this method of obtaining metadata is effective. According to our expectations, we should get positive results also after using our method in specific area such as personal photo albums. In fact, user may be more motivated because they annotating their own photos. Another side effect of this should be reflected also in the quality of obtained tags.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 75-76

Knowledge Tags Maintenance

OLYMPUS DIGITAL CAMERA

Karol Rástočný
doctoral study, supervised by prof. Mária Bieliková

Abstract. Knowledge tags provide new layer of light semantics over web content, in which computer systems and web users can share their knowledge about tagged content. To allow this type of collaboration, knowledge tags’ data must be stored in form which is understandable for computer systems and provides flexible and fast access for these systems. The next issue is dynamic character of the Web, whose content changes in the time. These changes can leads to invalidation of knowledge tags, its parts or anchoring in tagged content.

Our work consists of two parts – knowledge tags repository and automatized knowledge tags maintenance. We build knowledge tags repository on Open Annotation model. We decided for this model because of it is already accepted by wide range of systems and knowledge tags and annotations have common characteristics. Both of them are anchored to specific parts of documents and they contain small information on these documents’ parts. Knowledge tags maintenance approach automatically repair knowledge tags after updating of tagged documents. A repair of a knowledge tag means discarding of the knowledge tag or updating its anchor and content. If knowledge tags are not repairable, we mark these knowledge tags as voided and we yield decision how to modify them or if they have to be completely discarded to another system (if it is possible, a system, which created these knowledge tags).

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 77-78

Decentralised User Modelling and Personalisation

sajgalik

Márius Šajgalík
master study, supervised by Michal Barla

Abstract. Most of modern web applications try to gather data about users and their behaviour, interests and various others characteristics, which could be used later to create a user profile – his model. The purpose is to be able to adapt to the user as much as possible to make his work in the system more effective and comfortable. However, the problem of these applications is that each of them is focused in its own domain and re-creates its own user model from scratch. Thus, there is a majority of centralised solutions that are not interconnected at all. The alternative way is represented by decentralised user modelling and personalisation which tries to interconnect several various sources that could be used. The purpose is to be able to use the existing user models and gather more information about user from other domains as well. More information means ability to achieve better personalisation and thus, motivate the user to higher activity in the system, which contributes this way further to make the personalisation better.

Since web browser is nowadays the main work tool for the user to browse the web, new opportunities in realisation of decentralised personalisation arise, directly on the client device – in the web browser. In this project we analysed existing possibilities in the area of decentralised user modelling and personalisation not just in the browser to pick the best from the state-of-art approaches. We describe a new way of realisation via designed-by-us distributed multi-agent collaborative personalisation platform, which is built as the browser extension. The proposed solution consists of two main parts – decentralised user interests modelling (global and local in the various domains) and decentralised collaborative personalisation of web pages.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 41-42

JakubSevcech_foto

Jakub Ševcech
master study, supervised by Mária Bieliková

Abstract. Currently we are facing the raise of services for sharing different kinds of content, whether in the form of links to interesting web pages, images, comments, or various multimedia information. Many applications provide tools for creation of annotations into web pages or various electronic documents. Few of these applications use created annotations to provide additional value to their users. If there is any added value, it takes effect sometime in the future, when the user want to return to once studied documents, or when the number of annotated documents reaches some critical value.

In our work, we want to offer a reward for inserting annotations in time of their creation. We are working on navigation support using annotations, specifically we use annotations as an input for the process of creation of the query to retrieve more related documents. By annotations we mean comments, highlights in the text, bookmarks or tags that users are often inserting into documents while reading them. They are actually electronic equivalents of marginalia, that we are creating while reading books or newspapers. Annotations inserted into the document, describe exactly the part of the document, that user is the most interested in. That’s why we use them to find documents, that provide additional sources of information to information mentioned in the original document.

We are creating a simple tool for attaching annotations to documents, tool for automatic generation of a query for a search engine and an interface that allows the user to edit automatically generated query. The solution should not be a separate search engine, instead it should serve as an interface for creating and editing query for the most commonly used types of search engines.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 25-26

Games and Crowds: Authority Identification

simkoj

Jakub Šimko
doctoral study, supervised by Mária Bieliková

Abstract. Some of the today’s computational tasks are still subject to human labor, because computational machinery paradigms are unable to deal with them (esp. in terms of quality). These tasks include metadata acquisition and domain modeling, the two essential processes needed for enabling effective and adaptive hypermedia systems. Hence, a whole field of crowdsourcing-based (and game-based in particular) approaches has emerged to do the job.

However, this field has also its own problems. It is dependent on mass participations of users and meanwhile, it is usually not very effective in using this power as it is based on redundant task solving in order to filter out incorrect task solutions. We believe that the effectiveness of crowdsourcing approaches can be improved through authority identification, i.e. identification of contributors with more experience in particular domain, whose problem solutions should be taken with greater weights, assuming their high correctness probability. Authority identification has not been sufficiently addressed yet, and within the domain of games with a purpose (GWAP), it completely absents.

We aim to explore the possibilities of authority identification within crowd-based, collaborative and gaming metadata acquisition systems. In the GWAP domain, we experiment with tracking of the player abilities by measuring their scores in tasks dealing with different domain concepts (exploiting the possibilities of controlled task assignment). In the general crowdsourcing, an analogical experiment is currently underway in the e-learning domain, where students identify correct and wrong answers on questions. Here, we aim to prove that measurement of the student skills based on his past exercises can improve the crowd-based filtering tasks, if applied during solution voting procedures.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 81-82

Encouragement of Collaborative Learning Based on Dynamic Groups

srba

Ivan Srba
master study, supervised by Mária Bieliková

Abstract. Computer-Supported Collaborative Learning (CSCL) is an approach to learning based on support of information and communication technologies. The main task of CSCL is to link together two trends. First one is a support of students’ collaboration during learning in small groups. Second one is increasing potential and availability of ICT infrastructure.

We propose a method for creating different types of study groups with aim to support effective collaboration. We concentrate on small groups which solve short-term well defined problems. The method is able to apply many types of students’ characteristics as inputs, e.g. interests, knowledge, but also their collaborative characteristics. Our method takes as a base the Group Technology approach. Students in created groups are able to communicate and collaborate with several collaborative tools in a collaborative platform. We designed a collaborative platform called PopCorm which allows us to automatically observe dynamic aspects of the created groups, especially how students collaborate to achieve their goals. The results of this observation provide a feedback to the method for creating groups.

Evaluation of the proposed method consists of a long-term experiment which is realized during summer term as a part of education on the course Principles of Software Engineering. We hypothesize that the groups which will be created with the proposed method during the experiment will achieve more successful collaboration in comparison with the groups which will be created with the reference method k-means. In addition we will evaluate acquired activity logs which can provide us with interesting information how students tend to use each collaborative tool.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 43-44

Feedback Acquisition in ALEF

stenova

Andrea Šteňová
bachelor study, supervised by Mária Bieliková

Abstract. Explicit feedback from website visitors is very important and its meaning is constantly growing nowadays. User’s interests and opinions can be determined by user feedback acquisition, thanks to that it is possible to review, improve, recommend and personalize webpage content. In web-based learning, we want to know how students interact with system, which materials they find hard to learn or insufficiently explained, and which on the other hand they like.

Most of the time, students are not willing to provide the feedback or they do it only when they are very satisfied or not satisfied at all. Another problem is that we should not disturb students during their studies and get feedback in right time. Having met this condition, we can avoid collecting wrong information.

In our method, we offer different rating forms of distinct parts of the system and provide transformation of ratings on various rating scales. We plan to evaluate our proposed method in the domain of e-learning in ALEF (Adaptive Learning Framework).

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 45-46

Modeling a Tutor for E-Learning Support

svorada

Peter Svorada
master study, supervised by Jozef Tvarožek

Abstract. E-learning web systems allow students to educate themselves for example by studying materials, solving tests or doing exercises. It is proved that students can learn more if they are included in the learning process and this process is adapted to the needs of individual student. Likewise it is known that student advances faster in the learning process if he is led by someone who acts more like a friendly tutor than a leading authority.

In our research we work on our model of a tutor used to support e-learning process in the domain of basic procedural programming based on peer tutoring. In this model we use computer tutor to support the learning process by giving advice to students via chat messages. The most distinctive and main ability of our tutor is to notify the participants, that students are working on a solution that seems to be different than any other known correct solution, hence probably wrong. The other ability, that derives from the one mentioned, is ability to give students automated advices depending on the current context. For this we created a method based on code clone detection which is capable of calculating similarity between two pieces of code from which one is only in the process of creation.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 79-80

Method for Social Programming and Code Review

tomlein-michal

Michal Tomlein
master study, supervised by Jozef Tvarožek

Abstract. Code review is an important part of quality software development. In programming courses, peer review has the potential to be an effective driving force behind the learning process. However, due to the significant amount of time reviews take, whether in software development or a programming course, they cannot be and are not done thoroughly in practice.

While collaboration solutions are widely available, their use is, by their nature, generally limited to larger projects and/or requires a certain discipline to be effective. Moreover, these solutions rarely take into account the strengths and weaknesses of individuals, instead they rely on manual assignment or selection of reviewer. Automating the selection of the right reviewer is a non-trivial problem unaddressed by present collaboration and code review solutions.

In our work, we aim to make programming more effective through social interaction and peer reviews. We believe it is very important to reduce friction in the process, from asking for a review to getting feedback and communicating with a reviewer. To do so, we intend to integrate our solution with widely used development environments. We believe that by making it possible for students and software developers to collaborate more tightly and easily, we can speed up the development process and achieve higher quality overall.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 47-48

Web Content Caching for Web Surfing in Conditions of Slow and Intermittent Internet Connection

tomlein

Matúš Tomlein
bachelor study, supervised by Michal Barla

Abstract. Despite of the advancements in information and telecommunication technologies, slow and intermittent Internet connection is still a serious issue in many places of the World and is most visible in developing countries.

At the same time, the Internet with its most popular service – the Web, have become very important parts of our everyday lives as more and more of human activity is taking place online.

We propose a concept of software solution called OwNet which makes the Web surfing experience less frustrating even in the case of slow and intermittent Internet connection. OwNet is based on using a local proxy server, acting as an intelligent bridge between the client’s browser application and the Internet, communicating with other clients and services in order to provide the best surfing experience.

The author focuses on the implementation of cache algorithms and invalidation of cached objects. To provide satisfying performance under the expected conditions, special algorithms and techniques had to be implemented. The goal was to consume the least Internet bandwidth and computational power, while providing decent results in terms of cache hits and cache updates.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 7-8

Group Recommendation Based on Voting

trebula

Ján Trebuľa
bachelor study, supervised by Michal Kompan

Abstract. Personalized recommendation is very helpful for individuals, but there are many activities, which an individual performs in the group. Satisfaction of the individual in the group depends on several factors, such as group size; its composition and type of the group (Established group, Occasional group, Random group or Automatically identified group). However, each recommendation requires knowledge about the preferences of individual group members. We are able to gather users’ preferences based on their ratings of a set of items. This set consists of items that are going to be used for recommendation generation. While gaining the preferences we have to take in mind the weight of particular rating of an item based on user’s group status.

In our work we propose voting method which will explore several approaches how to process acquired users’ preferences. We verify this method by implementation of software prototype that will be intent on the area of movie recommendations as a web application, which is accessed through a social network. Social network environment provides sufficient set of users, which are able to join the group or create their own groups and invite their friends into these groups. After generating the recommendation users’ satisfaction is observed and evaluated.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 27-28

Acquiring Metadata about Web Content Based on Microblog Analysis

uhercik

Tomáš Uherčík
master study, supervised by Marián Šimko

Abstract. World Wide Web has become one of the most important sources of sharing and searching for information. The amount of information on the Web is so huge, that searching can be done only by machines. However, information presented on the Web is intended for humans and is understandable only by humans. The Semantic Web is vision, where this problem is solved by the layer of machine-processable metadata. These metadata are not available as often as we would like. The challenge is to obtain them automatically. There are many methods, which can be used to acquire them from text. In many cases, the keywords, which we would use for text annotation, are not included in that text. The social Web is opening up wide possibilities of user-generated data utilization for this purpose.

Socially-oriented data are those, which are created by the activity of users. There are a lot of useful metadata within that data. Web applications for social networks allow a user to share a lot of information with others. Data created by their activity are very valuable source of indirectly originated metadata.

We decided to use the microblog Twitter as source of metadata. We selected the URL as entity about which are the metadata acquired, because it can be unambiguously identified in the tweets’ text. We proposed a method for keyword extraction utilizing Twitter posts. In addition to ordinary extraction methods we consider the different relevance of particular tweets depending on an author who published them.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 83-84

User Modeling in Educational Domain

uncik

Maroš Unčík
master study, supervised by Mária Bieliková

Abstract.

The trend of using e-learning systems is progressively growing and opportunities that the Web provides are huge. Nowadays, e-learning systems offer more and richer content, enable communication and collaboration among users. The rise in use of these systems causes information overload, which is related with ascend of adaptive e-learning systems.

The performance of such personalized systems is derived from an important element – the user model, which is used to minimize error rates and learning time. In our work, we proposed user model, with segregated collecting user data and constructing of user model itself. We consider several sources of inputs to process, which enrich user model.

Another aspect of our work is to allow direct access to user model for users (students and teachers). The user model in such systems is often hidden and students haven’t chance to see and to affect system believes about them. In our work, we propose method to allow students access their user model. We have designed the method for user model visualization, which allows users direct and explicit feedback to enrich the user model. It brings also another benefits, the visualization helps the user to get an overview of the whole model, to get a clearer overview of dependencies in the user model and to adjust the sensitivity of the user model. Last, but not least, our method allows to verify the user model itself.

To verify our approach, we design a software component in e-learning system ALEF, which will be tested in course of Functional and logic programming at our faculty.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 49-50

Association Rules Mining from Context-enriched Server Logs

visnovsky

Juraj Višňovský
bachelor study, supervised by Dušan Zeleník

Abstract. As users are browsing the Web, servers are recording millions of theirs actions to eventually offer better service. These large amounts of Web logs should be reused, otherwise their recording could be considered as a waste of resources. This field is covered by Web usage mining which discovers Web usage patterns. In our work we are going to prove the importance of context in the field of Web usage mining.

Web usage mining is performed in several steps. At first the Web logs have to be preprocessed. Preprocessing means getting rid of the records which could negatively affect the mining results. Then an example of enriching server logs by various contexts is provided and we describe some ideas of contexts acquisition as well. Finally, as Web logs are preprocessed, we use FP-Growth algorithm to generate association rules from the Web logs. The logs used in our work are being recorded by Adaptive Proxy server.

The aim of our work is not to only evaluate the method for association rules generation, but we will also compare precision of our method to common web usage mining techniques where no contexts are being considered and thus to decide whether contexts could be used in the domain of Web usage mining or not.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 51-52

Analysis of User Submitted Source Code in E-Learning System

zbell

Pavol Zbell
bachelor study, supervised by Mária Bieliková

Abstract. When new students of information technologies are learning how to master a programming language or how to program in general, they sometimes create code that is not easy to understand, especially when learning an object oriented language, they unintentionally use the so called anti-patterns, they do not obey the language specific naming conventions or they simply use the core libraries of the language inefficiently or even do not know their functionality and try to create something that already exists on their own. Since we all know that the process of reviewing and refactoring our code is the key to make it better, clearer, more efficient and understandable by others, we might suggest finding someone to review and help the student to correct his code. On the other hand, while it is not always so easy to find such person or time for reviewing, we propose that some parts of the review analysis should be made by computer, for example at least on short and simple exercises in an e-learning system.

In our research we try to help new students to learn a programming language by automatically analyzing their code, matching it against known anti-patterns and then recommending them how to improve their programming skills. The system intelligently recommends other exercises to the student, based on his current progress, and it also tries to recommend exercises that were incorrectly solved by other students, not encountered by the student yet and there will be a possibility for the student to solve it incorrectly. The recommendations might also include some sort of suggestions to contact other students who solved the exercise correctly and are willing to help, probably online. We plan to add this functionality into the ALEF e-learning system and make it available in courses such as procedural or object-oriented programming. By doing so we expect the students to be able to create better and less error-prone code more quickly.

to the top | to the main

Context Influencing our Behaviour

zelenik

Dušan Zeleník
doctoral study, supervised by Mária Bieliková

Abstract. Web and mobile devices became very important part of our lives. These are the main sources of information, self presentation, communication or entertainment. To make this everything available almost for free, we silently agreed to share our personal information which is then somehow used by companies to adapt, personalize or recommend the content for us. This respects the model of user and her interests. But we can move to the more advanced level and claim that this interest is influenced by current state of the environment or user herself. This information on the conditions is known as contextual.

Contextual information could be applied in many different domains to improve the quality of the model based on user model. For instance, we present AdaptiveReminder as a tool which is able to dynamically adapt the plan for a day according to current conditions. AdaptiveReminder uses history of user movement and passed events to recognize the influence of specific context information on the time needed to transport. For example, it wakes user up earlier due to traffic jams caused by rain and fog.

We also analyze the impact of contextual information on to trends in news reading to boost some news to support the interest. SME.sk provides server logs and we are able to predict the interest in topic by analyzing the history of SME.sk readers. For example someone reads football news before the match to confidently place a bet. Others are reading recipes before Christmas or Thanksgiving. These behavioral patterns are known as human rituals.

Another example of our work is code review support. This helps software developers to identify bugs in the code. We monitor software developers and their contextual information while they write code. We learn which context influences the quality and occurrences of bug reports. These rules are then applied to discover problems and leads to marking the code as potentially wrong.

to the top | to the main | In Proc. of Spring 2012 PeWe Workshop, pp. 53-54