Thesis projects

The WIS group is open for students that want to do their thesis on subjects in the wider area of web information systems and information architecture.

At the WIS group we stimulate students to contact prof. Geert-Jan Houben, to discuss possible topics for a thesis project (and literature survey). In such discussions students are free to suggest their own topic and then together a concrete thesis (or literature survey) subject will be defined.

Master students who like to do a specialisation in WIS are encouraged to select Web Science & Engineering, Information Retrieval, and Seminar Web Information Systems (possibly in second year). In any case, for help in composing a program, students can always contact Geert-Jan Houben for advice and help.

As a rough indication of possible subjects for thesis projects, below we give some subjects of projects that have been running, that are running or that are open for new students. We note that many of these topics can be approached in projects inside the research lab, inside industry, or in a collaboration between university and industry. Industry here includes organizations that are public or private, large or small, national or international. As illustration, some organisations hosting previous student projects: Adyen, Crowdsense, Exact Software, Capgemini, Greetinq, KPMG, IBM, ICTU, IDS Scheer, ING, Innopay, Isaac, ISD, Logica, Sanoma, Tamtam, TNO, Truvo.

Here, is a list of subjects as inspiration:

Enriching MOOC learner models through social web mining

In the setting of Massive Open Online Courses (i.e., MOOCs), the construction of accurate learner models is particularly important for facilitating students' learning process and improving their learning experience. Most of the existing learner models are built by only using the data generated within the MOOC platform (e.g., EdX, Coursera). Although this generates a lot of learner traces during the course, we have no knowledge about those learners before or after the course. The social Web can change this. Data gathered from the social Web can yield insights into learners' interests, motivations, prior knowledge and progression after the course (through sources such as Bitbucket, GitHub, Twitter, LinkedIn, etc.). A Master thesis project will focus on a specific social Web platform and investigate what type of knowledge can be retrieved from it about learners of TU Delft MOOCs.

Learning path visualization dashboard

Massive Open Online Courses (MOOCs) offer unprecedented insights into the habits of learners and how they interact with online learning environments. This project aims to investigate how students can learn from previous (successful) students’ learning paths and engagement habits. This Master project will focus on the building of a "Learning Dashboard" that shows  a representation of the current students’ learning path & engagement for him or her to compare against an “ideal” learning path & engagement pattern. Informal learning environments, such as MOOCs, often leave students unsure of how to navigate the content and the platform itself, and this research would work towards supporting students in keeping them on track towards their learning objectives.

Using social network sites to prevent cold-start in personalization

Personalization in applications is usually based on user or context models that represent relevant aspects of the user or context, e.g. the interests in films, films genres or actors in the case of a movie recommender application. Obtaining the data for these user models is often not an easy task, specially when the user starts using the application, the so-called cold-start. So, the import and use of data from other sites can help. This project aims to investigate how information from social network sites can be used to fill user models for a given application, e.g. in the domain of movies or TV programs, by studying the mapping between user models, and by implementing a configurable software tool that allows for extracting data from social networking sites and transforming it into a user model for the given application. [Contact: Dr. Claudia Hauff]

Interactive dialogue-based user modeling

Adaptive applications need to know the user in order to be able to adapt to the user. There are several ways to explicitly ask or import relevant user knowledge, but in many cases there is a lack of proper support to verify with the user whether the application's assumptions are correct and valid. Interactive dialogue-based tools can enhance an application to obtain higher quality user knowledge through a carefully designed communication between application and end-user. For several scenarios, the design and development of such a interactive dialogue-based tool is an interesting project subject.

Model-driven Adaptive Hypermedia and Web Applications

Designing adaptive hypermedia and web applications in general can be a very complex task. Therefore researchers in the WIS group have developed a language to specify these applications at a hight abstraction level such that the navigation structure can be specified in a clear way, and from this the application can be generated. The language specifies how this structure is generated from a database that contains the content that is to be presented, and can take into account extra context information about the user in order to personalize the navigational structure. The assignment will consist of extending the language and its implementation such that it can be used to describe attractive and state-of-the-art adaptive web applications.

Streaming software security

The aggregation of both projects and deployment configurations on GitHub has made those projects particularly vulnerable to sensitive data leaks. For reasons that have to do with ease of use or just pure negligence and mistakes, it is quite common for GitHub users to push passwords, database connection strings, cloud provider one-time passwords and environment variables and private SSH keys to public repositories. Once this information is made public, it is impossible to retract it as projects such as GHTorrent and GitHub Archive archive this information, while GitHub's real-time event stream makes it easy for adversaries to attack the exposed systems almost immediately. The aim of the proposed project is to explore this phenomenon and propose effective counter-measures. [Contact: Dr. Georgios Gousios]

Streaming cascading aggregations

Cascading aggregations work by specifying a set of key metrics, a set of thresholds for those and a set of functions that can extract interesting pieces of information or combine two other functions. To react efficiently on current events, aggregation functions always work on data streams. Insights can be generated by linking metric threshold violations to aggregation functions; this creates a graph of aggregations, which, when topologically shorted, can lead to generation of summarized information.  What we are interested into is a language to specify cascading aggregations and a (stream-based) processor that will generate automated data summaries that read like this (e.g. when applied on software engineering data):

"Version 1.2.1 (commit a223b) of app Foo is receiving negative feedback (sentiment ratio: 0.45%) on app store. Users are complaining about frequent crashes.Top exceptions in app crash log: NullPointerException (88%), increased 95% in version 1.2.1. Static analysis on commit a223b indicates possible uninitialised variable x in Bar.java, line 75. Commit a223b is 85% bigger than average. Code review passed with 3 comments and 2 thumbs up" [Contact: Dr. Georgios Gousios]

More Information

For more ideas or inspiration, you can of course also have a look at the research interests of the Web Information Systems group members. We repeat that projects can run inside the research lab, inside industry, or in a collaboration between university and industry. Industry here includes organizations that are public or private, large or small, national or international. As illustration, some organisations hosting previous student projects: Adyen, Crowdsense, Exact Software, Capgemini, Greetinq, KPMG, IBM, ICTU, IDS Scheer, ING, Innopay, Isaac, ISD, Logica, Sanoma, Tamtam, TNO, Truvo..

Master specialisation

Master students who like to do a specialisation in WIS are encouraged to select Web Science & Engineering, Information Retrieval, and Seminar Web Information Systems (possibly in second year). In any case, for help in composing a program, students can always contact Geert-Jan Houben for advice and help.

More information can be obtained from prof.dr.ir. Geert-Jan Houben.