Thesis projects

The WIS group is open for students who want to do their thesis on subjects in the wider area of web information systems and information architecture.

At the WIS group we stimulate students to contact prof. Geert-Jan Houben, to discuss possible topics for a thesis project (and literature survey). In such discussions students are free to suggest their own topic and then together a concrete thesis (or literature survey) topic will be defined.

Master students who like to do a specialisation in WIS are encouraged to select Web Science & Engineering, Information Retrieval, and Seminar Web Information Systems (possibly in the second MSc year). In any case, for help in composing a program, students can always contact prof. Houben for advice.

As a rough indication of possible subjects for thesis projects, below we give some subjects of projects that have been running, that are running or that are open for new students. Many of these topics can be approached in projects within the research lab, in industry, or in a collaboration between academia and industry. Industry here includes organizations that are public or private, large or small, national or international. Examples of organisations hosting some of our thesis student: Adyen, Crowdsense, Exact Software, Capgemini, Greetinq, KPMG, IBM, ICTU, IDS Scheer, ING, Innopay, Isaac, ISD, Logica, Sanoma, Tamtam, TNO, and Truvo.

Thesis topics at Lambda-Lab

The Lambda-Lab is part of the Web Information Systems group and concerned with the design of large-scale learning interventions, their deployment across thousands of learners in TU Delft Massive Open Online Courses (MOOCs) and the analyses of their impact based on the log traces learners leave in MOOC environments such as edX. To get a concrete idea of what this means, take a look at the learning tracker paper we just published at LAK, the major learning analytics conference. This paper is based on a Master thesis written by Ioana Jivet in the Lambda-Lab. It showcases the use of software engineering, data science, data visualization and some social theories to design, implement, and evaluate a successful intervention. While MOOCs are the main focus, Lambda-Lab is also concerned with technology enhanced learning in the traditional higher education classroom.

Lambda-Lab has four research lines and accomodates MSc thesis topics in each one. Beyond what is mentioned here as examples, you are also welcome to use your own ideas as long as they fit into the four research lines.

  • Beyond the MOOC platform: enriching large-scale online learning through external data sources. An example project can match edX learners to their GitHub accounts and explore to what extent learners that followed a MOOC on programming are actually applying the learnt programming concepts in practice. Relevant MSc courses to have followed for this line include Pattern Recognition, Artificial Intelligence Techniques, Web Science & Engineering and Data Visualization.
  • Within the MOOC platform: enriching large-scale online learning through MOOC environment adaptations. The learning tracker mentioned above is an example of this line; other examples include the design of chatbots to engage learners during their stay on the MOOC platform, the design of an infinite quiz engine based on deep learning technology to enable scalable gamification and the design of collaborative worktables to enable learners to learn together. Relevant MSc courses to have followed for this line include Pattern Recognition, Artificial Intelligence Techniques and Web Science & Engineering.
  • Search as learning: enriching large-scale online learning through search-based adaptations. A big part of learning is search but so far Web search is not integrated into any of the MOOC platforms - this integration can be a potential thesis topic. Relevant MSc courses to have followed for this line are Information Retrieval and/or Multimedia Search and Recommendation.
  • Data analytics in the classroom: enriching the students’ and teachers’ experience in the traditional classroom. Not only MOOC learners are struggling, our TU Delft campus students can also benefit from data analytics. One example project is the design of a real-time attention tracker for large classrooms (such as Ampere in the EWI building based on noise, movement, tc.) that is displayed to students during the lecture to raise their awareness about their learning. Relevant MSc courses to have followed for this line include Pattern Recognition, Artificial Intelligence Techniques, Web Science & Engineering and Data Visualization.

Thesis projects at E-lab

The E(psilon)-lab is a new lab (formed April 2017) within the Web Information Systems group and is concerned with human interaction with artificial advice givers, and specifically explanations to support decision making. To have a concrete idea of what this means, take a look at the introduction to the special issue on human interaction with artificial advice givers.

The E-lab takes a user-centered approach to research, and evaluates the quality of human decision making to drive both interface and algorithm design. The research is currently driven by two applied challenges: 1) How to deal with filter-bubbles and confirmation bias; and 2) How to support decision making for sequences of items (in addition to individual items). To address these challenges, the E-lab has two research lines and accommodates MSc thesis topics in both. Beyond what is mentioned here as examples, you are also welcome to use your own ideas as long as they fit into the research lines.

  • Explainable algorithms: When a system provides advice, such as an item to try or buy in a recommender system, it is not always clear how this conclusion was reached. This line investigates how advice can be explained in a way that supports users in making good decisions. This line is currently focused on how to construct recommendation sequences that are diverse, while maintaining user satisfaction, and considering trade-offs between different types of preferences in domains such as music and tourism. One example project could be to develop a playlist recommender system which considers the ordering and diversity of the tracks. Another, is to automatically generate travel itineraries for tourists. Relevant MSc courses to have followed for this line are Multimedia Search and Recommendation, Fundamentals of Data Analytics, and Information Retrieval.
  • Novel interfaces and interactions for explanations: Recent developments in AI have enabled better artificial advice giving that supports and even augments human capabilities. As these advice-giving systems increase in complexity, their designers have also come to realize that a standard graphical user interface (GUI) is often not sufficient to harness their power. This line investigates methods for supporting interaction with AAGs (e.g., natural language, visualization, and argumentation). This line is currently focused on interfaces for helping users understand and explore their blind-spots and to discover novel and relevant content. Example projects would develop and evaluate argumentation interfaces with users, develop interactive explanation interfaces, or develop novel explanation visualizations. Relevant MSc courses to have followed for this line include (depending on project), Artificial Intelligence Techniques, Human-Agent/Robot Teamwork, Affective Computing, and Data Visualization.

Thesis topics at π-lab

At the software analytics lab, we are working at the intersection of Software Engineering and Data Science. Our research is driven by problems in modern, distributed software development at scale. We apply quantitative analyses to big software data from software repositories, app stores, social media and live systems. We also build systems that allow us to scale our research and make it production ready. Our philosophy is to build tools that solve software engineering problems in an efficient way. Our research revolves around the following 3 lines:

  • Engineering for (software) analytics: Creating platforms for ingesting, integrating and querying software engineering data in a streaming fashion. An example project can be a plug-in to our CodeFeedr platform that given a set of key metrics, a set of thresholds and a set of functions, it can extract interesting information summaries (cascading aggregations). Relevant MSc courses: Mining Software Repositories, Big Data Processing, Cloud Computing, Information Retrieval
  • Distributed collaboration on software development: optimising code review and integration processes across millions of projects and developers. For example, you can work on prioritizing code reviews on GitHub or to automatically assess the impact of pull requests on a code base. Relevant MSc courses: Mining Software Repositories, Information Retrieval, Pattern Recognition, Globally Distributed Software Engineering
  • Software ecosystems: analysing the fragility, security and robustness properties of package managers. An example project in this direction would be a tool that given an NPM or POM dependency file, it will calculate a risk score for depending on external libraries. Relevant MSc courses: Mining Software Repositories, Software Architecture

 More concrete project descriptions can be found in this page

Using social network sites to prevent cold-start in personalization

Personalization in applications is usually based on user or context models that represent relevant aspects of the user or context, e.g. the interests in films, films genres or actors in the case of a movie recommender application. Obtaining the data for these user models is often not an easy task, specially when the user starts using the application, the so-called cold-start. So, the import and use of data from other sites can help. This project aims to investigate how information from social network sites can be used to fill user models for a given application, e.g. in the domain of movies or TV programs, by studying the mapping between user models, and by implementing a configurable software tool that allows for extracting data from social networking sites and transforming it into a user model for the given application.

Interactive dialogue-based user modeling

Adaptive applications need to know the user in order to be able to adapt to the user. There are several ways to explicitly ask or import relevant user knowledge, but in many cases there is a lack of proper support to verify with the user whether the application's assumptions are correct and valid. Interactive dialogue-based tools can enhance an application to obtain higher quality user knowledge through a carefully designed communication between application and end-user. For several scenarios, the design and development of such a interactive dialogue-based tool is an interesting project subject.

Model-driven Adaptive Hypermedia and Web Applications

Designing adaptive hypermedia and web applications in general can be a very complex task. Therefore researchers in the WIS group have developed a language to specify these applications at a hight abstraction level such that the navigation structure can be specified in a clear way, and from this the application can be generated. The language specifies how this structure is generated from a database that contains the content that is to be presented, and can take into account extra context information about the user in order to personalize the navigational structure. The assignment will consist of extending the language and its implementation such that it can be used to describe attractive and state-of-the-art adaptive web applications.