Interactive Labeling for ML-based Structural Formula Extraction

Fast timer icon

Thesis

Location icon

Karlsruhe

Interactive Labeling for ML-based Structural Formula Extraction

Kollegiengebäude am Kronenplatz (05.20), das ISSD befindet sich in der Triangel im 4. und 5. OG
Calendar
Immediately searched (unlimited)
Clock
30–40 h per week
Dollar
No salary specified
Remote work Icon
Remote work not possible
Joint Master’s thesis offered by IAR/SZS (research group Prof. Stiefelhagen, CV:HCI) and IISM (research group Prof. Mädche, ISSD) for both computer science and information systems students. Open for applications. WHO CAN APPLY? Only enrolled students from KIT (Karlsruher Institut für Technologie) with course of studies Wirtschaftsinformatik, Wirtschaftsingenieurwesen, Informationswirtschaft, or Technische Volkswirtschaftslehre.
Responsibilities Icon

Requirements

We expect the student to be familiar with web development. The system should be devloped with a modern web application frontend framework (e.g. Angular, React, or Vue) and a JavaScript or Python backend.

Occupational fields

Machine Learning Engineer (m/w/d)

Data Science & Artificial Intelligence

Tech

Advantages Icon

Problem

Scientific publications, lecture slides, and other documents convey their information not only in plain text, but also in figures and images. This makes documents less accessible for humans and machines alike. Automated metadata extraction, full text search, or information aggregation is impacted by this. Less obvious, but potentially even more important, human accessibility is also hindered. Figures are often entirely incomprehensible for visually impaired users, but also people less accustomed with the domain could benefit from support. This fact limits access to e.g. graphical representations of structural formulas for the visually impaired. However, these graphics are often a crucial part of lecure slides or scientific publications on the topic.

Agile Working

Regular Feedback Meetings

User Icon

Goals

The goal of this Master’s thesis is to design, develop and evaluate an interactive labeling system to support the accessibility of figures. Thereby interactive labeling refers to a human-machine cooperative approach, which combines automatic with manual steps. Structural formulas from the field of chemistry offer themselves as a context of application for this system, as they are frequently used and standards have already been well established. We envision a semi-automated approach, in which user input is supported by the machine. Well structured tasks like these suit themselves well to be supported by machine learning models. As a user is always involved, the model does not need to achieve near-perfect accuracy scores, but rather should support the users with suggestions. Allowing the model to improve with new user input would be a bonus. In a first step we expect the student to identify the state of the art such systems, and identify components that could be re-used or adapted to this context. Afterwards the solution should be developed. A full-fledged evaluation of the system is expected as well. The typical workflow for the system should look like the following:
  • Import a PDF document into the system.
  • The system suggests areas in which figures chemical formulas could be found.
  • Correct the systems suggestions.
  • Crop out all marked areas to obtain indidual figures.
  • For each figure create
    • a chemfig representation of the figure (e.g. “\chemfig{*6(=-=-=-)}”),
    • a non-informative textual description of the figure (e.g. “a hexagon where three edges are double lines”)
    • and an interpretation of the figure (e.g. “Benzene”).
  • The system supports the user in creation of above representations with automatically generated suggestions. Hereby a classifier from automatically generated training data that translates images to chemfig should be trained.
  • Export an accessible EPUB v3 where the original figure is augmented with above data as alternative versions.
  • Export a version of the figure for use on a braille printer (Open Document Graphic format).

All levels welcome (no experience required)

Languages

German

English

Skill set

Angular

React

JavaScript

Python

Company Icon

About ISSD - KIT

The research group “Information Systems & Service Design” (ISSD) headed by Prof. Mädche focuses in research, education, and innovation on designing interactive intelligent systems. The research belongs to the Institute of Information Systems and Marketing (IISM) and is embedded into the Information Systems & Engineering group. ISSD is also part of the Karlsruhe Service Research Institute (KSRI). The research group is positioned at the intersection of Information Systems (german: Wirtschaftsinformatik) and Human-Computer Interaction (HCI). Our mission is to create impactful scientific knowledge for designing interactive intelligent systems that enable humans to perform activities more efficiently, effectively, and meaningful. We believe that delivering cutting-edge knowledge and inspiring education, as well as an ongoing dialog with the public need to go hand in hand to maximize the impact of our work in organizations and society. The group is organized in three research departments: Digital Experience & Participation, Intelligent Enterprise Systems, and Digital Service Design & Innovation. Current topics of research are Human-AI Interaction, Cognitive Interaction Technologies, Physiological Computing Systems, Interactive Business Intelligence & Analytics Systems, and Interactive Systems Engineering.

Foundation year icon
Founded in 1825
Employee icon
500-999 employees
Company sectors icon
Bildung
Company size icon
Global Player

By loading the map, cookies are set as specified in our data privacy. Learn more.

More information about the company
Frequently asked questions

Frequently asked questions

Arrow

Who or what is Campusjäger by Workwise?

Campusjäger is part of Workwise - a job platform that supports you throughout your entire career. We take care of recruiting for various companies and accompany you through the entire application process. Via Campusjäger by Workwise you can find jobs for students and graduates. You can manage your applications in your Workwise profile. Learn more about the connection between Workwise and Campusjäger.

Arrow

Is the job I see still available?

For jobs that are still open, you can click the 'Apply now' button. If this is not possible, the job has already been filled or temporarily deactivated.
Arrow

Which documents do I need for my application?

That depends entirely on the job you are applying for. In many cases it is sufficient to upload your PDF resume or fill out your Workwise profile.

Arrow

Where can I upload my records or documents?

You can upload your application documents in your Workwise profile. These can only be viewed by companies you are applying to.

Arrow

Where can I find more information about the company?

You can find more information in the company profile of ISSD - KIT.

Arrow

Can I process my application afterwards?

Yes, this is possible. In your application overview you can view your information and make changes. If you have already been invited to an interview, editing is no longer possible. However, you can still add general information and upload additional documents in your profile.

Arrow

How do I get news about my application?

In your application overview at Workwise you have an overview of the application progress at any time. Additionally, we send you emails about the most important status changes.

Arrow

Can I send several applications at once?

The number of your applications is not limited. An overview of your applications can be found at Workwise.

Arrow

Can other companies see where else I have applied?

No, companies can only see the applications they have received.
Arrow

Can I also contact the company's contact person directly?

Personal contact is possible via chat as soon as you have been invited for an interview. Before that, you will receive all important status changes by e-mail. If you have any questions, you can contact your personal Candidate Manager:in from Workwise.

Arrow

I don't think I meet all the requirements. Can I still apply?

Even if you don't meet all the requirements, you can make up for missing knowledge with additional skills. Use the application's questions to address your motivation and show the company why you are still a good fit for the job. If you don't meet many or all of the requirements, the application will not be successful.
Arrow

What do I have to consider if I am not from Germany?

Please make sure to provide all necessary documents within your Workwise profile. It should include an EU work-permit (if you have no EU citizenship) and a CV at least. Depending on the position you are applying to, you could also be asked for a certificate of enrollment, a transcript of records or a language certificate. We would also recommend to inform yourself thoroughly in advance about visa regulations. Therefore you can use the official visa navigator from the Federal Foreign Office.

Arrow

What do I have to consider if German is not my mother tongue?

Please take into account the job’s language requirements and make sure the requirements match your skills. In the job search you can use the language filter to find jobs without German language requirements. It is also helpful to provide language certificates. This section in our help center may support you during the application process.

Our job offer Interactive Labeling for ML-based Structural Formula Extraction sounds promising? We're looking forward to your application.

A similar job for you

Find similar jobs