Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Info-Icon Diese Seite ist nicht in Deutsch verfügbar
Bildinformationen anzeigen



The CSS group does research on natural language processing, with a focus on topics related to computational argumentation and computational sociolinguistics. Recently, we also started investigating explanations and explainablity. Past research tackled text mining processes in general. Details are given in the following.

Computational Argumentation

Computational argumentation deals with the computational analysis and synthesis of natural language arguments and argumentation, usually in an empirical data-driven manner. Computational argumentation research that members of the CSS group have particularly contributed to includes the following. Many of its outcomes are or will be demonstrated in the argument search engine

Argument Generation

In our argument generation research, we focus on understanding the core of an argument by inferring its conclusion (ACL 2020), as well as extracting snippets representing the most important claim and the reason behind (SIGIR 2020).

Argumentation Quality

Starting from a literature survey, we defined a taxonomy of argumentation quality (EACL 2017), followed by an empirical comparison of theory and practice (ACL 2017). We developed computational approaches to assess specific quality dimensions, such as a machine learning regressor that uses argument mining for argumentation-related essay scoring (COLING 2016) and a PageRank adaptation for argument relevance (EACL 2017).

Argument Search

With, we introduced the first search engine for arguments on the web (ArgMining 2017). The mentioned assessment of argument relevance (EACL 2017) may be used for ranking found arguments. Moreover, we developed a computational approach to retrieve counterarguments to arguments based on their simultaneous similarity and dissimilarity (ACL 2018).

Argumentation Strategies

Based on a news editorial corpus with fine-grained evidence annotations (COLING 2016), we trained a classifier and used it to find topic-specific evidence patterns (EMNLP 2017). We modeled and empirically studied the synthesis process of authors following rhetorical strategies (COLING 2018), and we analyzed deliberative strategies in dialogical argumentation on Wikipedia talk pages (ACL 2018).

Overall Argumentation

We developed computational models of the sequential flow of review argumentation (COLING 2014) based on an annotated corpus (CICLing 2014). Mapping a text into the feature space of overall structures, flows predict its sentiment robust across domains (EMNLP 2015). Later, we generalized the flow model to other text genres and prediction tasks (ACM TOIT 2017), and we presented a new tree kernel-based approach to capture sequential and hierarchical overall argumentation at the same time (EMNLP 2017).

Other Argumentation Topics

Other recent research topics include argument mining (NAACL 2016ArgMining 2017), argument reasoning comprehension (NAACL 2018SemEval 2018), and ad-hominem fallacies (NAACL 2018).

Computational Sociolinguistics

Computational Sociolinguistics, as a subfield of computational social science, investigates research questions from the social sciences through empirical analyses of natural language data. Input data includes, but is not limited to, social media text, online activities, and socio-cultural key indicators. The focus is often on insights into social phenomena and dynamics rather than the technologies behind. The CSS group is doing computational sociolinguistics research on the following topics, partly in the context of the research program Digital Future.

Social Bias Analysis

Social bias can emerge from pre-existing beliefs about the
characteristics of group members of any social group, often leading to prejudices and discrimination. Language can be a major factor in carrying and reinforcing those biases, causing NLP models that learn from it through texts to inherit those biases. Thus, in our research, we focused on analyzing the bias in texts to understand the possible influence on downstream systems. More information can be found in our paper (ArgMining2020).
We are currently adding to this research by evaluating possible relations between social and political bias, as well as taking a closer at the methods that are widely used to detect biases in NLP systems.

Media Bias Analysis

Media plays an important role in shaping public opinion. Biased media can influence people in undesirable directions and hence should be unmasked as such. To solve this problem with NLP techniques, we developed machine learning models to analysis the media bias. Especially, we focused on (1) learning about the relation between sentence-level and article-level bias (EMNLP Findings 2020), and (2) studying at what granularity level and how sequential patterns media bias is manifested (NLP+CSS 2020).

Media Bias Flipping

In this research topic, we studied the task of “flipping” the bias of news articles: Given an article with a political bias (left or right), generate an article with the same topic but opposite bias. We created a corpus with bias-labeled articles from As a first step, we analyze the corpus and discuss intrinsic characteristics of bias. They point to the main challenges of bias flipping, which in turn lead to a specific setting in the generation process. We applied an autoencoder incorporating information from an article’s content to learn how to automatically flip the bias. More about this research can be found in our paper (INLG 2018).

Self-determined Opinion Formation

For many controversial topics in life and politics, people disagree on what is the right stance towards them, be it the need for feminism, the influence of religion, or the assassination of dictators. Stance is affected by the subjective assessment and weighting of pro and con arguments on the diverse aspects of a topic. Building stance in a self-determined manner is getting harder and harder in times of fake news and alternative facts, due to the unclear reliability of many sources and their bias in stance and covered aspects. 

We are analyzing how a self-determined opinion formation can be supported through technologies such as the argument search engine More details follow soon.

Communication in Crowdworking

Crowdworking is one phenomenon of the digitization of the society. Online platforms provide connections between requesters of tasks and workers who solve the tasks in an anonymous and distant manner. However, the quality of crowdworking is affected by communication problems in the task design, operation, and evaluation. To learn about the workers' side, we compared existing research on problems in crowdworking with complaints mined from a workers' online discussion forum (COLING 2020). The findings from this study form the basis for our research on how to improve crowdsourcing.

Explanations and Explainability

Explainability expresses the desire to make a system’s behavior intelligible and thus controllable by humans. The CSS group recently started investigating explainability-related topics, with a focus on the analysis and generation of natural language explanations. The following topics are in the focus in this regard.

Generation of User-Adapted Explanations

Groundwork for explanations that adapt to the language of the user includes the modification of stylistic bias (INLG 2018), the generation of specific information (ACL 2020), and the encoding of beliefs into generated text (EACL 2021).

Assessment of Explanation Quality

Details on this topic are following in the next months.

Modeling of Metaphorical Explanations

Details on this topic follow in the next months.

Text Mining

Text mining deals with the automatic or semi-automatic discovery of new, previously unknown information of high quality from large numbers of unstructured texts. The types of information to be inferred from the texts are usually specified beforehand, i.e., text mining tackles given tasks. Text mining research that members of the CSS group have particularly contributed to includes the following.

Ad-hoc Design of Text Analysis Pipelines

We have developed a method that creates a text analysis pipeline ad-hoc in near-zero time for a specified information need along with a quality prioritization using partial order planning and greedy-best first search (CICLing 2013). In addition, we can automatically equip any such pipeline with an assumption-based truth maintenance system that systematically ensures that each algorithm in a pipeline analyzes only potentially relevant portions of text (CIKM 2013).

Efficiency Optimization of Text Analysis Pipelines

We developed a process to optimize the run-time efficiency of any text analysis pipeline (CIKM 2011). Based on the assumption-based truth maintenance system mentioned above, we found the theoretically optimal pipeline schedule using dynamic programming (COLING 2012). It depends on the run-time and found information of each employed text analysis algorithm. These values are not known beforehand, which is why an informed best-first search scheduling approach is more preferable in practice (LNCS 9383). In case the input texts to be process are heterogenous, an adaptive scheduling is needed, which we have realized with self-supervised online learning (IJCNLP 2013).

Die Universität der Informationsgesellschaft