Cite Kong, Q., Booth, E., Bailo, F., Johns, A., & Rizoiu, M.-A. (2022). Slipping to the extreme: A mixed method to explain how extreme opinions infiltrate online discussions. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 524–535.


Qualitative research provides methodological guidelines for observing and studying communities and cultures on online social media platforms. However, such methods demand considerable manual effort from researchers and can be overly focused and narrowed to certain online groups. This work proposes a complete solution to accelerate the qualitative analysis of problematic online speech, focusing on opinions emerging from online communities by leveraging machine learning algorithms. First, we employ qualitative methods of deep observation for understanding problematic online speech. This initial qualitative study constructs an ontology of problematic speech, which contains social media postings annotated with their underlying opinions. The qualitative study dynamically constructs the set of opinions, simultaneous with labeling the postings. Next, we use keywords to collect a large dataset from three online social media platforms (Facebook, Twitter, and Youtube). Finally, we introduce an iterative data exploration procedure to augment the dataset. It alternates between a data sampler — which balances exploration and exploitation of unlabeled data — the automatic labeling of the sampled data, the manual inspection by the qualitative mapping team, and, finally, the retraining of the automatic opinion classifiers. We present both qualitative and quantitative results. First, we show that our human-in-the-loop method successfully augments the initial qualitatively labeled and narrowly focused dataset and constructs a more encompassing dataset. Next, we present detailed case studies of the dynamics of problematic speech in a far-right Facebook group, exemplifying its mutation from conservative to extreme. Finally, we examine the dynamics of opinion emergence and co-occurrence, and we hint at some pathways through which extreme opinions creep into the mainstream online discourse.