Ofcom — Ahmed Younes

The Project

Context & Objectives

A pilot project for Ofcom — the UK's communications regulator — to map disinformation narratives across social media platforms from a curated list of known accounts. The goal was to understand what narratives existed, how they clustered thematically, and what each account's contribution to the broader disinformation ecosystem looked like.

A layered topic modelling approach was used — similar to the Swedish Institute project — with a first pass across the full corpus followed by theme-specific models to surface granular sub-narratives.

My Role

Data Scientist

Co-designed the analytical approach and research methodology. Responsible for executing an iterative two-week sprint cycle: week one to design, run and sample the topic model; week two for domain expert review of samples — with outputs generated and presented at every stakeholder meeting throughout the pilot.

This included preparing stratified samples for domain expert annotation, managing the annotation workflow, cleaning annotated outputs, and running the statistical analysis on the final classified data. The data collection pipeline and account list were pre-existing — contribution focused on the analytical, methodology and annotation workflow stages.

Scope

Project Scope

Platforms: Facebook, Twitter, Telegram, Instagram, 4chan, YouTube
Scale: ~1M messages (stratified sample from a much larger corpus)
Account list: pre-curated list of known disinformation-adjacent accounts
Themes: 5 dominant themes selected by the client for deeper investigation (e.g. Health → COVID conspiracy, vaccine hesitancy, pharma narratives)

Method

Approach & Pipeline

The full corpus was too large to process directly — a stratified ~1M message sample was drawn first. Content then passed through a two-layer pipeline:

Layer 1 — global BERTopic across the sampled corpus to surface broad narrative themes, annotated into a general thematic breakdown
Layer 2 — the client selected 5 dominant themes for deeper investigation; per-theme BERTopic models then surfaced granular sub-narratives within each (e.g. Health → COVID conspiracy, vaccination, pharmaceutical narratives)

Each iteration ran on a two-week cycle. Domain experts reviewed stratified samples from each cluster via structured annotation workflows. Outputs were consolidated, cleaned and quality-checked before statistical analysis and stakeholder reporting.

Outcomes

Results & Impact

Multilayered disinformation narrative map produced across the curated account set. Account-level thematic profiles built from the classified corpus.

Findings and methodology presented to Ofcom stakeholders on multiple occasions throughout the pilot. Internal deliverable — no published output. Further phases pending at time of completion.

Disinformation Narrative Mapping on Social Media

Context & Objectives

Data Scientist

Project Scope

Approach & Pipeline

Results & Impact

Tech Stack