data pre-processing

data pre-processing

Linking Headlines to Article Bodies for Stance Detection: A Structured Pre-processing Workflow Using GPT-4o

Working with real-world text data often means dealing with structures that are not yet analysis-ready. In our case, the dataset included headlines and full article texts stored separately, across two different tables. The only link between them was a shared identifier field: Body ID. Before we could begin any further

No-Code Transformation of the NCBI Disease Corpus into a Structured CSV

Working with biomedical corpora often requires programming skills, specialised formats, and time-consuming preprocessing. But what if you could transform a complex annotated dataset—like the NCBI Disease Corpus—into a structured, analysis-ready CSV using nothing more than a single, well-designed prompt? In this post, we demonstrate how a no-code, GenAI-powered

Prompting Warming Stripes: A No-Code Way to Visualise Meteorological Data

Creating visualisations from raw meteorological data no longer requires programming skills. In this post, we demonstrate how researchers can generate a warming stripes diagram – a simple yet powerful visualisation of long-term temperature trends – using only a natural language prompt. We recommend using GPT-4o or GPT-4.5 for this task, as

No-Code Data Pre-processing and Descriptive Analysis with GenAI: Exploring the Nobel Prize Dataset

In this case study, we demonstrate how generative AI can be used to carry out a full data pre-processing and descriptive analysis workflow on the Nobel Prize dataset. Our objective was to prepare and explore the data in a methodologically sound manner—converting numerical types, handling missing values, verifying completeness,

Prompt-based JSON to .xlsx Conversion: Turning Interview Metadata into a Structured Excel File

Interview metadata often arrives in semi-structured formats, making it difficult to analyse or integrate into standard research workflows. This short prompt-based approach shows how GenAI can transform a JSON file containing interview-related metadata into a clean, structured Excel (.xlsx) file—in seconds, without requiring any programming knowledge. The method enables

Data Pre-processing with GenAI: Standardising Geographic Location Data in an Input CSV

Data pre-processing is a crucial but often tedious stage in research, particularly when standardising complex CSV datasets. This blog demonstrates how GenAI can systematically structure geographic location data directly from an input CSV file without coding. Our primary objective was to ensure consistent and accurate categorisation, preparing the data for

Analysing US State Temperature Trends Using GenAI: A No-Code Approach for Researchers

Time series analysis is a powerful tool for researchers across disciplines, yet its reliance on programming can pose a challenge for those without coding expertise. This blog demonstrates how AI-driven prompts can enable researchers to perform simple time series analyses without writing a single line of code, making data insights