PROMPT REVOLUTION

Image-Based Handwriting Recognition: Comparing AI Model Accuracy Across Languages

In this follow-up experiment, we tested how generative AI models handle handwritten text when provided as image input rather than PDF. Unlike in the earlier test, all models were able to process the images and return readable outputs. Performance Comparison 🏆 Champion as of October 2: Copilot, Gemini 2.5 Pro,

by Miklós Sebők - Rebeka Kiss • Oct 7, 2025

OCR Copilot recommendations

Testing OCR on Handwritten PDFs: Comparing Model Accuracy on English, French, and Hungarian Samples

Optical character recognition (OCR) of handwritten text remains a demanding task, particularly once the focus shifts beyond English. In this experiment, we assessed a range of generative AI models on three handwritten text samples—one each in English, French, and Hungarian—to examine cross-linguistic performance. While accuracy was consistently high

by Miklós Sebők - Rebeka Kiss • Oct 2, 2025

manus.ai presentation recommendations

Transforming a Research Paper into a Conference Presentation with Manus AI

In this experiment, we tested Manus AI’s ability to generate a full academic presentation from a research paper. The tool produced a slide deck that was functional and visually adequate, though not every slide was directly usable. The Manus AI interface supported adjustments before downloading—such as modifying layout,

by Miklós Sebők - Rebeka Kiss • Oct 1, 2025

presentation Claude Sonnet 4.5 recommendations

Testing Claude Sonnet 4.5 for Academic Slide Design: From Research Papers to Conference Outlines

We tested the new Sonnet 4.5 model on a demanding academic task: transforming a research paper into a structured 15-slide outline for a conference presentation. The prompt required not only conceptual rigour and narrative coherence but also attention to visual communication, with clear guidance on where charts, tables, and

by Miklós Sebők - Rebeka Kiss • Sep 30, 2025

sentiment analysis classification customer review

Comparing Generative AI Models on Customer Review Sentiment Labelling Tasks

In this post we present a comparative evaluation of several generative AI models, focusing on their ability to carry out a straightforward sentiment analysis task. The models were asked to classify customer reviews as either positive or negative, following a simple binary labelling instruction. While the task itself was unambiguous,

by Miklós Sebők - Rebeka Kiss • Sep 29, 2025

data collection Claude Opus 4.1 Claude 4 Sonnet

Does the Language of the Prompt Matter? Collecting Hungarian Population Data Using Claude Opus 4.1 and Sonnet 4

When using generative AI for structured data collection, the language of the prompt can make a real difference. In our test with Hungarian population statistics, both Claude Opus 4.1 and Sonnet 4 produced accurate outputs when prompted in Hungarian – but with an English prompt, Sonnet 4 generated rounded figures

by Miklós Sebők - Rebeka Kiss • Sep 27, 2025

Claude Opus 4.1 data collection recommendations

Collecting U.S. Unemployment Data Using Claude Opus 4.1

In this post we examine how Claude Opus 4.1 can be applied to data collection tasks through a prompt-based approach. The model accessed and structured monthly U.S. unemployment rate data from the Federal Reserve Economic Data (FRED) platform for the period January 2000 to December 2024. The results

by Miklós Sebők - Rebeka Kiss • Sep 25, 2025

NotebookLM literature review recommendations

Testing AI-Assisted Literature Reviews with Notebook LM

In this post we put Notebook LM’s new literature review feature to the test. The model was given six pre-selected sources on a single topic and asked to produce a structured review. The results were not bad at all: the output drew appropriately on the provided references and offered

by Miklós Sebők - Rebeka Kiss • Sep 24, 2025

classification API DeepSeek

Zero-Shot Stance Classification with DeepSeek: An API-Based Experiment

In our previous experiment, we showed how DeepSeek-V3 could classify argumentative claims in a simple prompt-based setup. In this follow-up, we take the test one step further: running the same PRO/CON stance classification task programmatically via the DeepSeek API in a Google Colab environment. This shift from manual prompting

by Miklós Sebők - Rebeka Kiss • Sep 23, 2025

manus.ai data collection recommendations

Collecting Annual U.S. Housing Price Index Data from the Census Bureau using Manus AI

In this post we explore how Manus AI can be used to collect official housing price index data from the U.S. Census Bureau. The task focused on the Single-Family Houses Sold series, requesting annual figures to be gathered and organised into a clean, structured dataset suitable for further research.

by Miklós Sebők - Rebeka Kiss • Sep 5, 2025

GPT-4o Deep Research data collection

Extracting GDP per Capita Data from the World Bank Data Bank using GPT-4o Deep Research

In this short post, we demonstrate how GPT-4o with Deep Research mode can be used to retrieve and structure real-world macroeconomic data directly from authoritative sources. Using a single natural language instruction, the model successfully extracted GDP per capita (current US$) data for Hungary, Poland, Slovakia, and Czechia from the

by Miklós Sebők - Rebeka Kiss • Sep 1, 2025

searching for literature GPT-5 Grok-4

Searching for Literature with GenAI: Do the Latest Models Deliver Greater Accuracy?

In our earlier blog post, Harnessing GenAI for Searching Literature: Current Limitations and Practical Considerations, we examined the reliability of generative AI models for scholarly literature searching. To assess whether the newest releases represent any improvement, we tested them on the same narrowly defined academic topic. The results indicate modest

by Miklós Sebők - Rebeka Kiss • Aug 25, 2025