PROMPT REVOLUTION

Hands-On Prompt-Tutorials for Using GenAI in Research and Education

Testing ScienceDirect AI in LeapSpace: How Scopus-Driven Deep Research Structures Academic Insight

Generative AI tools designed specifically for academic research are beginning to reshape how scholars conduct literature reviews and map scholarly debates. In this post, we test LeapSpace (ScienceDirect AI), Elsevier’s Scopus-integrated deep research assistant, to examine how it structures evidence, organises themes, and traces citation networks within peer-reviewed literature.

by Rebeka Kiss

Does the Language of the Prompt Shape Source Selection? Testing EU AI Act Implementation Queries with Claude Opus 4.6 and GPT-5.2

When using generative AI for legal research, the language of the prompt may influence not only stylistic features of the output but also the model’s underlying source selection. In this experiment, we examined how two advanced models – Claude Opus 4.6 and GPT-5.2 – responded to identical questions about

by Rebeka Kiss

From Text to Interactive Understanding: Exploring Gemini’s Dynamic View on Transformer Models

Generative AI tools are increasingly moving beyond static text responses towards richer, interface-driven explanations. In this post, we explore Gemini’s Dynamic View feature using a conceptually demanding technical question about transformer architectures. Rather than producing a conventional paragraph-style summary, Gemini generated an interactive, visually structured learning environment that allows

by Rebeka Kiss

Setting Up Claude Code for Local File Querying: A Step-by-Step Guide

Claude Code is Anthropic's command line-integrated agent that runs directly on your computer, enabling terminal-based interactions with local files and applications. While the name suggests technical complexity, Claude Code is surprisingly accessible—offering the same conversational interface as chat-based models, but with the ability to execute code and

Benchmarking AI Audio Transcription: How Leading Models Handle English and Hungarian Speech

As audio becomes an increasingly common input for AI assistants, it becomes natural to ask whether these platforms can also transcribe pre-recorded audio accurately. In this benchmark, we tested five leading generative AI models on identical English and Hungarian audio files to assess their transcription accuracy, language support, and practical

Can GenAI Models Detect Plagiarism Reliably? A Multi-Model Benchmark on Academic Text

As large language models become increasingly sophisticated tools for academic writing support and web search, a critical question emerges: can the models be used also to reliably identify plagiarism? To test this capability, we embedded verbatim, unattributed passages from published academic articles into newly generated texts and asked multiple GenAI

Exploring Temperature Settings in Creative Writing: A Haiku Generation Case Study with OpenAI API

Every time you prompt a large language model, you're witnessing the result of thousands of sequential sampling decisions—each token drawn from a probability distribution over the model's vocabulary. In standard chat interfaces, these distributions are sampled with fixed settings that users cannot adjust. But API

Testing AI Coding Abilities: How GPT 5.2, Opus 4.5, Gemini 3 Pro, and Grok 4.1 Handle Algorithm Problems

As AI models increasingly demonstrate coding capabilities, understanding AI-written codes' reliability and ease of use becomes essential for researchers and practitioners. In this benchmark study, we evaluated four recent models—GPT 5.2 Thinking, Opus 4.5, Gemini 3 Pro, and Grok 4.1 Expert—on two algorithm problems

Testing PangramLabs for AI Text Detection: Impressive but Imperfect Performance

Can AI detection tools reliably distinguish between human and machine-generated text? In our previous testing, we found that platforms like Originality.ai and ZeroGPT and GenAI models, frequently misclassified both AI and human text. Now, PangramLabs has gained attention with claims of near-perfect accuracy, verified by third-party organizations. In this

Understanding Model Confidence Through Log Probabilities: A Practical OpenAI API Implementation

Log probabilities (logprobs) provide a window into how confident a language model is about its predictions. In this technical implementation, we demonstrate how to access and interpret logprobs via the OpenAI API, using a series of increasingly difficult multiplication tasks. Our experiment reveals that declining confidence scores can effectively signal