Argania Detection from Sentinel-2 Spectral Data: DeepSeek-V3 Excels with Prompt-Based Labelling of Structured Data

How far can today’s large language models go in scientific data analysis—without bespoke coding or deep learning pipelines? In this experiment, we explore the ability of DeepSeek-V3 to perform pixel-level detection of Argania trees (i.e., binary classification for each pixel) using only tabular Sentinel-2 spectral data. By

by Rebeka Kiss

Linking Headlines to Article Bodies for Stance Detection: A Structured Pre-processing Workflow Using GPT-4o

Working with real-world text data often means dealing with structures that are not yet analysis-ready. In our case, the dataset included headlines and full article texts stored separately, across two different tables. The only link between them was a shared identifier field: Body ID. Before we could begin any further

by Rebeka Kiss

No-Code Transformation of the NCBI Disease Corpus into a Structured CSV

Working with biomedical corpora often requires programming skills, specialised formats, and time-consuming preprocessing. But what if you could transform a complex annotated dataset—like the NCBI Disease Corpus—into a structured, analysis-ready CSV using nothing more than a single, well-designed prompt? In this post, we demonstrate how a no-code, GenAI-powered

by Rebeka Kiss

LLM Parameters Explained: A Practical, Research-Oriented Guide with Examples

Large language models (LLMs) rely on a set of parameters that directly influence how text is generated — affecting randomness, repetition, length, and coherence. Understanding of these parameters is essential when working with LLMs in research, application development, or evaluation settings. While chat-based interfaces such as ChatGPT, Copilot, or Gemini typically

by Rebeka Kiss

Generative AI in Academic Publishing: Tools and Strategies from Elsevier, Springer Nature and Wiley

As generative AI technologies continue to transform the research landscape, major academic publishers are beginning to integrate AI-powered tools directly into their platforms. These tools aim to support researchers at various stages of the scientific workflow – from literature discovery and summarisation to writing assistance and experimental comparison. This article provides

by Rebeka Kiss

AI Writing Tools for Researchers: Getting Started with Grammarly and DeepL

Academic writing demands clarity, precision, and often the ability to work across multiple languages. In recent years, AI-powered writing tools have become indispensable aids for researchers looking to improve their manuscripts. Two popular options are Grammarly and DeepL, each offering distinct strengths. Grammarly is known for refining English writing (catching

by Rebeka Kiss

Assessing the FutureHouse Owl Agent’s Ability to Detect Defined Concepts in Academic Research

Following our previous evaluations of the FutureHouse Platform’s research agents this post turns to Owl, the platform’s tool for precedent and concept detection in academic literature. Owl is intended to help researchers determine whether a given concept has already been defined, thereby streamlining theoretical groundwork and avoiding redundant

by Rebeka Kiss

Comparing the FutureHouse Platform’s Falcon Agent and OpenAI’s o3 for Literature Search on Machine Coding for the Comparative Agendas Project

Having previously explored the FutureHouse Platform’s agents in tasks such as identifying tailor-made laws and generating a literature review on legislative backsliding, we now directly compare its Falcon agent and OpenAI’s o3. Our aim was to assess their performance on a focused literature search task: compiling a ranked

by Rebeka Kiss

Using Falcon for Writing a Literature Review on the FutureHouse Platform: Useful for Broad Topics, Not for Niche Concepts

The FutureHouse Platform, launched in May 2025, is a domain-specific AI environment designed to support various stages of scientific research. It provides researchers with access to four specialised agents — each tailored to a particular task in the knowledge production pipeline: concise information retrieval (Crow), deep literature synthesis (Falcon), precedent detection

by Rebeka Kiss

Human- or AI-Generated Text? What AI Detection Tools Still Can’t Tell Us About the Originality of Written Content

Can we truly distinguish between text produced by artificial intelligence and that written by a human author? As large language models become increasingly sophisticated, the boundary between machine-generated and human-crafted writing is growing ever more elusive. Although a range of detection tools claim to identify AI-generated text with high precision,

by Rebeka Kiss

Can AI Really Accelerate Scientific Discovery? A First Look at the FutureHouse Platform

As scientific research became increasingly data-intensive and fragmented across disciplines, the limitations of traditional research workflows became more apparent. In response to these structural challenges, FutureHouse — a nonprofit backed by Eric Schmidt — launched a platform in May 2025 featuring four specialised AI agents. Designed to support literature analysis, hypothesis development,

by Rebeka Kiss

Building a Retrieval-Augmented Generation (RAG) System for Domain-Specific Document Querying

In recent years, Retrieval-Augmented Generation (RAG) has emerged as a powerful method for enhancing large language models with structured access to external document collections. By combining dense semantic search with contextual text generation, RAG systems have proven particularly useful for tasks such as answering questions based on extensive documentation, enabling

by Rebeka Kiss