recommendations

recommendations

Benchmarking GenAI Models for Penguin Species Prediction: Grok 3, DeepSeek-V3, and Qwen2.5-Max Delivered Top Results

How well can today’s leading GenAI models classify real-world biodiversity data—without bespoke code or traditional machine learning pipelines? In this study, we benchmarked a range of large language models on the task of predicting penguin species from tabular ecological measurements, including both numerical and categorical variables. Using a

Argania Detection from Sentinel-2 Spectral Data: DeepSeek-V3 Excels with Prompt-Based Labelling of Structured Data

How far can today’s large language models go in scientific data analysis—without bespoke coding or deep learning pipelines? In this experiment, we explore the ability of DeepSeek-V3 to perform pixel-level detection of Argania trees (i.e., binary classification for each pixel) using only tabular Sentinel-2 spectral data. By

Linking Headlines to Article Bodies for Stance Detection: A Structured Pre-processing Workflow Using GPT-4o

Working with real-world text data often means dealing with structures that are not yet analysis-ready. In our case, the dataset included headlines and full article texts stored separately, across two different tables. The only link between them was a shared identifier field: Body ID. Before we could begin any further

Automating Plant Disease Detection at Scale: From Prompt Limitations to a High-Accuracy API Workflow with GPT-4o

Image-based classification is increasingly used across biology, ecology, and agriculture—from identifying animal species to detecting plant diseases. One common use case is the analysis of leaf images to distinguish between healthy and diseased plants. In this post, we compare two approaches to classifying strawberry leaves as either fresh (healthy)

Integrating OpenAI’s GPT API into RStudio with Shiny: Real-Time Code Generation from Natural Language

This post presents a practical solution for integrating OpenAI’s GPT API into RStudio using a custom Shiny interface. The tool enables real-time code generation from natural language instructions, allowing users to interact with GPT-4 directly within their R workflow—without leaving the environment or blocking the console. We demonstrate

No-Code Transformation of the NCBI Disease Corpus into a Structured CSV

Working with biomedical corpora often requires programming skills, specialised formats, and time-consuming preprocessing. But what if you could transform a complex annotated dataset—like the NCBI Disease Corpus—into a structured, analysis-ready CSV using nothing more than a single, well-designed prompt? In this post, we demonstrate how a no-code, GenAI-powered

LLM Parameters Explained: A Practical, Research-Oriented Guide with Examples

Large language models (LLMs) rely on a set of parameters that directly influence how text is generated — affecting randomness, repetition, length, and coherence. Understanding of these parameters is essential when working with LLMs in research, application development, or evaluation settings. While chat-based interfaces such as ChatGPT, Copilot, or Gemini typically

Generative AI in Academic Publishing: Tools and Strategies from Elsevier, Springer Nature and Wiley

As generative AI technologies continue to transform the research landscape, major academic publishers are beginning to integrate AI-powered tools directly into their platforms. These tools aim to support researchers at various stages of the scientific workflow – from literature discovery and summarisation to writing assistance and experimental comparison. This article provides

Comparing the FutureHouse Platform’s Falcon Agent and OpenAI’s o3 for Literature Search on Machine Coding for the Comparative Agendas Project

Having previously explored the FutureHouse Platform’s agents in tasks such as identifying tailor-made laws and generating a literature review on legislative backsliding, we now directly compare its Falcon agent and OpenAI’s o3. Our aim was to assess their performance on a focused literature search task: compiling a ranked

Building a Retrieval-Augmented Generation (RAG) System for Domain-Specific Document Querying

In recent years, Retrieval-Augmented Generation (RAG) has emerged as a powerful method for enhancing large language models with structured access to external document collections. By combining dense semantic search with contextual text generation, RAG systems have proven particularly useful for tasks such as answering questions based on extensive documentation, enabling

Introducing Horizon Navigator '25: A Custom GPT by poltextLAB for Smarter Access to EU Funding Information

Navigating Horizon Europe’s 2025 Work Programme means dealing with twelve separate documents, each several hundred pages long. These include funding calls, eligibility conditions, strategic priorities, and legal annexes—making it difficult to locate critical information quickly. To address this challenge, poltextLAB developed a domain-specific Custom GPT (Horizon Navigator '