limitations

limitations

Assessing the FutureHouse Owl Agent’s Ability to Detect Defined Concepts in Academic Research

Following our previous evaluations of the FutureHouse Platform’s research agents this post turns to Owl, the platform’s tool for precedent and concept detection in academic literature. Owl is intended to help researchers determine whether a given concept has already been defined, thereby streamlining theoretical groundwork and avoiding redundant

Comparing the FutureHouse Platform’s Falcon Agent and OpenAI’s o3 for Literature Search on Machine Coding for the Comparative Agendas Project

Having previously explored the FutureHouse Platform’s agents in tasks such as identifying tailor-made laws and generating a literature review on legislative backsliding, we now directly compare its Falcon agent and OpenAI’s o3. Our aim was to assess their performance on a focused literature search task: compiling a ranked

Using Falcon for Writing a Literature Review on the FutureHouse Platform: Useful for Broad Topics, Not for Niche Concepts

The FutureHouse Platform, launched in May 2025, is a domain-specific AI environment designed to support various stages of scientific research. It provides researchers with access to four specialised agents — each tailored to a particular task in the knowledge production pipeline: concise information retrieval (Crow), deep literature synthesis (Falcon), precedent detection

Human- or AI-Generated Text? What AI Detection Tools Still Can’t Tell Us About the Originality of Written Content

Can we truly distinguish between text produced by artificial intelligence and that written by a human author? As large language models become increasingly sophisticated, the boundary between machine-generated and human-crafted writing is growing ever more elusive. Although a range of detection tools claim to identify AI-generated text with high precision,

Exploring Custom GPTs in ChatGPT: How Useful Are They Really?

As generative AI tools become increasingly integrated into academic practice, researchers are beginning to explore the use of Custom GPTs—personalised AI variants that operate according to predefined instructions, tone, or tasks. These agents can be configured for specific roles or workflows, such as teaching support, literature exploration, or data

Current Limitations of GenAI Models in Data Visualisation: Lessons from a Model Comparison Case Study

In earlier explorations, we identified the GenAI models that appeared most promising for data visualisation tasks—models that demonstrated strong code generation capabilities, basic support for data wrangling, and compatibility with popular Python libraries such as Matplotlib and Seaborn. In this follow-up case study, we examine a different dimension: rather