Grok-4

Comparing Generative AI Models on Customer Review Sentiment Labelling Tasks

In this post we present a comparative evaluation of several generative AI models, focusing on their ability to carry out a straightforward sentiment analysis task. The models were asked to classify customer reviews as either positive or negative, following a simple binary labelling instruction. While the task itself was unambiguous,

Searching for Literature with GenAI: Do the Latest Models Deliver Greater Accuracy?

In our earlier blog post, Harnessing GenAI for Searching Literature: Current Limitations and Practical Considerations, we examined the reliability of generative AI models for scholarly literature searching. To assess whether the newest releases represent any improvement, we tested them on the same narrowly defined academic topic. The results indicate modest