Exploring Input File Support in GenAI Models: What Works and What Doesn’t

Exploring Input File Support in GenAI Models: What Works and What Doesn’t
Source: Freepik via freepik licence

Generative AI models have become indispensable tools across industries, but their effectiveness often hinges on the types of input files they can process. Here, we assess the capabilities of leading GenAI models—including Claude 3.7 Sonnet, GPT-4.5, and Qwen2.5-Max—examining their support for formats like PDF, JSON, and Python files, and uncovering the current limitations.

To understand the versatility of GenAI models, we evaluated their ability to handle a range of common and specialised file formats. Our analysis covered plain text (.txt), Microsoft Word documents (.docx), PDFs (.pdf), images (.png), structured data formats like JSON (.json), Excel spreadsheets (.xlsx), and comma-separated values (.csv). We also tested support for technical formats such as LaTeX (.tex) for mathematical notation, Python scripts (.py) for code, PowerPoint presentations (.pptx), and even the more unique .dat files often used for raw data storage. This broad assessment reveals how well these models accommodate diverse inputs in real-world applications.

Model comparison

Model .txt .docx .pdf .png/.jpg .JSON .xlsx .CSV .LaTeX .py .pptx .dat
Claude 3.7 Sonnet Yes Yes Yes Yes Yes Limited Limited Yes Yes No Yes
Copilot Yes Yes Yes Yes Yes Yes Yes No No Limited No
DeepSeek-V3 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No
Gemini 2.0 Flash Yes Yes Yes Yes Yes Yes Yes No Yes Yes No
GPT-4.5 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
GPT-4o Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Grok 3 Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes
Mistral Large Yes Yes Yes Yes Yes Yes Yes No Yes Yes No
Qwen2.5-Max Yes Yes Yes Yes Yes Yes Yes No No No No
Qwen 2.5 Plus Yes Yes Yes Yes Yes Yes Yes No No No No

Claude 3.7 Sonnet

Claude 3.7 Sonnet handles .txt, .docx, .pdf, .png/.jpg, .json, .latex, .py, and .dat files easily, demonstrating strong performance in processing text, image, and technical formats. It has difficulty with .csv and .xlsx files, often failing to read them reliably, although occasional success is possible after multiple attempts. A notable limitation is the lack of support for .pptx files.

Claude 3.7 Sonnet's performance (accessed on 6 April 2025)

Claude 3.7 Sonnet initially encountered multiple errors when attempting to process the .xlsx file, struggling with library imports and binary parsing issues. After several retries, it eventually managed to extract the content successfully—something it failed to achieve with .csv files, which remained unreadable despite repeated attempts.

Claude 3.7 Sonnet's performance (accessed on 6 April 2025)

The model faced difficulties when attempting to read the PowerPoint file (.pptx), extracting primarily binary and XML structure data rather than the readable text content. The actual text from the slides was not visible in the extracted data, and the program identified internal XML structures and relationships between slide elements.

Claude 3.7 Sonnet's performance (accessed on 6 April 2025)

Copilot

Copilot faced limitations when attempting to read .latex and .py files, displaying an error message indicating it couldn't process these types. The .dat file format posed a challenge, as it couldn't be processed either. Similarly, when extracting content from .pptx files, Copilot encountered issues with file loading, requiring multiple attempts to process the file successfully.

Copilot's performance (accessed on 6 April 2025)
Copilot's performance (accessed on 6 April 2025)

DeepSeek-V3

Deepseek successfully handled most file formats, including .pdf, .doc, .xlsx, .ppt, images, text, and code. However, it encountered difficulties with the .dat file format, which it could not process. Other supported formats were processed without difficulties.

DeepSeek's performance (accessed on 6 April 2025)

Gemini 2.0 Flash

Gemini encountered difficulties with .latex and .dat file formats, which it could not process. However, it successfully handled all other supported formats without any issues.

Gemini 2.0 Flash's performance (accessed on 6 April 2025)

Grok-3

Grok encountered issues with the .pptx file format, displaying an error message and failing to process it. However, it successfully handled all other supported formats without any problems.

Grok-3's performance (accessed on 6 April 2025)

Mistral (Le Chat)

Mistral struggled with processing .latex and .dat file formats, which it was unable to handle. Nonetheless, it managed to process all other supported formats without any difficulties.

Mistral's performance (accessed on 6 April 2025)

Qwen2.5-Max, Qwen2.5-Plus

The Qwen models, including Qwen 2.5-Max and Qwen 2.5 Plus, are unable to process .pptx, .latex, and .dat input files.

The models have made significant progress, with each iteration expanding their ability to handle a wider range of input file formats. Among the tested formats, GPT-4o and GPT-4.5 were the only models that successfully processed all of them without difficulties or limitations. This demonstrates their advanced capabilities in managing diverse and complex file types.

The authors used GPT-4.5 [OpenAI (2025), GPT-4.5 (accessed on 6 April 2025), Large language model (LLM), available at: https://openai.com] to generate the output.