Generative AI is not only useful for content creation – it is increasingly being explored for structured data collection tasks as well. With a simple prompt, the Gemini Deep Research function can collect the requested data from publicly available sources. However, not all models can reliably complete such tasks; in fact, this kind of data retrieval is not advised unless done with the strictest of supervision. That said, the capabilities of deep research tools are expanding rapidly.
In this case, we asked Gemini, using its Deep Research function, to collect annual county-level population data for Hungary, including Budapest, spanning from 2010 to 2022. The output was delivered as a well-structured Google Document and an exportable Spreadsheet file containing both the yearly figures and the year-on-year changes.
Prompt
Please collect official yearly population data by county in Hungary, including Budapest, from 2010 to 2022. Compile it into a downloadable Excel file with clear structure showing annual values and year-to-year changes.
Before collecting the data, Gemini first generated a research plan outlining the steps needed to complete the task. The plan included identifying official sources, extracting population figures by county from 2010 to 2022, calculating year-on-year changes, and organising the results into a structured spreadsheet.
The entire data collection process–along with the detailed step-by-step description–was completed within just two to three minutes.
Output
The completed analysis is available here:
The collected data is also accessible in separate tables and can be exported as a Google Spreadsheet for further analysis:
One particularly useful aspect of Gemini’s output was explicitly listing the websites it reviewed during the research process. It also indicated which sources were ultimately excluded from the final report, providing an added layer of transparency.
Transparency is further enhanced by the fact that the model’s entire “thinking” process is visible – by expanding the “Thoughts” tab, users can trace which websites were visited and what steps led to the final report.
Recommendation
Gemini’s Deep Research function proved particularly useful for collecting structured, publicly available data with speed and clarity. Its ability to plan the research process, transparently document its steps, and present the results in a ready-to-use format makes it a promising tool for similar data collection tasks. However, as noted above, this use case still requires strict supervision and critical evaluation. Not all models perform reliably when retrieving or interpreting structured data from official sources, and results should always be carefully reviewed for accuracy and completeness.
The authors used Gemini’s Deep Research function [Google DeepMind (2025), Gemini – Deep Research function (accessed on 26 March 2025), Large Language Model (LLM), available at: https://gemini.google.com] to generate the output.