Clustering Sustainability-related Sentences with GenAI

Clustering Sustainability-related Sentences with GenAI
Source: Freepik - vinkf

Clustering refers to the unsupervised process of grouping similar entities, such as sentences, into categories based on shared characteristics without relying on predefined classes, thus revealing latent patterns in the data. For researchers, GenAI models enable this analysis without the need for programming expertise, offering a powerful means to explore complex datasets like those in sustainability with efficiency and precision, bypassing the time-consuming efforts of manual clustering.

The prompt consisted of 25 sentences addressing various sustainability themes, including renewable energy, sustainable agriculture, and environmental conservation. The model’s task was to group these sentences into thematic clusters based on their content, using an unsupervised approach without predefined categories.

Prompt

Clustering Task Input

Perform clustering on the following sentences.

###
"Renewable energy sources like wind and solar power are essential for reducing carbon emissions."
"Companies are increasingly adopting sustainable practices to minimize their environmental impact."
"Electric vehicles are becoming more popular as a means to reduce air pollution."
"Sustainable agriculture practices help preserve biodiversity and reduce soil degradation."
"Governments around the world are implementing policies to combat climate change."
"Reducing plastic waste is crucial for protecting marine ecosystems."
"The circular economy promotes the reuse and recycling of materials to minimize waste."
"Energy efficiency measures in buildings can significantly lower greenhouse gas emissions."
"Water conservation techniques are vital in areas facing drought conditions."
"Investing in green technology can drive innovation and create new job opportunities."
"Consumers are increasingly demanding eco-friendly products and sustainable packaging."
"Deforestation has a major impact on climate change and wildlife habitats."
"Carbon offset programs can help balance out the emissions produced by human activities."
"Public transportation systems play a key role in reducing the carbon footprint of urban areas."
"Ocean conservation efforts are essential to maintaining healthy marine biodiversity."
"Urban planning that includes green spaces can improve the quality of life and air quality in cities."
"Biofuels offer an alternative to fossil fuels, reducing the reliance on non-renewable energy sources."
"Corporate social responsibility initiatives are often aligned with sustainability goals."
"Reforestation projects can help restore ecosystems and sequester carbon dioxide."
"Sustainable fashion aims to reduce the environmental impact of clothing production and consumption."
"Smart grid technology helps optimize energy distribution and reduce waste."
"Organic farming practices avoid the use of synthetic chemicals, promoting soil health."
"Climate change adaptation strategies are necessary to protect vulnerable communities."
"Green building certifications encourage the construction of energy-efficient and sustainable buildings."
"Eco-tourism promotes environmental awareness and supports conservation efforts."
###

Output

Clustering Output

To perform clustering on the provided sentences, I’ll group them based on common themes or topics related to sustainability and environmental impact. Clustering is typically an unsupervised process where similar items are grouped together based on their content. Here, I’ll use a qualitative approach to identify clusters by analyzing the main focus of each sentence. The sentences will be grouped into thematic categories without relying on complex algorithms, as the dataset is small and the themes are relatively clear.

After reviewing the sentences, I’ve identified five distinct clusters based on recurring topics: Energy and Emissions, Sustainable Practices, Ecosystem Conservation, Waste and Resource Management, and Urban and Social Solutions. Below are the clusters with their respective sentences and a brief explanation of each group.

Cluster 1: Energy and Emissions

  • "Renewable energy sources like wind and solar power are essential for reducing carbon emissions."
  • "Electric vehicles are becoming more popular as a means to reduce air pollution."
  • "Energy efficiency measures in buildings can significantly lower greenhouse gas emissions."
  • "Biofuels offer an alternative to fossil fuels, reducing the reliance on non-renewable energy sources."
  • "Carbon offset programs can help balance out the emissions produced by human activities."
  • "Smart grid technology helps optimize energy distribution and reduce waste."

Explanation: This cluster focuses on energy-related solutions and technologies aimed at reducing greenhouse gas emissions and air pollution, including renewable energy, energy efficiency, and alternative fuels.

Cluster 2: Sustainable Practices

  • "Companies are increasingly adopting sustainable practices to minimize their environmental impact."
  • "Sustainable agriculture practices help preserve biodiversity and reduce soil degradation."
  • "Investing in green technology can drive innovation and create new job opportunities."
  • "Corporate social responsibility initiatives are often aligned with sustainability goals."
  • "Sustainable fashion aims to reduce the environmental impact of clothing production and consumption."
  • "Organic farming practices avoid the use of synthetic chemicals, promoting soil health."
  • "Green building certifications encourage the construction of energy-efficient and sustainable buildings."

Explanation: This cluster centers on sustainable practices across industries like agriculture, fashion, construction, and corporate operations, emphasizing environmentally friendly methods and innovation.

Cluster 3: Ecosystem Conservation

  • "Reducing plastic waste is crucial for protecting marine ecosystems."
  • "Deforestation has a major impact on climate change and wildlife habitats."
  • "Ocean conservation efforts are essential to maintaining healthy marine biodiversity."
  • "Reforestation projects can help restore ecosystems and sequester carbon dioxide."
  • "Water conservation techniques are vital in areas facing drought conditions."
  • "Eco-tourism promotes environmental awareness and supports conservation efforts."

Explanation: This cluster highlights efforts to protect and restore natural ecosystems, including forests, oceans, and water resources, with a focus on biodiversity and conservation.

Cluster 4: Waste and Resource Management

  • "The circular economy promotes the reuse and recycling of materials to minimize waste."
  • "Consumers are increasingly demanding eco-friendly products and sustainable packaging."

Explanation: This smaller cluster focuses on managing resources efficiently through recycling, reuse, and consumer-driven demand for sustainable products, emphasizing waste reduction.

Cluster 5: Urban and Social Solutions

  • "Governments around the world are implementing policies to combat climate change."
  • "Public transportation systems play a key role in reducing the carbon footprint of urban areas."
  • "Urban planning that includes green spaces can improve the quality of life and air quality in cities."
  • "Climate change adaptation strategies are necessary to protect vulnerable communities."

Explanation: This cluster covers societal and governmental responses to environmental challenges, including urban planning, public policy, and transportation solutions aimed at sustainability and community well-being.

Summary of Clusters

  • 1. Energy and Emissions (6 sentences): Focuses on energy solutions to reduce emissions and pollution.
  • 2. Sustainable Practices (7 sentences): Encompasses sustainable methods across various industries.
  • 3. Ecosystem Conservation (6 sentences): Centers on protecting natural ecosystems and biodiversity.
  • 4. Waste and Resource Management (2 sentences): Deals with waste reduction and resource efficiency.
  • 5. Urban and Social Solutions (4 sentences): Highlights societal and urban strategies for sustainability.

These clusters group the sentences into meaningful categories based on their thematic similarities. If you’d like a different clustering approach (e.g., fewer or more clusters, or a specific method like k-means with quantitative metrics), feel free to let me know!

We tested this prompt with Grok 3, and the model successfully clustered the 25 sustainability-related sentences into five thematic clusters—Energy and Emissions, Sustainable Practices, Ecosystem Conservation, Waste and Resource Management, and Urban and Social Solutions—using a qualitative, unsupervised approach. It formed coherent categories, such as pairing "renewable energy" with "carbon offsets," though the Waste and Resource Management cluster (2 sentences) was notably smaller. The process revealed patterns, like linking biofuels to emissions reductions, without needing coding skills. Overall, it met the prompt’s goals, demonstrating how GenAI can streamline data analysis and uncover actionable insights for researchers across diverse domains.

The authors used Grok 3 [xAI (2025) Grok 3 (accessed on 23 March 2025), Large language model (LLM), available at: https://x.ai/grok] to generate the output.