Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM

Climate change is one of the most significant challenges we face together as a society. Creating awareness and educating policy makers the wide-ranging impact of climate change is an essential step towards a sustainable future. Recently, Large Language Models (LLMs) like ChatGPT and Bard have shown impressive conversational abilities and excel in a wide variety of NLP tasks. While these models are close-source, recently alternative open-source LLMs such as Stanford Alpaca and Vicuna have shown promising results. However, these open-source models are not specifically tailored for climate related domain specific information and also struggle to generate meaningful responses in other languages such as, Arabic. To this end, we propose a light-weight Arabic Mini-ClimateGPT that is built on an open-source LLM and is specifically fine-tuned on a conversational-style instruction tuning curated Arabic dataset Clima500-Instruct with over 500k instructions about climate change and sustainability. Further, our model also utilizes a vector embedding based retrieval mechanism during inference. We validate our proposed model through quantitative and qualitative evaluations on climate-related queries. Our model surpasses the baseline LLM in 88.3% of cases during ChatGPT-based evaluation. Furthermore, our human expert evaluation reveals an 81.6% preference for our model's responses over multiple popular open-source models. Our open-source demos, code-base and models are available here https://github.com/mbzuai-oryx/ClimateGPT.


Introduction
Climate change poses a significant and urgent challenge in our modern era, as it has far-reaching and profound consequences for the environment, ecosystems, economies, and human well-being.Climate change has a significant impact on food security and freshwater.For example, high temperatures can lead to crop loss, water scarcity, the spread of pests and diseases, resulting in reduced agricultural production and increased food prices.Additionally, high temperatures and changes in precipitation patterns can lead to a decline in the availability of freshwater, affecting agricultural production, energy, and other industries that rely on irrigation methods.User:In this case, how can we preserve the crops?Response:Crops can be preserved through various methods, which include using sustainable agricultural practices such as efficient irrigation techniques, crop diversification, and crop rotation.These efforts can help reduce reliance on traditional crops and enhance the resilience of agricultural systems to climate change.(best viewed in zoom)   The main contributions of this work are : (i) We propose a climate specialized Arabic conversational agent named Arabic Mini-ClimateGPT, which is built upon Vicuna framework and finetuned specifically with our climate change and sustainability related instructions in Arabic language.

* Equal Contribution
(ii) We have generated over 500k conversationalstyle instruction tuning data based on the public benchmarks for climate change and sustainability.This augmentation of interactive conversational data significantly enhances the performance of LLMs and preserves its generalizability through the fine-tuning process.To the best of our knowledge, our proposed dataset Clima500-Instruct marks the first release of a substantial conversational-style Arabic instruction set dedicated to Climate change and sustainability.(iii) We integrate a vector embedding and datastore framework, which can be utilized during model inference for information retrieval without the need for additional training.(iv) We perform comprehensive evaluations of our model.Our model achieves 88.3% win rate in ChatGPT-based comparison with baseline.Furthermore, our Arabic language expert evaluation shows that our model responses were preferred in 81.6% of the cases, whereas popular open-sourced models such as Vicuna, Alpaca-Arabic and Dolly v2 were successful only in less than 8% of the cases.

Related Work
ClimateQ&A (Ekimetrics, 2023)  ClimaBench: This is a benchmark dataset (Laud et al., 2023) for Climate change text understand-ing, which is built upon the publicly available surveys released by CDP (formerly the Carbon Disclosure Project (CDPInc, 2000)) and the NAIC Climate Risk Disclosure Survey.ClimaBench consists of Clima-Insurance and Clima-CDP for text classification and Clima-QA for question answering tasks based on the CDP data.We use this Clima-QA data compiled from the CDP-Cities, CDP-States and CDP-Corporation subsets and preprocess into 485K Question Answer (QA) pairs.The proposed Arabic Clima500-Instruct dataset is built upon CCMRC and ClimaBench datasets.

Arabic Mini-ClimateGPT
Our Arabic Mini-ClimateGPT is an open-sourced conversational agent instruction-tuned for Climate change and Sustainability in Arabic.We first create a conversational style instruction-tuning dataset named Clima500-Instruct in Arabic language based on CCMRC and ClimaBench (See Sec.3.1).Our Arabic Mini-ClimateGPT is built upon the Vicuna-7B model by fine-tuning on the Clima500-Instruct.We also integrate a dedicated Inference module to incorporate region-specific documents in Arabic to extend the knowledge base of our model.(See Sec.3.2)

Arabic Clima500-Instruct Dataset
We propose a conversational style instructiontuning dataset named Clima500-Instruct for Climate change and Sustainability in Arabic language.Dataset Creation: The existing climate-specific datasets such as CCMRC and Clima-QA from ClimaBench are naive QA datasets in English that lack the interactive conversation capabilities required to train a conversational agent such as Vicuna.To this end, our proposed Clima500-Instruct dataset is a conversational-style instruction tuning dataset in Arabic language with Climate change and sustainability as the central theme.Fig. 3 shows our overall dataset creation pipeline which is summarized below: (i) We first pre-process the data and extract the Question Answer (QA) pairs from their respective sources.These QA pairs are then compiled and converted to a naive QA instruction set.(ii) We then instruct ChatGPT to act as a domain expert in Climate change and sustainability to generate conversational-style responses in English grounded in information based on the QA pairs from CCMRC and ClimaBench.  1 summarizes the statistics of our proposed Clima500-Instruct dataset in Arabic language as well as its intermediate byproduct in the English language.The train and test set splits of our Clima500-Instruct dataset are carefully prepared based on the respective splits of representative data sources mentioned in section 2.1.To further understand the composition of our proposed Clima500-Instruct Dataset, we obtain high-level categories in our dataset based on the most frequent words.Table 2 reports the categories and their corresponding instance counts.

Model Training and Inference
Our Arabic Mini-ClimateGPT is developed on top of the Vicuna-7B (Chiang et al., 2023) model.We fine-tune this Vicuna-7B model on the conversational style Clima500-Instruct in both English and Arabic languages.We train the model on 4 x A100 (80GB) utilizing gradient checkpointing and flash attention following the Vicuna fine-tuning recipe.
We observe that the Arabic ClimateGPT model requires a context length of 1024 tokens to generate high quality results.The overall framework of our model with its inference time retrieval capabilities is shown in Fig. 2.  Inference Module: Large Language models have shown remarkable capabilities for a variety of NLP tasks in recent times.However, their knowledgebase is inherently limited to their training data.Recent frameworks (Chase, 2022) have shown that the use of Vector embedding based retrieval mechanisms can be leveraged to extend the knowledge base of an LLM.To this end, we integrate a vector database using the ChromaDB (Huber et al., 2020) framework.We use the stsb-xlm-r-multilingual model from Sentence Transformers (Reimers and Gurevych, 2019) to embed the Arabic Climate documents to leverage its multilingual embedding capabilities.Climate reference documents in English are converted to vector embedding using the lightweight all-MiniLM-L6-v2 from Sentence Transformers (Reimers and Gurevych, 2019).
The inference steps of our model can be summarized as the following.When a user submits a query to the model it is first converted into the embedding space of the corresponding Sentence Transformer (Reimers and Gurevych, 2019) based on the language.We then perform a Vector similarity search of this query embedding with the existing Vector Database using ChromaDB.If the similarity search returns a high correspondence, the user query is then augmented with the retrieved additional context.If the retrieved documents are not relevant then the original query is passed to the Arabic Mini-ClimateGPT model.Our model fine-tuned on the Arabic Clima500-Instruct dataset, then generates the final output by incorporating the supplementary retrieved context (See Fig 2).

Experiments
As LLMs become more sophisticated, the existing open benchmarks may no longer suffice to adequately assess their capabilities.These benchmarks are typically designed to test chatbots on a limited set of tasks, such as question-answering or contextbased localization.To address this, we compare the model responses with ground truth responses on a held-out Clima500-Instruct Arabic test set using ChatGPT.We compare the generated responses of our model without the VectorDB module against other open-source LLMs like Vicuna (Chiang et al., 2023), Stanford Alpaca (Taori et al., 2023), Dolly v2 (Databricks, 2023) and Arabic Alpaca(Yasbok, 2023) on the test set in a fair setting.For each question within the test set, we provided ChatGPT with the following inputs: (i) the ground-truth answer text, (ii) the response texts generated by our model, and (iii) the text generated by the designated competitor model.In this context, ChatGPT's role was to assess and contrast the text outputs of our model and the competitor model, subsequently selecting the response that best semantically aligns with the ground truth.If neither of the generated texts semantically matches the ground truth, ChatGPT outputs "Neither" as the response.The row headers in Table 3 indicate the occurrences of "Our," "Com-petitor," and "Neither" being favored, alongside the corresponding win percentages.For instance, in the first row, our model is preferred in 88.33% of test samples, Vicuna is favored in 11.23%, and in 0.43% of cases, neither our model nor Vicuna is preferred.Our Arabic Mini-ClimateGPT model outperforms the recently introduced Vicuna with a win rate of 88.3% on ChatGPT evaluation.(See Tab 3).We also perform human expert based evaluation on our Arabic Clima500-Instruct test set with native Arab speakers.For Human Evaluation to be in a fair setting, we remove all model identifiers and provide responses from all 5 models to Arab native speakers.They then select the best response out of the 5 (or none) for a given question.The evaluation results on recently introduced open-source models are shown in Fig 4 .Our Arabic Mini-ClimateGPT achieves an overall win rate of 81.6% on human expert evaluations.
Table 4 presents the comparison results of naive translated baselines with our proposed model.We employ the same ChatGPT-based evaluation methodology, comparing our proposed Arabic Mini-ClimateGPT responses against these baseline models.(i) Vicuna_tr: In this baseline, we begin by translating the query from Arabic to English.Next, we generate the model response in English using the Vicuna baseline.Finally, we translate the responses back to Arabic.(ii) Cli-mateGPT_en_tr: In the second baseline, we also execute Arabic-English and English-Arabic translations.However, instead of the Vicuna baseline, we utilize ClimateGPT_en.This model has been trained using the our proposed Clima500-Instruct in English.In both scenarios, we observe that our Arabic Mini-ClimateGPT responses outperform these baselines.Our model achieves 73.98% win rate compared to Vicuna_tr with only 25.84%.In case of ClimateGPT_en_tr (30.07%), our model achieves 69.57% win rate on ChatGPT evaluations.These results demonstrate the necessity for a dedicated Arabic conversational dataset and model learning.

Conclusion
In this paper, we present Arabic Mini-ClimateGPT, a climate-specialized conversational agent in Arabic.Our framework incorporates a vector embedding based retrieval mechanism to extend the knowledge-base of our model during inference.Our proposed dataset Clima500-Instruct is an ex-     (Zhu et al., 2023;Liu et al., 2023;Thawkar et al., 2023;OpenAI, 2023;Muhammad Maaz and Khan, 2023;Rasheed et al., 2023) have demonstrated promising results in several domains.We hope that integrating a dedicated Vision encoder for Climate related modalities (e.g.temperature, wind, precipitation) with our model would be a promising future direction.

Ethics Statement
In alignment with the ACL Ethics Policy, we address the ethical dimensions of our work on Arabic Mini-ClimateGPT.We have conscientiously credited the data sources and other open source works on which Arabic Mini-ClimateGPT is built upon.We have worked diligently to reduce biases in model responses and improve the credibility of generated responses through the implementation of our VectorDB module.Our commitment to transparency is evident through the availability of open-source resources, and we value collaboration and accountability within the research community.In recognizing the broader societal impact of our research, we pledge to uphold ethical standards in the development and deployment of our model for Climate and sustainability related information dissemination.

Acknowledgement
We would like to thank our colleagues at MBZUAI for their essential contribution to the Evaluation and Dataset verification tasks, including Dr. Jean Lahoud, Abdelrahman Shaker, Salwa Al Khatib, Mohamed El Amine Boudjoghra, Aisha Fahad Ahmed Ali Alraeesi, Amna Abdelrahim Nasir Abdalla Alhosani, Hour Eisa Abdelrahim Ahmed Mohamed, Hosam Mahmoud Abdalla Ahmed Ali Elgendy, Yahia Dalbah, Mohammed Almansoori, without which this project would not be possible.
The computational resources were provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no.2022-06725, and by the Berzelius resource, provided by the Knut and Alice Wallenberg Foundation at the National Supercomputer Centre.

Figure 1 :
Figure 1: Arabic Mini-ClimateGPT conversation: User: How does climate change affect food security and freshwater?Response:Climate change has a significant impact on food security and freshwater.For example, high temperatures can lead to crop loss, water scarcity, the spread of pests and diseases, resulting in reduced agricultural production and increased food prices.Additionally, high temperatures and changes in precipitation patterns can lead to a decline in the availability of freshwater, affecting agricultural production, energy, and other industries that rely on irrigation methods.User:In this case, how can we preserve the crops?Response:Crops can be preserved through various methods, which include using sustainable agricultural practices such as efficient irrigation techniques, crop diversification, and crop rotation.These efforts can help reduce reliance on traditional crops and enhance the resilience of agricultural systems to climate change.(best viewed in zoom)

Figure 2 :
Figure2: Overview of Arabic Mini-ClimateGPT framework: When the user submits a query, it is first embedded and searched for similarity in the vector database.The query is then augmented if the retrieved information is of high correspondence and passed on to our model for generating the final output.

Figure 3 :
Figure 3: Clima500-Instruct Arabic Dataset creation pipeline: At first, CCMRC and ClimaBench dataset are pre-processed and compiled into QA pairs.These QA pairs are then converted to a conversational-style instruction set and further translated into Arabic using ChatGPT.The resulting instruction set undergoes manual verification and post-processing to yield the final Clima500-Instruct dataset.(Refer 3.1)

Figure 4 :
Figure 4: Human Evaluation: We perform evaluation on various open-source models on our Clima500test set with native Arab speakers.Win % of each model is illustrated in the graph along with their respective legends.Our model achieves 81.6% better responses compared to other models in Human evaluation.

Figure 5 :
Figure 5: Example 1 -Question What are the expected effects of climate change in the short and long term?OursThe projected impacts of climate change will be on a very large and complex scale, and could affect different levels of ecosystems and human societies.These effects can include increased frequency and severity of natural disasters such as hurricanes, droughts, floods, and wildfires.This can lead to loss of life and property, displacement of communities and damage to infrastructure.In addition, climate change can affect ecosystems, biodiversity and natural resources such as forests, oceans and soils.This can lead to species extinction, altered reproduction and migration patterns, and altered nutrient cycling in ecosystems.Moreover, climate change can have far-reaching impacts on economic and social systems, including rising sea levels, altered rainfall patterns and loss of agricultural yields.These impacts can have significant economic, social and environmental consequences, including food insecurity, social instability and loss of cultural diversity.In general, the impacts of climate change in the short and long term are complex and multifaceted, and the frequency and intensity of impacts will vary depending on the region and ecosystem involved and the adaptive and mitigation capacity of human societies.VicunaThe impact of climate change in the short and long term through changes in the basic weather and atmospheric waves such as the middle door and the east, and land changes in the areas where the weather waves change.Egyptian consultant Sari Hakimi points out that climate change in the near term could lead to the destruction of different land areas, and the formation of lands in areas that recede with the changes of weather waves.We could also see changes in the atmospheric climate in the near term, which could lead to the destruction of people's living environments related to the lands.A local community sometimes faces major challenges in weathering atmospheric and remote changes, but in quite a few ways a community can combat its challenges in a multi-way, including investing in quality ground services, courses and training.Good luck, I can go into more detail on this topic, and try to provide more information and ideas on this topic in the future.

Figure 6 :
Figure 6: Example 2 -Question What is the difference between global warming and climate change?Ours Globalwarming refers to the long-term increase in the Earth's average surface temperature as a result of human activities such as burning fossil fuels, logging, and industrial processes.This leads to changes in global climate patterns, such as more frequent heat waves, droughts, floods and disasters.On the other hand, climate change refers to long-term changes in weather patterns, temperatures, and precipitation as a result of natural changes in the terrestrial climate over time that can also lead to changes in climate patterns.Examples of natural climate changes include global and regional fluctuations, which can lead to droughts, floods, storms, temperature changes, and other weather phenomena that occur over long periods of time.Ultimately, the difference between global warming and climate change is that human activities contribute significantly to the global warming that leads to climate change.It is important to take urgent action to reduce greenhouse gas emissions and limit the negative impacts of climate change.Vicuna Global warming is one of the determinants that affect the atmospheric and chemical reactions in the living family, including atmospheric and chemical changes appropriate to affinity and responses, in addition to changes in atmospheric waves and priority.Climate change is one of the determining factors affecting the atmospheric and chemical reactions in the living family, including ion and isotropic changes, in addition to changes in atmospheric waves and priority.Global warming is a good challenge for the living family, as it is related to a large prevalence of human diseases and many unstable factors that affect the atmospheric and chemical reactions in the living family, including changes in air waves and priority.Climate change is a good challenge for the living family, as it is related to a large prevalence of human diseases and many unstable factors that affect the atmospheric and chemical reactions in the living family, including changes in weather waves and priority.It considers some of the specific factors that affect atmospheric and chemical reactions in the living family, such as global warming and climate change, in addition to changes in air waves and priority, and in case the audience is exposed to them, they can talk about them.They can keep track of the latest information about climate change and global warming and what's your type, to get help with solutions for diseases and conditions

Figure 7 :
Figure 7: Example 3 -Question How can we tackle climate change?Ours Steps can be taken to address climate change by reducing greenhouse gas emissions and promoting sustainable energy and transportation practices.Governments, businesses and individuals must collaborate to develop and implement policies and practices that promote sustainability and reduce the effects of climate change.Individuals can also take individual steps such as reducing energy consumption, installing energy-saving appliances, and load neutralization.In addition, climate change can be promoted through research and studies to understand the factors contributing to it and implement solutions to deal with it.The public can also contribute to climate change mitigation by changing individual behavior and by engaging in efforts to reduce greenhouse gas emissions and promote sustainability.Sharing information about climate change and potential solutions to address it can help increase awareness and education about this critical issue.Ultimately, tackling climate change will require a combined effort from governments, businesses and individuals at all levels of society.Vicuna Yes, everything can address climate change, and it is important to address climate change is to address the change of the natural balance of regions.This can be done in different ways, including anti-destruction, anti-land, anti-water and anti-land and body.Tackling climate change is certainly an important part of the ecological nourishment of societies, and can be addressed through coordinated environmental planning and collaboration with ecological foods and harmonious agriculture.If you know what the climate response is, you can benefit from the new standards and tools you use to address climate change to achieve sustainable development and the environment.

Figure 8 :Figure 9 :
Figure 8: Example A -User Write a python program to add two numbers Response Combine the two numbers to take advantage of online energy efficiency improvements and environmentally significant actions.2.He drove

Table 1 :
Clima500-Instruct Dataset Statistics Finally, we manually filtered out low-quality examples and post-processed the instruction set with the assistance of native Arabic experts.The following steps were taken to post-process the dataset.
(a) Eliminate undefined symbols.(b) Verify and manually translate English words in several questions and answers into Arabic.(c) Remove any instances of corrupted translation (Q or A still being in English).(d) Verify the translation for N random samples.This approach enabled us to obtain a highquality, conversationally-styled instruction tuning dataset, Clima500-Instruct, in the Arabic language.Dataset Statistics: Table

Table 2 :
Clima500-Instruct high-level categories and their instance count

Table 3 :
ChatGPT Evaluation: Our Arabic Mini-ClimateGPT compared with other open-source models, where ChatGPT was tasked to pick the best response based on the ground truth.Our model achieves nearly 90% win rate with all the competing models including the Alpaca-ar in Arabic language.

Table 4 :
Comparison with Naive baseline with translated responses.Vicuna_tr stands for baseline Vicuna with translated responses.ClimateGPT_en_tr stands for English ClimateGPT with translated responses.