Using Large Language Models for research

How long would it take for you to read 10,000 thousand scientific articles? Even if you managed 10 a day, this would take almost three years – including weekends (!). There are vast amounts of untapped insights in agricultural and climate data which finres is accessing and making available to our partners. Some of the latest developments in Artificial Intelligence (AI), namely Large Language Models (LLMs) make it possible to benefit from this previously produced research.

The benefits are not just speed. More and more people are using LLMs such as ChatGPT, Gemini, Perplexity and others, to obtain quick and apparently correct results. LLMs are also being used more and more in scientific research. The same algorithms which answer seemingly random questions drawing from models trained on terabytes of source material, are able to be honed and targeted on the body of scientific literature to synthesize large amounts of information, draw patterns and extract insights. Even if someone did dedicate themselves to reading all 10,000 papers on a particular subject, they would not be able to remember and synthesize across all these articles: LLMs can. 

This is why finres has been using LLMs as part of our research and development to strengthen our work supporting agriculture to adapt to the changing climate. In this blog, we highlight two ongoing research projects using LLMs and offer three emerging lessons for other researchers seeking to benefit from these powerful AI tools. 

 

Using LLMs to Identify the Most Effective Climate-Resilient Farming Practices

Shading agriculture technique used in a potato field

Our Agronomist Dr. Mathilde Duvallet is leading a project using a powerful LLM (ChatGPT-4o) to analyze the effect of different measures that can be used by farmers to adapt to a changing climate. These practices include new methodologies or complementary technologies that farmers can use if they are faced with more difficult weather conditions. For example types of shading to protect from higher temperatures, micro-water harvesting to combat drought conditions, or different drainage systems to tackle saturation from too much rain. The effect of these different measures varies by crop, microclimatic conditions, soil parameters, as well as other variables.

Mathilde used ChatGPT-4o to screen and find the most relevant studies conducted out of an initial list of 10,000. More than 1500 papers have been classified by geographic location, soil type, crop and the climate adaptation technology studied. This initial classification from the LLM allowed us to prioritize the papers by climate adaptation type, climate or soil type, before our team of researchers then analyse and integrate the results of the research in our work.

 

Analysing  Behavioural Science Research with LLMs

In another project, our Data Scientist Dr. Dan Xie, used a LLM to conduct a qualitative meta-analysis of barriers faced by farmers inhibiting the adoption of new agricultural practices. ChatGPT-4o was again used to screen and select relevant academic articles. Validation tests showed that ChatGPT-4o achieved  results consistent with human researchers 94% of the time. The LLM analysed the content of the articles, and used statistical methods such as Sentence-BERT clustering to develop 12 categories of barriers, including elements like behavioral factors, economic influences, and policy support. 

Dan used a LLM to analyse over 670 separate studies, covering more than 14,603 farmers from across 33 countries. This meant we were able to reveal patterns across different locations, demographics, and other variables such as the type of farming. Further statistical assessments, called co-occurrence analysis, also identified interactions between barriers in different groups. By being able to more efficiently analyse so much research means we are able to better design applications and target information to support farmers that are adapting to the changing climate. 

 

Reducing “hallucinations” and other risks of AI for Agriculture

As with any research using innovative tools, there are risks. The LLM might “hallucinate”, i.e. create information that seems coherent and correct but is not drawn from any accurate pre-existing data. To mitigate this risk, parameters can be adjusted in the coding of the algorithm to control the randomness and amount of data the model uses to find the right answer. The optimal value for these parameters can be chosen by the researcher following tests on sample results to assess the accuracy of the information being provided. 

Another risk is that the LLM may be inconsistent, i.e. give a different answer despite being asked the same question, or ‘prompt’. This can happen when the answer is based on nuanced information that needs subjective interpretation. Validation tests can again be used to assess the probability of inconsistency before undertaking large-scale analysis, but as with judgements faced by human researchers, some of this risk will be unavoidable. 

 

What are some key lessons for other researchers when using LLMs? 

These approaches help minimize ambiguity and maximize the quality of the output received. 

  1. Be precise and strategic with prompts: Crafting clear and specific prompts is essential for obtaining accurate results. The more precise the prompt, the more effectively the LLM can focus on relevant data. For example, instead of asking a general question like “What impacts farmers’ decisions?”, use detailed instructions like, “Identify factors impacting farmers’ adoption of agricultural practices.” 
  2. Use several questions or steps to obtain the information needed: Progressively narrow the scope of inquiry rather than asking open questions that could lead the model to hallucinate For example, when assessing which crop is studied in some research, first we asked the model if the paper is focused on agriculture, next ask if a crop is studied in the paper, before finally asking which crop is being studied.
  3. Implement strong verification processes: To ensure the reliability of LLM-generated data, it’s important to verify outputs against human-reviewed standards to avoid noise and bias in the LLM-generated output. Regular validation checks build trust in the findings and help maintain scientific rigor.
  4. Use LLMs for scalability, but recognize their boundaries. LLMs are invaluable for handling large datasets quickly, but they won’t be suitable for every task. While they efficiently process large volumes of information, nuanced interpretation may still require human oversight. Combining the strengths of LLM scalability with human expertise ensures more accurate and meaningful research outcomes. 

 

What’s next? The Future of AI in Agricultural Research

The Future of AI in Agricultural Research

We take the efficacy and ethical consequences of using AI in our work seriously. Rigorous testing of the methodology as well as the results ensure we understand and are confident in the work produced, finres is already using another form of AI, machine learning, in its calculations relating to the changing climate and hyper-local  impact on specific agricultural products. But with new models of LLMs being regularly developed and released, there is even more scope to uncover greater insights to support the future of farming.

Hybrid LLMs with multimodal abilities will allow researchers to combine text, images, and other data formats, enriching the analysis process, such as integrating visual data from satellite imagery with textual analysis to enhance insights into agricultural practices. Whereas the integration of real-time data into LLMs will allow for continuous updates, ensuring that research findings remain timely and relevant. As this innovative area of research matures, domain-specific LLMs are being built to tailor to specific focus areas, which in turn can power applications developed to bridge the gap between the latest science and its practical implementation.

 

To find out more about how finres is using AI to develop applications to support the resilience of the agricultural sector please get in touch or try the free version of AgHorizon.  AgHorizon uses insights derived AI-supported research to provide decision-grade, crop-specific, data for the specific location you request.

 

Recent posts

Five Insights from five Years of Using AI and Science to make agriculture more resilient in the face of global disorders.

This month, finres celebrated our fifth birthday: five years of using the latest AI and science to bridge the financing gap and make agriculture more resilient to mounting global disorders.[...]
Read more

3 takeaways from COP29 on adaptation and agriculture

COP29 emphasizes the urgency of scaling up private sector investments in adaptation, especially for agriculture. As climate risks intensify, the roadmap to COP30 will focus on actionable solutions to build[...]
Read more

The new normal: French agriculture in an era of climate extremes

New analysis from finres highlighting the scale of the devastation caused by the wet agricultural season in 2024, the increased frequency of such climate extremes in the near future, but[...]
Read more