Evidence-First Agricultural Intelligence Research
FildraAI is built as a research programme, not just a product. We connect computer vision for crop disease, automatic speech recognition for field audio, machine learning for continuous agricultural variables, and structured knowledge systems — all validated in real field environments across crops, livestock, and fisheries. Each component is evaluated with transparent methods so researchers, agronomists, and farmers can see how the system behaves in practice.
Research Philosophy
Our Approach to Agricultural AI
We use AI and machine learning to support agronomy — not to replace agronomists. Three core principles shape how we design models, collect data, and communicate results.
Agricultural AI must earn trust through transparency, not demand it through authority.
When a model suggests "Northern Leaf Blight" or recommends a specific fungicide application rate, the agronomist in the field needs to understand why. They need to see what the model observed, compare it against their own experience, and make an informed decision. Black-box predictions have no place in agriculture where livelihoods depend on getting it right.
Our research programme is designed around this reality. Every model we deploy comes with explainability tools. Every recommendation traces back to evidence. Every knowledge base entry cites its sources. We measure success not just by accuracy metrics, but by whether farmers and agronomists can understand and act on our outputs.
"Lab metrics are useful, but on-farm performance is the real benchmark. A model that achieves 98% accuracy on curated datasets but fails when farmers capture images under real conditions is not a successful model — it is a research artefact that never made the transition to practice."
Explainability First
Predictions must be easy to inspect. All deployed image models support AI focus area maps and structured outputs showing typical symptoms, look-alike issues, and management options. Agronomists can see why a result was suggested, not just the final label.
Field-First Validation
We prioritise data gathered in real fields — mixed cropping, partial nutrient stress, complex backgrounds — and compare early predictions with end-of-season outcomes. Lab metrics provide a starting point. On-farm performance is the real benchmark.
Transparent Methods, Safe Details
We share evaluation setups, baselines, and typical failure modes. Sensitive items — deployment pipelines, internal hyperparameters, customer data — stay private. But the behaviour and limitations of our models are open for discussion.
Academic Foundations
Datasets & Research Foundations
Our models are informed by publicly available datasets and research from leading institutions across Africa, Asia, and North America — combined with our own field data. We treat these datasets as shared scientific infrastructure and follow citation and licensing conditions for every source.
We separate datasets used for research benchmarking from those allowed in production deployment, respecting the licensing conditions of each source. Full citation details, including DOIs and BibTeX entries, are maintained in our internal technical notes and shared with research partners.
Maize Disease Datasets
Nelson Mandela African Institution of Science and Technology
Arusha, Tanzania
Publishes Harvard Dataverse maize disease datasets with field-collected images and expert labels for Northern Leaf Blight, Gray Leaf Spot, and Common Rust — captured directly from Tanzanian farms under real agricultural conditions.
Makerere University
Kampala, Uganda
Hosts maize image collections with clear field protocols across Ugandan agro-ecologies and seasons, with rigorous annotation standards and disease classification.
Namibia University of Science and Technology
Windhoek, Namibia
Contributes maize disease datasets from semi-arid and arid systems, extending evaluation into Southern African environments with different light conditions and stress combinations than many global benchmarks.
KaraAgro AI
Accra, Ghana
Provides curated maize imagery from West African farms, improving geographic balance across our maize-focused models and including healthy plants alongside multiple disease classes.
PlantVillage — Penn State University
Pennsylvania, USA
One of the most widely cited plant disease image repositories. Where licensing is restrictive, we use PlantVillage primarily as a reference benchmark rather than production training data.
PlantDoc — IIT
India
Introduces in-the-wild imagery rather than controlled lab conditions. Useful for stress-testing robustness and identifying failure modes where background clutter and image quality vary from curated field images.
Rice Disease Datasets
Rice Leaf Disease and Pest Dataset
Mendeley Data, 2024
A comprehensive rice leaf disease and pest dataset providing annotated imagery for multiple disease classes and pest types. Supports training and evaluation of rice-specific visual diagnosis models across diverse growing conditions.
Rice Leaf Diseases — UCI Machine Learning Repository
UCI Repository, 2017
A foundational rice disease dataset covering bacterial blight, blast, and brown spot — three of the most economically damaging rice diseases. Provides early benchmarks for leaf-level visual classification.
Rice Disease Dataset — Kaggle
Kaggle, 2021
A community-contributed rice disease dataset with images spanning multiple disease conditions in South Asian growing contexts. Broadens the geographic and phenotypic diversity of our rice model training and evaluation data.
Proper Citation & Licensing: We cite all datasets and papers in our technical documentation and publications. Each repository is accessed under its own licence terms. Some datasets are used only for research benchmarking; production models are trained on sources whose licences are compatible with commercial deployment.
Full citation details, including DOIs and BibTeX entries, are maintained in our internal technical notes and shared with research partners who would like to reproduce or extend our experiments.
Research Tracks
How Our Research Fits Together
The programme is organised into connected tracks: computer vision for crop images, automatic speech recognition for field audio, machine learning for continuous variables, an evidence-first knowledge system, and field validation. Each track informs the others.
Understanding how these tracks connect explains why our recommendations are contextual, explainable, and grounded in evidence — rather than opaque model outputs.
Crop Disease Detection & AI focus areas
DenseSwin (maize), residual networks (rice), and compact CNN baselines (tomato, cassava) form the foundation of our plant health pipeline. We focus on performance under farmer-captured image conditions and on making model attention visible through AI focus area overlays.
- Hybrid CNN + attention architectures tuned for maize, rice, tomato, and cassava
- Explainability via AI focus areas plus symptom, look-alike, and management summaries
- Confidence calibration so probabilities translate into actionable advice
- Evaluation across diverse African and Asian field conditions
Automatic Speech Recognition for Field Use
FieldAudio is active research — not a roadmap item. We are fine-tuning ASR models for African Bantu languages (Swahili, Chichewa, Tumbuka, Nyanja), building TTS synthesis calibrated to agricultural vocabulary in these languages, and evaluating STT pipelines under real field conditions: wind, livestock noise, and low-bandwidth connections.
- ASR fine-tuning on agricultural command vocabulary across Bantu language families
- TTS synthesis optimised for how farmers actually speak field terms and crop names
- STT evaluation in real field conditions: wind, livestock noise, low-bandwidth connections
- Language coverage: Swahili (Kenya/Tanzania), Chichewa (Malawi/Zambia), and expanding
ML for Rates, Yields & Scenarios
When questions move from "what is this?" to "how much should I apply?", we use supervised regression and statistical models. These estimate continuous targets such as fertilizer rates, spray volumes, and expected yield — not single numbers, but ranges with confidence intervals.
- Agronomy-aware features: nutrient balances, growing degree days, stress indicators
- Regularised linear models and tree-based ensembles for continuous predictions
- Conservative / typical / upper confidence bands rather than single-point estimates
- Safety guardrails derived from regulation, labels, and internal policies
FieldKB & Evidence-First Search
FieldKB is our structured agronomy knowledge system. It links regulations, practice notes, field images, and model outputs into one searchable, country-aware layer — always showing its sources and avoiding one-size-fits-all answers.
- Multilingual, country-aware retrieval routing by crop × country × topic
- Text and image evidence with clear citations and licence tags
- Integration with AI focus areas so users can ask "why this diagnosis?" and see visual evidence
- RAG-style orchestration favouring transparent, document-backed answers
Field Trials & On-Farm Evaluation
Our primary field operations are based in Zambia, spanning smallholder belts and commercial hubs. We collect images, management histories, and outcomes to test how models behave in practice and keep the knowledge base grounded in real farms.
- Sites across Eastern, Central, Southern Province and the Lusaka corridor
- Linked datasets: images, weather summaries, soil tests, and management logs
- Season-end reviews comparing model suggestions with realised yields and outcomes
- Data-use agreements and privacy controls for farmer and partner data
Livestock, Fisheries & New Crops
We are actively scoping research into livestock health monitoring, aquaculture and inland fisheries intelligence, and additional staple crops. These domains share the same accountability requirements as our crop work and will follow the same validation-first approach before deployment.
- Livestock health monitoring — visual and behavioural indicator research
- Fisheries — aquaculture conditions, fish health, and yield intelligence
- Additional crops — sorghum, groundnut, soybean, cassava expansion
- Same evidence-first, field-validated approach applied across all new domains
Expanding Scope
Where We Are Going
As our validation phases progress, we are beginning to scope research into adjacent agricultural domains. These expansions follow the same discipline that governs our current work: validate before deploying, and never overclaim what has not been proven in the field.
Integration
From Input to Recommendation
Understanding how our research tracks connect explains why FildraAI recommendations are contextual, explainable, and grounded in evidence — not opaque model outputs.
When a farmer or advisor submits a query — whether an image for diagnosis, a voice note describing symptoms, or a question about treatment rates — the request flows through multiple research tracks. Each track adds context, validation, and explainability before results reach the user.
Input — Image or Voice
A farmer captures a crop image or records a voice note describing symptoms. FieldAudio transcribes speech to structured queries; FieldVision processes the image through crop-specific computer vision models, generating AI focus area overlays that make model attention visible.
Knowledge Retrieval & Evidence Assembly
The diagnosis, crop, and location feed into FieldKB, which retrieves country-aware guidance, regulatory information, and management practices. Users see the sources behind every recommendation — government guidelines, peer-reviewed research, or validated field data.
Continuous Variable Estimation
When the question is "how much?" — fertiliser rates, spray volumes, expected yields — ML regression models propose rate ranges. Rather than a single number, users receive conservative, typical, and upper-bound estimates with confidence intervals.
Field Validation Feedback Loop
Fieldwork and on-farm evaluation feed back into all tracks. When we compare model predictions with actual outcomes at season end, the results update datasets, stress-test models, and refine FieldKB entries — keeping our research grounded in what actually happens in fields.
Transparency
Openness with Responsibility
We aim to make our research easy to understand and scrutinise while protecting production-critical details that enable continued investment in field-driven agricultural AI.
Open Research Components
Evaluation protocols, high-level architectures, performance metrics, and typical failure modes are documented so partners can reproduce results or challenge assumptions. Model behaviour characteristics and known limitations — including where systems should not be trusted — are open for discussion.
Production & Partner Data
Full training pipelines, internal feature engineering, hyperparameter configurations, and partner-specific datasets remain confidential. This allows continued investment in long-term, field-driven AI development. Customer and farmer data is never shared beyond agreed use cases.
Farmer & Field Data Governance
Field data is collected under clear data-use agreements with explicit consent. Public pages aggregate information at province or district level — we never publish farmer-identifying details. Usage data improves our services but is never sold. Farmers retain ownership of their agricultural data.
Where We Acknowledge Uncertainty
Our tools support agronomy decisions but do not replace local expertise, labels, or regulation. Where models are uncertain — overlapping stress symptoms, out-of-distribution images, novel disease presentations — we emphasise caution and defer to human judgement.
Licensing & Provenance: FieldKB entries record source, licence, and region for every piece of evidence. Datasets with non-commercial or restrictive terms are separated from production training and used only for research or benchmarking purposes.
Model Limitations: We explicitly document where models should not be trusted — novel pathogens, unusual environmental conditions, crops outside our training distribution, and languages with insufficient ASR training data. Transparency about limitations builds more trust than claims of universal applicability.
Interested in Research Collaboration?
We collaborate with universities, research institutes, agribusinesses, and public agencies. If you have datasets to share, models to benchmark, or field trials to design — across crops, livestock, fisheries, or audio — we welcome joint work on accountable agricultural AI grounded in real fields and clear evidence.