Citizen-generated data (CGD) plays a crucial role in enhancing transparency and accountability in service delivery, especially in resource-constrained environments. By empowering citizens to report local service delivery failures, CGD offers ground-level insights that traditional, centralized data collection often overlooks. This empowers marginalised communities to voice their concerns and contributes to more inclusive and participatory governance models.

Project Name

Detecting potential service delivery intervention areas through citizen-led reporting integrating WhatsApp and Large Language Models

Objective

This project aims to leverage the widespread use of WhatsApp in South Africa to create a scalable and inclusive platform for citizens to report service delivery challenges and thus providing information on potential service delivery interruption hotspots. The collected data will be analysed using AI to generate insights for improved service delivery and accountability.

The appeal of this concept lies in the integration of WhatsApp, a widely used communication platform, with a sophisticated Language Learning Model (LLM). This combination allows for the collection of rich, citizen-generated data in a familiar and accessible format, while also ensuring the data is interpreted and utilised effectively.

This concept creates a single platform to track potential service delivery hotspots and thus potential areas to intervene at scale. It complements current citizen-driven initiatives by creating an additional data source with which to cross-check existing sources. Additionally, this project makes possible a close to real-time analysis of potential service delivery interruption hotspots.

All citizen-generated data will be submitted anonymously, ensuring there are no privacy issues associated with the data collection.

Chatbot

To ensure maximum transparency and explainability, we explain how our chatbot works and how we ensure that your data is managed ethically and responsibly. The first thing to understand is that we have a conversation period, which runs from when the user initiates a conversation to either a few minutes after the last activity or to a maximum conversation duration. During that time we store each message as it comes in and keep a record of personal information such as the user’s WhatsApp number, the location pin of the service delivery issue, and the time of the conversation. Once the conversation period ends, we process this data and move it to a long-term database, de-identifying the data in the process.

How the conversation works
Users send a message on WhatsApp, which is forwarded to our application using a Twilio Application Programme Interface (API). Once our application receives a message, the first thing we do is make sure that the user gives their informed consent so that we can collect their data, in accordance with the Protection of Personal Information Act. At any point during the conversation, users can send a quit response, at which point the conversation ends and all of the data they shared with us is deleted. From here we manage the message, first by checking whether or not the user has given consent, then checking that they are using the tool for its intended purpose and not for reporting criminal activities or emergency situations. If the message is about service delivery issues and the user consents to us collecting their data, then we send their message to a Large Language Model that we have instructed to have a conversation with them, collecting information about how long the service delivery issue has been going on for, where it is happening, and what its effects are. Once the LLM is satisfied that is has collected this information, it will thank the user.

How the post-conversation processing works
Shortly after every conversation, the data in short-term memory is processed to remove personal information, stored in the long-term database, and then deleted. The de-identification process includes the following:

  1. The user’s WhatsApp number is deleted.
  2. Locations of service delivery issues are aggregated to the ward level. This reduces the identifiability of service delivery location, and thus the citizen’s location. If the ward cannot be determined, then this data field is left blank.
  3. The times of the conversation aggregated to the week level. This reduces the ability to identify an individual or their location at a particular time.
  4. We use other AI techniques to remove personal information in the form of unique identifiers that may have been shared with us.
  5. Requests are sent to Twilio to delete any data that they have stored from each conversation. The company whose LLM we use does not retain the data we send it long-term or use it for training their models.

While these steps de-identify our collected data, it is not possible for us to guarantee that our data has been de-identified to the extent that it cannot be re-identified again, since we cannot control what information citizens choose to share.

How we analyse the data
We can now analyse and visualise the data. We do this firstly by doing topic analysis. This AI method matches the content of messages to different service delivery topics, allowing us to classify the conversations. This allows us to display the data on an interactive map which allows users to determine the most pressing service delivery issues in each ward, local municipality, district municipality or province over time. By comparing wards, policymakers can better determine which policies will best meet the needs of the people in different areas. By comparing one region over time, one can see if the citizen complaints about one service delivery issue are decreasing or increasing. We also analyse the data using LLMs to generate summaries of the citizen conversations, again, filtered by region, time period, or service delivery issue.

Providing this close to real-time analysis will

  • Help policymakers set policy agendas that better meet the needs of citizens in different regions and get early indicators of widespread service delivery issues.
  • Empower citizens with information they can use to hold their government representatives accountable across different spheres of government.
  • Enable businesses to identify markets in need of goods and services.

Terms of use

By using this service (the chatbot), you agree to the following terms and conditions.

Objective
This tool collects data on service delivery issues in South Africa through cellphone messaging services. Your participation helps inform public service improvements and contributes to open data initiatives. Participation is not mandatory, and you can reply with ‘Q’ to stop a conversation and delete it from our database.

Responsible Party
The responsible party is the Policy Innovation Lab at Stellenbosch University. Contact details can be found below.

Data Collection
The data collected through this service includes:

  • Details about service delivery issues as reported by you.
  • The ward in which the service delivery issue can be found.
  • The week in which the conversation took place.

Use of Data
To do the research we have discussed, we must collect and store the data described from people like you. The information you give to us may be re-used by others. Other investigators from all over the world can ask to use your data in the future. To protect your privacy, we do not share your contact details, the exact location of the service delivery issue, or the exact time of the conversation. This tool does not request, intentionally store, or require any personal information such as your name, ID number, phone number, residential address, or any other information that could be used to identify you. You should remain anonymous in your interactions with this tool. If you do choose to share personal information with us, then you allow us to share that information and use it for the described purposes.

Public Availability
Since the data is made publicly available, once submitted, it may not be possible to delete or amend it. However, reasonable attempts will be made to delete or amend submitted data by request (see our contact details below). By using this tool, you consent to the data being used in this manner.

Potential risks
The potential risk of using this chatbot is that someone may access personal information that you share.

Furthermore, while we take steps to protect the data from unauthorized access, please be aware that no method of electronic transmission or storage is completely secure.

Use of the Tool
By agreeing to these terms you agree to only use the tool for its intended purpose of collecting South Africans’ perspectives on service delivery issues. This tool does not report the service delivery issue to your local municipality, and it remains your responsibility to do so. You agree not to use the tool for reporting criminal activity or emergencies, or for your purposes.

Legal Compliance
This tool complies with the Protection of Personal Information Act (POPIA), No. 4 of 2013. Following POPIA, no personal information is collected unless submitted by you with your consent, and measures are taken to ensure that the data remains anonymous and is used responsibly.

Disclaimer
The service is provided “as is” without any guarantees of completeness, accuracy, or timeliness. Users should not rely on this tool for reporting urgent or life-threatening issues or reporting criminal activity.

Changes to Terms
We reserve the right to update these terms and conditions as necessary. Continued use of this service indicates acceptance of any updates.

Explainability
This chatbot uses artificial intelligence. To find out more about how it works click here.

Contacts

Requests or further information can be obtained from the Policy Innovation Lab at Stellenbosch University (www.policyinnovationlab.sun.ac.za/contact/).

This research was approved by the Stellenbosch University Research Ethics Committee: Social, Behavioural and Education Research (ID 31685). If you have questions, concerns or complaints regarding your rights as a research participant, please contact Mrs Clarissa Robertson [cgraham@sun.ac.za; 021 808 9183] at the Division for Research Development.
For more details on South African privacy law, consult the Protection of Personal Information Act (POPIA), No. 4 of 2013, or the Information Regulator of South Africa’s official website here.