Globally, governments are employing the use of tools that collect citizen-generated data to gain a better understanding of the needs of their constituents. The Policy Innovation Lab has been developing a WhatsApp chatbot for documenting service delivery issues within South Africa. The purpose of this tool is to improve national policy on service delivery by making citizen-generated feedback timely, accessible and useful. Based on the needs of the government, we are in the early stages of repurposing this tool to collect data from business owners and investors regarding the policies, laws and regulations that impede their investments in the South African market.

Earlier this year, as part of our commitment to using AI responsibly, the Policy Innovation Lab asked OpenUp to perform a legal and ethical review of the chatbot and supported by GIZ’s Data4Policy initiative. This review complemented the internal ethics and data management processes required by Stellenbosch University. This is intended to not only ensure that our chatbot was compliant with South Africa’s laws and regulations, but also to scrutinize our application according to the highest ethical standards. You can read more about OpenUp’s approach and the key aspects they examined in their analysis. Below we reflect on some of the lessons we have learnt through this process.

Lesson 1: AI ethics reaches far beyond the technical implementation

Designing a chatbot that can have a conversation and collect data is relatively easy nowadays. Knowing what to do with that data to ensure that the user’s data and privacy is respected is more challenging, especially when the chatbot is designed with two end-users in mind: First are the citizens who use the tool and share their knowledge with us, and second are the policymakers to whom we provide the data analysis.

Technically, the chatbot is designed to have two types of data storage: short-term storage, where the user’s WhatsApp number and current conversation are stored, and long-term storage, where the final conversation data is stored. To get from the initial (short-term) conversation to the final (long-term) dataset, we use AI to de-identify the data by removing personal information such as the user’s name and cell-phone number and aggregating the location of the issues so that no one user could easily be found. Once the data is processed, it is put in long-term memory and the short-term memory is deleted, and requests are sent to other applications used to delete the data. The design of the chatbot and task sequencing is built around this principle of having a quick and responsive chatbot that uses short-term storage to obtain data and long-term de-identified data for analysis.

Most of the Lab’s focus concerning the tool’s ethical use has been on these technical processes. As a result, OpenUp’s investigation did not find any major issues. However, OpenUp’s analysis revealed two aspects of the system that extended far beyond the codebase that we had not spent that much time considering. This is a valuable reminder to occasionally take a step back from the computer and think about the bigger picture and long-term view.

Lesson 2: In Academia it is difficult to implement long-term digital projects

One of the issues OpenUp raised was about developing a financial sustainability plan, in other words, a plan to answer the question “What happens to our tool if the funds run out or people leave?” This is a challenge within academia in general, particularly when you rely on external grants and funding that typically last only a few years. It then becomes difficult to create long-term financial plans that will enable the long-term impact of your research and projects. Furthermore, academics themselves are often on short-term contracts, making it difficult to maintain projects for long periods of time. The pressure of publishing also means it may not be possible to find academics to run, maintain or update existing projects that may not lead to new research outputs.

Lesson 3: There are trade-offs between user-friendliness and ethical use

OpenUp (and others) rightly pointed out that we would improve the user experience and trust in our chatbot if we are able to send information back to citizens about how their data is being used. However, this is not possible unless we keep their WhatsApp numbers and link them to the data that they give us. In our approach, we prioritized the user’s privacy over the ability to provide them with personalized feedback. There are several ethical and pragmatic reasons for doing this that are tied to the overall purpose of the tool, which is to support policymakers. By comparison, the priorities for a business chatbot whose goal is to improve customer service would be different, and user engagement would be prioritized over their data privacy and security.

These are just a few key lessons that stand out, and we extend our thanks to OpenUp and Data4Policy for their contributions to our project. As the Lab begins repurposing the chatbot to help the government understand where it may reduce red tape for investors, we will continue to strive towards developing responsible AI for transformative policymaking, informed by the lessons we continue to learn.

Published On: July 23, 2025Categories: Data Science & Public Policy, News
Please Share