Like any new technology, AI brings both opportunities and risks, following the release of his recent research ‘Security Code Review with ChatGPT’, we spoke with Chief Scientist at NCC Group, Chris Anley, about AI, the future, and why you should proceed with caution.
AI systems are rapidly becoming so sophisticated and accessible that their pervasive “intelligence” now forms part of our daily lives. There are a few things that AI systems are very good at - finding patterns, putting things into categories, and generating data that’s like other data. This list may sound mundane, but when you put it into context, it translates to medical diagnostic systems making life-saving early diagnoses, driver assistance systems preventing accidents, universal language translation, and human-like customer service bots - it all suddenly seems extremely real and very impactful.
The opportunity
AI is already making the world better in many ways. We are only just beginning to see the possibilities, but there are clear "directions of travel" from what we've seen in the last decade or so.
- We can expect to see transformative effects in science and medicine, such as finding patterns in large amounts of data and therefore recognising danger signals much earlier orseeing signals and connections that were previously lost in the noise.
- Similarly, in business, we will see better customer engagement, better recommendations, and more efficient business processes.
- Speech and image recognition are now almost solved problems.
- Autonomous transport isn’t far away, and at the very least, we will see significant improvements in safety from driver assistance systems.
- We are now able to automate more business processes than ever before – although it’s still necessary to have a “human in the loop”
- We will see improvements in text generation systems like ChatGPT – larger input and output limits, “augmented” systems that pull live data from the web, and better quality outputs.
- In security, we’ll see better detection, more efficient monitoring, and automation of some audit activities.
The risk
However along with these advantages we need to remain aware of the potential disadvantages and dangers that come with AI.
The biggest risk is that we overestimate their capabilities - AI systems are statistical; they aren't brains, they are very large, very complex statistical engines. Automated decision making is dangerous because people tend to think of computers as infallible, which increases the dangers from bias, assumptions and factual errors in the training data. Biases that are already present in our systems and data can be perpetuated and magnified by these systems; we need to be extremely careful not to automate our past mistakes.
Here are just a few security related reasons why we should be cautious when it comes to AI:
Data Security
To be effective, AI systems require access to large amounts of data. In business, this typically means customer data, which may be sensitive or covered by stringent regulatory regimes. The data may need to be accessed by people within the organisation who didn’t previously need to interact with it directly (data scientists, software developers, ML engineers), and may now need to be sent to third party systems and personnel. If models are trained on data from multiple customers, the data that may previously have been “segregated” by customer is now present in a single pool. New systems and infrastructure are necessary to power AI systems, all of which introduces new attack surface.
AI systems require significant changes in the ways that organisations interact with their data; it’s important to recognise this and ensure that appropriate security regimes are in place.
Third Party Disclosure
When using a third-party system, one of the biggest security risks is that the third party may not take appropriate care of your data. This is a major issue for widespread adoption of AI techniques in the commercial world. Customer data, business sensitive information and source code can be under heavy legal and regulatory protection and passing this data to a third party without careful controls can be dangerous.
Deep Fakes
We’ve already seen hackers and criminals producing realistic "deep fake" videos, based only on photos of individuals, which are becoming very convincing. There are reports of AI voice cloning systems being used to defraud businesses by attackers posing as executives and redirecting payments. We are now in a world where we can no longer trust the identity of the person we see on a video call, or speak to on the phone, and business processes – especially those related to payments – need to change to accommodate this new reality.
Phishing
Recent generative AI systems such as ChatGPT are excellent at creating plausible sounding text and can generate variations on a theme very quickly and easily, without any tell-tale spelling or grammatical errors. This makes them ideal for generating variations of phishing emails, and in targeted social engineering attacks.
Malware
Recent generative AI systems can also provide individuals who have little or no coding ability with a tool to adapt or finetune malware that others have created to make it more effective. Some coding and security skills are still required, but these tools can help point attackers in the “right” direction. What these new language models present is a difference in ease, quantity, and speed. Hackers are already capable of writing variations on phishing emails and malware, but using AI as a tool makes the process easier and quicker, and reduces the “barrier to entry” for writing malware.
Leakage of Training Data
Data that AI models are trained on can be retrieved by various attacks where the attacker simply queries the system. This can pose a problem if the system is trained on sensitive data like email contents, discussions of sensitive topics, financial information, or other sensitive data.
Fundamentals
Finally, when pursuing the latest innovations, it is easy to neglect the foundations that newer systems must be built on, and to that point here’s some specific guidance to keep in mind:
- Patch systems.
- Authenticate, and use multifactor authentication wherever possible.
- Have a security training programme for all staff, especially executives. It's important that everyone understands that ANYONE can be hacked; blaming the victim is counterproductive. Everyone in the organisation needs to be pointing at the problem rather than pointing at each other. This is the only way to ensure that incidents are reported immediately and can then be dealt with effectively.
- Control access.
- Log activity in your systems and use those logs to detect and respond to malicious activity.
- Make full use of web application firewalls and other security features of your cloud platform.
- Run NCC Group's open source ScoutSuite (a multi-cloud security auditing tool) against your cloud estate.
- Have the security of your systems reviewed by external professionals and give those professionals access to your source code to help with their review. And while ensuring that there is a clear directed focus on your areas of concern, you should also give those professionals a broad scope to investigate any security weaknesses in your organisation that they find.
Conclusions
In conclusion, AI systems are beneficial in many ways, however alongside opportunity comes risk, and you should proceed with caution when using them. To alleviate security pitfalls, ML should form part of a process that includes humans for oversight and review. The "decision" can then be explained in the context of the whole process; the training data, the algorithm, the checks, and balances - and the human review.
For further NCC Group research pieces relevant to this topic, see below:
- Security Code Review With ChatGTP
- Exploring Prompt Injection Attacks
- Machine Learning 102: Attacking Facial Authentication with Poisoned Data
- Machine Learning 10: The Integrity of Image Misclassification
- Five Essential Machine Learning Security Papers
- Whitepaper Practical Attacks on Machine Learning Systems
- Machine Learning for static Analysis of Malware
- Practical Machine Learning for Random (filename) detection