Security Tips For Your AI Cloud Infrastructure

Written by Javier Garcia

In the current panorama of AI expansion, more and more companies are deciding to take advantage of its powerful capabilities. However, using AI from scratch is not a piece of cake: algorithms complexity and data requirements, among others, may be overwhelming.

To cover this need, cloud providers have included AI services in their catalogs, giving birth to AI as a service (AIaaS). With this option, companies can enjoy the benefits of AI without worrying about the complex infrastructure underneath.

However, default configurations for all kinds of services are, as we all know, rarely as secure as possible. From NCC Group, based on our experience with AI services in the cloud, we have developed a testing methodology which will ensure that our clients use these services securely.

In this post, we share several tips about how to secure your AI environments in the cloud.

Follow the Principle of Least Privilege

All standard cloud security settings we are already familiar with, such as Identity and Access Management (IAM), continue to apply in AIaaS. That means excessive permissions still play a significant role in the security of an AIaaS environment, and it should be ensured that configured services cannot access data or perform operations outside its scope.

Regarding IAM, companies should consider the principle of least privilege (PoLP) at the time of assigning permissions to users and workloads.¹ Only the necessary permissions to perform their intended tasks should be assigned. This will reduce the attack surface of the AI environment and the impact of potential compromises. It will also safeguard it against human error.

Additionally, some cloud providers offer AI related services for collaborative notebook environments. Access to notebooks is handled through permissions. Sharing permissions should also be applied following the principle of least privilege: to the right principals and at the proper scope.² Permissive permissions, together with lack of segregation between test and production environments, may lead to compromises in the AI environment such as code execution.

Isolate your services

When cloud services are exposed to the Internet, they become potential targets for attacks. Similarly, a service with non-restrictive access to internal networks or subnets may increase the attack surface in case of a potential compromise. This also applies for AI services, which often come with suboptimal default settings.

For network access, as with other services in the cloud, AI services also get configured with public access by default. Therefore, unless further restrictions are applied, AI Services will be exposed to the open Internet. This is not recommended, unless it is an intended behavior, since it increases the attack surface of the AI environment and may result in leakage of sensitive data to unauthorised parties if other security measures are not in place.

Also, with certain providers, AI services tend to use the default virtual networks in the cloud tenant.³ This goes against security best practices since these networks are typically configured with suboptimal firewall rules. Protection against certain attacks require using firewalls or proxies to limit how many inferences attackers can perform per second. Therefore, ensuring that direct connections to the AI capabilities are not performed is an essential step to secure the AI environment.

Finally, for some AI services, it is possible to configure templates with the configuration to use in underlying VMs. The same settings that are usually reviewed in cloud Configuration Assessments also apply.⁴ It is important to guarantee that templates with hardened settings are used as a secure base for notebooks and training infrastructure, so that underlaying components are properly protected.⁵ For instance, a good starting point in GCP Vertex AI would be to implement the extended security policy for secure AI⁶, which helps protect Gemini and Vertex AI workloads enforcing a series of settings.

Monitor your workloads

Monitoring and auditing is an essential component in cloud security. It does not only help companies detect security incidents faster, increasing reaction time, but it also grants a better insight into the AI environment.

Certain AI specific attacks such as model inversion⁷ or model theft⁸ involve sending massive amounts of inference requests and analyzing their responses, so proper monitoring helps to detect them and respond accordingly.⁹

Proper monitoring of the model predictions should be configured to detect the signatures of this kind of attacks. According to OWASP, model inversion should be monitored by tracking the distribution of inputs and outputs, among others.¹⁰ For model theft, regular monitoring and auditing the model’s use can also help detect and prevent theft by detecting when an attacker is attempting to access or steal the model.¹¹

Monitoring and auditing can also help to avoid incurring extra costs due to unexpected workloads, which may even end up in a Denial of Service (DoS) if budget limits are set in the cloud account, which is known as a Denial-of-Wallet (DoW) attack. The main cloud providers offer budget and budget alerts which could control the financial impact of a DoW attack. Although the attacker would then achieve DoS, the monetary loss will be mitigated.¹²

It should also be noted that AI services, same than the rest of cloud services, can be integrated with the cloud provider monitoring solutions, although for most of the services, this is not configured out-of-the-box, and some extra configuration steps must be performed.¹³¹⁴¹⁵

Protect your data

Without data, there is no AI. Protecting the data employed in the AI services should be one of the highest priorities for companies. Datasets typically contain sensitive information that should be properly protected. For this reason, encrypting the dataset at rest using customer keys as an additional protection layer is strongly recommended.¹⁶

Due to the nature and intended usage of AI services, it is common for them to handle sensitive data and Personal Identifiable Information (PII). From chatbots to classifiers, text-to-speech and all other services, sensitive data should be handled securely. For this purpose, PII redaction and similar features offered by the cloud provider should be enabled in order to comply with the strong regulations to which this kind of data is subjected to. Most of AI services, such as language processing, text-to-speech, chatbots and image classifiers offer this security settings.¹⁷¹⁸¹⁹

Conclusion

In this post we have given a brief overview of some concerns of AI in the cloud. Although security in AI services has opened new windows of opportunity for malicious actors, the standard Cloud Configuration Review still play a massive part in keeping our AI services secured against unauthorized access and misuse. In making the leap to AI services, we must not forget standard principles of security in the cloud.

1 PoLP in Azure: https://learn.microsoft.com/en-us/entra/id-governance/scenarios/least-privileged

PoLP in AWS: https://docs.aws.amazon.com/wellarchitected/latest/framework/sec_permissions_least_privileges.html

PoLP in GCP: https://cloud.google.com/iam/docs/using-iam-securely#least_privilege

2 https://docs.aws.amazon.com/sagemaker/latest/dg/security_iam_id-based-policy-examples.html#security_iam_service-with-iam-policy-best-practices

https://cloud.google.com/vertex-ai/docs/general/access-control#custom-roles

3 Secure Vertex AI Workbench instances: https://cloud.google.com/vertex-ai/docs/workbench/instances/introduction#secure_your_instance

4 GCP Colab Enterprise templates: https://cloud.google.com/colab/docs/runtimes

5 https://docs.aws.amazon.com/sagemaker/latest/dg/mkt-algo-model-internet-free.html

6 Predefined posture for secure AI, extended: https://cloud.google.com/security-command-center/docs/security-posture-extended-secure-ai-template https://owasp.org/www-project-machine-learning-security-top-10/docs/ML03_2023-Model_Inversion_Attack

7 https://owasp.org/www-project-machine-learning-security-top-10/docs/ML03_2023-Model_Inversion_Attack

8 https://genai.owasp.org/llmrisk2023-24/llm10-model-theft/

9 AWS recommendations: https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/mlsec-11.html