Amazon Bedrock implementation leveraging AWS Lake Formation Tag Based Access Control (LF-TBAC)
While an easy access to pretrained LLM’s revolutionized applications, they have also led us to scenarios which are difficult and complicated. The challenges that LLM’s pose in terms of a safe and secure deployments, can often become the biggest hurdle in LLM implementation.
Some of the critical points in a typical LLM deployment are -
- LLM’s do not differentiate among users; and respond to every question leveraging their training data.
- Organizations often implement LLM use cases using Retrieval Augmented Generation (RAG) technique which optimizes LLM responses by providing it access to the organization data.
- Prompt engineering furthers the RAG implementation by ensuring that the LLM responses are limited to the organization data that it has access to.
- While this “fencing” of LLM responses using RAG and Prompt engineering is useful, it also leads to a scenario where every user now has access to all the data that the LLM has access to. This means all users have access to all data – Public, Internal Restricted and Confidential; which may lead to unintended consequences.
In this blog, let us look at how we can implement an LLM use case in AWS using Amazon Bedrock and AWS Lake Formation. We will use the AWS Lake Formation Tag Based Access Control ((LF-TBAC) to ensure data access based on user entitlement.
For this example, let us use a scenario where LLM responses are fenced around the user data entitlements in terms of the data classification of - Public, Internal, Restricted and Confidential.
- Using Lake formation Tags (LF-TBAC), define tags and the values for those tags. In this example let’s define 4 tags with values – Public, Internal Restricted and Confidential.
- Assign these 4 tags to your resources – Data Lake Database, tables, columns. The tags are hierarchical and so if you tag a DB, all tables and columns in that DB by default will inherit the same tag. This means if you tag as DB as Confidential, all tables and table columns under that DB will inherit the same tag as confidential. You can override this if required, but this is the default scenario.
- Create 4 IAM Roles with similar names as the tag values – IAM-Role-Public, IAM-Role-Internal, IAM-Role-Restricted and IAM-Role-Confidential.
- Next define the policies for the tag and scale this permission model. Grant permissions to principals/IAM Roles to assign LF-Tags to resources. You can add Public tag permissions to IAM-Role-Public etc.
Your Lake Formation Tag Based Access Control is now ready to authorize access to catalog resource and S3 objects. When users access the data, LF authorizes access based on the permissions set using the tags.
AWS Lake Formation uses the Tag Based Access Control to compare the Data Tags with the User Role Tags and data is returned if the Data Tag and the Role Tag are matched. For example – if the User Role is Public, Lake formation will return only data tagged as Public.
Conclusion:
Using Lake Formation -TBAC is one way to ensure that LLM responses to users are based on the user data entitlements only. This means, a user with Public entitlement will see LLM responses with data classified as Public only, while a User with Restricted access can see LLM responses built using Restricted data.
~Narendra V Joshi
Comments
Post a Comment