You are viewing...

Authz Libraries Are Not Fast especially for a Microservices world.

Updated on December 06, 2019 at the 16th hour
Posted under:

DISCLAIMER: All views are considered my own and you should not draw any conclusions on associates.

What makes me want to write this post? Ah, I've been working on my own product and scouring around for a good answer to authorization. Particularly, authorization libraries in NodeJs. Some are really bad just looking at their GitHub pages and others are pretty good. All of them make an assumption that authz rules are stored somewhere (code, memory, db), which is great. Some are flexible about where policies are stored offering plugins. A lot tend to be rigid on the idea of Role based authz (RBAC) and not enough of Attribute based authz (ABAC). None approach how to do fast authorization as to not execute the authz check for the same request as it traverses the system. Hell, you may not even want to do any more authorization checks unless policy data has changed.

The libraries I liked tended towards attribute based authz, which meant that the subject, action and resource can have attributes. Some libraries posed doing withdraw over limt checks as examples. Probably not a good example. Maybe that should be hardcoded as the condition will never change outside of limit threshold, easy to write and is very much resistant to any library bugs that could occur. The examples I liked to see were the "allow if user.id == post.owner or role is admin" kind of checks. Could be extended to include privacy checks in the future such as "allow if user in post.privacyList" where the privacy list could be a reference to a bloom filter or lazy cached set. It could be a complex join that is required!

There is route authz, resource authz and service to service authz. I'm focusing on resource authz.

I'd imagine most web applications falter in performance with simple resource authz which limits how and where they implement authorization checks.

Performance

Can one imagine how many times Facebook or Google's authorization systems must get hammered by requests for "Does user X have access to resource?" Where the resource may have a privacy setting of a set of friends or shared with... list. How crazy would it be to have to hammer the service or database for the latest information in order to satisfy policy checks? Those policy checks I assume are performed on eventually consistent data, so there may be some lapse where deny should have been thrown instead of allow, but eventually deny will always be thrown.

How does one implement that system anyway? Imagine answering the questions "Can user1 invite user 2 to group?" "Can user 1 include user2 in Calendar?" "Can user1 communicate with user2?" "Is user1 blocked by user2" "Can user1 access this resource which is restricted to dynamically changing list?" How much of this can you cache and invalidate within a reasonable amount of time? Just eat the cost I suppose. It is safe to say that I don't know yet how to implement such a system other than hardcode everything.
- The dynamically changing list would not scale for resources with sudden traffic. DDOS to that weak spot will knock all your services offline especially in the case of authz as a a service.
- Google has a paper (https://ai.google/research/pubs/pub48190) about their authorization system which offers external consistency. They deal with hotspots by using ACL snapshots.. fancy word for cache.

Graph based Access control. One could express acls in a more efficient way in a relation graph. User 1 wants to edit a blog post, but can only modify it if the user is part of user 2's group which is some random group. If you had the relation stored in a graph db then it would be a simple check of "allow if User X MEMBER OF Post Group". Context data could be provided for any part of that if needed. Role based authentication wouldn't be appropriate here since user.assignedRoles would get big when random users can create and delete groups. The future of authz is graph  😁, so where are the libraries? Only kidding, It is complementary to existing techniques. Like how do you express exclusivity policy where only a limited count of users are allowed past this gate? I don't want to store counters in a Graph DB, so this would be an attribute check.

Eliminate duplicate access control checks to a resource across application boundaries. I'm not sure yet how to go about it tbh. The bottleneck will be gathering information for the subject action resource check in multiple places for the same request context. One idea is to store authorizations for a request in cache, so that any duplicate auth checks stemming from a request context are quickly returned. Another might be to just to do appropriate access control checks on the origin only while enforcing some way of checking that the "grants" have been done on the service level. Services could be originators of access control checks if desired. I remember one unnecessary idea at work regarding service 2 in front of the service 1 where service 1talks with database while service 2 does the access control checks and some business logic. It makes sense, but service 1's endpoints will have to be guarded anyway. With a propagated authz request context, could create one service with the endpoints that could guard different versions of endpoints with varying levels of access control requirements (a.k.a. request's session user must have access to resource already such as setUserDetailsWithSessionUser or request must contain some authz to perform operation such as setUserDetails). Could be super inefficient to do these ideas. I dunno.

Conclusion

Choice of authz strategy depends on where it will be used. IMO, simple RBAC is fine for admin applications where roles are operations (blog:post for instance rather than "editor"). Typical ABAC (with database/service lookups) will be fine for random (user generated data) resource checks. Graph based access control is where complex applications end up at since the same graph layout can be shared for multiple services and queries better expressed.

Start small and inefficient until you can no longer ignore the problem is the typical advice. Hopefully by that time, technologies that were not available can be considered. Can't say I am typical or follow typical advice since I prefer small footprint and low cost.

What am I gonna do? Try some of the libraries I liked and keep gathering ideas on a ideal authorization library that I can port over to Rust and Java. Sometime in the near future I'll make my own authz library that encodes the patterns. Can't say I will open source it. I'd rather charge for it by providing it as a service instead. Man's gotta eat.

Notes

NPM: I think NPM should seriously consider doing some curation. Lots of libraries use whatever keyword. I searched for "keyword:abac policy" and libraries that don't do ABAC but decided to use the keyword with no mention of it in the description are included. Waste of space and time. I assert that libraries that do that are

You just read "Authz Libraries Are Not Fast especially for a Microservices world.". Please share if you liked it!
You can read more recent posts here.