desa/access_control.md

Last active January 18, 2019 16:18

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/desa/e43f97511ea31f402e76b7f4fd1116d8.js"></script>
Save desa/e43f97511ea31f402e76b7f4fd1116d8 to your computer and use it in GitHub Desktop.

Download ZIP

Ideas

Raw

access_control.md

Access control

There are a number of concepts that are currently loosely related to access control.

Authorizations
Sessions
User Resource Mapping

The way that a request is currently authorized takes place in one of two ways

Case 1. a token (part of an authorization) is provided

We look up the authorization by the token provided, this authorization contains the set of all permissions that an authorization has.
We put the authorization on context, and pass it down the chain of functions.
Somewhere along that chain, we construct a permission and ask if the if the authorization allows the action.

Case 2. a session key is provided

We look up the session by the key provided
We grab the user off of the session and use the user resource mapping and resolve that to a set of permissions
We put the session on context, and pass it down the chain of functions.
Somewhere along that chain, we construct a permission and ask if the if the session allows the action.

While this sounds relatively straight forward, it has been relatively hard to express the model to others and in practice has lead to a bit of awkwardness in the implementation.

This awkwardness is the result of needing to look at the resource in order to authorize access. There are two main ways this manifests itself.

when we are the member of an organization and we are only provided the id of the resource, we have to retrieve the resource to see if it belongs to the organization to authorize the action
when attempting to authorize find many, we have to fetch all and construct a permission for each resource (filtering out results that aren't authorized)

Ideally, we should have an efficient way doing the following

find all resources a user, or token, is authorized to see
given an user, or token, and the ID of a resource determine if I'm authorized to access the resource.

Proposition

I propose that we move to a system that is based on Access Control Lists (ACL). That is, instead of having a set of permissions that we move around and use for each request, each resource in the system has an associated list that contains the IDs of each of entity that has access to the resource (where an entity is a user, org, or token). This way, given a user, or token, and the resource id, we could check to see if the user was authorized to perform the action without ever needing to explicitly access the resource.

In addition to the ACL, I propose we store an Inverse ACL (IACL), so that given a user, or token, we can look up all resources that it is authorized to see.

I belive this would roughly look like two indexes, similar to what User resource mappings look like today. Those two indexes would be

ACL index (resource type)/id -> [(user || org || token)/id]

This would be used to detemine authorization of a resource given a resource type, resource id and user, or token, id.

Let rt = resource type and rid = resource id.

For tokens, you simple check the existence of the key rt/rid/tt/tid (where tt = token type and tid = token id)
For users, you simple check the existence of the key rt/rid/ut/uid (where ut = user type and uid = user id). If that key does not exist then you scan across rt/rid/ot/* and for each oid you check to see if ot/oid/ut/uid exists. If no such key exists, the action will fail. This has worst case peformance of log linear (there are the initial log lookup + the log lookups for each org scan value) in the number of org entries.

Note: it's possible that there is only one case here, the specific implementation would depend a bit on requirements.

IACL index (user || org || token)/id -> [(resource type)/id]

This would be used to find all resources of a particular type. Given rt = resource type the process would be as follows

For tokens, you scan across tt/tid/rt/*. This should produce the entire list of available resources.
For users, you scan across ut/uid/ot/* for each oid and union together the list of ot/oid/rt/*, and join that with all of the values for ut/uid/rt/*. This will likely require a bit of deduping during the scan, but the operation should be efficient.

One thing that this would change is that tokens would now have an associated operation that takes place, where the token id is added to the list of each resource that the token grants permission to.

Additionally, we'd need a system that could resolve names to IDs (since this design works exclusively with IDs).

Note

It should be noted that this model is functionally equivalent to a minimal role based access control model See the role based acces control document section on ACLg. This is possible since we can store the types of actions that a user, group, or token is allowed to perform as a value in the ACL.

The benefit

The benefits would be the following

A simple authorization model that is easy to explain to other
No need to access a resource in order to know if the user is authorized to see it
Worst case log linear, in the number of org owners a resource has, authorization time (which should be fairly small since things currently only have a single org owner)
Returning list of authorized resources should be log lookup + linear in the number of resources
If performance becomes an issue, there is an obvious partitioning scheme we can do (by resource type)

Raw

redundency.md

Code Redundancy

There is a large amount of duplicated logic for each of our services. Specifically, every new resource that we add has must implement essentially the exact same logic as every resource. The patterns is as follows

Add top level struct and interface definition (e.g. ./resource.go).
Add conformance test for resource (e.g. testing/resource.go).
Add concrete and mock implementations of interface (e.g. {bolt,kv,inmem,etcd,etcd_cache}/resource.go)
Add authorized wrapper of interface (e.g. authorizer/resource.go)
Add http handler and interface impelemenation of resource (e.g. http/resource.go)

Nearly all of this work is simply a copy-and-replace types workflow that leads to the same exact logic being duplicated over and over again.

Creating a new resource should not involve reasoing about how things are written to disk and should happen at another layer of abstraction.

Specifically, I'm imganing that we have something like

package idk

type Database struct {
  store kv.Store // or something more generic so that we could have non kv type backends.
}

func (db *Database) Register(v interface{}) error { ... }
func (db *Database) Tx() (*Tx, error) { ... }

func (tx *Tx) Put(v interface{}) error { ... }
func (tx *Tx) Find(q Query, vs interface{}) error { ... }
func (tx *Tx) Delete(q Query) error { ... }

func (tx *Tx) Commit() error { ... }
func (tx *Tx) Abort() error { ... }

using interface{} like this might not be desirable/possible, it's just to highlight an idea. Ideally, this would then result in the following type of implementation

package idk

func (db *Database) initializeUsers(ctx) error {
  return db.Register(influxdb.User{})
}

func (db *Database) CreateUser(ctx context.Context, u *influxdb.User) error {
  tx, err := db.Tx()
  if err != nil {
    return err
  }

  if err := tx.CreateUser(ctx,u); err != nil {
    tx.Abort()
    return err
  }

  return tx.Commit()
}

func (tx *Tx) CreateUser(ctx context.Context, u *influxdb.User) error {
  u.ID = <some id>

  return tx.Put(u)
}

func (tx *Tx) DeleteUser(ctx context.Context, id influxdb.ID) error {
  // not totally sure about what to do here
  q := Query{Resource: "User", Equal{Field: "ID", Value: id}}

  return tx.Delete(q)
}

func (tx *Tx) FindUserByID(ctx context.Context, id influxdb.ID) (u *influxdb.User, error) {
  // not totally sure about what to do here
  q := Query{Resource: "User", Equal{Field: "ID", Value: id}}

  us := []*influxdb.User{}
  if err := tx.Find(q,us); err != nil {
    return nil, err
  }

  if len(us) != 1 {
    return nil, errors.New("error")
  }

  return us[0], nil
}

func (tx *Tx) FindUsers(ctx context.Context, filter influxdb.UserFilter) error {
  q := filterToQuery(filter)

  us := []*influxdb.Users{}
  if err := tx.Find(q, &us); err != nil {
    return err
  }

  return us, err
}

This gives us the ability to describe actions in a way that is transactional at a higher level. These types of work flows are currently hidden inside of bolt or etcd (and would be hidden inside kv in a similar way). If we did the abstraction correctly, we should also be able to hide the actual storage type (as in I believe if we did things correctly, we could conceivable use an sql database or a kv store and the logic around resources would not change). As an additonal note, this takes us closer to having resources that could be generically described (as in we'd be closer to being able to have users create arbitrary resources without additional API changes).

The big ideas here are

have one single concrete implementation
make storage/query details consolidated to two locations (one for plaform, one for idpe)
allow the expressability of transactional operations at a higher level
give us the ability to expose transactions to the client (do we even need this?)
start moving towards user defined resources

Additionally we could then have http handlers become:

package http

type UserHandler struct {
  ...
  UserHandlerBackend *UserHandlerBackend
}

type UserHandlerBackend interface {
  // methods here can take in things that map more closely to the
  // reality of the http world.
  DeleteUser(ctx context.Context, req deleteUserRequest) (deleteUserResponse, error)
  ...
}

type userHandlerBackend struct {
  db *Database
  // maybe some other stuff
}

func (be *userHandlerBackend) DeleteUser(ctx context.Context, req deleteUserRequest) (deleteUserResponse, error) {
  tx, err := be.db.Tx()
  if err != nil {
    return nil, err
  }

  u, err := tx.FindUserByID(ctx, req.ID)
  if err != nil {
    tx.Abort()
    return nil, err
  }

  as, err := tx.FindAuthoriztions(ctx, influxdb.AuthorizationFitler{UserID: &u.ID})
  if err != nil {
    tx.Abort()
    return nil, err
  }

  for _, a := range as {
    if err := tx.DeleteAuthorization(ctx,a.ID); err != nil {
      tx.Abort()
      return nil, err
    }
  }

  if err := tx.Commit(); err != nil {
    return nil, err
  }

  return res, nil
}

func (h *UserHandler) handleDeleteUser(w http.ResonseWriter, r *http.Request) {
  // decode http world

  res, err := h.UserHandlerBackend.DeleteUser(ctx,req)
  if err != nil {
    // stuff
  }

  encodeResponse(res)
}