Firebase IAM: the tale of excessive permissions

Securing Firestore objects from attacks abusing the JavaScript SDK can be done with the Firestore security rules, which you can read about in my recent post on Firestore

If you are using the Admin SDK on the server side, you have full access to everything by default. The Firestore security rules do not apply to the Admin SDK. One thing in particular we should be aware of is that the Firesbase admin SDK gives access to management plane functionality, making it possible to change security rules, for example. This is not apparent from the Firebase console or command line tools. 

firefighters in action
Running Firebase Cloud Functions using the Admin SDK with default permissions can quickly lead to a lot of firefighting. Better get those permissions under control!

In this blog post we dig into a Firebase project through the Google Cloud console and the gcloud command line tool, where we show how to improve the security of our capture-the-flag app by creating specific service accounts and role bindings for a cloud function. We also explore how to verify that a user is signed in using the Firebase Admin SDK.

A threat model for the flag checker

We have created a demo Firebase project with a simple web application at https://quizman-a9f1b.web.app/. This app has a simple CTF function, where a CTF challenge is presented, and players can verify if their identified flag is correct. The data exchange is primarily done using the JavaScript SDK, protected by security rules. For checking the flag, however, we are using a cloud function. If this cloud function has a vulnerability that allows an attacker to take control over it, that attacker could potentially overwrite the “correct flag”, or even change the security rules protecting the JavaScript SDK access. 

Here’s a list of threats and potential consequences: 

VulnerabilityExploitationImpact
RCE vulnerability in codeAttacker can take full control of the Firebase project environment through the admin SDKCan read/write to private collection (cheat)Can create other resources (costs money)Can reconfigure security rules (data leaks or DoS)
Lack of brute-force protectionAttacker can try to guess flags by automating submissionUser can cheatCosts money
Lack of authenticationAn unauthenticated user can perform function callsCosts money in spite of not being a real player of the CTF game

We need to make sure that attackers cannot exploit vulnerabilities to cheat in the program. We also want to protect against unavailability, and abuse that can drive up the cloud usage bill (after all this is a personal project). We will apply a defence-in-depth approach to our cloud function: 

  1. Execution of the function requires the caller to be authenticated. The purpose of this is to limit abuse, and to revoke access to users abusing the app. 
  2. The Firebase function shall only have read access to FIrestore, preferably only to the relevant collections. This will avoid the ability of an attacker with RCE to overwrite data, or to manage resources in the Firebase project.
  3. For the following events we want to create logs and possibly alerts: 
    1. authenticated user verified token
    2. unauthenticated user requested token verification

Requiring the user to be authenticated

First we need to make sure that the person requesting to verify a flag is authenticated. We can use a built-in method of the Firebase admin SDK to do this. This method checks that the ID token received is properly signed, and that it is not expired. The good thing about this approach is that it avoids making a call to the authentication backend.

But what if the token has been revoked? It is possible to check if a token is revoked using either security rules (recommended, cheap), or making an extra call to the authentication backend (expensive, not recommended). Since we are not actively revoking tokens in this app, unless a user changes his/her password, we will not bother with this functionality but if you need it, there is documentation how here: https://firebase.google.com/docs/auth/admin/manage-sessions#detect_id_token_revocation

We need to update our “check flag workflow” from this: 

  • send flag and challenge ID to cloud function
  • cloud function queries Firestore based on challenge ID and gets the “correct flag”
  • cloud function compares submitted flag with the correct flag, and returns {success: true/false} as appropriate

to this slightly more elaborate workflow:

  • send flag, challenge ID and user token to cloud function
  • cloud function verifies token ID
    • If invalid: return 403 (forbidden) // simplified to returning 200 with {success: false}
    • if valid: 
      • cloud function queries Firestore based on challenge ID and gets the “correct flag”
      • cloud function compares submitted flag with the correct flag, and returns {success: true/false} as appropriate

The following code snippet shows how to perform the validation of the user’s token: 

​​const idTokenResult = await admin.auth().verifyIdToken(idToken);

If the token is valid, we receive a decoded jwt back.

Restricting permissions using IAM roles

By default, a Firebase function initiated with the Firebase admin SDK has been assigned very powerful permissions. It gets automatically set up with a service account that is named as “firebase-adminsdk-random5chars@project-id.iam.gserviceaccount.com”. The service account itself does not have rights associated with it, but it has role bindings to roles that have permissions attached to it. 

If you go into the Google Cloud Console, and navigate to “IAM” under your project, you can look up the roles assigned to a principal, such as your service account. For each role you automatically get an assessment of “excess permissions”; those are permissions available through the role bindings but that are not used in the project. Here’s the default configuration for the service account set up for the Admin SDK: 

By default Firebase Cloud Functions run with excessive permissions!

Our Firebase cloud function does not need access to all those permissions. By creating roles that are fit for purpose we can limit the damage an attacker can do if the function is compromised. This is just the same principle in action as when your security awareness training tells you not to run your PC as admin for daily work. 

Cloud resources have associated ready-made roles that one can bind a service account to. For Firestore objects the relevant IAM roles are listed here: https://cloud.google.com/firestore/docs/security/iam. We see that there is a viewer role that allows read access to all Firestore resources, called datastore.viewer. We will use this, but be aware it could read all Firestore data in the project, not only the intended objects. Still, we are protecting against deletion, overwriting data, and creation of new resources. 

Note that it is possible to create more specific roles. We could create a role that only has permission to read from Firestore entities. We cannot in an IAM role describe exactly which Firestore collection to allow read operations from, but if we create the role flagchecker and assign it the permission datastore.entities.get and nothing else, it is as locked down as we can make it. 

To implement this for our cloud function, we create a new service account. This can be done in the Console by going to IAM → Service Accounts → New Service Account. We create the account and assign it the role datastore.viewer. 

Our new service account is called quizman-flag-checker.

Now we need to attach this service account to our Firebase function. It is not clear form the Firebase documentation how we can accomplish this, but opening the Google Cloud Console, or using the gcloud command line tool, we can attach our new service account with more restrictive permissions to the Firebase function. 

To do this, we go into the Google Cloud console, choose the right project and Compute → Cloud functions. Select the right function, and then hit the “edit” button to change the function. Here you can choose the service account you want to attach to the function. 

google cloud console

After changing the runtime service account, we need to deploy the function again. Now the service-to-service authentication is performed with a principal with more sensible permissions; attackers can no longer create their own resources or delete security rules. 

Auditing the security configurations of a Firebase function using gcloud

Firebase is great for an easy set-up, but as we have seen it gives us too permissive roles by default. It can therefore be a good idea to audit the IAM roles used in your project. 

Key questions to ask about the permissions of a cloud function are: 

  • What is the service account this function is authenticating as?
  • What permissions do I have for this cloud function?
  • Do I have permissions that I do not need? 

In addition to auditing the configuration, we want to audit changes to the configuration, in particular changes to service accounts, roles, and role bindings. This is easiest done using the log viewer tools in the Google Cloud console. 

We’ll use the command line tool gcloud for the auditing, since this makes it possible to automate in scripts. 

Service accounts and IAM roles for a Firebase function

Using the Google Cloud command line tool gcloud we can use the command 

gcloud functions describe <functionName>

to get a lot of metadata about a function. To extract just the service account used you can pipe it into jq like this: 

gcloud functions describe <functionName> --format=”json”| jq “.serviceAccountEmail”

When we have the service account, we can next check which roles are bound to the account. This query is somewhat complex due to the nested data structure for role bindings on a project (for a good description of gcloud IAM queries, see fabianlee.org): 

gcloud projects get-iam-policy <projectIdNumber> --flatten="bindings[].members" --filter="bindings.members=serviceAccount:<account-email>" --format="value(bindings.role)"

Running this gives us the following role (as expected): projects/quizman-a9f1b/roles/flagchecker.

Hence, we know this is the only role assigned to this service account. Now we finally need to list the permissions for this role. Here’s how we can do that: 

cloud iam roles describe flagchecker --project=quizman-a9f1b --format="value(includedPermissions)”

The output (as expected) is a single permission: datastore.entities.get

CCSK Domain 5: Information governance

Information governance is the management practices we introduce to enusre that data and information complies with organizational policies, standards and strategy, including regulatory, contractual and business objectives. 

There are several aspects of cloud storage of data that has implications for information governance. 

Public cloud deployments are multi-tenant. That means that there will be other organizations also storing their information in the same datacenter, on the same hardware. The security features for account separation will thus be an important part of achieving information compliance in most cases. 

As data is shared across cloud infrastructure, so is the responsibility for securing the data. To define a working governance structure it is important to define data ownership and who the data custodian is. The difference between the two, is that the former is who actually owns the data (and is accountable for its governance), and the latter who manages the data (and is responsible for ensuring compliance in practice). 

When we host third-party data in the cloud, we are introducing a third-party into the governance model. This third-party is the cloud provider; the information governance now depends on the provider’s management practices and technologies offered by the cloud provider. This complicates the regulatory compliance considerations we need to make and should be taken into account when designing a project’s regulatory compliance matrix. First, legal requirements may change because the cloud stores, or makes data available, in more geographical regions that would otherwise be the case. Compliance, regulations, and in particular privacy, should be carefully reviewed with regard to how governance is managed in the cloud for customer data. Further, one should ensure that customer requirements to deletion (destruction) of data is possible to satisfy given the technical offerings from the cloud provider. 

Moving data to the cloud provides a welcome opportunity to review and perhaps redesign information architectures. In many organizations information architectures have evolved over a long time, perhaps with little planning, and may have resulted in a fractured model where it is hard to manage compliance. 

Cloud information governance domains

Cloud computing can have an effect on multiple aspects of data governance. The following list defined issues the CSA has described as affected by cloud artifacts: 

Information classification. Often tied to storage and handling requirements, that may include limitations on access, location. Storing information in an S3 bucket will require a different method for access control than using a file share on the local network. 

Information management practices. How data is managed based on classification. This should include different cloud deployment models (or SPI tiers: SaaS, PaaS, IaaS). You need to decide what can be allowed where in the cloud, with which products and services and with which security requirements. 

Location and jurisdiction policies. You need to comply with regulations and contractual obligations with respect to data storage, data access. Make sure you understand how data is processed and stored, and the contractual instruments in place to manage regulatory compliance. One primary example here is personal data under the GDPR, and how data processing agreements with cross-border transfer clauses can be used to manage foreign jurisdictions. 

Authorizations. Cloud computing does not typically require much changes to authorizations but the data security lifecycle will most likely be impacted. The way authorization controls are implemented may also change (e.g. IAM practices of the cloud vendor for account level authorization). 

Ownership. The organization owns its data and this is not changed when moving to cloud. One should be careful with reviewing the terms and conditions of cloud providers here, in particular SaaS products (especially those targeting the consumer market).

Custodianship. The cloud provider may fully or partially become the custodian, depending on the deployment model. Encrypted data stored in a cloud bucket is still under custody of the cloud provider. 

Privacy. Privacy needs to be handled in accordance with relevant regulations, and the necessary contractual instruments such as data processing agreements must be put in place. 

Contractual controls. Contractual controls when moving data and workloads to control will be different from controls you employ in an on-premise infrastructure. There will often be limited access to contract clause negotiations in public cloud environments. 

Security controls. Security controls are different in cloud environments than in on-premise environments. Main concepts are security groups and access control lists.

Data Security Lifecycle

A data security lifecycle is typically different from information lifecycle. A data security lifecycle has 6 phases: 

  • Create: generation of new digital content, or modification of existing content
  • Store: committing digital data to storage, typically happens in direct sequence with creation. 
  • Use: data is viewed, processed or otherwise used in some activity that does not include modification. 
  • Share: Information is made accessible to others, such as between users, to customers, and to partners or other stakeholders. 
  • Archive: data leaves active use and enters long-term storage. This type of storage will typically have much longer retrieval times than data in active storage. 
  • Destroy. Data is permanently destroyed by physical or digital means (cryptoshredding)

The data security lifecycle is a description of phases the data passes through, without regard for location or how it is accessed. The data typically goes through “mini lifecycles” in different environments as part of these phases. Understanding the physical and logical locations of data is an important part of regulatory compliance. 

In addition to where data lives and how it is transferred, it is important to keep control of entitlements; who accesses the data, and how can they access it (device, channels)? Both devices and channels may have different security properties that may need to be taken into account in a data governance plan. 

Functions, actors and controls

The next step in assessing the data security lifecycle is to review what functions can be performed with the data, by a given actor (personal or system account) and a particular location. 

There are three primary functions: 

  • Read the data: including creating, copying, transferring.
  • Process: perform transactions or changes to the data, use it for further processing and decision making, etc. 
  • Store: hold the data (database, filestore, blob store, etc)

The different functions are applicable to different degrees in different phases. 

An actor (a person or a system/process – not a device) can perform a function in a location. A control restricts the possible actions to allowed actions. The key question is: 

What function can which actor perform in which location on a given data object?

An example of data modeling connecting actions to data security lifecycle stages.

CSA Recommendations

The CSA has created a list of recommendations for information governance in the cloud: 

  • Determine your governance requirements before planning a transition to cloud
  • Ensure information governance policies and practices extent to the cloud. This is done with both contractual and security controls. 
  • When needed, use the data security lifecycle to model data handling and controls. 
  • Do not lift and shift existing information architectures to the cloud. First, review and redesign the information architecture to support the current governance needs, and take anticipated future requirements into account. 

Deploying Django to app engine with Python 3.7 runtime – fails because it can’t find pip?

Update 1 June 2020: Google tells me the problem is now fixed. I haven’t verified it yet, will update if I find it is not fixed at the next crossroads. Bug tracker link: https://issuetracker.google.com/issues/131663964.

Update 30 March 2020: The problem is still here. Use pip version 19.2.3 now to make it work. Google: can you please go fix this now?

Update 30 April 2019: Problem is back. This time it tries to upgrade to pip 19.1, but the app engine instance is stuck on 19.0.3. Adding pip==19.0.3 in the requirements.txt file saves the deployment.

Update 1 April 2019: now the deploy fails with the same message as described in this post, when the PIP version is specified in the requirements.txt file. Removing the specific pip version line from the requirements file fixes this. I have not seen any change notice or similar from Google on this.

I had an interesting error that took quite some time to hunt down today. These are basically some notes on what caused it and how I tracked it down. I have an app that is deployed to Google App Engine standard. It is running Django and using the Python 3.7 runtime – and it was working quite well for some time. Then yesterday I was going to deploy an update (actually just adding some CSS tweaks), it failed, with a cryptic error message. Running “gcloud app deploy” lead to the following error message:

File upload done.
Updating service [default]…failed.
ERROR: (gcloud.app.deploy) Error Response: [9] Cloud build a33ff087-0f47-4f18-8654-********* status: FAILURE.
Error ID: B212CE0B.
Error type: InternalError.
Error message: pip_install_from_wheels had stderr output:
/env/bin/python3.7: No module named pip
error: pip_install_from_wheels returned code: 1.

This is weird: this is a normal Python project using a requirements.txt file for its dependencies. The file was generated using pip freeze, and should not contain anything weird (it doesn’t). Searching the wisdom of the internet reveals that nobody else seems to have this problem, and it only occurred since yesterday. My hunch was that they’ve changed something on the GAE environment that broke something. Searching the net gives us these options:

  • The requirements.txt file has weird encoding and contains Chinese signs/letters? That was not it.
  • This is because you need to install some special packages for using Python3.. was also not the case and would have been weird changing since a few days ago…
  • You need to manually install pip to make it work – which may be the case sometimes but without SSH access to the instance this isn’t obvious how to do.
The trick is often looking at the right logs….

So, being clueless I turned to looking for the right logs to figure out what is going on. Not being an expert on the GAE environment led to some hunting in the web console until I found “Cloud build”, which sounded promising. That was the right place to be – GAE uses a build process in the cloud to first build the application, and then a Docker image to push to the internal Google Cloud Docker repository. Hunting the build log for this finds this little piece of gold:

Step #1 - "builder": INFO pip_install_from_wheels took 0 seconds
Step #1 - "builder": INFO starting: pip_install_from_wheels
Step #1 - "builder": INFO pip_install_from_wheels /env/bin/python3.7 -m pip install --no-deps --prefix /tmp/tmp9Y3aD7/env /tmp/tmppuvw4s/wheel/pip-19.0.1-py2.py3-none-any.whl --disable-pip-version-check
Step #1 - "builder": INFO `pip_install_from_wheels` stdout:
Step #1 - "builder": Processing /tmp/tmppuvw4s/wheel/pip-19.0.1-py2.py3-none-any.whl
Step #1 - "builder": Installing collected packages: pip
Step #1 - "builder": Found existing installation: pip 18.1
Step #1 - "builder": Uninstalling pip-18.1:
Step #1 - "builder": Successfully uninstalled pip-18.1
Step #1 - "builder": Successfully installed pip-19.0.1
Step #1 - "builder": Failed. No module /env/python/pip

Before the weird error we see it is trying to uninstall pip-18.1, then install pip-19.0.1 (a more recent version), and then it can’t find pip afterwards and the build process fails. This has not been configured by me and is probably Google configuring automatic upgrades of some packages during build – and here it breaks the workflow.

Fixing it

The temporary fix was simple. Adding “pip==18.1” to the requirements.txt file, allowed the build process to run and it deployed nicely.

What did we learn from this?

  • API tools give only partial error messages, making debugging hard.
  • Automated upgrade configs are good but can cause things to break in unforeseen ways.
  • Finding the right logs is the key to fixing weird problems