Confidential data of users and limited metadata of programs and reports accessible via GraphQL
State Resolved (Closed)
Disclosed publicly 2019-02-03T10:57:19.220Z
Reported To
Weakness Information Disclosure
Bounty $20,000
Collapse

Summary by yashrs

On January 31st, 2019 at 7:16pm PST, HackerOne confirmed that two reporters were able to query confidential data through a GraphQL endpoint. This vulnerability was introduced on December 17th, 2018 and was caused by a backend migration to a class-based implementation of GraphQL types, mutations, and connections. The class-based implementation introduced the nodes field by default on all connections. The nodes field, in contrast with edges, didn’t leverage any of the defenses HackerOne has implemented to mitigate the exposure of sensitive information.

Our investigation concluded that malicious actors did not exploit the vulnerability. No confidential data was compromised. A short-term fix was released on January 31st, 2019 at 9:46 PM, a little over 2 hours after the vulnerability was reproduced.

Timeline

Date Time (PST) Action
2018-12-17 9:07 AM Software containing bug deployed to production.
2019-01-31 7:32 AM Vulnerability submitted to HackerOne’s bug bounty program.
2019-01-31 7:21 PM HackerOne validated the report and started incident response.
2019-01-31 8:25 PM HackerOne identified which code change introduced the security vulnerability and started work on a patch.
2019-01-31 9:46 PM A patch was released mitigating the identified vulnerability.
2019-01-31 11:46 PM HackerOne confirmed the vulnerability was not abused by any malicious actors.
2019-02-01 6:18 AM The root cause of the vulnerability was identified and a long term mitigation was proposed.
2019-02-01 5:08 PM Long term mitigation was deployed to production.
2019-02-03 2:34 AM Impacted users were alerted that their information was exposed to the reporters who submitted the vulnerability.

Root Cause

HackerOne has a number of defenses in place to reduce the risk of over-exposing data through our GraphQL layer. The first notable defense is a separate database schema that limits the set of rows a user can query based on their current role. This significantly reduces the impact in case, for example, the result of Report.all, would be serialized and returned to the user. The second notable defense is attribute-level authorization depending on the role of the requester. This makes sure that when an object is serialized, for example a publicly disclosed report, the user is not able to obtain internal metadata of the report.

Why upgrade?
On December 17th, when the code change was put up for review, engineers noticed the addition of the nodes field. An assumption was made that the field behaved like a shortcut for edges { node } — which, in hindsight, was not the case. No manual testing was performed to make sure that the authorization model for nodes was similar to other connection types.

HackerOne’s engineering team decided to upgrade to the class-based implementation of graphql-ruby because the old .define-based implementation was lazy-loaded. This caused problems when hot reloading pieces of code in a development environment. The class-based implementation also performs better in most situations. The .define-style implementation is also deprecated by the maintainers of the gem (to be removed with GraphQL 2.0).

Why didn’t we notice?
The nodes field is a helper field for Relay, which is used by the frontend. Even though the field was introduced, HackerOne engineers hadn’t started using this in our frontend. This caused the addition to fly under the radar of other engineers. The go-to way to query data through connection types at HackerOne is to go through the edges field. Because engineers outside of the specific team who upgraded to the class-based implementation did not deem the change important enough, there was no communication to other engineering teams.

Why was it exploitable?
When a GraphQL query is deconstructed and turned into one or multiple SQL queries, it will cast the result of it into an array of stale objects and use the attribute-level authorization to scrub all data the current user isn’t authorized to see. Root cause analysis showed that this code path was only followed when the nodes were queried through the edges field.

Query that followed the expected code path

query {
  users() {
    edges {
      node {
        email
      }
    }
  }
}

During the GraphQL gem upgrade on December 17th, all GraphQL types, connections, and mutations were rewritten to a class-based implementation. This introduced the nodes field on every connection type in HackerOne’s GraphQL schema. Instead of casting the result to an array with stale objects, the nodes field would result in an ActiveRecord::Relation object. The attribute-level authorization instrumentation would then incorrectly assume that the result was safe to be serialized, as it assumes the parent of the GraphQL field had already been scrubbed.

Query that followed the unexpected code path

query {
  users() {
    nodes {
      email
    }
  }
}

In the team’s investigation to determine whether this was exploited by malicious actors, the team concluded that the current logging level enabled them to answer two crucial questions: which GraphQL queries were executed and what information was transferred to the people proving the security vulnerability in the first place. These questions confirmed it was not exploited.

Resolution and Recovery

At 7:21 PM PST, HackerOne successfully reproduced the vulnerability as described by the reporter. The responding team identified the code change that introduced the vulnerability and started working on a short-term mitigation at 8:25 PM. This mitigation was released at 9:46 PM. The short-term mitigation was to disable the nodes field from every connection type. An internal code rule was deployed to alert the incident responders in case a new connection type was added that had the nodes field enabled. At the time, the root cause of the vulnerability was still unclear.

On February 1st at 6:18 AM, the team concluded the root cause analysis of the identified vulnerability. A long-term fix was put up for discussion. This fix addressed the underlying problem of the lack of attribute-level protection for the nodes field. Going forward any connection type that is introduced will either be sanitized through the attribute-level authorization or will stop processing the request in case of an unexpected object to be returned.

The minimum bounty award for a critical vulnerability on hackerone.com is currently set to $15,000. Even though this vulnerability exposed confidential information, it was limited to user information and metadata of programs and reports. None of the exposed information could have led to the compromise of confidential vulnerability information. It did, however, allow actors to query a significant amount of information. Because of that, the team decided to award the reporters with $20,000 for uncovering this vulnerability and working with us throughout the process.

Vulnerability Impact on Data

Sensitive information of multiple objects was exposed. Due to the two notable defenses as described in the Root Cause section, the scope of the information that was exposed was limited. Below is an overview of the objects and the confidential data that a user was able to access.

Connection: users
The GraphQL schema enables anyone to query the users on the platform. This is an intentional design decision. However, because every User object could be accessed, a significant amount of confidential information was accessible.

Below is an overview of all sensitive attributes that could be queried for every user on hackerone.com.

Sensitive attribute Note
account_recovery_phone_number The last two digits of a verified account recovery phone number.
account_recovery_unverified_phone_number The complete unverified account recovery phone number.
address Accessible when swag was awarded for a report the authenticated user had access to, regardless of their role (e.g. publicly disclosed report).
calendar_token The secret calendar token that exposes when HackerOne challenges were scheduled for the user. This does not expose customer names.
duplicate_users An array of possible duplicate accounts based on platform behavior.
email The email address.
otp_backup_codes An array of bcrypt-hashed OTP backup codes.
payout_preferences A connection of the user’s payout preferences. This does not include bank account details.
reports See Report connection for the scope and attributes that were exposed.
unconfirmed_email The unconfirmed email address.

Connection: teams
The secure database schema, by default, allows any user to query public programs (teams) and public external programs. Because of the relationship between external programs and HackerOne programs, this data set includes programs who may be running a private program. This means it was possible to obtain internal triage notes and the policy of a select number of private programs the user did not have access to. The reporters queried partial program information, but they did not obtain any sensitive information that warranted HackerOne to reach out to any customers.

Sensitive attribute Note
average_bounty_lower_amount The lower bound of the average bounty range.
average_bounty_upper_amount The higher bound of the average bounty range.
base_bounty The minimum bounty of a program.
bounties_total The sum of awarded bounties in the entire lifetime of the program.
bug_count The total number of resolved reports.
child_teams A connection containing the hierarchy of teams.
first_response_time A float containing the average time to first response.
goal_valid_reports The goal of valid vulnerabilities per month the program set.
grace_period_remaining_in_days The number of days the program has to recover from too many SLA failures to avoid their program being taken off HackerOne.
new_staleness_threshold The internal SLA until a report is marked as an SLA miss when it hasn’t received a first response.
new_staleness_threshold_limit The internal SLA until a report is marked as an SLA fail when it hasn’t received a first response.
policy The program policy in raw markdown.
policy_html The rendered program policy.
product_edition The product edition the program uses.
report_submission_form_intro The submission form introduction in raw markdown.
report_submission_form_intro_html The rendered submission form introduction.
report_template The default report template in raw markdown.
reporters An array of user objects who have reporter access to the program.
resolution_time A float containing the average time to resolution.
resolved_staleness_threshold The internal SLA until a report is marked as an SLA miss when it hasn’t been resolved.
sla_failed_count The number of reports failing the internal SLA.
structured_policy A structured representation of the program policy.
structured_scopes A connection that only disclosed an internal reference in case the user was authorized to see the structured scopes on the program page.
target_signal A float representing the targeted signal of the program.
triage_bounty_management A text field containing instructions for HackerOne’s triage team on how to handle bounty payments.
triage_enabled A boolean field indicating whether the program uses HackerOne’s triage services.
triage_note Internal triage notes in raw markdown.
triage_note_html The rendered triage notes.
triage_time A float containing the average time to triage.
triaged_staleness_threshold The internal SLA until a report is marked as an SLA miss when it hasn’t been triaged.
triaged_staleness_threshold_limit The internal SLA until a report is marked as an SLA fail when it hasn’t been triaged.
whitelisted_hackers See reporters.

Connection: reports
The reports data hasn’t been fully migrated to the secure database schema yet, which means that at the time the vulnerability was reported, only fully publicly disclosed and all reports the user participated in were accessible. This significantly reduced the number of report information that was exposed.

Sensitive attribute Note
anc_reasons An array of strings containing flags why the report was submitted to the HackerOne Human-Augmented Signal queue.
mediation_requested_at A date/time field when mediation was requested.
pre_submission_review_state A flag representing how Human-Augmented Signal responded to the report.
reference An optional internal reference.
reference_link An optional link to an internal ticket.

Even though the reporters confirmed that they did not query more information than necessary to prove the vulnerability and that they have deleted the information, HackerOne has reached out to the people for which sensitive information was downloaded by the reporters.

If your data was accessed during this incident, you have received a separate notification from HackerOne.

Preventative Measures

As part of our incident response process, we are conducting an internal review and analysis of the incident. We are taking the following actions to address the underlying causes of issues and to help prevent future occurrence:

  • Consider leveraging the graphql-ruby gem hooks for built-in authorization callbacks to catch more edge cases
  • Break the execution flow when an unexpected object is returned in the resolution of a connection field
  • Reduce the complexity of connection type resolution
Timeline
submitted a report to HackerOne .
2019-01-31T15:32:20.974Z

Summary:
The GraphQL endpoint doesn't have access controls implemented properly.

Description:
Any attacker can get personally identifiable information of users of Hackerone such as email address, backup hash codes, facebook_user_id, account_recovery_phone_number_verified_at, totp_enabled, etc.

These are just some examples of fields which are getting leaked directly from GraphQL.

This is the request sent to GraphQL:

{
  id
  users()
  {
    total_count 
    nodes
    {
      _id
      name
      username
      email
      account_recovery_phone_number
      account_recovery_unverified_phone_number
      bounties
      {
        total_amount
      }
      otp_backup_codes
      i_can_update_username
      location
      year_in_review_published_at
      anc_triager
      blacklisted_from_hacker_publish
      calendar_token
      vpn_credentials
      {
        name
      }
      account_recovery_phone_number_sent_at
      account_recovery_phone_number_verified_at
      swag
      {
        total_count
      }
      totp_enabled
      subscribed_for_team_messages
      subscribed_for_monthly_digest
      sessions
      {
        total_count
      }
      facebook_user_id
      unconfirmed_email
    }
  }

Sample Response:
█████████

Please fix it.

Thanks,
Yash :)

Impact

This could potentially leak many users' info

Regards,
Frans

  • 0 attachments:
yashrs Activities::Comment
2019-01-31T16:30:59.149Z
After further research, we also found the following: - User email addresses also leak the private program information - Duplicate users can give info about actual users. For example: (jobert -> ███████, Michiel -> ██████████) ██████ - Invitation preference: █████████ - T_shirt size - edit_unclaimed_profiles(true/false) - Lufthansa account(what is it?) - Next username update date Similarly, the total count on Users is ███, so we are able to extract information for any user and also for all if an attacker wants to. Thanks, Yash


yashrs Activities::ReportCollaboratorInvited
2019-01-31T16:32:16.231Z
None


milindpurswani Activities::ReportCollaboratorJoined
2019-01-31T16:32:39.853Z


milindpurswani Activities::Comment
2019-01-31T16:39:41.250Z
For instance, we are able to extact information about a hackerone staff member @still by using the feature of graphql, **after cursor**,` users(after:"MzY4MDYw")`. **P.S We haven't saved any other information other than mentioned here.** ``` { id { id team { _id about } uuid } me{ _id #388246 id #gid://hackerone/User/388246 } users(after:"MzY4MDYw") { total_count pageInfo { hasNextPage endCursor startCursor } nodes() { _id name username hackerone_triager email authentication_service created_at duplicate_users { total_count nodes { _id name username bio bounties { average_amount } account_recovery_phone_number hackerone_triager } } account_recovery_phone_number account_recovery_unverified_phone_number bounties { total_amount } otp_backup_codes i_can_update_username location #year_in_review_published_at anc_triager #blacklisted_from_hacker_publish calendar_token facebook_user_id } } } ``` ████


jobert Activities::Comment
2019-02-01T03:16:02.463Z
Hi @yashrs and @milind1997 - thanks for this. We're looking into this now and we'll keep you posted.


jobert Activities::BugTriaged
2019-02-01T03:22:45.872Z
Nice, we were able to reproduce the vulnerability you described. We'll jump on it right away!


yashrs Activities::Comment
2019-02-01T05:29:16.310Z
Additionally, we found out that `teams()` was also affected. So this further widens the impact and attack surface of this report. The **triage_note** shouldn't be visible to anyone. It reveals information like test accounts for hackers, SAML credentials and other sensitive information that should be only visible to HackerOne Team. ██████████ Also, as seen in the above screenshot, other information like `max_number_of_team_mediation_requests`, `last_invitation_accepted_at_for_user`, etc. were found. There maybe more to this, but we haven't investigated 100%. Thanks, Yash :)


jobert Activities::Comment
2019-02-01T05:31:22.693Z
Hi @yashrs and @milind1997 - thanks for continuing to look into this. We're aware that this exposes more data that you initially reported. We will follow up with the data that was possible to be queried in a post mortem. We'd kindly like to ask to stop testing right now. Thanks for your cooperation!


yashrs Activities::Comment
2019-02-01T05:36:07.072Z
Hello @jobert, Thanks for your quick response. We were just assessing the attack surface searching for worst case scenarios. But, now that you are aware about all the risks, we will stop. Thanks -Yash :)


jobert Activities::Comment
2019-02-01T06:22:52.195Z
Hi @yashrs and @milind1997 - thanks again! We just deployed a fix for the vulnerability you discovered. Can you confirm the fix? We are continuing with our investigation to determine whether this has been abused. Thanks!


jobert Activities::ReportSeverityUpdated
2019-02-01T06:26:32.473Z


jobert Activities::ReportVulnerabilityTypesUpdated
2019-02-01T06:27:47.188Z


yashrs Activities::Comment
2019-02-01T06:33:54.363Z
I can confirm that it is fixed. I get an error from GraphQL now. That was quick :)


jobert Activities::BugResolved
2019-02-01T06:39:39.385Z
Thanks for confirming, it's much appreciated! We'll wrap up our investigation, provide a summary in this report with our root cause analysis, and award a bounty soon. Unrelated to the vulnerability itself: we noticed that you're both collaborators on this report and we want to make sure that the weights are set correctly. Can you confirm this?


yashrs Activities::Comment
2019-02-01T06:45:20.092Z
> Thanks for confirming, it's much appreciated! We'll wrap up our investigation, provide a summary in this report with our root cause analysis, and award a bounty soon. Thanks, that is much appreciated :) I'm so excited, it's my first accepted bug on Hackerone > Unrelated to the vulnerability itself: we noticed that you're both collaborators on this report and we want to make sure that the weights are set correctly. Can you confirm this? Thanks for noticing that @jobert, but yes I can confirm that it's correctly set.


Activities::BountyAwarded
2019-02-02T00:18:40.284Z
Hi @yashrs and @milindpurswani - thanks again for bringing this to our attention, this was an amazing finding! We've added a post mortem at the top of the report to prepare this to be publicly disclosed. This includes how we decided on the bounty amount. We've redacted the screenshots you provided us. We look forward to receiving vulnerabilities from both of you in the future! Happy hacking!


Activities::BountyAwarded
2019-02-02T00:18:41.408Z
Hi @yashrs and @milindpurswani - thanks again for bringing this to our attention, this was an amazing finding! We've added a post mortem at the top of the report to prepare this to be publicly disclosed. This includes how we decided on the bounty amount. We've redacted the screenshots you provided us. We look forward to receiving vulnerabilities from both of you in the future! Happy hacking!


michiel Activities::ReportTitleUpdated
2019-02-02T00:35:42.197Z


jobert Activities::ReportTitleUpdated
2019-02-02T00:48:10.686Z


milindpurswani Activities::Comment
2019-02-02T01:42:49.410Z
We are glad we could help make Hackerone more secure.


yashrs Activities::Comment
2019-02-02T01:44:54.538Z
Thank you so much @jobert and @hackerone team for fixing this so quickly and awarding the bounty :D Do you think we are eligible for some swag? Would love to have one!


Activities::SwagAwarded
2019-02-02T01:46:47.485Z
Of course! Happy to send you some swag for such a great find. :-)


milindpurswani Activities::Comment
2019-02-02T01:52:23.287Z
Hello team, Two researchers collaborated, so do you think that the other researcher is also eligible for some swag?


yashrs Activities::Comment
2019-02-02T03:28:02.477Z
@jobert @security Slightly related to this vuln: The user himself is able to read the otp_backup_codes hashes. I know this doesn't cause any harm in general but just wanted to confirm if it's intended before this report is disclosed ``` { me{ _id #388246 id #gid://hackerone/User/388246 otp_backup_codes username } } ``` Resp: {F416558} Thanks, Yash :)


yashrs Activities::Comment
2019-02-02T03:31:13.830Z
Also, just curious: What is the difference between edges[node] and nodes.. why are there two fields which do the same thing?


jobert Activities::Comment
2019-02-02T18:24:43.950Z
> Two researchers collaborated, so do you think that the other researcher is also eligible for some swag? Yes, we'll make sure to send both of you swag. > The user himself is able to read the otp_backup_codes hashes. I know this doesn't cause any harm in general but just wanted to confirm if it's intended before this report is disclosed Thanks for asking! It is currently intentional, but when we worked on this incident we noticed that this could be implemented in a different way. We'll likely remove it from the schema in some time. > What is the difference between edges[node] and nodes.. why are there two fields which do the same thing? Great question! From what I could see [in the commit history of the gem](https://github.com/rmosolgo/graphql-ruby/commit/cc6ce94032f49b8d732a4a134d7ea484e86d9d05#diff-9d2aa305e6bee5b203288ca963d6d4d4), it is simply a shorthand for `edges { node }`. It wasn't supposed to be added by default though, and so for compatibility the maintainer later accepted [a pull request](https://github.com/rmosolgo/graphql-ruby/pull/1693) to make it configurable.


reed Activities::AgreedOnGoingPublic
2019-02-03T10:36:47.001Z


milindpurswani Activities::Comment
2019-02-03T10:39:07.064Z
Hello @reed, please redact the last screenshot posted by @yashrs. Then we can disclose it. Thanks -Milind


reed Activities::Comment
2019-02-03T10:44:52.310Z
@milindpurswani done! Please accept disclosure. :-)


yashrs Activities::AgreedOnGoingPublic
2019-02-03T10:57:19.143Z
Here we go!! {F417233}


yashrs Activities::ReportBecamePublic
2019-02-03T10:57:19.243Z