Keeping your conversations private in the age of supervised machine learning and government snooping

Most of us would like to keep our conversations with other people private, even when we are not discussing anything secret. That the person behind you on the bus can hear you discussing last night’s football game with a friend is perhaps not something that would make you feel uneasy, but what if employees, or outsourced consultants, from a big tech firm are listening in? Or government agencies are recording your conversations and using data mining techniques to flag them for analyst review if you mention something that triggers a red flag? That would certainly be unpleasant to most of us. The problem is, this is no longer science fiction.

You are being watched.

Tech firms listening in

Tech firms are using machine learning to create good consumer products – like voice messaging that allows direct translation, or digital assistants that need to understand what you are asking of them. The problem is that such technologies cannot learn entirely by themselves, so your conversations are being recorded. And listened too.

Microsoft: https://www.vice.com/en_us/article/xweqbq/microsoft-contractors-listen-to-skype-calls

Google: https://www.theverge.com/2019/7/11/20690020/google-assistant-home-human-contractors-listening-recordings-vrt-nws

Amazon: https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-on-alexa-a-global-team-reviews-audio

Apple: https://www.theguardian.com/technology/2019/jul/26/apple-contractors-regularly-hear-confidential-details-on-siri-recordings

All of these systems are being listened in to in order to improve speech recognition, which is hard for machines. They need some help. The problem is that users have not generally been aware that they conversations or bedroom activities may be listened in to by contractors in some undisclosed location. It certainly doesn’t feel great.

That is probably not a big security problem for most people: it is unlikely that they can specifically target you as a person and listen in on everything you do. Technically, however, this could be possible. What if adversaries could bribe their way to listen in to the devices of decision makers? We already know that tech workers, especially contractors and those in the lower end of the pay scale, can be talked into taking a bribe (AT&T employee installing malware on company servers allowing unauthorized unlocking of phones (wired.com), Amazon investigating data leaks for bribe payments). If you can bribe employees to game the phone locking systems, you can probably manipulate them into subverting the machine learning QA systems too. Because of this, if you are a target of high-resource adversaries you probably should be skeptical about digital assistants and what you talk about around them.

Governments are snooping too

We kind of knew it already but not the extent of it. Then Snowden happened – confirming that governments are using massive surveillance program that will capture the communications of everyone and make it searchable. The NSA got heavily criticized for their invasive practices in the US but that did not stop such programs from being further developed, or the rest of the world to follow. Governments have powers to collect massive amounts of data and analyze it. Here’s a good summary of the current US state of phone record collection from Reuters: https://www.reuters.com/article/us-usa-cyber-surveillance/spy-agency-nsa-triples-collection-of-u-s-phone-records-official-report-idUSKBN1I52FR.

The rest of the world is likely not far behind, and governments are using laws to make collection lawful. The intent is the protection of democracy, freedom of speech, and the evergreen “stopping terrorists”. The only problem is that mass surveillance seems to be relatively inefficient at stopping terrorist attacks, and it has been found to have a chilling effect on freedom of speech and participation in democracy, and even stops people from seeking information online because they feel somebody is watching them. Jonathan Shaw wrote an interesting comment on this on Harvard Magazine in 2017: https://harvardmagazine.com/2017/01/the-watchers.

When surveillance makes people think “I feel uneasy researching this topic – what if I end up on some kind of watchlist?” before informing themselves, what happens to the way we engage, discuss and vote? Surveillance has some very obvious downsides for us all.

If an unspoken fear of being watched is stopping us from thinking the thoughts we otherwise would have had, this is a partial victory for extremists, for the enemies of democracy and for the planet as a whole. Putting further bounds on thoughts and exploration will also likely have a negative effect on creativity and our ability to find new solutions to big societal problems such as climate change, poverty and even religious extremism and political conflicts, the latter being the reason why we seem to accept such massive surveillance programs in the first place.

But isn’t GDPR fixing all this?

The GDPR is certainly a good thing for privacy but it has not fixed the problem. It does apply to the big tech firms and the adtech industry but it really hasn’t solved the problem, at least not yet. As documented in this post from Cybehave.no, privacy statements are still too long, too complex, and too hidden for people to care. We all just click “OK” and remain subject to the same advertising driven surveillance as before.

The other issue we have here is that the GDPR does not apply to national security related data collection. And for that sort of collection, the surveillance state is still growing with more advanced programs, more collection, and more sharing between intelligence partners. In 2018 we got the Australian addition with their rather unpleasant “Assist and access” act allowing for government mandated backdoors in software, and now the US wants to backdoor encrypted communications (again).

Blocking the watchers

It is not very difficult to block the watchers, at least not from advertisers, criminals and non-targeted collection (if a government agency really wants to spy on you as an individual, they will probably succeed). Here’s a quick list of things you can do to feel slightly less watched online:

  • Use an ad-blocker to keep tracking cookies and beacons at bay. uBlock origin is good.
  • Use a VPN service to keep your web traffic away from ISP’s and the access of your telephone company. Make sure you look closely at the practices of your VPN supplier before choosing one.
  • Use end-2-end encrypted messaging for your communications instead of regular phone conversations and text messages. Signal is a good choice until the US actually does introduce backdoor laws (hopefully that doesn’t happen).
  • Use encrypted email, or encrypt the message you are sending. Protonmail is a Swiss webmail alternative that has encryption built-in if you send email to other Protonmail users. It also allows you to encrypt messages to other email services with a password.

If you follow these practices it will generally be very hard to snoop on you.

Do you consider security when buying a SaaS subscription?

tl;dr;  SaaS apps often have poor security. Before deciding to use one do a quick security review. Read privacy statements, ask for security docs, and test authentication practices, crypto and console.log information leaks before deciding if you want to trust the app or not. This post gives you a handy checklist to breeze through your SaaS pre-trial security review.

Over the last year I’ve been involved in buying SaaS access for lots of services in a corporate setting. My stake in the evaluation has been security. Of course, we can’t do active attacks or the like to evaluate if we are going to buy a software, but we can read privacy statements, look for strange things in the HTML and test their login process if they allow you to set up a free account or a trial account (most of them do). Here’s what I’ve generally found of challenges:

  • Big name players know what they are doing, but they have so many options for setup that you need to be careful what you select, especially if you have specific compliance needs.
  • Smaller firms (including startups)  don’t offer a lot of documentation, and quite often when talking to them they don’t really have much in place in terms of security management (they may still make great software, it is just harder to trust it in an enterprise environment)
  • Authentication blunders are very common. From HR support systems to payroll to IT management tools, we’ve found a lot of bad practices:
    • Ineffective password policies (5 digits and digits only, anyone?)
    • A theoretical password policy in place but validation is only performed client-side (that means it is easy to trick the server into setting a very weak password, or even no password in same cases, if you should so desire)
    • Lack of cookie security for session cookies (missing HTTPOnly and Secure flags, allowing for cookie theft by XSS attacks or man-in-the-middle attacks)
    • Poor password reset processes. In one case I found the supposedly random string used for a password reset link to be a base 64 encoding of my username…. how hard is that to abuse?
  • When you ask about development practices and security testing, many firms will lack such processes but try to explain that they implicitly have them because their developers are so smart (even with the authentication blunders mentioned above).

Obviously, at some point, security will be so bad that you have to say “No, we cannot buy this shiny thing, because it is rotten on the inside”. This is not a very popular decision in some cases, because the person requesting the service probably has some reason to request it.

egg power fear hammer
Don’t put your data and business at risk by using a SaaS product you can’t trust. By doing a simple due dilligence job you can at least identify services you definitely should be avoiding!

In order to evaluate SaaS security without spending too much time on it I’ve come up with a process I find works pretty well. So, here’s a simple way to sort the terrible from the average!

  1. Read the privacy statement and the terms and conditions. Not the whole thing, that is what lawyers are for (if you have some), but scan it and look for anything security related. Usually they will try to explain how they protect your personal data. If it is a clear and understandable explanation, they usually know what they are doing. If not, they usually don’t.
  2. Look for security or compliance documentation, or request it from their sales/support team. Specific questions to consider are:
    1. Do they offer encryption at rest (if this is reasonable to expect)?
    2. Do they explain how they store passwords?
    3. Do they use trustworthy data centers, and have some form of compliance proof for good practice for their data centers? SOC-2 reports, ISO certificates, etc?
    4. Do they guarantee limited access for insiders?
    5. Do they explain what cryptographic controls they are using for TLS, signatures and at-rest encryption? That means how they perform key management and exchange, what ciphers they allow, and the key strength they require?
    6. Do they say anything about vulnerability management and security testing?
    7. Do they say anything about incident handling?
  3. Perform some super-lightweight testing. Don’t break the law but do your own due diligence on the app:
    1. Create a free account if possible, and test if the authentication practices seem sound:
      • Test the “I forgot my password functionality” to see if the password reset link has an expiry time, and if it is a unguessable” link
      • Try changing your password to something really bad, like “password” or “dog”. Try to replay post requests if stopped by client-side validation (this can be done directly from the dev tools in Firefox, no hacker tool necessary)
      • Try logging in many times with the wrong password to see what happens (warning, lockout,… or most likely, nothing)
    2. Check their website certificate practices by running sslscan, or by using the online version at ssllabs.com. If they support really weak ciphers, it is an indicator that they are not on the ball. If they support SSL (pre TLS 1.0) it is probably a completely outdated service.
    3. Check for client-side libraries with known vulnerabilities. If they are using ancient versions of jQuery and other client side libraries, they are likely at risk of being hacked. This goes for WordPress sites as well, including their plugins.
    4. Check for information leaks by opening the console (Ctrl + I in most browsers). If the console logs a lot of internal information coming from the backend, they probably don’t have the best security practices.

So, should you dump a service if any of these tests “fail”? Probably not. Most sites have some weak points, and sometimes that is a tradeoff for some reason that may be perfectly legitimate. But if there are a lot of these “bad signs” in the air and you would be running some critical process or storing your company secrets in the app, you will be better off finding another service to suit your needs – or at least you will be able to sleep better at night.

 

Why you should be reading privacy statements before using a web site

If you are like most people, you don’t read privacy statements. They are boring, often generic, and seem to be created to protect businesses from lawsuits rather than to inform customers about how they protect their privacy. Still, when you know what to look for to make up your mind about “is it OK to use this product”, such statements are helpful.

payphone-on-brick-wall_4460x4460
If somebody was wiretapping all of your phone calls you wouldn’t be happy. Why should you then accept surveillance when you use other services? Most people do, and while they may have a feeling that their browsing is “monitored” they may not have the full insight into what people can do with the data they collect, or how much data they actually get access to. 

Even so, there is much to be learned from looking at a privacy statement. If you are like most people you are not afraid of sharing things on the internet, but still you don’t want the platforms you use to abuse the information you share. In addition, you would like to know what you are sharing. It is obvious that you are sharing a photo when you include it in a Facebook status update – but it is obvious that you are sharing your phone number and location when you are using a browser add-on? The former we are generally OK with (it is our decision to share), the latter not so much – we are typically tricked into sharing such information without even knowing that we do it.

Here’s an example of a privacy statement that is both good and bad: http://hola.org/legal/privacy.  It is good in the way that it is relatively explicit about what information it collects (basically everything they can get their hands on), and it is bad because they collect everything they can get their hands on.. Hola is a “VPN” service that markets itself as a security and privacy product. Yet, their website does not use SSL, their socalled VPN service does not use encryption but is really an unencrypted proxy network, they accept users that register with “password” as password, and so on… So much for security. So, here’s a bit on what they collect (taken from their privacy policy):

  • So-called anonymous information: approximate geo-location, hardware specs, browser type and version, date of software installation (their add-on I presume), the date you last used their services, operating system type and version, OS language, registry entries (really??), URL requests, and time stamps.
  • Personal information: IP address, name, email, screen names, payment info, and other information we may ask for. In addition you can sign up with your Facebook profile, from which they will collect usernames, email, profile picture, birthday, gender, preferences. When anonymous information is linked to personal information it is treated as personal information. (OK….?)
  • Other information: information that is publicly available as a result of using the service (their socalled VPN network) may be accessed by other users as a cache on your device. This is basically your browser history.

Would you use this service when seeing all of these things? They collect as much as they can about you, and they have pretty lax security. The next thing in their privacy statement that should be of interest is “Information we share”. What they call anonymous they share with whoever they please – meaning marketing people. They may also share it for “research purposes”. Note that the anonymous information is probably enough to fingerprint exactly who you are, and to track you around the web afterwards using tracking cookies. This is pretty bad. They also state when they share “personal information” – it includes the usual reason; due to legal obligations (like subpoenas, court orders). In addition they may share it to detect, prevent or address fraud, security, violations of policy or other technical issues (basically, this can be whatever you like it to be), to enforce the privacy policy or any other agreements between the user and them, and finally the best reason they share personal information: to protect against harm to the rights, property or safety of the company, its partners, users or the public. So basically, they collect as much as they want about you and they share it with whoever they like for whatever reasons they like. Would anyone be using such a service? According to their web page they have 125 million users.

125 million users accept that their personal data is being harvested, analysed and shared at will by a company that provides “VPN” with no encryption and that accepts the use of “password” as password when signing up for their service.

So, here’s the take-away:

  • Read the privacy policy, look specifically for:
    • What they collect
    • How they collect it
    • What they are using the information for
    • With whom do they share the informaiton
    • How do they secure the information?
  • Think about what this means for the things that are important to your privacy. Do you accept that they do the stuff they do?
  • What is the worst-case use of that information if the service provider is hacked? Identity theft? Incriminating cases for blackmail? Political profiling? Credibility building for phishing or other scams? The more information they gather, the worse the potential impact.
  • Finally, never trust someone claiming to sell a security product that obviously does not follow good security practice. No SSL, accepting weak passwords? Take your business elsewhere, it is not worth the risk.

Avoid keeping sensitive info in a code repo – how to remove files from git version history

One of the vulnerabilities that are really easy to exploit is when people leave super-sensitive information in source code – and you get your hands on this source code. In early prototyping a lot of people will hardcode passwords and certificate keys in their code, and remove it later when moving to production code. Sometimes it is not even removed from production. But even in the case where you do remove it, this sensitive information can linger in your version history. What if your app is an open source app where you are sharing the code on github? You probably don’t want to share your passwords…

Key on keyboard
Don’t let bad guys get the key to your databases and other valuable files by searching old versions of your code in the repository.

Getting this sensitive info out of your repository is not as easy as deleting the file from the repo and adding it to the .gitignore file – because this does not touch your version history. What you need to do is this:

  • Merge any remote changes into your local repo, to make sure you don’t remove the work of your team if they have commited after your own last merge/commit
  • Remove the file history for your sensitive files from your local repo using the filter-branch command:

git filter-branch –force –index-filter \
‘git rm –cached –ignore-unmatch \
PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA‘ cat — –all

Although the command above looks somewhat scary it is not that hard to dig out – you can find in the the Github doc files. When that’s done, there’s only a few more things to do:

  • Add the files in question to your .gitignore file
  • Force write to the repo (git push origin –force –all)
  • Tell all your collaborator to clone the repo as a fresh start to avoid them merging in the sensitive files again

Also, if you have actually pushed sensitive info to a remote repository, particularly if it is an open source publicly available one, make sure you change all passwords and certificates that were included previously – this info should be considered compromised.


Like what you read? Sign up for free updates!

What does the GDPR (General Data Protection Regulation) mean for your company’s privacy protection and cybersecurity?

The EU is ramping up the focus on privacy with a new regulation that will be implemented into local legislations in the EEC area from 2018. The changes are huge for some countries, and in particular the sanctions the new law is making available to authorities should be cause for concern for business that have not adapted. Shockingly, a Norwegian survey shows that 1 in 3 business leaders have not even heard of the new legislation, and 80% of the respondents have not made any effort to learn about the new requirements and its implications for their business (read the DN article here in Norwegian: http://www.dn.no/nyheter/2017/02/18/1149/Teknologi/norske-ledere-uvitende-om-ny-personvernlov). The Norwegian Data Protection Authority says this is “shocking” and says all businesses will face new requirements and that it is the duty of business leaders to orient themselves about this and act to comply with the new rules.

The new EU general data protection regulation (GDPR) will become law in most European countries from 2018. Make sure you have the right controls in place in time for the new regulation to become law. This even applies to non-European businesses offering services in Europe.

Here’s a short form of key requirements in the new regulation:

  • All businesses must have a human readable privacy policy: many privacy and data protection policies today are written in legal jargon and made to be hard to understand on purpose. The new regulation will require businesses to state their policies and describe how personal data are protected in a language that is comprehensible to the user group they are working with, including children if they are in the target user group of the company.
  • You need to do a risk assessment for privacy and data protection of personal data. The risk assessment should consider the risk to the owner of the data, not only the business. If the potential consequences of a data breach are high for the data owner, the authorities should be involved in discussions on how to mitigate the risk.
  • All new solutions need to build privacy protections into the design. The highest level of data protection in a software’s settings must be used as default, meaning you can only collect a minimum of data by default unless the user actively changes the settings to allow you to collect more data. This will have large implications for many cloud providers that by default collect a lot of data. See for example here, how Google Maps is collecting location data and tracking the user’s location: https://safecontrols.blog/2017/02/18/physically-tracking-people-using-their-cloud-service-accounts/
  • All services run by authorities and most services run by private companies will require the organization to assign a data protection officer responsible for compliance with the GDPR and for communicating with the authorities. This applies to all businesses that in their operation is handling personal data on a certain scale and frequency – meaning in practice that most businesses must have a data protection officer. It is permissible to hire in a third-party for this role instead of having an employee to fill the position.
  • The new regulation also applies to non-European businesses that offer services to Europe.
  • The new rules also apply to data processing service providers, and subcontractors. That means that cloud providers must also follow these rules, even if the service is used by their customer, who must also comply.
  • There will be new rules about communication of data breaches – both to the data protection authorities and to the data subjects being harmed. All breaches that have implications for individuals must be reported to the data protection authorities within 72 hours of the breach.
  • The data subjects hold the keys to your use of their data. If you store data about a person and this person orders you to delete their personal data, you must do so. You are also required to let the person transfer personal data to another service provider in a commonly used file format if so requested.

The new regulation also provides the authorities with the ability to impose very large fines, up to 20 million Euros or up to 4% of the global annual turnover, whichever is greater.This is, however, a maximum and not likely to be the normal sanctions. A warning letter would be the start, then audits from the data protection authorities. Fines can be issued but will most likely be within the common practice of corporate fines within the country in question.

Implications for cybersecurity

The GDPR has focus on privacy and the mechanisms necessary to avoid abuse of personal data. The regulation also requires you to be vigilant about cybersecurity in order to avoid data breaches. In practicular, Section 39 states (see text here: http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=OJ:L:2016:119:FULL):

“Personal data should be processed in a manner that ensures appropriate security and confidentiality of the personal data, including for preventing unauthorised access to or use of personal data and the equipment used for the processing.”

This means that you should implement reasonable controls for ensuring the confidentiality, integrity and availability of these data and the processing facilities (software, networks, hardware, and also the people involved in processing the data). It would be a very good idea to implement at least a reasonable information security management system, following good practices such as described in ISO 27001. If you want a roadmap to an ISO 27001 compliance management system, see this post summarizing the key aspects there: https://safecontrols.blog/2017/02/12/getting-started-with-information-management-systems-based-on-iso-27001/.

You may also be interested in the 88-page slide deck with an overview of cybersecurity basics: it is a free download if you sign up as a Safecontrols Insider.