Article 43

 

Broadband Privacy

Tuesday, November 12, 2019

Project Nightingale

image: google

Google’s “Project Nightingale” analyzes medical records to create “Patient Search” for health providers

By Abner Li
9to5Google
Nov 2019

Beyond the ACQUISTION OF FITBIT earlier this month, Google’s health ambitions are multi-faceted and extend into services for hospitals and health providers. Such an effort named Project Nightingale was detailed today, along with the end product: Patient Search.

The Wall Street Journal today REPORTED on Project Nightingale, with Forbes providing more details on the effort, including screenshots.  Ascension - one of the country’s largest healthcare systems - is moving its patient records to Google Cloud. This complete health history includes lab results, doctor diagnoses, and hospitalization records.

In turn, Google is analyzing and compiling that data into a Patient Search tool that allows doctors and other health professionals to conveniently see all patient data on an overview page.

The page includes notes about patient medical issues, test results and medications, including information from scanned documents, according to presentations viewed by Forbes.

The interface is quite straightforward and not too different from hospitals that offer results directly to patients today.

Internally, the project is being developed within Google Cloud, and 150 Googlers reportedly have access to the data. This includes Google Brain, the companys internal AI research division. The WSJ describes another tool in development that uses machine learning to suggest possible patient treatment changes to doctors.

Google in this case is using the data, in part, to design new software, underpinned by advanced artificial intelligence and machine learning, that zeroes in on individual patients to suggest changes to their care.

That appears to be further off in the distance compared to ԒPatient Search, which is already deployed to Ascension facilities in Florida and Texas, with more locations planned this year. Google is apparently not charging Ascension for the work and could offer the tool to other health systems in the future.

When asked for comment, Google said Project Nightingale abides by all federal laws and that privacy protections are in place. Experts that spoke to the WSJ believe that this initiative is allowed under the Health Insurance Portability and Accountability Act (HIPPA).

SOURCE

Posted by Elvis on 11/12/19 •
Section Privacy And Rights • Section Broadband Privacy
View (0) comment(s) or add a new one
Printable viewLink to this article
Home

Wednesday, June 12, 2019

AC Phone Home

snooping on your pc

I got a new HONEYWELL THERMOSTAT for the air conditioner that has internet connectivity for remote access, and pulls a weather report.

Like everything IOT- it INSISTS ON A MIDDLEMAN (pretty much anyone after looking at their EULA) possibly peeking at the things connected to my network, and who knows WHAT ELSE:

The Internet has been around for around 20 years now, and its security is far from perfect. Hacker groups still ruthlessly take advantage of these flaws, despite spending billions on tech security. The IoT, on the other hand, is primitive. And so is its security.

Once everything we do, say, think, and eat, is tracked, the big data thats available about each of us is immensely valuable. When companies know our lives inside and out, they can use that data to make us buy even more stuff. Once they control your data, they control you.

Why can’t I just VPN into the house and connect to it that way?

Because then they can’t SNOOP.

Their EULA SAYS:

We may use your Contact Information to market Honeywell and third-party products and services to you via various methods

We also use third parties to help with certain aspects of our operations, which may require disclosure of your Consumer Information to them.

Honeywell uses industry standard web ANALYTICS to track web visits, Google Analytics and Adobe Analytics.

GOOGLE and Adobe may also TRANSFER this INFORMATION to third parties where required to do so by law, or where such third parties process the information on Google’s or Adobe’s behalf.

You acknowledge and agree that Honeywell and its affiliates, service providers, suppliers, and dealers are permitted at any time and without prior notice to remotely push software

collection and use of certain information as described in this Privacy Statement, including the transfer of this information to the United States and/or other countries for storage

Wonderful.

I connected it to the LAN without asking it to get the weather - or signing up for anything at HONEYWELL’S SITE.

As fast as I can turn my head to peek at the firewall - it was chatting on the internet, and crapped out with some SSL error:

‘SSL_PROTO_REJECT: 48: 192.168.0.226:61492 -> 199.62.84.151:443’
‘SSL_PROTO_REJECT: 48: 192.168.0.226:65035 -> 199.62.84.152:443’
‘SSL_PROTO_REJECT: 48: 192.168.0.226:55666 -> 199.62.84.153:443’

Maybe the website has a problem:

# curl -sslv2 199.62.84.151:443
* About to connect() to 199.62.84.151 port 443 (#0)
* Trying 199.62.84.151… connected
* Connected to 199.62.84.151 (199.62.84.151) port 443 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: 199.62.84.151:443
> Accept: */*
>
* Closing connection #0
* Failure when receiving data from the peer

# curl -sslv3 199.62.84.151:443
* About to connect() to 199.62.84.151 port 443 (#0)
* Trying 199.62.84.151… connected
* Connected to 199.62.84.151 (199.62.84.151) port 443 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: 199.62.84.151:443
> Accept: */*
>
* Closing connection #0
* Failure when receiving data from the peer

# curl -tlsv1 199.62.84.151:443
curl: (56) Failure when receiving data from the peer

# curl -tlsv1.0 199.62.84.151:443
curl: (56) Failure when receiving data from the peer

# curl -tlsv1.1 199.62.84.151:443
curl: (56) Failure when receiving data from the peer

# curl -tlsv1.2 199.62.84.151:443
curl: (56) Failure when receiving data from the peer

# curl 199.62.84.151:80
curl: (56) Failure when receiving data from the peer

Then I pulled the plug.  Even if Honeywell’s website is broke - I still fear this thermostat will find a way to download software, and maybe START SPYING ON MY HOME NETWORK:

The US intelligence chief has acknowledged for the first time that agencies might use a new generation of smart household devices to increase their surveillance capabilities.

Maybe, someday I’ll firewall off HONEYWELL’S NETBLOCKS, connect it again, see where it goes.

For now - I’m too AFRAID:

When the cybersecurity industry warns about the nightmare of hackers causing blackouts, the scenario they describe typically entails an elite team of hackers breaking into the inner sanctum of a power utility to start flipping switches. But one group of researchers has imagined how an entire power grid could be taken down by hacking a less centralized and protected class of targets: home air conditioners and water heaters.

---

Think that’s bad?  Check this out

Dont Toss That Bulb, It Knows Your Password

By Tom Nardi
Hackaday
January 28, 2019

Whether it was here on Hackaday or elsewhere on the Internet, youҒve surely heard more than a few cautionary tales about the Internet of ThingsӔ by now. As it turns out, giving every gadget you own access to your personal information and Internet connection can lead to unintended consequences. Who knew, right? But if you need yet another example of why trusting your home appliances with your secrets is potentially a bad idea, [Limited Results] is here to make sure you spend the next few hours doubting your recent tech purchases.

In a series of POSTS on the [Limited Results] blog, low-cost smart bulbs are cracked open and investigated to see what kind of knowledge theyve managed to collect about their owners. Not only was it discovered that bulbs manufactured by Xiaomi, LIFX, and Tuya stored the WiFi SSID and encryption key in plain-text, but that recovering said information from the bulbs was actually quite simple. So next time one of those cheapo smart bulb starts flickering, you might want to take a hammer to it before tossing it in the trash can; you never know where it, and the knowledge it has of your network, might end up.’

Regardless of the manufacturer of the bulb, the process to get one of these devices on your network is more or less the same. An application on your smartphone connects to the bulb and provides it with the network SSID and encryption key. The bulb then disconnects from the phone and reconnects to your home network with the new information. It’s a process that at this point were all probably familiar with, and there’s nothing inherently wrong with it.

The trouble comes when the bulb needs to store the connection information it was provided. Rather than obfuscating it in some way, the SSID and encryption key are simply stored in plain-text on the bulbs WiFi module. Recovering that information is just a process of finding the correct traces on the bulbҒs PCB (often there are test points which make this very easy), and dumping the chips contents to the computer for analysis.

It’s not uncommon for smart bulbs like these to use the ESP8266 or ESP32, and [Limited Results] found that to be the case here. With the wealth of information and software available for these very popular WiFi modules, dumping the firmware binary was no problem. Once the binary was in hand, a little snooping around with a hex editor was all it took to identify the network login information. The firmware dumps also contained information such as the unique hardware IDs used by the cloudӔ platforms the bulbs connect to, and in at least one case, the root certificate and RSA private key were found.

On the plus side, being able to buy cheap smart devices that are running easily hackable modules like the ESP makes it easier for us to create custom firmware for them. Hopefully the community can come up with slightly less suspect software, but really just keeping the things from connecting to anything outside the local network would be a step in the right direction.

(Some days later)

[Limited Results] had hinted to us that he had previously disclosed some vulnerabilities to the bulb’s maker, but that until they fixed them, he didn’t want to make them public. They’re fixed now, and it appears that the bulbs were sending everything over the network unencrypted your data, OTA firmware upgrades, everything.  They’re using TLS now, so good job [Limited Results]! If you’re running an old version of their lightbulbs, you might have a look.

On WiFi credentials, we were told: “In the case where sensitive information in the flash memory wasn’t encrypted, the new version will include encrypted storage processing, and the customer will be able to select this version of the security chips, which can effectively avoid future security problems.” Argue about what that actually means in the comments.

SOURCE

Posted by Elvis on 06/12/19 •
Section Privacy And Rights • Section Broadband Privacy
View (0) comment(s) or add a new one
Printable viewLink to this article
Home

Wednesday, October 17, 2018

The Clouded Cloud

image: amazon honor system

AmazonAtlas

Wikileaks
October 11, 2018

Today, WikiLeaks publishes a “Highly Confidential” internal documentfrom the cloud computing provider Amazon. The documentfrom late 2015 lists the addresses and some operational details of over one hundred data centers spread across fifteen cities in nine countries. To accompany this document, WikiLeaks also created a map showing where Amazons data centers are LOCATED.

Amazon, which is the largest cloud provider, is notoriously secretive about the precise locations of its data centers. While a few are publicly tied to Amazon, this is the exception rather than the norm. More often, Amazon operates out of data centers owned by other companies with little indication that Amazon itself is based there too or runs its own data centers under less-identifiable subsidiaries such as VaData, Inc. In some cases, Amazon uses pseudonyms to obscure its presence. For example, at its IAD77 data center, the documentstates that Amazon is known as “Vandala Industries” on badges and all correspondence with building manager

Amazon is the leading cloud provider for the United States intelligence community. In 2013, Amazon entered into a $600 million contract with the CIA to build a cloud for use by intelligence agencies working with information classified as Top Secret. Then, in 2017, Amazon announced the AWS Secret Region, which allows storage of data classified up to the Secret level by a broader range of agencies and companies. Amazon also operates a special GovCloud region for US Government agencies hosting unclassified information.

Currently, Amazon is one of the leading contenders for an up to $10 billion contract to build a private cloud for the Department of Defense. Amazon is one of the only companies with the certifications required to host classified data in the cloud. The Defense Department is looking for a single provider and other companies, including Oracle and IBM, have complained that the requirements unfairly favor Amazon. Bids on this contract are due tomorrow.

While one of the benefits of the cloud is the potential to increase reliability through geographic distribution of computing resources, cloud infrastructure is remarkably centralised in terms of legal control. Just a few companies and their subsidiaries run the majority of cloud computing infrastructure around the world. Of these, Amazon is the largest by far, with recent market research showing that Amazon accounts for 34% of the cloud infrastructure services market.

Until now, this cloud infrastructure controlled by Amazon was largely hidden, with only the general geographic regions of the data centers publicised. While Amazons cloud is comprised of physical locations, indications of the existence of these places are primarily buried in government records or made visible only when cloud infrastructure fails due to natural disasters or other problems in the physical world.

In the process of dispelling the mystery around the locations of Amazon’s data centers, WikiLeaks also turned this documentinto a puzzle game, the Quest of Random Clues. The goal of this game was to encourage people to research these data centers in a fun and intriguing way, while highlighting related issues such as contracts with the intelligence community, Amazons complex corporate structures, and the physicality of the cloud.

SOURCE

Posted by Elvis on 10/17/18 •
Section Privacy And Rights • Section Broadband Privacy
View (0) comment(s) or add a new one
Printable viewLink to this article
Home

Thursday, August 09, 2018

Legalized Hacking

snooping on your pc

If I were to PORT SCAN any IP - I can be IN BIG TROUBLE.

Shouldn’t BIG BAD BANKS - and everyone else - be bound by the same rules?

Check this out:

Halifax Bank scans the machines of surfers that land on its login page whether or not they are customers

---

Bank on it: It’s either legal to port-scan someone without consent or it’s not, fumes researcher
One rule for banks, another for us, says white hat

By John Leyden
The Register
August 7, 2018

Security researcher Paul Moore has made his objection to this practice in which the British bank is not alone - clear, even though it is done for good reasons. The researcher claimed that performing port scans on visitors without permission is a violation of the UK’s COMPUTER MISUSE ACT (CMA).

Halifax has disputed this, arguing that the port scans help it pick up evidence of malware infections on customers’ systems. The scans are legal, Halifax told Moore in response to a complaint he made on the topic last month.

When you visit the Halifax login page, even before you’ve logged in, JavaScripton the site, running in the browser, attempts to scan for open ports on your local computer to see if remote desktop or VNC services are running, and looks for some general remote access trojans (RATs) backdoors, in other words. Crooks are known to abuse these remote services to snoop on victims’ banking sessions.

Moore said he wouldn’t have an issue if Halifax carried out the security checks on people’s computers after they had logged on. It’s the lack of consent and the scanning of any visitor that bothers him. “If they ran the scriptafter you’ve logged in… they’d end up with the same end result, but they wouldn’t be scanning visitors, only customers,” Moore said.

According to Moore, when he called Halifax to complain, a representative told him: “We have to port scan your machine for security reasons.”

Having failed to either persuade Halifax Bank to change its practices or Action Fraud to act (thus far1), Moore last week launched a fundraising effort to privately prosecute Halifax Bank for allegedly breaching the Computer Misuse Act. This crowdfunding effort on GoFundMe aims to gather 15,000 (so far just 50 has been raised).

Halifax Bank’s “unauthorised” port scans are a clear violation of the CMA - and amounts to an action that security researchers are frequently criticised and/or convicted for, Moore argued. The CISO and part-time security researcher hopes his efforts in this matter might result in a clarification of the law.

“Ultimately, we can’t have it both ways,” Moore told El Reg. “It’s either legal to port scan someone without consent, or with consent but no malicious intent, or it’s illegal and Halifax need to change their deployment to only check customers, not visitors.”

The whole effort might smack of tilting at windmills, but Moore said he was acting on a point of principle.

“If security researchers operate in a similar fashion, we almost always run into the CMA, even if their intent isn’t malicious. The CMA should be applied fairly to both parties.”

Moore announced his findings, his crowdfunded litigation push and the reasons behind it on Twitter, sparking a lively debate. Security researchers are split on whether the effort is worthwhile.

The arguments for and against

The scanning happens on the customer login page and not the main Halifax Bank site, others were quick to point out. Moore acknowledged this but said it was besides the point.

Infosec pro Lee Burgess disagreed: “If they had added to the non-customer page then the issue would be different. They are only checking for open ports, nothing else, so [I] cannot really see the issue.”

Surely there needs to be intent to cause harm or recklessness for any criminal violation, neither of which is present in the case of Halifax, argued another.

UK security pro Kevin Beaumont added: “I’d question if [it was] truly illegal if [there was] not malicious intent. Half the infosec services would be illegal (Shodan, Censys etc). IRC networks check on connect, Xbox does, PlayStation does etc.”

Moore responded that two solicitors he’d spoken to agreed Halifax’s practice appeared to contravene the CMA. An IT solicitor contact of The Register, who said he’d rather not be quoted on the topic, agreed with this position. Halifax’s lawyers undoubtedly disagree.

Moore concluded: “Halifax explicitly says they’ll run software to detect malware… but that’s if you’re a customer. Halifax currently scan everyone, as soon as you land on their site.”

Enter the ThreatMetrix

Halifax Bank is part of Lloyds Banking Group, and a reference customer for ThreatMetrix, the firm whose technology is used to carry out the port scanning, via client-side JavaScripts.

The scripts run within the visitor’s browser, and are required to check if a machine is infected with malware. They test for this by trying to connect to a local port, but this is illegal without consent, according to Moore.

“Whilst their intentions are clear and understandable, the simple act of scanning and actively trying to connect to several ports, without consent, is a clear violation of the CMA,” Moore argued.

Beaumont countered: “It only connects to the port, it doesn’t send or receive any data (you can see from the code, it just checks if port is listening).”

Moore responded that even passively listening would break the CMA. “That’s sufficient to breach CMA. If I port-sweep Halifax to see what’s listening, I’d be breaching CMA too,” he said.

The same ThreatMetrix tech is used by multiple UK high street banks, according to Beaumont. “If one is forced to change, they all will,” Moore replied.

Moore went on to say that this testing - however well-intentioned - might have undesirable consequences.

“Halifax/Lloyds Banking Group are not trying to gain remote access to your device; they are merely testing to see if such a connection is possible and if the port responds. There is no immediate threat to your security or money,” he explained.

“The results of their unauthorised scan are sent back to Halifax and processed in a manner which is unclear. If you happen to allow remote desktop connections or VNC, someone (other than you) will be notified as such. If those applications have vulnerabilities of which you are unaware, you are potentially at greater risk.”

Moore expressed that his arguably quixotic actions may have beneficial effects. “Either Halifax [is] forced to correct it and pays researchers from the proceeds, or the CMA is revised to clarify that if [its] true intent isn’t malicious, [it’s] safe to continue,” he said.

We have asked ThreatMetrix for comment.

Updated at 1200 UTC to add

Halifax Bank has been to touch to say: “Keeping our customers safe is of paramount importance to the Group and we have a range of robust processes in place “to protect online banking customers.”

Bootnote

1 Action Fraud is the UK’s cyber security reporting centre. Moore has reported the issue to it. AF’s response left Moore pessimistic about finding any relief from that quarter.

SOURCE

Posted by Elvis on 08/09/18 •
Section Privacy And Rights • Section Broadband Privacy
View (0) comment(s) or add a new one
Printable viewLink to this article
Home

Monday, January 15, 2018

Session Replay Scripts

pc-eye.jpg alt:images: snoppy pc

This is the first post in our “No Boundaries” series, in which we reveal how third-party scripts on websites have been extracting personal information in increasingly intrusive ways. [0]

By Steven Englehardt, Gunes Acar, and Arvind Narayanan
Freedom To Tinker
November 15, 1017

Update: we’ve released our data the list of sites with session-replay scripts, and the sites where we’ve confirmed recording by third parties.

You may know that most websites have third-party analytics scripts that record which pages you visit and the searches you make.  But lately, more and more sites use “session replay” scripts. These scripts record your keystrokes, mouse movements, and scrolling behavior, along with the entire contents of the pages you visit, and send them to third-party servers. Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder.

The stated purpose of this data collection includes gathering insights into how users interact with websites and discovering broken or confusing pages. However the extent of data collected by these services far exceeds user expectations [1]; text typed into forms is collected before the user submits the form, and precise mouse movements are saved, all without any visual indication to the user. This data can’t reasonably be expected to be kept anonymous. In fact, some companies allow publishers to explicitly link recordings to a users real identity.

For this study we analyzed seven of the top session replay companies (based on their relative popularity in our measurements [2]). The services studied are Yandex, FullStory, Hotjar, UserReplay, Smartlook, Clicktale, and SessionCam. We found these services in use on 482 of the Alexa top 50,000 sites.

See HERE or HERE.

What can go wrong? In short, a lot.

Collection of page content by third-party replay scripts may cause sensitive information such as medical conditions, credit card details and other personal information displayed on a page to leak to the third-party as part of the recording. This may expose users to identity theft, online scams, and other unwanted behavior. The same is true for the collection of user inputs during checkout and registration processes.

The replay services offer a combination of manual and automatic redaction tools that allow publishers to exclude sensitive information from recordings. However, in order for leaks to be avoided, publishers would need to diligently check and scrub all pages which display or accept user information. For dynamically generated sites, this process would involve inspecting the underlying web applications server-side code. Further, this process would need to be repeated every time a site is updated or the web application that powers the site is changed.

A thorough redaction process is actually a requirement for several of the recording services, which explicitly forbid the collection of user data. This negates the core premise of these session replay scripts, who market themselves as plug and play. For example, Hotjar’s homepage advertises: Set up Hotjar with one scriptin a matter of seconds and Smartlooks sign-up procedure features their scripttag next to a timer with the tagline every minute you lose is a lot of video.

To better understand the effectiveness of these redaction practices, we set up test pages and installed replay scripts from six of the seven companies [3]. From the results of these tests, as well as an analysis of a number of live sites, we highlight four types of vulnerabilities below:

1. Passwords are included in session recordings. All of the services studied attempt to prevent password leaks by automatically excluding password input fields from recordings. However, mobile-friendly login boxes that use text inputs to store unmasked passwords are not redacted by this rule, unless the publisher manually adds redaction tags to exclude them. We found at least one website where the password entered into a registration form leaked to SessionCam, even if the form is never submitted.

2. Sensitive user inputs are redacted in a partial and imperfect way. As users interact with a site they will provide sensitive data during account creation, while making a purchase, or while searching the site. Session recording scripts can use keystroke or input element loggers to collect this data.

All of the companies studied offer some mitigation through automated redaction, but the coverage offered varies greatly by provider. UserReplay and SessionCam replace all user input with an equivalent length masking text, while FullStory, Hotjar, and Smartlook exclude specific input fields by type. We summarize the redaction of other fields in the table below.

image:replay

Automated redaction is imperfect; fields are redacted by input element type or heuristics, which may not always match the implementation used by publishers. For example, FullStory redacts credit card fields with the `autocomplete` attribute set to `cc-number`, but will collect any credit card numbers included in forms without this attribute.

image:replay

To supplement automated redaction, several of the session recording companies, including Smartlook, Yandex, FullStory, SessionCam, and Hotjar allow sites to further specify inputs elements to be excluded from the recording. To effectively deploy these mitigations a publisher will need to actively audit every input element to determine if it contains personal data. This is complicated, error prone and costly, especially as a site or the underlying web application code changes over time. For instance, the financial service site fidelity.com has several redaction rules for Clicktale that involve nested tables and child elements referenced by their index. In the next section we further explore these challenges.

A safer approach would be to mask or redact all inputs by default, as is done by UserReplay and SessionCam, and allow whitelisting of known-safe values. Even fully masked inputs provide imperfect protection. For example, the masking used by UserReplay and Smartlook leaks the length of the user’s password

3. Manual redaction of personally identifying information displayed on a page is a fundamentally insecure model. In addition to collecting user inputs, the session recording companies also collect rendered page content. Unlike user input recording, none of the companies appear to provide automated redaction of displayed content by default; all displayed content in our tests ended up leaking.

Instead, session recording companies expect sites to manually label all personally identifying information included in a rendered page. Sensitive user data has a number of avenues to end up in recordings, and small leaks over several pages can lead to a large accumulation of personal data in a single session recording.

For recordings to be completely free of personal information, a sites web application developers would need to work with the site’s marketing and analytics teams to iteratively scrub personally identifying information from recordings as its discovered. Any change to the site design, such as a change in the class attribute of an element containing sensitive information or a decision to load private data into a different type of element requires a review of the redaction rules.

As a case study, we examine the pharmacy section of Walgreens.com, which embeds FullStory. Walgreens makes extensive use of manual redaction for both displayed and input data. Despite this, we find that sensitive information including medical conditions and prescriptions are leaked to FullStory alongside the names of users.

We do not present the above examples to point fingers at a certain website. Instead, we aim to show that the redaction process can fail even for a large publisher with a strong, legal incentive to protect user data. We observed similar personal information leaks on other websites, including on the checkout pages of Lenovo [5]. Sites with less resources or less expertise are even more likely to fail.

4. Recording services may fail to protect user data. Recording services increase the exposure to data breaches, as personal data will inevitably end up in recordings. These services must handle recording data with the same security practices with which a publisher would be expected to handle user data.

We provide a specific example of how recording services can fail to do so. Once a session recording is complete, publishers can review it using a dashboard provided by the recording service. The publisher dashboards for Yandex, Hotjar, and Smartlook all deliver playbacks within an HTTP page, even for recordings which take place on HTTPS pages. This allows an active man-in-the-middle to injecting a scriptinto the playback page and extract all of the recording data. Worse yet, Yandex and Hotjar deliver the publisher page content over HTTP - data that was previously protected by HTTPS is now vulnerable to passive network surveillance.

The vulnerabilities we highlight above are inherent to full-page session recording. That’s not to say the specific examples can’t be fixed indeed, the publishers we examined can patch their leaks of user data and passwords. The recording services can all use HTTPS during playbacks. But as long as the security of user data relies on publishers fully redacting their sites, these underlying vulnerabilities will continue to exist.

Does tracking protection help?

Two commonly used ad-blocking lists EasyList and EasyPrivacy do not block FullStory, Smartlook, or UserReplay scripts. EasyPrivacy has filter rules that block Yandex, Hotjar, ClickTale and SessionCam.

At least one of the five companies we studied (UserReplay) allows publishers to disable data collection from users who have Do Not Track (DNT) set in their browsers. We scanned the configuration settings of the Alexa top 1 million publishers using UserReplay on their homepages, and found that none of them chose to honor the DNT signal.

Improving user experience is a critical task for publishers. However it shouldn’t come at the expense of user privacy.

End notes:

[0] We use the term exfiltrate in this series to refer to the third-party data collection that we study. The term leakageђ is sometimes used, but we eschew it, because it suggests an accidental collection resulting from a bug. Rather, our research suggests that while not necessarily malicious, the collection of sensitive personal data by the third parties that we study is inherent in their operation and is well known to most if not all of these entities. Further, there is an element of furtiveness; these data flows are not public knowledge and neither publishers nor third parties are transparent about them.

[1] A recent analysis of the company Navistone, completed by Hill and Mattu for Gizmodo, explores how data collection prior to form submission exceeds user expectations. In this study, we show how analytics companies collect far more user data with minimal disclosure to the user. In fact, some services suggest the first party sites simply include a disclaimer in their sites privacy policy or terms of service.

[2] We used OpenWPM to crawl the Alexa top 50,000 sites, visiting the homepage and 5 additional internal pages on each site. We use a two-step approach to detect analytics services which collect page content.

First, we inject a unique value into the HTML of the page and search for evidence of that value being sent to a third party in the page traffic. To detect values that may be encoded or hashed we use a detection methodology similar to previous work on email tracking. After filtering out leak recipients, we isolate pages on which at least one third party receives a large amount of data during the visit, but for which we do not detect a unique ID. On these sites, we perform a follow-up crawl which injects a 200KB chunk of data into the page and check if we observe a corresponding bump in the size of the data sent to the third party.

We found 482 sites on which either the unique marker was leaked to a collection endpoint from one of the services or on which we observed a data collection increase roughly equivalent to the compressed length of the injected chunk. We believe this value is a lower bound since many of the recording services offer the ability to sample page visits, which is compounded by our two-step methodology.

[3] One company (Clicktale) was excluded because we were unable to make the practical arrangements to analyze script’s functionality at scale.

[4] FullStory’s terms and conditions explicitly classify health or medical information, or any other information covered by HIPAA as sensitive data and asks customers to not provide any Sensitive Data to FullStory.

[5] Lenovo.com is another example of a site which leaks user data in session recordings.

[6] We used the default scripts available to new accounts for 5 of the 6 providers. For UserReplay, we used a scripttaken from a live site and verified that the configuration options match the most common options found on the web.

SOURCE

---

Website operators are in the dark about privacy violations by third-party scripts

By Arvind Narayanan
Freedom To Tinker
January 12, 2018

Recently we revealed that session replayӔ scripts on websites record everything you do, like someone looking over your shoulder, and send it to third-party servers. This en-masse data exfiltration inevitably scoops up sensitive, personal information in real time, as you type it. We released the data behind our findings, including a list of 8,000 sites on which we observed session-replay scripts recording user data.

As one case study of these 8,000 sites, we found health conditions and prescription data being exfiltrated from walgreens.com. These are considered Protected Health Information under HIPAA. The number of affected sites is immense; contacting all of them and quantifying the severity of the privacy problems is beyond our means. We encourage you to check out our data release and hold your favorite websites accountable.

Student data exfiltration on Gradescope

As one example, a pair of researchers at UC San Diego read our study and then noticed that Gradescope, a website they used for grading assignments, embeds FullStory, one of the session replay scripts we analyzed. We investigated, and sure enough, we found that student names and emails, student grades, and instructor comments on students were being sent to FullStoryגs servers. This is considered Student Data under FERPA (US educational privacy law). Ironically, Princetons own Information Security course was also affected. We notified Gradescope of our findings, and they removed FullStory from their website within a few hours.

You might wonder how the companiesҒ privacy policies square with our finding. As best as we can tell, Gradescopes Terms of Service actually permit this data exfiltration [1], which is a telling comment about the ineffectiveness of Terms of Service as a way of regulating privacy.

FullStoryҒs Terms are a different matter, and include a clause stating: Customer agrees that it will not provide any Sensitive Data to FullStory.Ӕ We argued previously that this repudiation of responsibility by session-replay scripts puts website operators in an impossible position, because preventing data leaks might require re-engineering the site substantially, negating the core value proposition of these services, which is drag-and-drop deployment. Interestingly, Gradescopes CEO told us that they were not aware of this requirement in FullStoryҒs Terms, that the clause had not existed when they first signed up for FullStory, and that they (Gradescope) had not been notified when the Terms changed. [2]

Web publishers kept in the dark

Of the four websites we highlighted in our previous post and this one (Bonobos, Walgreens, Lenovo, and Gradescope), three have removed the third-party scripts in question (all except Lenovo). As far as we can tell, no publisher (website operator) was aware of the exfiltration of sensitive data on their own sites until our study. Further, as mentioned above, Gradescope was unaware of key provisions in FullStorys Terms of Service. This is a pattern weҒve noticed over and over again in our six years of doing web privacy research.

Worse, in many cases the publisher has no direct relationship with the offending third-party script. In Part 2 of our study we examined two third-party scripts which exploit a vulnerability in browsers built-in password managers to exfiltrate user identities. One web developer was unable to determine how the scriptwas loaded and asked us for help. We pointed out that their site loaded an ad network (media-clic.com), which in turn loaded themoneytizer.com, which finally loaded the offending scriptfrom Audience Insights. These chains of redirects are ubiquitous on the web, and might involve half a dozen third parties. On some websites the majority of third parties have no direct relationship with the publisher.

Most of the advertising and analytics industry is premised on keeping not just users but also website operators in the dark about privacy violations. Indeed, the effort required by website operators to fully audit third parties would negate much of the benefit of offloading tasks to them. The ad tech industry creates a tremendous negative externality in terms of the privacy cost to users.

Can we turn the tables?

The silver lining is that if we can explain to web developers what third parties are doing on their sites, and empower them to take control, that might be one of the most effective ways to improve web privacy. But any such endeavor should keep in mind that web publishers everywhere are on tight budgets and may not have much privacy expertise.

To make things concrete, here’s a proposal for how to achieve this kind of impact:

Create a 1-pager summarizing the bare minimum that website operators need to know about web security, privacy, and third parties, with pointers to more information.

Create a tailored privacy report for each website based on data that is already publicly available through various sources including our own data releases.

Build open-source tools for website operators to scan their own sites [3]. Ideally, the tool should make recommendations for privacy-protecting changes based on the known behavior of third parties.

Reach out to website operators to provide information and help make changes. This step doesn’t scale, but is crucial.

If you’re interested in working with us on this, wed love to hear from you!

Endnotes

We are grateful to UCSD researchers Dimitar Bounov and Sorin Lerner for bringing the vulnerabilities on Gradescope.com to our attention.

[1] Gradescope’s terms of use state: “By submitting Student Data to Gradescope, you consent to allow Gradescope to provide access to Student Data to its employees and to certain third party service providers which have a legitimate need to access such information in connection with their responsibilities in providing the Service.”

[2] The Wayback Machine does not archive FullStory’s Terms page far enough back in time for us to independently verify Gradescope’s statement, nor does FullStory appear in ToSBack, the EFFs terms-of-service tracker.

[3] Privacyscore.org is one example of a nascent attempt at such a tool.

SOURCE

Posted by Elvis on 01/15/18 •
Section Privacy And Rights • Section Broadband Privacy
View (0) comment(s) or add a new one
Printable viewLink to this article
Home
Page 1 of 25 pages  1 2 3 >  Last »

Statistics

Total page hits 9516306
Page rendered in 1.0470 seconds
40 queries executed
Debug mode is off
Total Entries: 3206
Total Comments: 337
Most Recent Entry: 11/21/2019 10:21 am
Most Recent Comment on: 01/02/2016 09:13 pm
Total Logged in members: 0
Total guests: 10
Total anonymous users: 0
The most visitors ever was 114 on 10/26/2017 04:23 am


Email Us

Home

Members:
Login | Register
Resumes | Members

In memory of the layed off workers of AT&T

Today's Diversion

If you want to grow your own dope, plant a politician. - Anonymous

Search


Advanced Search

Sections

Calendar

December 2019
S M T W T F S
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        

Must Read

Most recent entries

RSS Feeds

Today's News

ARS Technica

External Links

Elvis Picks

BLS Pages

Favorites

All Posts

Archives

RSS


Creative Commons License


Support Bloggers' Rights