This is maybe my biggest pet peeve. These companies are not listening to you in any meaningful way.
You can trivially confirm this by hooking up your home network to Wireshark and filtering packets.
Other reasons:
They can get all of this information elsewhere: searches, ad pixels, location capturing etc.
Processing audio data is basically impossible on-device in a useful way, and the network infrastructure to support mass transcriptions on the cloud would be on the order of billions.
It would be a massive endeavor to cover up the millions of hours of audio data that would need to be analyzed by the lowest paid and most unhappy workers in the industry (content labelers and moderators)
Now I’m sure this is some marketers wet dream, but the logistical and PR nightmare this would create dissuades all but the dumbest ad agencies. This is mostly just terrible tech journalism.
Not that I disagree with your conclusion because there’s an even simpler way to check if an app is listening: iOS and Android will tell you the mic is being used… Anyway, we do have always-on NNs listening for keywords (“Siri,”, “Hey google”, “Alexa”) so I agree that full ass voice transcription like whisper will run like dogshit on your phone they can certainly run a much much lighter model to pick up a handful of keywords.
To Camdat’s point, a general transcription is definitely not low power even if you have some kind of gating on when it transcribes. Obviously Apple and Google and Samsung and whoever makes the phone can turn on the mic without you knowing, otherwise how would their voice assistant work, but Apple probably isn’t letting Facebook have access to the mic without throwing something up on the status bar.
Sure this is definitely true. I should clarify that single-word NNs do run on-device all the time, but those require specialized models that are trained only on those keywords. Once those models trigger they need to send everything else to the cloud.
I agree. If I was going to do something like this for advertising though I wouldn’t really care too much about what people were saying so instead I’d just listen for some limited set of keywords (maybe for some of my top paying advertisers) and serve ads for keywords that hit recently. Keep it all on device until an ad actually needs to be served.
I think people greatly underestimate (or misunderstand) the pervasiveness of ad tracking pixels.
Basically any website that has ads or tries to sell you something has a tracking pixel. These pixels create profiles of devices and track almost everything you do while interacting with those sites.
These pixels don’t require any actual “information” about you, they’re only interested in what you (via the device you’re browsing on) will buy. They also don’t use cookies anymore, it’s usually a combination of user agent, IP address, and coarse location. As you said, companies will generally share these profiles.
Eh, I dunno. I remember making exactly those points 20 years ago, but I think it’s pretty feasible now. There are open source NNs that look like they can do this locally on mediocre phones. And if the output is garbage quality, that’s ok, it just has to be good enough to sell some ads. I think it’s largely feasible, although I’m sure it’s inflated by startups looking to impress clients and investors.
Feel free to Wireshark your smart devices and confirm what I’ve said yourself. The most efficient way to do this is the pixels that already exist on almost every site.
On-device NNs use insane amounts of processing, even on “high-end” phones. You would notice if there was a always-on NN running on your device, this is also something you can try for yourself.
And what exactly am I looking for in wireshark? A few KB of encrypted text data occasionally sent to who-knows-where? Mixed in among a flood of other tracking bullshit and general wasteful bloat? Yeah lemme go check real quick.
Computationally, we’ve had low-quallity speech to text on home PCs for like 30 years, and we’ve had OK-quality NN implementations for like 15 years. Yes it would be a bit wasteful, but a trimmed-down NN could easily hide among the general bloat of modern software.
Yes it would be kind of a clunky and impractical way to collect data compared to other methods, but it’s definitely plausible that an adtech startup could hack together a semi-functional version of this and then slap it in a slide deck. It would let them say “AI” more times during their pitch.
You can filter by device. Leave your suspect device connected to your network for a few days, filter by destination and review. Also keep an eye on CPU usage.
If your devices have a ton of random outgoing network requests you’re already being tracked in a myriad of other ways and need to lock your shit down.
I’ve done this before, there’s not as much network bloat as you might think.
it sounds like you have enough knowledge to know it’s almost impossible for an individual to assert it absolutely 100% isn’t happening.
imo if you make an honest effort to break the technical problem down you will arrive at a different conclusion - or in the very least not be nearly so bold as to allow this to be an emotional peeve.
consider forgetting the propaganda the media has subjected you to, and most importantly forget whether you do or don’t want it to be true. approach the problem from a purely technical perspective while considering these companies can hire hundreds of very smart people from a variety of subdisciplines. recall these companies have virtually bottomless greed and almost exactly 0 morals.
The Internet and smartphones are not mystical devices.
Whether they’re mystical or not is an entirely different conversation ;p
This is something you can independently confirm yourself very easily…
you are vastly understating how non-trivial this task is. or you are allowing your emotional desires to cloud your technical analysis.
teams of experts put in months at a time to assess only a fraction of the required scope. these experts are putting in so much time while admitting they couldn’t achieve full coverage despite having financial backing & well trained teams. it’s reasonably unlikely so many experts would dedicate so much time & resources if its such an easy thing to independently confirm.
if Camdat and ganymede were sitting with one of their nontechnical friends, and their friend says “hey my stock smart device which i only use with facebook and a few things seemed like it eavesdropped on my voice about <common product/brand>”. and they swear they didn’t reveal it via some other channel etc. blah blah we’ve all heard it many times.
if you, Camdat listed all the reasons why the same phenomena can likely be attributed to a variety of other surveillance and correlation methods, some of which are arguably at least as scary. i would likely agree with every single thing you said.
imo its wiser to leave it at that, rather than making the assertion its absolutely not happening, or getting frustrated with them for even wondering.
It’s actually really successful. I’ve had some conversations with people and right after, something on their Instagram feed would show something we just talked about. Most recently I made a joke to my friend about my name, next thing on his feed was a meme using my name.
Uuuh point 2 is just completely false. Googles STT works live and offline on device and so could any other STT model. This invalidates point 3 because only text data needs to be sent and analyzed.
Point 1 is pointless because more data = more better in the eyes of companies that make money selling data. Someone will always be willing to pay for it.
The only valid point is the legal issues but im getting less and less convinced that they even care about that at this point.
This is maybe my biggest pet peeve. These companies are not listening to you in any meaningful way.
You can trivially confirm this by hooking up your home network to Wireshark and filtering packets.
Other reasons:
Now I’m sure this is some marketers wet dream, but the logistical and PR nightmare this would create dissuades all but the dumbest ad agencies. This is mostly just terrible tech journalism.
Not that I disagree with your conclusion because there’s an even simpler way to check if an app is listening: iOS and Android will tell you the mic is being used… Anyway, we do have always-on NNs listening for keywords (“Siri,”, “Hey google”, “Alexa”) so I agree that full ass voice transcription like whisper will run like dogshit on your phone they can certainly run a much much lighter model to pick up a handful of keywords.
deleted by creator
To Camdat’s point, a general transcription is definitely not low power even if you have some kind of gating on when it transcribes. Obviously Apple and Google and Samsung and whoever makes the phone can turn on the mic without you knowing, otherwise how would their voice assistant work, but Apple probably isn’t letting Facebook have access to the mic without throwing something up on the status bar.
deleted by creator
Whatsapp is sending your audio to the cloud to handle transcription. This is not an accurate test because it is not an on-device process.
deleted by creator
Sure this is definitely true. I should clarify that single-word NNs do run on-device all the time, but those require specialized models that are trained only on those keywords. Once those models trigger they need to send everything else to the cloud.
I agree. If I was going to do something like this for advertising though I wouldn’t really care too much about what people were saying so instead I’d just listen for some limited set of keywords (maybe for some of my top paying advertisers) and serve ads for keywords that hit recently. Keep it all on device until an ad actually needs to be served.
Not to mention cross site trackers owned by Google and Facebook.
I think people greatly underestimate (or misunderstand) the pervasiveness of ad tracking pixels.
Basically any website that has ads or tries to sell you something has a tracking pixel. These pixels create profiles of devices and track almost everything you do while interacting with those sites.
These pixels don’t require any actual “information” about you, they’re only interested in what you (via the device you’re browsing on) will buy. They also don’t use cookies anymore, it’s usually a combination of user agent, IP address, and coarse location. As you said, companies will generally share these profiles.
Eh, I dunno. I remember making exactly those points 20 years ago, but I think it’s pretty feasible now. There are open source NNs that look like they can do this locally on mediocre phones. And if the output is garbage quality, that’s ok, it just has to be good enough to sell some ads. I think it’s largely feasible, although I’m sure it’s inflated by startups looking to impress clients and investors.
Feel free to Wireshark your smart devices and confirm what I’ve said yourself. The most efficient way to do this is the pixels that already exist on almost every site.
On-device NNs use insane amounts of processing, even on “high-end” phones. You would notice if there was a always-on NN running on your device, this is also something you can try for yourself.
And what exactly am I looking for in wireshark? A few KB of encrypted text data occasionally sent to who-knows-where? Mixed in among a flood of other tracking bullshit and general wasteful bloat? Yeah lemme go check real quick.
Computationally, we’ve had low-quallity speech to text on home PCs for like 30 years, and we’ve had OK-quality NN implementations for like 15 years. Yes it would be a bit wasteful, but a trimmed-down NN could easily hide among the general bloat of modern software.
Yes it would be kind of a clunky and impractical way to collect data compared to other methods, but it’s definitely plausible that an adtech startup could hack together a semi-functional version of this and then slap it in a slide deck. It would let them say “AI” more times during their pitch.
You can filter by device. Leave your suspect device connected to your network for a few days, filter by destination and review. Also keep an eye on CPU usage.
If your devices have a ton of random outgoing network requests you’re already being tracked in a myriad of other ways and need to lock your shit down.
I’ve done this before, there’s not as much network bloat as you might think.
it sounds like you have enough knowledge to know it’s almost impossible for an individual to assert it absolutely 100% isn’t happening.
imo if you make an honest effort to break the technical problem down you will arrive at a different conclusion - or in the very least not be nearly so bold as to allow this to be an emotional peeve.
consider forgetting the propaganda the media has subjected you to, and most importantly forget whether you do or don’t want it to be true. approach the problem from a purely technical perspective while considering these companies can hire hundreds of very smart people from a variety of subdisciplines. recall these companies have virtually bottomless greed and almost exactly 0 morals.
The Internet and smartphones are not mystical devices. This is something you can independently confirm yourself very easily.
I have the knowledge necessary to say this 100% does not occur on devices that I own.
Whether they’re mystical or not is an entirely different conversation ;p
you are vastly understating how non-trivial this task is. or you are allowing your emotional desires to cloud your technical analysis.
teams of experts put in months at a time to assess only a fraction of the required scope. these experts are putting in so much time while admitting they couldn’t achieve full coverage despite having financial backing & well trained teams. it’s reasonably unlikely so many experts would dedicate so much time & resources if its such an easy thing to independently confirm.
if Camdat and ganymede were sitting with one of their nontechnical friends, and their friend says “hey my stock smart device which i only use with facebook and a few things seemed like it eavesdropped on my voice about <common product/brand>”. and they swear they didn’t reveal it via some other channel etc. blah blah we’ve all heard it many times.
if you, Camdat listed all the reasons why the same phenomena can likely be attributed to a variety of other surveillance and correlation methods, some of which are arguably at least as scary. i would likely agree with every single thing you said.
imo its wiser to leave it at that, rather than making the assertion its absolutely not happening, or getting frustrated with them for even wondering.
Your posts in this thread have been very helpful! thank you!
It’s actually really successful. I’ve had some conversations with people and right after, something on their Instagram feed would show something we just talked about. Most recently I made a joke to my friend about my name, next thing on his feed was a meme using my name.
Bad headline then? Huh
Uuuh point 2 is just completely false. Googles STT works live and offline on device and so could any other STT model. This invalidates point 3 because only text data needs to be sent and analyzed.
Point 1 is pointless because more data = more better in the eyes of companies that make money selling data. Someone will always be willing to pay for it.
The only valid point is the legal issues but im getting less and less convinced that they even care about that at this point.