STOP the Google Home "Jesus" posts!
I just searched "Google Home" on Google, and filtered by new articles in the past 24 hours. Almost the ENTIRE first page is articles talking about this idiotic Jesus thing.
Here is the problem. It isn't even a thing any more! That behavior your article is harping on Google Home for? It doesn't do that anymore. CEASE IT NOW.
And it was never a big deal to begin with. And, it has been adequately explained.
People have these hair brained theories that someone intentionally targeted or "coded" the speakers to not respond to this. This was total BS from the beginning. We are seeing the birth of mainstream "religious privilege". North America is predominantly (insofar as religion is concerned) a nation of believers in Christ. North America is Google's most profitable market. They would not intentionally "code" a device to target AGAINST such a large group.
If you A) think they would, and B) think it would even matter all that much... you need to get your shit together. A group who faces virtually no discrimination based on religion in North America has absolutely no sane reason to assume an American company is trying to exclude them. It is irrational and insane. Period. And, as I stated, had it turned out not to be benign... this isn't even a fraction of the discrimination people of other religions in your country experience on a daily basis. But, that isn't really what this article is about.
Of course, this wasn't an attack on faith and it wasn't even something which was specifically coded. If it were, the headlines would be of the responsible person being promptly fired, not that of an explanation and steps to mitigate future problems.
Part of the problem is that the people reporting on these things... wait for it... don't actually know how this technology works!
One site actually accused Google of "coding" it to do this. Another woman proclaimed we needed a separation of "church and technology" like Google Home is somehow an agent of another religion. This is... well... some sort of psychosis. I'd probably mislabel it, so I won't try.
Let's start with a quick run down of how a Google Home or an Amazon Echo works. You speak a command in a predefined format like "Turn on the X light" or "Who is Y?" or "What is X in ?". When you say the wake word ("Hey Google", "Alexa" etc...) and follow it by one of these commands, the device takes what you said, sends it to the cloud. In the cloud it makes an intelligent guess about what the words were. It then tries to decide based on what it thinks it heard, which skill it should pass the specific information (if any) to.
From there, a typically rather generic action takes over, parses the inputs and returns a response, which may be an action, a prompt, or a combination of such things.
Now, I don't want to say that it is impossible that Google could have hard coded a completely separate action to handle JUST this exact case or hard code the response into the action. But, if it did, it would only do so for a specific reason. And that reason would probably be out of respect rather than an attack.
So, when you say "Hey Google, who is Jesus Christ?", what happens (almost certainly) is the "Hey Google" part wakes up the device. It then sends "who is Jesus Christ" as the command to the server. It then detects an action/skill which is intended to parse a command like "Who is {name}". It will decide that this command matches that format. And it then it will call that command and pass in a variable called "name" with the value "Jesus Christ".
The action will then use whatever logic it was built upon to form a response.
So, to the meat of the argument. I've admitted that Google COULD have hard coded it to handle this, but that I don't believe that they did. Why would I believe that? I'm glad you asked.
Let's start with Google's response to the "incident". Roughly stated, they said that if they detect results which could be prone to tampering, they refuse to respond.
But, if I were to paraphrase the response from Google, what they are basically saying is, when it runs these sort of queries, it assigns the results a value. We'll call it "quality". In truth, there could be a host of variables. But, in the end, there is some "standard" the results must meet. If quality of the results it gets back from the query don't meet these standards, it issues a standardized response like "I don't know" or "I can't do that yet".
Why would I paraphrase and interpret it this way? Machine Learning. There are simply too many permutations (options) of things for people to ask. Google isn't going to bother coding answers to every question. Instead, for these generalized sort of questions, Google almost definitely uses Machine Learning. They have a cloud based machine learning Platform after all, and this is a more or less perfect application of it. But, machine learning itself isn't perfect. It needs to be trained with LOTS of data (I mean staggeringly large quantities for generalized queries like this) and then it still need to rely on lots of data even to arrive at an answer.
If Google is using Machine Learning for this (and you can bet your first born, or whatever is valuable to you that they are), then part of that process is ranking the results and assigning a "confidence" or what would be our earlier notion of "quality". When answering generalized queries based on machine learning though, and knowing responses can be sensitive to bad data, and bad response can SERIOUSLY hurt PR, so that threshold needs to be VERY high. Google will only allow it's machine learning to actually return a result to the user if it believes the answer is one of extremely high quality.
In other words, without ANY intervention from Google, the action associated with these things would basically refuse to respond if it found too much diversity in the answers. There are a lot of reasons this might happen. When asking who a person is, if there is not a lot of varied information, it may consider all of the information unreliable. If there are multiple, relatively equally famous people with the same name there might be a lot of competing results. Or, and this is where Jesus comes in, if there are a lot of competing descriptions of the same person.
Jesus is a bit of a problem for deep learning I would imagine. You see, firstly, we have a figure who actually existed as a real person. At least, we're fairly sure that there was a Jesus of Nazareth. And that the biblical accounts of his death are fairly accurate. So, we likely have descriptions of Jesus out there which describe him ONLY in a NON-RELIGIOUS context. But then we likely have sources which describe him in PURELY a religious context. And then there will be sites which describe him using a mix of these things.
Right away, if the skill detected these inconsistencies in any great quantity it would likely kill it right there. And it is easy to see why this is the right move. If a Christian asks and gets back a response like "Jesus Christ, also known as Jesus of Nazareth, was a man who is purported to have lived over 2000 years ago and was killed for religious crimes in Rome" they might be a tad offended. If a Jew or Muslim asks and get back "Jesus Christ is the son of God and your Lord and Savior", they would almost definitely be offended. If a person asking purely as a matter of historical inquiry got the latter response, they might not be offended but would definitely question the value of the service.
The algorithm just takes in data, compares it against a model, decides if there is a good enough answer and then spits it out if there is. It has no idea what the relevance of the question is or, if there are multiple answers, which is the "correct" answer to give in different situations. It doesn't even know what the current "situation" is.
And Jesus isn't JUST a religious figure and an actual person. He also has different descriptions in different religious contexts. Some accept him as a prophet, but not as the son of God. Some see him as a false prophet. And some say very nasty things about him indeed. He is also a target for Satanists. He is brought up a LOT in satire and other contexts. And, in all of these things, he features much more prominently than other religious figures in English, North American sources.
Ultimately, it also matters what data was used to train the model and what data was fed in.
But, if you think my answer sounds flaky, or you don't actually understand machine learning, here is a much simpler approach. Ask your Google Home device a few thousand questions of formats you expect it to understand. Ask it who an assortment of people are of varying degrees of fame and in varying fields. Ask it to define an army of words from the standard to the silly.
You'll note a few things as you go along... it is able to answer A LOT of questions. The list of questions it can answer is "odd". It will be able to answer some weird things you wouldn't expect. And fall flat on it's face on some items you seem perplexed to find it can't answer.
If you want to take this to the next level. Track your questions and their responses in a spreadsheet and repeat it once a month. You'll notice over time that some responses change.
Whether you repeat the questions or not though, the take away is simple. With the odd hits and misses and the sheer quantity of things it can answer it is obvious that no matter how many questions you come up with, you could never hope to generate more questions than the device can successfully answer. This is your first clue that the individual questions and responses are NOT "CODED". Yes, there are some Easter Eggs and other scripted things that the device does.
But general knowledge questions rely on Google Search and TensorFlow (or perhaps some other machine learning platform internal to Google). There are simply too many questions to code responses too. And even if they could, they still wouldn't. The more command structures the service is trained to recognize, the slower recognition becomes. In other words, it is quicker for the service to match "Who is Jesus" to "Who is {name}", than it is to compare it to a list of possible name combinations, not to mention all other possible commands. And, again, even if it were hard coded in the skill, adding all of the exceptions costs computation time.
Other religious figures, likely, just had a higher ratio of more consistent data, allowing the service to arrive at an answer with a high enough quality to disseminate when asked.
This is a really long article. But the end of it is... THIS IS NOT NEWS. GOOGLE IS NOT TARGETING JESUS. And it has "fixed" the problem.
I want to get back to news which isn't a week old, factually inaccurate and displaying clear elitist fears. I'm actually starting to miss the news about Google Home devices breaking people's networks or how Google was finally deploying a fix. At least it was, you know, ACTUALLY A PROBLEM, even if that too was reported to death.
Here is the problem. It isn't even a thing any more! That behavior your article is harping on Google Home for? It doesn't do that anymore. CEASE IT NOW.
And it was never a big deal to begin with. And, it has been adequately explained.
People have these hair brained theories that someone intentionally targeted or "coded" the speakers to not respond to this. This was total BS from the beginning. We are seeing the birth of mainstream "religious privilege". North America is predominantly (insofar as religion is concerned) a nation of believers in Christ. North America is Google's most profitable market. They would not intentionally "code" a device to target AGAINST such a large group.
If you A) think they would, and B) think it would even matter all that much... you need to get your shit together. A group who faces virtually no discrimination based on religion in North America has absolutely no sane reason to assume an American company is trying to exclude them. It is irrational and insane. Period. And, as I stated, had it turned out not to be benign... this isn't even a fraction of the discrimination people of other religions in your country experience on a daily basis. But, that isn't really what this article is about.
Of course, this wasn't an attack on faith and it wasn't even something which was specifically coded. If it were, the headlines would be of the responsible person being promptly fired, not that of an explanation and steps to mitigate future problems.
Part of the problem is that the people reporting on these things... wait for it... don't actually know how this technology works!
One site actually accused Google of "coding" it to do this. Another woman proclaimed we needed a separation of "church and technology" like Google Home is somehow an agent of another religion. This is... well... some sort of psychosis. I'd probably mislabel it, so I won't try.
Let's start with a quick run down of how a Google Home or an Amazon Echo works. You speak a command in a predefined format like "Turn on the X light" or "Who is Y?" or "What is X
From there, a typically rather generic action takes over, parses the inputs and returns a response, which may be an action, a prompt, or a combination of such things.
Now, I don't want to say that it is impossible that Google could have hard coded a completely separate action to handle JUST this exact case or hard code the response into the action. But, if it did, it would only do so for a specific reason. And that reason would probably be out of respect rather than an attack.
So, when you say "Hey Google, who is Jesus Christ?", what happens (almost certainly) is the "Hey Google" part wakes up the device. It then sends "who is Jesus Christ" as the command to the server. It then detects an action/skill which is intended to parse a command like "Who is {name}". It will decide that this command matches that format. And it then it will call that command and pass in a variable called "name" with the value "Jesus Christ".
The action will then use whatever logic it was built upon to form a response.
So, to the meat of the argument. I've admitted that Google COULD have hard coded it to handle this, but that I don't believe that they did. Why would I believe that? I'm glad you asked.
Let's start with Google's response to the "incident". Roughly stated, they said that if they detect results which could be prone to tampering, they refuse to respond.
But, if I were to paraphrase the response from Google, what they are basically saying is, when it runs these sort of queries, it assigns the results a value. We'll call it "quality". In truth, there could be a host of variables. But, in the end, there is some "standard" the results must meet. If quality of the results it gets back from the query don't meet these standards, it issues a standardized response like "I don't know" or "I can't do that yet".
Why would I paraphrase and interpret it this way? Machine Learning. There are simply too many permutations (options) of things for people to ask. Google isn't going to bother coding answers to every question. Instead, for these generalized sort of questions, Google almost definitely uses Machine Learning. They have a cloud based machine learning Platform after all, and this is a more or less perfect application of it. But, machine learning itself isn't perfect. It needs to be trained with LOTS of data (I mean staggeringly large quantities for generalized queries like this) and then it still need to rely on lots of data even to arrive at an answer.
If Google is using Machine Learning for this (and you can bet your first born, or whatever is valuable to you that they are), then part of that process is ranking the results and assigning a "confidence" or what would be our earlier notion of "quality". When answering generalized queries based on machine learning though, and knowing responses can be sensitive to bad data, and bad response can SERIOUSLY hurt PR, so that threshold needs to be VERY high. Google will only allow it's machine learning to actually return a result to the user if it believes the answer is one of extremely high quality.
In other words, without ANY intervention from Google, the action associated with these things would basically refuse to respond if it found too much diversity in the answers. There are a lot of reasons this might happen. When asking who a person is, if there is not a lot of varied information, it may consider all of the information unreliable. If there are multiple, relatively equally famous people with the same name there might be a lot of competing results. Or, and this is where Jesus comes in, if there are a lot of competing descriptions of the same person.
Jesus is a bit of a problem for deep learning I would imagine. You see, firstly, we have a figure who actually existed as a real person. At least, we're fairly sure that there was a Jesus of Nazareth. And that the biblical accounts of his death are fairly accurate. So, we likely have descriptions of Jesus out there which describe him ONLY in a NON-RELIGIOUS context. But then we likely have sources which describe him in PURELY a religious context. And then there will be sites which describe him using a mix of these things.
Right away, if the skill detected these inconsistencies in any great quantity it would likely kill it right there. And it is easy to see why this is the right move. If a Christian asks and gets back a response like "Jesus Christ, also known as Jesus of Nazareth, was a man who is purported to have lived over 2000 years ago and was killed for religious crimes in Rome" they might be a tad offended. If a Jew or Muslim asks and get back "Jesus Christ is the son of God and your Lord and Savior", they would almost definitely be offended. If a person asking purely as a matter of historical inquiry got the latter response, they might not be offended but would definitely question the value of the service.
The algorithm just takes in data, compares it against a model, decides if there is a good enough answer and then spits it out if there is. It has no idea what the relevance of the question is or, if there are multiple answers, which is the "correct" answer to give in different situations. It doesn't even know what the current "situation" is.
And Jesus isn't JUST a religious figure and an actual person. He also has different descriptions in different religious contexts. Some accept him as a prophet, but not as the son of God. Some see him as a false prophet. And some say very nasty things about him indeed. He is also a target for Satanists. He is brought up a LOT in satire and other contexts. And, in all of these things, he features much more prominently than other religious figures in English, North American sources.
Ultimately, it also matters what data was used to train the model and what data was fed in.
But, if you think my answer sounds flaky, or you don't actually understand machine learning, here is a much simpler approach. Ask your Google Home device a few thousand questions of formats you expect it to understand. Ask it who an assortment of people are of varying degrees of fame and in varying fields. Ask it to define an army of words from the standard to the silly.
You'll note a few things as you go along... it is able to answer A LOT of questions. The list of questions it can answer is "odd". It will be able to answer some weird things you wouldn't expect. And fall flat on it's face on some items you seem perplexed to find it can't answer.
If you want to take this to the next level. Track your questions and their responses in a spreadsheet and repeat it once a month. You'll notice over time that some responses change.
Whether you repeat the questions or not though, the take away is simple. With the odd hits and misses and the sheer quantity of things it can answer it is obvious that no matter how many questions you come up with, you could never hope to generate more questions than the device can successfully answer. This is your first clue that the individual questions and responses are NOT "CODED". Yes, there are some Easter Eggs and other scripted things that the device does.
But general knowledge questions rely on Google Search and TensorFlow (or perhaps some other machine learning platform internal to Google). There are simply too many questions to code responses too. And even if they could, they still wouldn't. The more command structures the service is trained to recognize, the slower recognition becomes. In other words, it is quicker for the service to match "Who is Jesus" to "Who is {name}", than it is to compare it to a list of possible name combinations, not to mention all other possible commands. And, again, even if it were hard coded in the skill, adding all of the exceptions costs computation time.
Other religious figures, likely, just had a higher ratio of more consistent data, allowing the service to arrive at an answer with a high enough quality to disseminate when asked.
This is a really long article. But the end of it is... THIS IS NOT NEWS. GOOGLE IS NOT TARGETING JESUS. And it has "fixed" the problem.
I want to get back to news which isn't a week old, factually inaccurate and displaying clear elitist fears. I'm actually starting to miss the news about Google Home devices breaking people's networks or how Google was finally deploying a fix. At least it was, you know, ACTUALLY A PROBLEM, even if that too was reported to death.
Comments
Post a Comment