AI Assistants in devices with screens?

Amazon did it with the Echo Show and it looks like Google has decided to come to the party too. Only, I really wish they hadn't.

The thing is, while I don't really have a specific issue with displays or touch controls with these digital assistants, it gets back into coupling too many things together. The end result is an added UI paradigm which is intrinsically tied to the assistant and hardware which is more expensive than it needs to be.

They've also gone out of their way to make sure that the device "isn't a tablet". Except... why not? How is this NOT the perfect use for an old tablet. It should be an app or a stripped down version of Android which developers can build upon.

Again, I'll make my argument, these devices need to be a collection of one or more disparate elements of the smart eco system. Not whole devices. I mean yes, at some level you need to sell something which represents a "whole" device. But, that device should be built from smaller interoperable building blocks:
  • Microphone
  • Speaker
  • Simple Display (clock like)
  • Rich Display (tablet like
  • Touch Display
  • Button
  • etc...
There are certainly viable cases for JUST a microphone. The speakers, even on the Echo Dot and Google Home Mini are loud enough for many spaces especially if that space isn't one you listen to music in very often. Having a super cheap device which integrates with a network which is JUST a microphone enables a user to improve the listening capabilities and accuracy of their smart home without spending the overhead to get it in a device bundled with a speaker.

Similarly, if I want to kit out a single room with multiple speakers to get an immersive listening experience I really only need 1 microphone. And I should be able to use 1 microphone to interact with any speaker in the house, even if that interaction might be a little cumbersome. 

These smart displays should be the same. They shouldn't come only as complete packages which NEED a microphone, speaker and display. If I can logically group my smart devices, I should be able to have, say, 3 speakers, 2 displays (one touch and one not) and one microphone all as separate devices in a group called "kitchen", and when I say "Hey Google, find me a recipe for chicken noodle soup" it should broadcast audio for the response across speakers in the group associated with the Microphone and display results on screens in that group. If I then go to the touch screen and select a specific recipe, it should update on the other display as well and read off instructions on all speakers in the group again.

So, it isn't really that I have a problem with these devices they are coming out with. There is definitely a place for them. It is that I see no move to compartmentalize and mesh the individual capabilities.

Here is a scenario for me. Let's say I want to build a smart office with a digital whiteboard. I want to be able to buy a 42" smart display with touch capabilities to mount on my wall. Then a few (say 2) cheap smart speakers for surround sound and a cheap microphone array since my office isn't that large and I'd probably spend most of my time at my desk anyway.

I should be able to treat those 4 items as a single, giant device equivalent to the smart displays they are rolling out now. And, if I want better audio quality, I should be able to go out and buy a better set of supported speakers and either add them to, or replace the ones currently in that group. Similarly, if I decide that the microphone I bought wasn't getting the covered I wanted. I can pick up another or replace with a better one, or whatever.

After all, the primary goal of the microphone is to pick up audio to control the assistant. While some sort of user feedback is generally preferable, it isn't required. And while a smart speaker is perhaps better when tied in with the assistant, it should be fine without a Microphone and be able act more or less like a Chromecast audio. Same with a display surface with or without audio. It should be able to act more or less like a Chromecast device.

I think that last bit about Chromecast is what drives me the battiest. THEY ALREADY HAVE SEPARATE PROTOCOLS/APIS FOR EXACTLY THIS. They should be making those embeddable. If they are beefed up a bit so that they can network and extend capabilities in that way... AWESOME! If they rely solely on those same protocols and APIs... still awesome, then 3rd party devs would have the same access. 

Anyway, rant done.

Comments

Popular Posts