Frustration...

So, I spend a bit of time getting the WP7 Developer SDK up and running on Windows 8 and have a simple idea for a quick app. Not something I planned to publish or anything. Just something to toy with some of the API's.

I decided on a quick app to record, manipulate and save audio from the phone and then set that audio as a ring tone.

The first hurdle was the aforementioned issue with the SDK on Windows 8 the XNA files from the original 7.1 SDK don't install properly on Windows 8 and require a work around and then there was a 7.1.1 which unofficially gets everything else working on Windows 8. I'll stay on the fence here. It is a beta product, but at the same time, MS dev's are supposedly using this OS internally. So I land somewhere in the middle on how I feel about a relatively new SDK not running on this system.

Next was the recording. It could not be harder to find a common sense documentation on this topic. Combine this with some post I read about some pieces of the API changing a bit over time and you have some real fun on your hands. Firstly, the MS example doesn't cover your buffer filling more than once, and since the max buffer size equates to about 1 second it is absolutely useless. Then, no matter how much effort you go to, you still get buffers of unexpected lengths in the end when joining multiple... how I don't know. But eventually I got all of those sorted out... only to find I was regularly losing the last buffer when replaying the stream... why? Because the MS tutorial uses the largest buffer possible and if you stop before the buffer empties... you lose that data. Yes, there are some work-arounds. But it boils down to... if you always take what is in the buffer at the end you often have duplicate data, and if you leave it you often lose data.

I could add a flag and not remove the listeners and process the data until the buffer clears after I stop, but then I have to make stuff async, which for this basic demo... was not something I wanted to do. The other option... shrink the buffer and leave the last bit if any. The buffer is so small now you don't really notice.

Next is manipulation. What I get is a raw byte stream. I could probably invest a ton of time figuring out how to manipulate it. Was hoping someone had gone before and posted examples, but all anyone ever did was make use of the built in functions which adjust volume, pitch and pan. Only the pitch is really relevant and that is a really boring manipulation feature.

Saving... well now all you get from the microphone is raw data which isn't all that useful. Sure I can play it back within my own application. But nothing else. So I find an article explaining how to encode the WAV headers into the stream. Write that. Find out for ringtones you can only use WMA files. F*CK! And guess what? No simple headers for that one. Also, microphone uses wrong bitrate. So dead in the water.

Honestly, it was a good experience. Frustrating? Sure. I probably won't work on any more apps using audio unless the scope is more limited. At the point my app is at now, I could add sliders or another UI control for controlling the pitch and be pretty much on par with about 2000 other voice modifiers on the marketplace. Live and learn I guess. And I do enjoy this self destructive style of development. So I'll probably find equal frustration in my next project.

Next project: make use of the camera in some way. Pray, in advance that camera takes pictures in some sort of useful format.

Comments

Popular Posts