Wednesday, November 27, 2013

Enhanced Dictation in MacOS X 10.9 as an STT Engine for Extant Audio Files

The pressure is on to take affirmative action to make screencasts and other online video more accessible. Of course, this includes eTextbooks that contain video. One important aspect of that challenge is to make video more accessible to persons who are deaf or have difficulty hearing. For video content creators, this means providing a transcript or, better, providing subtitles to that video so that dialogue may be viewed in the same context as the video. This is fast becoming de rigueur.
The problem is that many videos are created without a script that is followed closely by the speakers in that video. Indeed, many important videos are created in ad hoc fashion (interviews, panel discussions, conference presentations and the like) where scripts would be totally inappropriate.
Creating text from speech has become essential to meeting these expectations, especially where all one has to work with is the speech in the audio track of a video. Speech to text (STT) is a bit more difficult than text to speech (TTS) which has been in use much longer.
MacOS X recently introduced Dictation (speech-to-text) as a feature usable in any application that takes text as input. This is quite an advance over having to purchase a two hundred dollar application to accomplish the same end. However, the first iteration of this system required an internet connection so that speech could be uploaded to Apple's servers where it would be turned into text. This created delays and was difficult to use for substantial bodies of text. However, Dictation was given a significant boost in MacOS X 10.9 (Mavericks) with the introduction of
Enhanced Dictation which enables offline use and continuous dictation with live feedback. Enhanced Dictation is NOT enabled by default (see link above for details on how to enable it).
Still, this is a system that assumes a live speaker. There is no obviously easy way to route speech from a recorded file through Apple's Dictation system to produce usable text. That's what this post is all about. You can, in fact, route the speech in an audio file through Apple's speech-to-text subsystem and render very usable text output. It isn't intuitive or Apple-easy but it is something that anyone can accomplish with a bit of determination. Here's how.
The application at the center of this process is
Audio HiJack Pro by Rogue Amoeba ($32 USD). There are two things to set up with this app. The first is to identify the source of the audio. It could be any app that emits audio but I used QuickTime Player X. Thus, I set that app as the audio source as follows:

ss.01


This will capture the audio from anything that this app plays. My
sample audio is from NPR and contains a dramatic reading from noted actor, Sam Waterston and looks like this in QuickTime Player X:


ss.02

This configuration will grab all the audio from QuickTime Player X as it plays the "NPR Gettsyberg Address" audio file. Next, we use Audio HiJack Pro to send that audio to Soundflower (free). To do that we go to the Effects tab and choose Auxiliary Device Output from the 4FX menu.

ss.03


The Auxiliary Device Output plug-in enables us to choose the previously installed Soundflower as the recipient of the HiJacked audio as follows:

ss.04

Once installed, Soundflower becomes an input/output option in your Sound preference pane and everywhere else audio sources and destinations can be specified. In other words, it becomes an integral part of your sound system in MacOS X.

Finally, we set the Dictation input to be Soundflower as follows:

ss.05

At this point, any audio played by QuickTime Player X will be routed to Soundflower and will thus become available to any application that accepts text input and has a Start Dictation menu item. In Pages, that looks like:


ss.06

The following screencast illustrates this process from start to finish:




A very special "Thank You" to Chris Barajas at Rogue Amoeba who patiently worked me through the intricacies of the in Auxiliary Devices Output plug-in for Audio HiJack Pro.

Tuesday, September 3, 2013

Using SVG images in iBooks Author

The primary advantage of Scalable Vector Graphics (SVG) files is that a very small file can be scaled up to yield large images without the aliasing (jaggies) that appears when a bitmapped graphic is scaled up. SVG files are resolution independent, usually non-photographic and commonly carry the suffix *.svg. There are lots of free SVG files available on the Internet and there are many applications for creating SVG files such as the free, open source Inkscape. SVG files can now be viewed via drag & drop onto web browsers such as Firefox, Chrome or Safari or via the free MacOS X app, Gappin. For an excellent primer on vector graphics, see this Wikipedia article.

So the advantage of vector graphics to iBooks Author is that these images can be scaled up without making those images fuzzy with jagged edges. A single U.S. state map vector image, for example, could be used to show full screen versions of each state without any visual loss. Complex diagrams such as those produced by OmniGraffle can be included at any level of detail.

However, it is not possible to use SVG images directly in iBooks Author.  If you attempt to drag and drop an SVG file onto an iBooks Author project, nothing will happen. You'll get no error messages or feedback of any kind. Similarly, apps in the
iWork suite (Pages, Keynote and Numbers) will also refuse to accept SVG files. This post is about how to work around that limitation.

The iBooks Author application has its own Text, Shapes and Graphs menus with which a number of vector graphics can be created. Another option is to use the vector graphics created by Keynote, Numbers and Pages. These can be copied and pasted directly into an iBooks Author project. Graphics created in iBooks Author or any of the iWorks suite applications are vector graphics in PDF file containers, not SVG file containers. PDF files can contain text, bit-mapped graphics and vector graphics. We are most interested in the latter type, vector PDFs. The
OmniGraffle application is a considerably more sophisticated graphics toolset and is capable of exporting both SVG vector drawings and PDF vector images. The latter are compatible with iWork suite and iBooks Author.

That's useful but there is an Internet full of already drawn
SVG images that are in the public domain or CC licensed.  It would be a shame not to have access to that vast library of free vector images.  The work around to this problem is to use this on-line conversion service to convert SVG to PDF and then drag and drop that PDF directly into an iBooks Author project or into one of the iWork apps or OmniGraffle for further manipulation.

Download an *.ibooks file
here that shows how vector graphics created in iBooks Author compare with vector graphics converted from SVG files. The following screencast uses that same multi-touch eBook.


Sunday, July 28, 2013

Using iAd Producer with iBooks Author (example: creating an external video player widget)

The iAd Producer application from Apple has grown considerably since its inception. Originally, it was a highly specialized application that created advertisements for mobile devices from Apple. Those iAds were composed of sophisticated HTML, CSS and Javascript.

Since that inception, it has been expanded to create iTunes LPs for music albums sold in the iTunes Store and iTunes Extras for video sold in the iTunes Store. These, too, rely upon HTML, CSS and Javascript web technologies. Most recently, iAd producer has added iBooks Author HTML widgets to its repertoire. Thus, the following screencast tutorial showing how easy it is to use iAd Producer to create a high quality HTML widget for iBooks Author without writing a single line of code.

This example focuses on creating an HTML widget that plays a video hosted on an external server. This keeps the size of your *.ibooks file down making for quicker downloads and avoiding becoming a burden to iPads already nearly filled to capacity with other books and media.



Try viewing this video in full screen to catch all of the details. Note that Firefox doesn't like MPEG-4 and will refuse to play this. Try any other modern web browser (Safari, Chrome, etc.).

Download the example book to an iPad to get an even better view of how this looks and feels in the hands of your audience.

Download iAd Producer (free developer registration required)

Tuesday, June 18, 2013

Danger: Do Not Duplicate an *.iba File - Use Templates Instead


The setting. I wrote four eTextbooks (of six planned) for a private iTunes U course that I am teaching. I used names such as Unit.01 Getting Started, Unit.03 Capture and so on.  These were uploaded to iTunes U via the iTunes U Course Manager. 

The bad move. Once I got the design of Unit.01 decide upon, I duplicated the *.iba file for Unit.01 and edited it to make the shell for Unit.02 and so on.  What I should have done was to create a template and work from that. Bad move.

The reason. After a bit of back and forth with some very diligent folks at Apple involving both the iTunes U and iBooks Author teams, we discovered that my bad move caused the *.iba files to have the same internal ID and that caused all of the resulting *.ibooks files exported from them to inherit that identical internal ID.  

The cure. So I opened each *.iba file and made a template from it, closing it and adding the prefix "old_" to its file name in the Finder, just to have a fallback in case something went wrong.  After that, I created a new IBA project file from the just created template and saved that under the old name (no prefix) and exporter a new *.ibooks file replacing the old , defective, identity-challenged *.ibooks file. Rinse, repeat with each of the other eTextbooks.

Testing has shown that all of these eTextbooks have regained their individual identities. Opening Unit.04 always results in opening Unit.04. Success.

The moral. Make templates, don't dupe an *.iba file.

Friday, February 8, 2013

HTML Widgets Made Easy: External Video Player Example

The iBookstore limits the size of of *.ibooks files created with iBook Author (henceforth, IBA) to 2 GB and recommends that you keep the size of your iBook file under one gigabyte if possible in order to avoid taking too much space on your readers' iPads as well as to avoid your readers having to endure long download times. Although including video that is internal to your IBA project is a simple drag and drop application using the Media Widget in IBA, that kind of video will very quickly increase the size of your iBook and may place an unwelcome burden on some of your readers.

The alternative is to include external video in your IBA project using a custom made HTML Widget. The big advantage is that a one megabyte HTML Widget can play a 70 megabyte video in your iBook. The downside, of course, is that the reader must have an active internet connection and the availability of the video must be maintained. If a video used in your iBook should become unavailable, you can provide your readers with a free upgrade correcting that issue via the iBookstore's versioning feature.

Unfortunately, many people are persuaded not to use this approach because it involves writing HTML code but this post will offer you a way around that obstacle. If you can use a text editor, you can modify an HTML widget template that plays an external video that you select for your iBook. Here's how:

As you'll learn from
this Apple support document, an HTML widget is nothing more than a collection of text files enclosed in a folder with the suffix ".wdgt" added to it. On the Mac, adding that suffix to the folder name changes the appearance of the folder into a widget icon. The minimum HTML widget contains three files: a Default.png file, an index.html file and an info.plist file. I have prepared an HTML widget that you can use as both an example and a template. It is a ZIP archive containing a complete working widget that you can add to a test IBA project. Once you have it in an IBA project you may use the Preview function to see how it works in the iBooks.app on an iPad. Download that HTML widget here and then double-click on it to extract it.

This example widget plays a video called "Open Access Explained" that is hosted on a server that I have access to. In this tutorial, I will show you how to open the widget and modify it so that it will play another video, one that is on a server that I do not have access to, a video that you choose. All I'll have to do to accomplish this feat is to open the widget, change the Default.png file and edit the text of the index.html and info.plist files so that they reference a different video. It's just that easy.

Of course that video must be playable on an iPad so no Flash. These
tech specs provide all the necessary details. The great thing about video on the iPad is that the HTML 5 video tag works without having to create multiple fallback versions of your video (*.mp4, *.ogg and *webm) as one would have to do on a web site. As long as it's using the MPEG-4 H.264 video and AAC audio CODECs, it can be in either a MOV, MP4 or M4V container. More simply, if the video plays on your iPad, it will play in this widget.

The video that I'll be modifying the widget to use in this tutorial is:
http://movies.apple.com/media/us/mac/ibooks-author/2012/tours/apple-ibooks-tour-ipad_ibooks_author-cc-us-20120314_r848-9cie.mov Because I'll be changing the widget to play another movie with different dimensions, I'll need to create a new Default.png file and change all of the references from the old video to the new video. I'll be using the BBEdit text editor but any plain text editor such as TextEdit will do just as well. Here's a screencast showing how this is done. Click HERE or click the image below.




Caveats: Some video services such as Vimeo and YouTube go to great lengths to tie their hosted video to their own web sites so that they can generate data about you and get paid for exposing advertising to you. Thus, it is just a little more difficult to use these videos in an iBook but it can be done. I may take that topic up in the next post.

NOTES:

• About the "auto play" attribute. You'll notice that I use the optional "auto play" attribute in the HTML 5 "video" tag. All HTML widgets take over the entire screen when invoked. Under iOS 6.x (tested), tapping this external video player widget will bring the poster (Default.png) image to full size atop a white background that also displays the widget's title and caption as well as a close button. A standard iOS "play" icon will be superimposed on the center. The video will begin to play automatically without the reader having to tap this play button. The time that auto play takes depends upon the size of the video and the speed of the network. The operating system tries to estimate when it can play the external video without interruption. The reader can always tap the play button prior to auto play. Simply delete the "auto play" attribute if you rather not have this feature operative.

• About the "controls" attribute. I also use the optional "controls" attribute. This provides the reader with a standard video controller with which they can control playback of the video such as audio volume and two "full screen" options as well as a scrubber for moving the play head to arbitrary points along the time line. Simply delete the "controls" attribute if you'd rather not have this feature be available to your readers. The following image shows these various controls and their effects.




Resources:

Safari HTML 5 Audio and Video Guide

iBooks Author: About HTML widget creation