User:Tone/Wildfire 1.0

From The Dreadnought Project
Jump to: navigation, search

Wildfire Prototypes

Bill and Nick had created the broad outlines of what Wildfire would become before any prototyping was undertaken. The core principles evident in their slides and presentations included:

  • voice dialing
  • foreground/background modes for the interactive agent (i.e., so that she could remain on the line after the user placed a call and re-appear)
  • objects such as messages, sales brochures and such that could be interated over for operations (e.g., "Delete" or "Play") on the selected item

These would provide a clear enough framework for immediate work. The initial ideas "seemed right".

Hardware

Bill had done enough homework to learn about the special telephony trunk lines, line interface cards that could be connect a computer to them with dedicated digital busses that would allow the computer, through other dedicated cards, to hear and talk to users over the telephone network, perform operations similar to a switchboard, and to recognize commands.

For a quick prototype sufficient to illustrate the idea to potential investors, the computer would only need one card: a VPC-100 speech recognizer card made by Voice Processing Corporation of Cambridge. The nature of this card, and its brethren, would do much to define the nature of what Wildfire could be for the next five years. As such, it bears a little discussion. VPC recognizers were all hardware-based. The VPC-100 had a single channel of capacity and was really made for demonstration more than anything. It generally provided discrete recognition of words or phrases.

Equipped with a little daughterboard card, it could be connected directly to a modular RJ-11 phone connector identical to any simple landline.

The four founders obtained a lease on around 3,000 square feet at 245 (265?) Winter Street in Waltham. Working at a Steelcase desk with his own 386 computer brought from home and a copy of Microsoft's C Compiler, Tony plugged in the VPC-100 card and thumbed through the VPC documentation to build a sample application. Calling it up, it answered and said "Hello" and could do other rudiments, such as read back a series of digits spoken by the user. With the source code available, it was easy enough to see how to have the card do speech recognition and handle answer/hangup operations. Tony got to work creating a prototype from these magical rudiments.

First Demo System

Before long (a day? two?), a single-user system was ready for demonstration. A user could call the machine, and an "assistant" speaking in Tony's voice would ask who was calling. Each user identified himself by name or by speaking some digits associate with his account. Then the robotic Tony might say, "You have two messages. The first is from <user name>". The user could say "Play" and hear it, and then the system would be ready for another command, such as "Delete". This series of commands and responses became known, somehat drily, as a "session". Other commands were more complex, in that they caused a "dialog" to ensue before the command was completed and a new command could be employed. The first such command was "Send message", which might go like so:

Send message

to who?

Bill Warner

<Bill Warner> ... go ahead (beep)

Hi, Bill -- just sending you a message to test the system (a press of # or silence would cause the message recording to terminate)

Message sent. Please say a command.

The system as first used had only a few commands — perhaps "Play", "Next message", "Previous message", "Delete", and "Send message". The team had to take turns using the machine, as when it picked up the phone and spoke to one user, the single phone line and channel of recognition was dedicated to the "session". When that user hung up, the next user could try calling in it his account. If new messages arrived while you were speaking to the device, you could only discover them by hanging up and calling in again. These were simplifying "baby steps", as Bill had already envisioned a more conversational "session" during which users could receive messages as phone calls "live".

It was exciting to have something working, but Bill wanted to demonstrate voice dialing and the background operation of the assistant. This was thorny, as we only had the one copper land line and a single line interface, and the way that placing a call through the system was going to work in the long run was to have a second phone line that would be called into use when the user who had called in wanted to call out. The system would act much like an old human switchboard operator would: she would place an outcall on a second line, and then "patch" the first phone line to the second. The system, then, was a telephone switchboard.

But Bill remembered the quirky "calling feature" available to phone line customers called "three-way calling". This allowed a person who had received a call (here, the demo machine) to press the switchhook briefly (this was called "flashing the switch-hook") and get a dialtone. If the system dialled a number here, an outbound call to be placed. Once underway, a second switch-hook "flash" would merge the second call with the first, placing the demo assistant, the incalling user, and the outcalled party together on one three-way conference call. Tony coded it up, and sure enough – it worked!

The demo system gained a few more commands, but the single-channel nature of the device meant it was never going to become much more than it already was. It showed that the team could produce a workable system, and Powerpoint, charisma and Bill's bankable goodwill as a proven tech entrepreneur could show a worthy backer that a rich vision was within reach.