Friday, March 14, 2008

Joe Weizenbaum Father of Eliza and A.I.


This morning I stumbled upon the obituary of the man who can be credited, I think, as the father of Artificial Intelligence. Joe Weizenbaum was 85, and died in Germany, a country he fled from in the Nazi era.

In the 1960's he developed the computer program Eliza, the very first chatbot, or personality who responds to what you type, just like my cartoon horse Whinsey does. It's mind boggling to me that this was developed in 1966. When I first discovered Eliza in the late 90's I was critical of her because she seemed sort of stiff, not realizing the history there. Eliza played the role of a psychotherapist to the person hooked up with her. From informationweek.com
I picked up this sample dialog:

"Men are all alike," the patient states.
"In what way," asks the computer-therapist?
"They're always bugging us about something."
"Can you think of a specific example?"
"Well, my boyfriend made me come here."
"Your boyfriend made you come here?"
"He says I'm depressed much of the time."
"I'm sorry to hear you are depressed."
"It's true."

Weizenbaum ultimately turned against the whole idea of A.I., seems as if he regarded it as trickery. In my own experience writing a JavaScript chatbot it is indeed entirely trickery, but at the same time I feel that the vast power of google is creating its own hive like brain.

11 comments:

Jesse said...

Hey there Sally, I'm sorry to hear that Whinsey's grandfather passed away.

I was also reading about him and his "man behind the curtain" rant a few days ago while researching some other chatbots, ALICE, Program D, etc.

I thought perhaps I could put one of these together, but the corpus of knowledge seems to rigid.. at 40,000 patterns that's just too much input to enter by hand in order to get a coherent persona to come out.

But you keep pecking away at your side, and I will at mine, one of these days we'll have appliances that think. :)

Anonymous said...

This history is fascinating. Thanks for posting. That dialog you posted is interesting: it validates in the formal way some therapists might! Sally, I think it's very clever that you took the example you saw and ran with it to create such an original, fun Whinsey! I'm impressed!

Linda Davick said...

Jesse: Please do keep pecking away, you and Sally, and one of these days we'll have appliances that thimk, if nothing else.

Sally said...

Jesse, I'm really surprised google hasn't come out with a chatbot yet. I could see it lightly sprinkled with ad links. One pattern I had working for a while was to grab the longest word in a response, assume that was the important word of the conversation, then go to google or other sources, thesaurus.com eg and grab a synonym, throw that back in the reponse, with some um, like, uh, sprinkled in. Because Whinsey's a blonde character it would work well.

I haven't checked the logs in a long time- that may not be working anymore.

Sal, that example was better than most of Eliza's responses, but then she was from a long time ago. It was mostly echo talk.

Linda, glad to know you're checking in and thimking about things. We're thimking of you too.

Namowal (Jennifer Bourne) said...

Sally,
I had a hunch Whinsey might have been Google dipping when stuff I mentioned showed up on her television set.
Chatbots are fascinating, and I'm impressed that you're code savy enough to make one from scratch. I have a Personality Forge bot, but the forge has done the codework, all I did was fill in words and what the bot might say when he heard them.
How difficult was it to learn code?

Anonymous said...

what was that comment about blondes....

Sally said...

Namowal, I'm not going to say it's all easy, but for someone like you it would be a possible project that you might enjoy.

When I first built Whinsey I could barely make sense of a Flash timeline. It was not at all like animation exposure sheets, because to start with, it was horizontal not vertical. (actually makes more sense that way.) However, I'd already spent much time using JavaScript to manipulate images and simulate animation.

My bot has always used regular expressions, a particular part of the JavaScript language that intensely tracks down what it needs from text. It can look totally unworldy at first, then becomes fun.

quest[168]=/\bhuh|comprendo|wha($|\b|\W)?\b/i

that's a regular expression.

Because Flash was unable to process regular expressions until its latest incarnation, I had to put all the JavaScript in an external file which my Flash file talked to. Lots of inherent problems there.

Then in addition I added various php scripts during a brief "feelin' smart" period which worked for about as long as the feelin' smart lasted.

What they did was grab info from external live sources. Like "What's New?" might bring in a paraphrase of the quoted cnn headlines, maybe with an umm in the middle. But as soon as the layout of any of those sources changed, the scripts stopped working, because they couldn't find the scripted start and stop code elements where the info would be.

The basic structure is a series of numbered simple regular expressions which are ticked through until a match is found. Then a corresponding numbered answer is put together. In addition each answer got an emotional code so Whinsey would play out accordingly.

In addition I wrote a php script to log the answers, but I always dread looking at them, and I never recorded ip addresses or dates.

I used to test her with "Who was Sylvia Plath?" and for awhile there she could answer that.

Now Flash accomodates regular expressions, and I just looked at an example I put together that does work. Unfortunately the new Flash code system is ultra ultra fussy, and it makes it less fun to play around with. It sort of requires that you have the entire project in mind and in place before you begin, whereas in old sloppy Flash script you could cobble it as you went along.

I may still try to present Whinsey in Flash only. But for cel phones, the scripting language is Flash 5 or before.

There is a way to make a learning bot, and that's where you'd want to take this I think. There are also Flash cookies that can store info so you'd come back and get a personal greeting.

I've never gotten any offers for jobs based on Whinsey, nor made any money from her.

And I'm not as code obsessed as I used to be. It can add to sleep troubles. Whinsey's brain is at least 4000 lines of code if you include all the php stuff.

Longer answer than anyone wanted to hear!

Jesse said...

The process you describe Sally where you get data from a web page and then the page changes format so your pattern matcher breaks, this is officially called "Data Mining".. and yep, it is a frustrating task all the way.

The best way to proceed is with an API. Some services, like Google's search for example, have an Application Programming Interface made just for people who write scripts.

They use a protocol that is designed not to change enough to break anyone, and they use methods like XML (or JSON or YAML) that are easier to parse in JavaScript or in Flash than trying to outsmart presentational HTML.

Now some services do not provide API's because they are afraid of viewers getting data but not getting ads. Needless to say, those folks will slowly go out of business, so don't worry your head. ;)

I think some tricks like that should help make the project fun again. :)

Sally said...

Jesse, Thanks very much for this information. It got me started recoding Whinsey in ActionScript 3. I made quite a bit of progress with it yesterday, but I'd like to ask you more about data mining and google. I looked at their apis but got a bit lost.

Jesse said...

That's great news! I've woken the sleeping coder. :)

In case this conversation gets too long and you'd like to move it out of the blog comments, you can also email me at jesset@gmail.com.

So first, I'll outline the general structure of how such an API commonly works. I have some links to some helpful resources as well. If you're stuck, I'm interested to know which step is holding up the show.

* First off, Google wants you to use an API key for your API work. You use their site to sign up for a key, and they generate one for you (dozens of letters and numbers long) This acts a bit like a username/password for you, and it only works from one domain name for their javascript or actionscript based interface. They use it to cap how many API requests you generate to less than a quadrillion. :) The key can be obtained here.

* Next step is to write up some javascript or actionscript code that uses the API. Google has examples for how to set up the environment, you include javascript files from their domain which create classes and objects you can call out to in your code.

* When you use this interface (when you call their functions and use the objects they have generated) you often need to form your questions in XML, and get your answers back in XML. Javascript and Actionscript have armies of functions designed to help break down XML so you can pull out just the parts that you want.

* Specifically for the Actionscript environment, I googled a bit and found an example where someone had packaged together some code to make using the Google API in actionscript a lot simpler.

And lastly, I am lead to understand that the best way to foster learning in an Eliza/Alice based system is to log the conversations as you have been doing. Also, you'll want to flag talking points where Whinsey had to heavily rely upon wildcards or make long-shot matches, as those are all signs of innuendo going right over her head. Those discussions are templates for new patterns to create.

So give me head's up if there seems to be a certain step in the process where you are getting snagged up, and I'll be glad to provide more detail in the necessary area.

Anonymous said...

bengali departments pursuance reaping coordinator incidents ignored mathclass mezzo enhancements remedial
semelokertes marchimundui