Latest Stable Version: None
Latest Unstable Version: 0.8.5
Your generous donations are appreciated!
PyAIML is an interpreter for AIML, the Artificial Intelligence Markup Language, implemented as a 100% pure standard Python package. It was developed as an extension to Howie, an AIML chatterbot project I've been working on (shameless plug!). Howie was originally built on top of J-Alice, a C++ AIML interpreter, but when J-Alice became too difficult to compile I decided that it made more sense to switch to a native Python interpreter. I soon discovered that AIML hadn't been (publicly) ported to Python, and...well, now it has!
I have three main design goals for this project. The first is to continue to rely on nothing but standard Python. Dependence on external libraries is a pain; this is one of the main reasons I switched away from J-Alice.
The second goal is 100% compliance with the AIML 1.0.1 standard: no less, but also no more. Interpreters that support extra non-standard tags only serve to encourage non-portable AIML. Personally, I believe there are some great non-standard AIML tags in common use (for instance, the <secure> tag is a great shortcut, and I'm quite fond on the optional "mode" attribute in <system> tags). I'd love to see these features added to the AIML standard. But until they are, PyAIML won't support them.
The final goal is to avoid feature creep. PyAIML's focus is (and will remain) bare-bones AIML interpreting. That means no support for obscure communication protocols, no advanced non-standard features, and no user interface to speak of. All of these features are easy enough to implement in Python that I'm willing to leave them as an exercise to the bot developers.
Actually, that reminds me: it's worth mentioning that PyAIML is intended for use by developers only. If you're looking to download a fully-functional AIML chatterbot program, PyAIML isn't for you (but may I once again recommend Howie?). If you're looking to write a fully-functional AIML chatterbot program, please read on!
All downloads can be found at the SourceForge project page. In addition to source & binary distributions of the PyAIML package, I've also posted the Standard AIML set, so you can have some working AIML files to play with.
For those who just can't wait, the absolute latest source code is always available through Git. I have a strict policy about not checking in (known) buggy or incomplete code, so you can be reasonably sure that what you're getting is pretty stable. To grab a snapshot of the very latest sources, run the following commands:
git clone git://pyaiml.git.sourceforge.net/gitroot/pyaiml/pyaiml
Here's some basic sample code that demonstrates how to use the PyAIML package.
First of all, import the package:
Next, you need to instantiate a Kernel object. The Kernel is the only class you should ever have to use in your dealings with PyAIML.
k = aiml.Kernel()
The Kernel is now ready to respond to user input -- but it doesn't have anything to say. So, the next step is to load some AIML files. This is done through the Kernel's learn() method:
In this example, we've loaded the startup file from the Standard AIML set. It defines a single AIML pattern, "LOAD AIML B", which in turn causes the rest of the set to load. To trigger this process, we pass "load aiml b" as input to the Kernel:
k.respond("load aiml b")
The respond() method returns a string containing the Kernel's response to the input, but in this case we ignore the response. We're now ready to start responding intelligently to user input. The following line sets up the input/output loop:
while True: print k.respond(raw_input("> "))
That's all you need to know to get started!
If your bot's AIML files aren't changing much, you can significantly reduce the startup time by using the Kernel's loadBrain() and saveBrain() methods. These functions let you dump the contents of your bot's "brain" to a file on disk. The next time your bot runs, rather than re-parsing all the AIML files, it can just load its brain from the last session and pick up right where it left off!
First, you need to create the brain dump. First, create a Kernel object and load up the standard AIML set. Remember that just loading "std-startup.xml" isn't enough -- the bulk of the AIML files aren't loaded until you pass "load aiml b" to the Kernel's respond() method!
import aiml k = aiml.Kernel() k.learn("std-startup.xml") k.respond("load aiml b")
Now that the brain is populated with knowledge, use the saveBrain() method to dump the brain to disk:
To reload the brain, just use the loadBrain() method. We'll create a new, empty Kernel object so we can be sure we're starting from a clean slate.
k2 = aiml.Kernel() k2.loadBrain("standard.brn")
That's all there is to it! You should favoring loading brains vs. loading AIML files whenever possible, since loading a brain is roughly three times as fast as re-parsing the AIML files!
NOTE: if you load a bot's brain from disk, all the existing contents of the brain will be overwritten! However, after the brain is loaded, you can continue to learn() new AIML files just as before, and their contents will be added to the brain.
The Kernel provides a bootstrap() method that you can use to simplify your bot's initialization process. The bootstrap() method takes the following optional arguments: brainFile, learnFiles and commands. The actual process that bootstrap() uses to initialize the Kernel object is as follows:
Using the bootstrap() method, the initialization process can be reduced to the following idiom:
import aiml import os.path k = aiml.Kernel() if os.path.isfile("standard.brn"): k.bootstrap(brainFile = "standard.brn") else: k.bootstrap(learnFiles = "std-startup.xml", commands = "load aiml b") k.saveBrain("standard.brn")
There are two more common initialization tasks to take care of: setting the verbosity level, and setting your bot's name. The verbosity setting determines whether warnings and non-critical error messages are printed to the text console at runtime. These messages can be useful when tracking down problems, but other times you don't want to deal with them. To set the verbosity mode, use the Kernel's verbose() method. Kernel.verbose(True) enables verbose mode, and Kernel.verbose(False) disables it. Verbose mode is enabled by default.
Your bot has a name associated with it. To set your bot's name, use the Kernel's setBotPredicate("name","newName") method. The name you provide must be a single word! The bot's name is "Nameless" by default. You can query the bot's current name with Kernel.getBotPredicate("name").
Sometimes, you want your bot to carry on multiple conversations at once. While you could just pass the input of two or more users to Kernel.respond(), your bot (and your users) might get confused as a result. The bot could call people by the wrong names, refer to earlier conversation topics that never happened, and so on.
You could solve this problem by creating a separate Kernel object for each conversation, but that results in a lot of wasted memory. Each individual Kernel would have to load and store the exact same AIML sets, and those things aren't small.
To prevent these problems, PyAIML supports the concept of multiple sessions. Each session represents a single conversation with one or more users, and each one is stored independently from all the others (thus avoiding conversational cross-pollination). And you only need to load the AIML set once!
To make use of multiple sessions, you use the optional sessionID parameter (supported by the respond(), getPredicate() and setPredicate() Kernel methods). The session ID can be anything: a string containing the user's name, an IRC channel name, an IP address...so long as it uniquely idenitifies the conversation. So, if you're having a conversation with two users, Alice and Bob, and you receive some new input from Bob, you'd make the following call:
response = Kernel.respond(input, "Bob")
That's it -- PyAIML takes care of the rest.
As implemented, all session data is cleared whenever a Kernel object is destroyed (such as when your bot exits). This means that all the data the Kernel has learned about all the people it's talked with is lost for good! Some people find that prospect rather unappealing, so PyAIML provides some features which make implementing persistent sessions much easier (see below for the rationale for why PyAIML doesn't just implement persistent sessions itself).
The key function to remember is Kernel.getSessionData(sessionID). This function takes a session ID as its argument, and returns a dictionary containing all the predicates currently defined in the specified session, as well as their current values. You can then write the data to disk however you'd like (I recommend the standard marshal module, which is faster than pickle and cPickle and just as easy to use). The following snippet saves the "Bob" session to the file "Bob.ses" on disk:
session = Kernel.getSessionData("Bob") sessionFile = file("Bob.ses", "wb") marshal.dump(session, sessionFile) sessionFile.close()
Later, you can restore a session by loading the dictionary from the file on disk, and then repeatedly calling Kernel.setPredicate() on each of its key/value pairs. The following code demonstrates how to restore the state of the "Bob" session that was previously saved to disk:
sessionFile = file("Bob.ses", "rb") session = marshal.load(sessionFile) sessionFile.close() for pred,value in session.items(): Kernel.setPredicate(pred, value, "Bob")
If the sessionID parameter to Kernel.getSessionData(sessionID) is omitted, the function returns a much larger dictionary containing ALL the session dictionaries for every session currently in existence. The keys of this large dictionary are the names of the sessions, and the values are the individual session dictionaries themselves. If you wanted to quickly save all session data, this is a good way to do so.
NOTE: Originally, the module contained fully-implemented support for persistent sessions, which automatically dumped the session data to files on disk after every response. However, I eventually decided that the there was a design trade-off (decreased performance vs. frequency of session file updates) inherent to the implementation, and it would be impssible to please everyone. Hence, the current "roll-your-own" implementation philosophy.
Here are the ones that I'm aware of...
PyAIML is provided under the FreeBSD license.
Copyright 2003-2010 Cort Stratton. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FREEBSD PROJECT OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
(c) 2003 Cort Stratton (firstname.lastname@example.org), with the following exceptions: