Embeddable Scripting Engines

Embedding a Scripting Language in your Program or System

You might want to consider embedding a scripting language into your program. Why? There are many good reasons for embedding a scripting language in your program.  Here are just a few:

  1. It allows your users to automate the functions of your program.
  2. Users can add additional processing to existing functions of your program (e.g. pre-conditions specific to their environment.)
  3. Your users will find ways to integrate your program with other programs. 
  4. Your users will be able to share these automation scripts with each other, forming a community.
  5. Future-proofing - It is an acknowledgement that you cannot possibly anticipate all the uses of your program or system.

All of these factors will contribute to increasing the usefulness and therefore the value of your program.

But do you know what your options are? You have two options. The first is to write your own scripting language. This is generally not a good idea (unless you're doing it for fun or education). Your best option is to use an existing scripting language. Here's why:
  1. Existing languages are generally well documented.
  2. You can leverage existing scripting communities and code.
  3. Your users can find others who can  help them (friends, colleagues, family,etc.) because you chose a mainstream language.
  4. The API for embedding is already designed.
  5. You have less debugging because any bugs are probably in your code, not the scripting language engine.
The scripting engine (or interpreter) you choose must be implemented in the same language (or runtime if JVM or MS .NET) as your host program.  So, if the host program is in C, use a scripting language implemented in C.  There are of course some exceptional cases but this is the rule.

JVM (Java) programmers have an additional option through the Apache Bean Scripting Framework (BSF). Effectively, this framework makes is possible to access your application objects through a standard API. If you are programming in a JVM language, you should take a look and see if it work for you.

In the Microsoft world, there is the .NET API and COM (which is not used as much any more). CORBA and SWIG provide methods to integrate external programs together.

Reasons why you should not use a general purpose scripting language.
  1. Some can be too hard for beginners to learn
  2. Security concerns (all that power can be a problem sometimes.)
  3. Potential Architectural mismatch
  4. Memory management mismatch
  5. Data representation mismatch
  6. Customer support costs can increase if applications are modified in unexpected ways
The following are all language implementations that you can embed into your programs, allowing end users to access a full programming language to extend and enhance your program. Effectively, it helps turn your system into a platform.

ImplementationDialect ofImplementation/Host Language
Angelscript AngelScriptC
AppleScript AppleScriptObjective C
Bean Scripting Framework(BSF) XMLJava
BeanShell BeanShell (Java-like)Java
DMDScript JavaScriptC++ and D
Falcon FalconC
Groovy Groovy (Java-like)Java
Guile (GNU) SchemeC
IronPython Python.NET
IronRuby Ruby.NET
JavaScript (Google V8) JavaScriptC
JavaScript (Mozilla Rhino) JavaScriptJava
JavaScript (Mozilla Spidermonkey) JavaScriptC
JRuby RubyJava
JudoScript Judo (90% JavaScript)Java
Jython PythonJava
KornShell Unix ShellC
Lua LuaC
NeoLua LuaMicrosoft .NET
Open Object REXX REXXC
z/VM REXX REXXAssembly Language, PL/S ?
Perl PerlC
Pike PikeC
Python PythonC
Ruby RubyC
SlangSLang (C like)C
SLEEPSLEEP (Perl-like)Java
Visual Basic for Applications Visual BasicC
Open Office Basic Visual BasicC

Another option to consider if you don't want a full programming language is a limited expression language. In Java and .Net, this is a popular option. These allow your users to write expressions, such as math expressions or even object graph manipulation.  But it also allows you to limit your application's exposure to security problems due to embedding too powerful a language that could be abused by malicious users or cause accidental problems by inexperienced users.

ImplementationDialect ofImplementation Language
Jelly XMLJava


Examples of scripted applications are helpful to read because you can learn alot about how to use these various languages.
There are thousands of applications that included embedded languages. Here are just a few applications that have built-in scripting languages that you probably already know.
  • Visual Basic for Applications is used to script MS Word, Excel, Visio, and Outlook.
  • Open Office has an open source scripting framework that might be reusable.
  • GNU Guile is used to script GnuCash and LilyPad
  • GNU Emacs has EmacsLisp.
  • Mainframes running VM/CMS have XEDIT which is scripted in REXX.
  • Your web-browser, which is scripted in JavaScript and CSS.
  • Dynamics Starsiege:Tribes video game was scripted in TribeScript. 
  • Id Software's DOOM3 has a scripting language.
  • IBM WebSphere Application Server uses Jython and JACL.
  • Crytek CryEngine (used in many games) uses Lua.
  • Autolisp (a derivative of XLISP) is used inside AutoCad.
  • The Fossil source control system uses a limited version of TCL called TH1 (link to PDF).

Modes/models of scripting
  • Unix Pipe and filters
  • Same application (embedded), mainline is C but macros written in script
  • Scripting, augmented with C code (think Foreign Function Interface or FFI)
  • Same operating system (local API, COM, D-Bus, etc.)
  • Network based API (CORBA, Web Services, REST, etc.)
Synchronicity Considerations:

Most embedded scripting is single threaded and synchronous. That is, a performance issue in the script direct affects the host program. Also, the host program does not execute while the script is executing.

There is one aspect, multi-threading, which can be a problem.  Event-driven (async) code is a kind of solution but not particularly easy for most casual programmers to deal with.

If the scripting engine is running in its own thread, this can be avoided.

Data Structure Considerations:

TCL always got a lot of grief for not allowing embedded nulls but that's the C null terminated string limitation.  Pascal style strings (length prefix + data) is superior but that's not how the C libraries work.

LUA allows for opaque "application data" which can be passed around (TCL has this too.)  This matters if you don't want the garbage collection system which manages the scripting language's memory to stomp all over your C program's memory.


Popular Posts