5.4. Functions

5.4. Functions
Prev	Chapter 5. Data processing	Next

This chapter describes how functions are linked to the logic tier. It gives an overview on the language bindings available for Wolframe.

For defining database transactions Wolframe introduces a language called TDL (Transaction Definition Language). TDL embeddes the language of the underlaying database (SQL) in a language that defines how sets of elements of input and output are addressed.

This chapter also describes how data types are defined that can be used in data definion languages (DDL) for form desciptions. Forms and their definition will be introduced in a different chapter.

After reading this chapter you should be able to write functions of the Wolframe logic tier on your own.

Be aware that you have to configure a programming language of the logic tier in Wolframe before using it. Each chapter introducing a programming language will have a section that describes how the server configuration of Wolframe has to be extended for its availability.

5.4.1. Transactions in TDL

Introduction

For the description of transactions Wolframe provides the transaction definition language (TDL) introduced here. Wolframe transactions in TDL are defined as functions in a transactional context. This means that whatever is executed in a transaction function belongs to a database transaction. The transaction commit is executed implicitely on function completion. Errors or a denied authorization or a failing audit operation lead to an abort of the database transaction.

A TDL transaction function takes a structure as input and returns a structure as output. The Wolframe database interface defines a transaction as object where the input is passed to as a structure and the output is fetched from it as a structure.

TDL is a language to describe the building of transaction input and the building of the result structure from the database output. It defines a transaction as a sequence of instructions on multiple data. An instruction is either described as a single embedded database command in the language of the underlying database or a TDL subroutine call working on multiple data.

Working on multiple data means that the instruction is executed for every item of its input set. This set can consist of the set of results of a previous instruction or a selection of the input of the transaction function. A "for each" selector defines the input set as part of the command.

Each instruction result can be declared as being part of the transaction result structure. The language has no flow control based on state variables other than input and results of previous commands and is therefore not a general purpose programming language. But because of this, the processing and termination of the program is absolutely predictable.

As possibility to convert the input data before passing it to the database, the transaction definition language defines a preprocessing section where globally defined Wolframe functions can be called for the selected input. To build an output structure that cannot be modeled with a language without control structures and recursion, TDL provides the possibility to define a function as filter for postprocessing of the result of the transaction function. This way it is for example possible to return a tree structure as TDL function result.

The TDL is - as most SQL databases - case insensitive. For clearness and better readability TDL keywords are written in uppercase here. We recommend in general to use uppercase letters for TDL keywords. It makes the source more readable.

Some internals

TDL is compiled to a code for a virtual machine. Setting the log level to DATA will print the symbolic representation of the code as log output. The internals of the virtual machine will be discussed in a different chapter of this book.

Configuration

Each TDL program source referenced has to be declared in the Processor section of the configuration with program <sourcefile>.

Language description

A TDL program consists of subroutine declarations and exported transaction function declarations. Subroutines have the same structure as transaction function blocks but without pre- and postprocessing and authorization method declarations.

Subroutines

A subroutine declaration starts with the Keyword SUBROUTINE followed by the subroutine name and optionally some parameter names in brackets ('(' ')') separated by comma. The declared subroutine name identifies the function in the scope of this sourcefile after this subroutine declaration. The name is not exported and the subroutine not available for other TDL modules. With includes described later we can reuse code. The body of the function contains the following parts:

```
DATABASE <database name list>
```
This optional definition is restriction the definition and availability of the function to a set of databases. The databases are listed by name separated by comma (','). The names are the database id's defined in your server configuration or database names as specified in the module. If the database declaration is omitted then the transaction function is avaiable for any database. This declaration allows you to run your application with configurations using different databases but sharing a common code base.
```
BEGIN <...instructions...> END
```
The main processing block starts with BEGIN and ends with END. It contains all the commands executed when calling this subroutine from another subroutine or a transaction function.

The following pseudocode example shows the parts of a subroutine declaration:

			SUBROUTINE <name> ( <parameter name list> )
			DATABASE <list of database names>
			BEGIN
				...<instructions>...
			END

The DATABASE declaration is optional.

Transaction function declarations

A transaction function declaration starts with the Keyword TRANSACTION followed by the name of the transaction function. This name identifies the function globally. The body of the function contains the following parts:

```
AUTHORIZE ( <auth-function>, <auth-resource> )
```
This optional definition is dealing with authorization and access rights. If the authorization function fails, the transaction function is not executed and returns with error. The <auth-function> references a form function implementing the authorization check. The <auth-resource> is passed as parameter with name 'resource' to the function.
```
DATABASE <database name list>
```
This optional definition is restriction the definition and availability of the function to a set of databases. The databases are listed by name separated by comma (','). The names are the database id's defined in your server configuration. If the database declaration is omitted then the transaction function is avaiable for any database. This declaration allows you to run your application with configurations using different databases but sharing a common code base.
```
RESULT FILTER <post-filter-name>
```
This optional declaration defines a function applied as post filter to the transaction function. The idea is that you might want to return a structure as result that cannot be built by TDL. For example a recursive structure like a tree. The result filter function is called with the structure printed by the main processing block (BEGIN .. END) and the result of the filter function is returned to the caller instead.
```
PREPROC <...instructions...> ENDPROC
```
This optional block contains instructions on the transaction function input. The result of these preprocessing instructions are put into the input structure, so that they can be referenced in the main code definition block of the transaction. We can call any global normalization or form function in the preprocessing block to enrich or transform the input to process.
```
BEGIN <...instructions...> END
```
The main processing block starts with BEGIN and ends with END. It contains all the database instructions needed for completing this transaction.
```
AUDIT [CRITICAL] <funcname...> WITH BEGIN <...instructions...> END
```
This optional block specifies a function that is executed at the end of a transaction. The input of the function is the structure built from the output of the instructions block. If CRITICAL is specified then the transaction fails (rollback) if the audit function fails. Otherwise there is just the error of the audit function logged, but the transaction is completed (commit). You can specify several audit functions. The variables in the instructions block refer to the scope of the main processing block. So you can reference everything that is referencable after the last instruction of the main processing block.
```
AUDIT [CRITICAL] <funcname...> ( <...parameter...> )
```
If the input structure of the audit function is just one parameter list this alternative syntax for an audit function declaration can be used. You simply specify the audit function call after the AUDIT or optionally after the CRITICAL keyword.

The following pseudo code snippet shows the explained building blocks in transaction functions together:

	
			TRANSACTION <name>
			AUTHORIZE ( <auth-function>, <auth-resource> )
			DATABASE <list of database names>
			RESULT FILTER <post-filter-name>
			PREPROC
				...<preprocessing instructions>...
			ENDPROC
			BEGIN
				...<instructions>...
			END
			AUDIT CRITICAL <funcname> ( ...<parameter>... )

The lines with AUTHORIZE,DATABASE and RESULT FILTER are optional. So is the preprocessing block PREPROC..ENDPROC. A simpler transaction function looks like the following:

	
			TRANSACTION <name>
			BEGIN
				...<instructions>...
			END

Main processing instructions

Main processing instructions defined in the main execution block of a subroutine or transaction function consist of three parts in the following order terminated by a semicolon ';' (the order of the INTO and FOREACH expression can be switched):

```
INTO <result substructure name>
```
This optional directive specifies if and where the results of the database commands should be put into as part of the function output. In subroutines this substructure is relative to the current substructure addressed in the callers context. For example a subroutine with an "INTO myres" directive in a block of an "INTO output" directive will write its result into a substructure with path "output/myres".
```
FOREACH <selector>
```
This optional directive defines the set of elements on which the instruction is executed one by one. Specifying a set of two elements will cause the function to be called twice. An empty set as selection will cause the instruction to be ignored. Without quantifier the database command or subroutine call of the instruction will be always be executed once.
The argument of the FOREACH expression is either a reference to the result of a previous instruction or a path selecting a set of input elements.
Results of previous instructions are referenced either with the keyword RESULT referring to the result set of the previous command or with a variable naming a result set declared with this name before.
Input elements are selected by path relative to the path currently selected, starting from the input root element when entering a transaction function. The current path selected and the base element of any relative path calculated in this scope changes when a subroutine is called in a FOREACH selection context. For example calling a subroutine in a 'FOREACH person' context will cause relative paths in this subroutine to be sub elements of 'person'.
```
DO <command>
```
Commands in an instruction are either embedded database commands or subroutine calls. Command arguments are either constants or relative paths from the selector path in the FOREACH selection or referring to elements in the result of a previous command. If an argument is a relative path from the selector context, its reference has to be unique in the context of the element selected by the selector. If an argument references a previous command result it must either be unique or dependent an the FOREACH argument. Results that are sets with more than one element can only be referenced if they are bound to the FOREACH quantifier.

Main processing example

The following example illustrate how the FOREACH,INTO,DO expressions in the main processing block work together:

				
TRANSACTION insertCustomerAddresses
BEGIN
    DO SELECT id FROM Customer
        WHERE name = $(customer/name);
    FOREACH /customer/address
        DO INSERT INTO Address (id,address)
        VALUES ($RESULT.id, $(address));
END

Preprocessing instructions

Preprocessing instructions defined in the PREPROC execution block of a transaction function consist similar to the instructions in the main execution block of three parts in the following order terminated by a semicolon ';' (the order of the INTO and FOREACH expression can be switched and has no meaning, e.g. FOREACH..INTO == INTO..FOREACH):

```
INTO <result substructure name>
```
This optional directive specifies if and where the results of the preprocessing commands should be put into as part of the input to be processed by the main processing instructions. The relative paths of the destination structure are calculated relative to a FOREACH selection element.
```
FOREACH <selector>
```
This optional directive defines the set of elements on which the instruction is executed one by one. The preprocessing command is executed once for each element in the selected set and it will not be executed at all if the selected set is empty.
```
DO <command>
```
Commands in an instruction are function calls to globally defined form functions or normalization functions. Command arguments are constants or relative paths from the selector path in the FOREACH selection. They are uniquely referencing elements in the context of a selected element.

Preprocessing example

The following example illustrate how the "FOREACH, INTO, DO" expressions in the main processing block work together:

				
TRANSACTION insertPersonTerms
PREPROC
    FOREACH //address/* INTO normalized
        DO normalizeStructureElements(.);
    FOREACH //id INTO normalized
        DO normalizeNumber(.);
ENDPROC
BEGIN
    DO UNIQUE SELECT id FROM Person
        WHERE name = $(person/name);
    FOREACH //normalized DO
        INSERT INTO SearchTerm (id, value)
        VALUES ($RESULT.id, $(.));
END

Selector path

An element of the input or a set of input elements can be selected by a path. A path is a sequence of one of the following elements separated by slashes:

```
Identifier
```
An identifier uniquely selects a sub element of the current position in the tree.
```
*
```
Anp asterisk selects any sub element of the current position in the tree.
```
..
```
Two dots in a row select the parent element of the current position in the tree.
```
.
```
One dots selects the current element in the tree. This operator can also be useful as part of a path to force the expression to be interpreted as path if it could also be interpreted as a keyword of the TDL language (for example ./RESULT).

A slash at the beginning of a path selects the root element of the transaction function input tree. Two subsequent slashes express that the following node is (transitively) any descendant of the current node in the tree.

Paths can appear as argument of a FOREACH selector where they specify the set of elements on which the attached command is executed on. Or they can appear as reference to an argument in a command expression where they specify uniquely one element that is passed as argument to the command when it is executed.

When used in embedded database statements, selector paths are referenced with $(<path expression>). When used as database function or subroutine call arguments path expressions can be used in plain without '$' and '(' ')' markers. These markers are just used to identify substitution entities.

Path expression examples

The following list shows different ways of addressing an element by path:

```
/
```
Root element
```
/organization
```
Root element with name "organization"
```
/organization/address/city
```
Element "city" of root "organization" descendant "address"
```
.//id
```
Any descendant element with name "id" of the current element
```
//person/id
```
Child with name "id" of any descendant "person" of the root element
```
//id
```
Any descendant element with name "id" of the root element
```
/address/*
```
Any direct descendant of the root element "address"
```
.
```
Currently selected element

Path usage example

This example shows the usage of path expression in the preprocessing and the main processing part of a transaction function:

			
TRANSACTION selectPerson
PREPROC
    FOREACH /person/name
        INTO normalized DO normalizeName( . );
    FOREACH /person
        INTO citycode DO getCityCode( city );
ENDPROC
BEGIN
    FOREACH person
        DO INSERT INTO Person (Name,NormalizedName,CityCode)
            VALUES ($(name),$(name/normalized),$(citycode));
END

Referencing Database Results

Database results of the previous instruction are referenced with a '$RESULT.' followed by the column identifier or column number. Column numbers start always from 1, independent from the database! So be aware that even if the database counts column from 0 you have to use 1 for the first column.

As already explained before, database result sets of cardinality bigger than one cannot be addressed if not bound to a FOREACH selection. In statements potentially addressing more than one result element you have to add a FOREACH RESULT quantifier.

For addressing results of instructions preceding the previous instruction, you have to name them (see next section). The name of the result can then be used as FOREACH argument to select the elements of a set to be used as base for the command arguments of the instruction. Without binding instruction commands with a FOREACH quantifier the named results of an instruction can be referenced as $<name>.<columnref>, for example as $person.id for the column with name 'id' of the result named as 'person'.

The 'RESULT.' prefix in references to the previous instruction result is a default and can be omitted in instructions that are not explicitly bound to any other result than the last one. So the following two instructions are equivalent:

			DO SELECT name FROM Company
			    WHERE id = $RESULT.id
			DO SELECT name FROM Company
			    WHERE id = $id

and so are the following two instructions:

			FOREACH RESULT
			    DO SELECT name FROM Company
			        WHERE id = $RESULT.id
			FOREACH RESULT
			    DO SELECT name FROM Company
			        WHERE id = $id

The result name prefix of any named result can also be omitted if the instruction is bound to a FOREACH selector naming the result. So the following two statements in the context of an existing database result named "ATTRIBUTES" are equivalent:

			FOREACH ATTRIBUTES
			    DO SELECT name FROM Company
			        WHERE id = $ATTRIBUTES.id
			FOREACH ATTRIBUTES
			    DO SELECT name FROM Company
			        WHERE id = $id

Naming database results

Database results can be hold and made referenceable by name with the declaration KEEP AS <resultname> following immediately the instruction with the result to be referenced. The identifier <resultname> references the result in a variable reference or a FOREACH selector expression.

Named Result Example

This example illustrates how a result is declared by name and referenced:

TRANSACTION selectDevices
BEGIN
    DO SELECT id FROM DevIdMap
        WHERE name = $(device/name);
    KEEP AS dev;
    FOREACH dev
        DO SELECT key,name,registration
            FROM Devices WHERE sid=$id;
END

Referencing Subroutine Parameters

Subroutine Parameters are addressed like results but with the prefix PARAM. instead of RESULT. or a named result prefix. "PARAM." is reserved for parameters. The first instruction without FOREACH quantifier can reference the parameters without prefix by name.

SUBROUTINE selectDevice( id)
BEGIN
    INTO device
        DO SELECT name FROM DevIdMap
            WHERE id = $PARAM.id;
END

TRANSACTION selectDevices
BEGIN
    DO selectDevice( id );
END

Constraints on database results

Database commands returning results can have constraints to catch certain errors that would not be recognized at all or too late otherwise. For example a transaction having a result of a previous command as argument would not be executed if the result of the previous command is empty. Nevertheless the overall transaction would succeed because no database error occurring during execution of the commands defined for the transaction.

Constraints on database results are expressed as keywords following the DO keyword of an instruction in the main processing section. If a constraint on database results is violated the whole transaction fails and a rollback occurrs.

The following list explains the result constraints available:

```
NONEMPTY
```
Declares that the database result for each element of the input must not be empty.
```
UNIQUE
```
Declares that the database result for each element of the input must be unique, if it exists. Result sets with more than one element are refused but empty sets are accepted. If you want to declare each result to have to exist, you have to put the double constraint "UNIQUE NONEMPTY" or "NONEMPTY UNIQUE".

Example with result constraints

This example illustrates how to add result constraint for database commands returning results:

				
TRANSACTION selectCustomerAddress
BEGIN
    DO NONEMPTY UNIQUE SELECT id FROM Customer
        WHERE name = $(customer/name);
    INTO address
        DO NONEMPTY SELECT street,city,country
            FROM Address WHERE id = $id;
END

Rewriting error messages for the client

Sometimes internal error messages are confusing and are not helpful to the user that does not have a deeper knowledge about the database internals. For a set of error types it is possible to add a message to be shown to the user if an error of a certain class happens. The instruction ON ERROR <errorclass> HINT <message>; following a database instruction catches the errors of class <errorclass> and add the string <message> to the error message show to the user.

We can have many subsequent ON ERROR definitions in a row if the error classes to be caught are various.

Database error HINT example

The following example shows the usage HINTs in error cases. It catches errors that are constraint violations (error class CONSTRAINT) and extends the error message with a hint that will be shown to the client as error message:

			
TRANSACTION insertCustomer
BEGIN
    DO INSERT INTO Customer (name) VALUES ($(name));
    ON ERROR CONSTRAINT
        HINT "Customers must have a unique name.";
END

On the client side the following message will be shown:

			unique constaint violation in transaction 'insertCustomer'
			-- Customers must have a unique name.

substructures in the result

We already learned how to define substructures of the transaction function result with the RESULT INTO directive of a TRANSACTION. But we can also define a scope in the result structure for sub blocks. A sub-block in the result is declared with


				INTO <resulttag>
				BEGIN
					...<instruction list>...
				END

All the results of the instruction list that get into the final result will be attached to the substructure with name <resulttag>. The nesting of result blocks can be arbitrary and the path of the elements in the result follows the scope of the sub-blocks.

Explicit sefinition of elements in the result

The result of a transaction consists normally of database command results that are mapped into the result with the attached INTO directive. For printing variable values or constant values you can in certain SQL databases use a select constant statement without specifying a table. Unfortunately select of constants might not be supported in your database of choice. Besides that explicit printing seems to be much more readable. The statement INTO <resulttag> PRINT <value>; prints a value that can be a constant, variable or an input or result reference into the substructure named <resulttag>. The following artificial example illustrates this.

			
TRANSACTION doPrintX
BEGIN
  INTO person
  BEGIN
    INTO name PRINT 'jussi';
    INTO id PRINT '1';
  END
END

Database specific code

TDL allows the support of different transaction databases with one code base. For example one for testing and demonstration and one for the productive system. We can tag transactions,subroutines or whole TDL sources as beeing valid for one or a list of databases with the command DATABASE followed by a comma separated list of database names as declared in the configuration. The following example declares the transaction function 'getCustomer' to be valid only for the databases DB1 and DBtest.

			
TRANSACTION getCustomer
DATABASE DB1,DBtest
BEGIN
    INTO customer
        DO SELECT * FROM CustomerData
            WHERE ID=$(id);
END

The following example does the same but declares the valid databases for the whole TDL file. In this case the database declaration has to appear as first declaration in the file.

			
DATABASE DB1,DBtest

TRANSACTION getCustomer
BEGIN
    INTO customer DO SELECT *
        FROM CustomerData WHERE ID=$(id);
END

Subroutine templates

To reuse code with different context, for example for doing the same procedure on different tables, subroutine templates can be defined in TDL. Subroutine templates become useful when we want to make items instantiable that are not allowed to be dependent on variable arguments. Most SQL implementations for example forbid tables to be dependent on variable arguments. To reuse code on different tables you can define subroutine templates with the involved table names as template argument. The following example defines a transaction using the template subroutine insertIntoTree on a table passed as template argument. The subroutine template arguments are substituting the identifiers in embedded database statements by the passed identifier. Only whole identifiers and not substrings of identifiers and no string contents are substituted.

			
TEMPLATE <TreeTable>
SUBROUTINE insertIntoTree( parentID)
BEGIN
    DO NONEMPTY UNIQUE SELECT rgt FROM TreeTable
        WHERE ID = $PARAM.parentID;
    DO UPDATE TreeTable
        SET rgt = rgt + 2 WHERE rgt >= $1;
    DO UPDATE TreeTable
        SET lft = lft + 2 WHERE lft > $1;
    DO INSERT INTO TreeTable (parentID, lft, rgt)
        VALUES ( $PARAM.parentID, $1, $1+1);
    DO NONEMPTY UNIQUE SELECT ID AS "ID" from TreeTable
        WHERE lft = $1;
END

TRANSACTION addTag
BEGIN
    DO insertIntoTree<TagTable>( $(parentID) )
    DO UPDATE TagTable
        SET name=$(name),description=$(description)
            WHERE ID=$RESULT.id;
END

Includes

TDL has the possibility to include files for reusing subroutines or subroutine templates in different modules. The keyword INCLUDE followed by the name of the relative path of the TDL file without the extension .tdl includes the declarations of the included file. The declarations in the included file are treated as they would have been made in the including file instead. The following example swhows the use of include. We assume that the subroutine template insertIntoTree of the example before is defined in a separate include file treeOperations.tdl located in the same folder as the TDL program.

			
INCLUDE treeOperations

TRANSACTION addTag
BEGIN
    DO insertIntoTree<TagTable>( $(parentID) )
    DO UPDATE TagTable
        SET name=$(name),description=$(description)
            WHERE ID=$RESULT.id;
END

Auditing

TDL defines hooks to add function calls for auditing transactions. An audit call is a form function call with a structure build from transaction input and some database results. An auditing function call can be marked as critical, so that the final commit is dependent not only on the transaction success but also on the success of the auditing function call. The following two examples show equivalent calls of audit. One with the function call syntax for calls with a flat structure (only atomic parameters) as parameter and one with the parameter build from a result structure of a BEGIN..END block executed. The later one can be used for audit function calls with a more complex parameter structure.

Audit example with function call syntax

				
TRANSACTION doInsertUser
BEGIN
	DO INSERT INTO Users (name) values ($(name));
	DO SELECT id FROM Users WHERE name = $(name);
END
AUDIT CRITICAL auditUserInsert( $RESULT.id, $(name) )

Audit example with parameter as structure

				
TRANSACTION doInsertUser
BEGIN
	DO INSERT INTO Users (name) values ($(name));
	DO SELECT id FROM Users WHERE name = $(name);
END
AUDIT CRITICAL auditUserInsert WITH
BEGIN
	INTO id PRINT $RESULT.id;
	INTO name PRINT $(name);
END

5.4.2. Functions in .NET

Introduction

You can write functions for the logic tier of Wolframe in languages based on .NET (http://www.microsoft.com/net) like for example C# and VB.NET. Because .NET based libraries can only be called by Wolframe as a compiled and not as an interpreted language, you have to build a .NET assembly out of a group of function implementations before using it. There are further restrictions on a .NET implementation. We will discuss all of them, so that you should be able to write and configure .NET assemblies for using in Wolframe on your own after reading this chapter.

Configuration

For enabling .NET you have to declare the loading of the module 'mod_command_dotnet' in the main section of the server configuration file.

		
    Module mod_command_dotnet

For the configuration of the .NET assemblies to be loaded, see section 'Configure .NET Modules'.

Function interface

Function context

In .NET the building blocks for functions called by Wolframe are classes and method calls. The way of defining callable items for Wolframe is restricted either due to the current state of the Wolframe COM/.NET interoperability implementation or due to general or version dependent restrictions of .NET objects exposed via COM/.NET interop. We list here the restrictions:

The methods exported as functions for Wolframe must not be defined in a nested class. They should be defined in a top level class without namespace. This is a restriction imposed by the current development state of Wolframe.
The class must be derived from an interface with all methods exported declared.
The methods must not be static because COM/.NET interop, as far as we know, cannot cope with static method calls. Even if the methods nature is static, they have to be defined as ordinary method calls.

Function signature

Functions callable from Wolframe take an arbitrary number of arguments as input and return a structure (struct) as output. The named input parameters referencing atomic elements or complex structures are forming the input structure of the Wolframe function. A Wolframe function called with a structure containing the elements "A" and "B" is implemented in .NET as function taking two arguments with the name "A" and "B". Both "A" and "B" can represent either atomic elements or arbitrary complex structures. .NET functions that need to call global Wolframe functions, for example to perform database transactions, need to declare a ProcProvider interface from Wolframe namespace as additional parameter. We will describe the ProcProvider interface in a separate section of this chapter.

Example

The following simple example without provider context is declared without marshalling and introspection tags. It can therefore not be called by Wolframe. We explain later how to make it callable. The example just illustrates the structure of the exported object with its interface (example C#):

		
using System;
using System.Runtime.InteropServices;

public struct Address
{
    public string street;
    public string country;
};

public interface FunctionInterface
{
    Address GetAddress( string street, string country);
}

public class Functions : FunctionInterface
{
    public Address GetAddress( string street, string country)
    {
        Address rt = new Address();
        rt.street = street;
        rt.country = country;
        return rt;
    }
}

Prepare .NET assemblies

Wolframe itself is not a .NET application. Therefore it has to call .NET functions via COM/.NET interop interface of a hosted CLR (Common Language Runtime). To make functions written in .NET callable by Wolframe, the following steps have to be performed:

Make assemblies COM visible

First the asseblies with the functions exported to Wolframe have to be build COM visible. To make the .NET functions called from Wolframe COM visible, you have to tick "Properties/Assembly Information" the switch "Make assembly COM visible". Furthermore every object and method that is part of the exported API (also objects used as parameters) has to be tagged in the source as COM visible with [ComVisible(true)].

Tag exported objects with a Guid

Each object that is part of the exported API has to be tagged with a global unique identifier (Guid) in order to be adressable. Modules with .NET functions will have to be globally registered and the objects need to be identified by the Guid because that's the only way to make the record info structure visible for Wolframe. The record info structure is needed to serialize/deserialize .NET objects from another interpreter context that is not registered for .NET. There are many ways to create a Guid and tag an object like this: [Guid("390E047F-36FD-4F23-8CE8-3A4C24B33AD3")].

Add marshalling tags to values

For marshalling function calls correctly, Wolframe needs tags for every parameter and member of a sub structure of a parameter of methods exported as functions. The following table lists the supported types and their marshalling tags:

Table 5.2. Marshalling Tags

.NET Type	Marshalling Tag
I2	`[MarshalAs(UnmanagedType.I2)]`
I4	`[MarshalAs(UnmanagedType.I4)]`
I8	`[MarshalAs(UnmanagedType.I8)]`
UI2	`[MarshalAs(UnmanagedType.UI2)]`
UI4	`[MarshalAs(UnmanagedType.UI4)]`
UI8	`[MarshalAs(UnmanagedType.UI8)]`
R4	`[MarshalAs(UnmanagedType.R4)]`
R8	`[MarshalAs(UnmanagedType.R8)]`
BOOL	`[MarshalAs(UnmanagedType.BOOL)]`
string	`[MarshalAs(UnmanagedType.BStr)]`
RECORD	no tag needed
array of structures	`[MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_RECORD)]`
array of strings	`[MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_BSTR)]`
array of XX (XX=I2,I4,I8,..)	`[MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_XX)]`

Decimal floating point and numeric types (DECIMAL) are not yet supported, but will soon be available.

Example with COM introspection tags

The following C# module definition repeats the example introduced above with the correct tagging for COM visibility and introspection:

		
using System;
using System.Runtime.InteropServices;

[ComVisible(true)]
[Guid("390E047F-36FD-4F23-8CE8-3A4C24B33AD3")]
public struct Address
{
    [MarshalAs(UnmanagedType.BStr)] public string street;
    [MarshalAs(UnmanagedType.BStr)] public string country;
};

[ComVisible(true)]
public interface FunctionInterface
{
    [ComVisible(true)]  Address GetAddress( [MarshalAs(UnmanagedType.BStr)] string street, [MarshalAs(UnmanagedType.BStr)] string country);
}

[ComVisible(true)]
[ClassInterface(ClassInterfaceType.None)]
public class Functions : FunctionInterface
{
    public Address GetAddress([MarshalAs(UnmanagedType.BStr)] string street, [MarshalAs(UnmanagedType.BStr)] string country)
    {
        Address rt = new Address();
        rt.street = street;
        rt.country = country;
        return rt;
    }
}

Create a type library

For making the API introspectable by Wolframe, we have to create a TLB (Type Library) file from the assembly (DLL) after build. The type library has to be recreated every time the module interface (API) changes. The type library is created with the program tlbexp. All created type library (.tlb) file that will be loaded with the same runtime environment have to be copied into the same directory. They will be referenced for introspection in the configuration. The configuration of .NET will be explained later.

Register the type library

The type library created with tlbexp has also to be registered. For this you call the program regtlibv12 with your type library file (.tlb file) as argument. The type libary fegistration has to be repeated when the the module interface (API) changes.

Register the assembly in the GAC

Wolframe does not accept local assemblies. In order to be addressable over the type library interface assemblies need to be put into the global assembly cache (GAC). Unfortunately this has to be repeated every time the assembly binary changes. There is no way around. For the registration in the GAC we have to call the program gacutil /if <assemblypath> with the assembly path <assemblypath> as argument. The command gacutil has to be called from administrator command line. Before calling gacutil, assemblies have to be strongly signed. We refer here to the MSDN documentation for how to sign an application.

Register the types in the assembly

We have to register the types declared in the assembly to enable Wolframe to create these types. An example could be a provider function returning a structure that is called from a Wolframe .NET function. The structure returned here has to be build in an unmanaged context. In order to be valid in the managed context, the type has to be registered. For the registration of the types in an assembly we have to call the program regasm <assemblypath> with the assembly path <assemblypath> as argument. The command regasm has to be called from administrator command line.

Calling Wolframe functions

Wolframe functions in .NET calling globally defined Wolframe functions need to declare the processor provider interface as an additional parameter. The processor provider interface is defined as follows (example C#):

	
namespace Wolframe
{
    public interface ProcProvider
    {
        object call(
        [In] string funcname,
        [In] object argument,
        [In] Guid resulttype);
    }
}

To use it we have to include the reference to the assembly WolframeProcessorProvider.DLL.

The interface defined there has a method call taking 3 arguments: The name of the function to call, the object to pass as argument and the Guid of the object type to return. The returned object will be created with help of the registered Guid and can be casted to the type with this Guid.

The following example shows the usage of a Wolframe.ProcProvider call. The method GetUserObject is declared as Wolframe function requiring the processor provider context as additional argument and taking one object of type User as argument named usr. The example function implementation redirects the call to the global Wolframe function named GetAddress returning an object of type Address (example C#):

	
public Address GetUserAddress(
            Wolframe.ProcProvider provider,
            User usr
) {
    Address rt = (Address)provider.call(
                     "GetAddress", usr,
                     typeof(Address).GUID);
    return rt;
}

The objects involved in this example need no more tagging because the provider context and also structures (struct) need no additional mashalling tags.

Configure .NET assemblies

.NET modules are grouped together in a configuration block that specifies the configuration of the Microsoft Common Language Runtime (CLR) used for .NET interop calls. The configuration block has the header runtimeEnv dotNET and configures the version of the runtime loaded (clrversion) and the path where the typelibraries (.tlb) files can be found (typelibpath).

With the assembly definitions you declare the registered assemblies to load.

	
RuntimeEnv dotNet
{
    clrversion   "v4.0.30319"
    typelibpath  programs/typelibrary
    assembly     "Functions, Version=1.0.0.30, Culture=neutral, PublicKeyToken=1c1d731dc6e1cbe1, processorArchitecture=MSIL"
    assembly     "Utilities, Version=1.0.0.27, Culture=neutral, PublicKeyToken=1c1f723cc51212ef, processorArchitecture=MSIL"
}

Assembly Declaration

Table 5.3. Attributes of assembly declarations

Name	Description
<no identifier>	The first element of the assembly definition does not have an attribute identifier. The value is the name of the assembly (and also of the type library)
Version	4 element (Major.Minor.Build.Revision) version number of the assembly. This value is defined in the assembly info file of the assembly project.
Culture	For Wolframe applications until now always "neutral". Functionality is in Wolframe not yet culture dependent on the server side.
PublicKeyToken	Public key token values for signed assemblies. See next section how to set it.
processorArchitecture	Meaning not explained here. Has on ordinary Windows .NET plattforms usually the value "MSIL". Read the MSDN documentation to dig deeper.

Get the PublicKeyToken

We already found out that Wolframe .NET modules have to be strongly signed. Each strongly signed assembly has such a public key token that has to be used as attribute when referencing the assembly.

We can get the PublicKeyToken of the assembly by calling sn -T <assemblypath> from the command line (cmd) with <assemblypath> as the path of the assembly. The printed value is the public key to insert as attribute value of PublicKeyToken in the Wolframe configuration for each .NET assembly.

Validation issues

Languages of .NET called via the CLR are strongly typed languages. This means that the input of a function and the output is already validated to be of a strictly defined structure. So a validation by passing the input through a form might not be needed anymore. Validation with .NET data structures is weaker than for example XML validation with forms defined in a schema language. But only if distinguishing XML attributes from content elements is an issue. See in the documentation of the standard command handler how validation can be skipped with the attribute SKIP.

5.4.3. Functions in python

Current development status

You can write functions for the logic tier of Wolframe in the Python programming language (http://www.python.org).

The implementation of Python calls is not yet available. But Wolframe will provide Python functions soon.

5.4.4. Functions in Lua

Introduction

You can write functions for the logic tier of Wolframe with Lua. Lua is a scripting language designed, implemented, and maintained at PUC-Rio in Brazil by Roberto Ierusalimschy, Waldemar Celes and Luiz Henrique de Figueiredo (see http://www.lua.org/authors.html). A description of Lua is not provided here. For an introduction into programming with Lua see http://www.lua.org. The official manual which is also available as book is very good. Wolframe introduces some Lua interfaces to access input and output and to execute functions.

Configuration

For enabling Lua you have to declare the loading of the module 'mod_command_lua' in the main section of the server configuration file.

		
    Module mod_command_lua

Each Lua script referenced has to be declared in the Processor section of the configuration with program <sourcefile>. The script is recognized as Lua script by the file extension ".lua". Files without this extension cannot be loaded as Lua scripts.

Declaring functions

For Lua we do not have to declare anything in addition to the Lua script. If you configure a Lua script as program, all global functions declared in this script are declared as global form functions. For avoiding name conflicts you should declare private functions of the script as local.

Wolframe provider library

Wolframe lets you access objects of the global context through a library called provider offering the following functions:

Table 5.4. Method

Name	Parameter	Returns
form	Name of the form	An instance of the form
type	Type name and initializer list	A constructor function to create a value instance of this type
formfunction	Name of the function	Form function defined in a Wolframe program or module
document	Content string of the document to process	Returns an object of type "document" that allows the processing of the contents passed as argument. See description of type "document"
authorize	1) authorization function 2) (optional) authorization resource	Calls the specified authorization function and returns true on success (authorized) and false on failure (authorization denied or error)

Using atomic data types

Wolframe lets us extend the type system consisting of Lua basic data types with our own. We can create atomic data types defined in a module or in a DDL datatype definition program (.wnmp file). For this you call the type method of the provider with the type name as first argument plus the type initializer argument list as additional parameters. The function returns a constructor function that can be called with the initialization value as argument to get a value instance of this type. The name of the type can refer to one of the following:

Table 5.5. List of Atomic Data Types

Class	Initializer Arguments	Description
Custom data type	Custom Type Parameters	A custom data type defined in a module with arithmetic operators and methods
Normalization function	Dimension parameters	A type defined as normalization function in a module
DDL data type	(no arguments)	A normalizer defined as sequence of normalization functions in a .wnmp source file
Data type 'bignumber'	(no arguments)	Arbitrary precision number type
Data fype 'datetime'	(no arguments)	Data type representing date and time down to a granularity of microseconds

Data type 'datetime'

The data type 'datetime' is used as interface for date time values.

Table 5.6. Methods of 'datetime'

Method Name	Arguments	Description
<constructor>	year, month, day, hour, minute, second ,millisecond, microsecond	Creates a date and time value with a granularity down to microseconds
<constructor>	year, month, day, hour, minute, second ,millisecond	Creates a date and time value with a granularity down to milliseconds
<constructor>	year, month, day, hour, minute, second	Creates a date and time value
<constructor>	year, month, day	Creates a date value
year	(no arguments)	Return the value of the year
month	(no arguments)	Return the value of the month (1..12)
day	(no arguments)	Return the value of the day in the month (1..31)
hour	(no arguments)	Return the value of the hour in the day (0..23)
minute	(no arguments)	Return the value of the minute (0..59)
second	(no arguments)	Return the value of the second (0..63 : 59 + leap seconds)
millisecond	(no arguments)	Return the value of the millisecond (0..1023)
microsecond	(no arguments)	Return the value of the microsecond (0..1023)
__tostring	(no arguments)	Return the date as string in the format YYYYMMDD, YYYYMMDDhhmmss, YYYYMMDDhhmmssll or YYYYMMDDhhmmssllcc, depending on constructor used to create the date and time value.

Data Type 'bignumber'

The data type 'bignumber' is used to reference fixed point BCD numbers with a precision of 32767 digits between -1E32767 and +1E32767.

Table 5.7. Methods of 'datetime'

Method name	Arguments	Description
<constructor>	number value as string	Creates a bignumber from its string representation
<constructor>	number value	Creates a bignumber from a lua number value (double precision floating point number)
precision	(no arguments)	Return the number of significant digits in the number
scale	(no arguments)	Return the number of fractional digits (may be negative, may be bigger than precision)
digits	(no arguments)	Return the significant digits in the number
tonumber	(no arguments)	Return the number as lua number value (double precision floating point number) with possible lost of accurancy
__tostring	(no arguments)	Return the big number value as string (not normalized).

Filter interface iterators

Lua provides an interface to the iterators internally used to couple objects and functions. They are accessible as iterator function closure in Lua. The look similar to Lua iterators but are not. You should not mix them with the standard Lua iterators though the semantic is similar. Filter interface iterators do not return nodes of the tree as subtree objects but only the node data in the order of a pre-order traversal. You can recursively iterate on the tree and build the object during traversal if you want. The returned elements of the Filter interface iterators are tuples with the following meaning:

Table 5.8. Filter interface iterator elements

Tuple First Element	Tuple Second Element	Description
NIL/false	string/number	Open (tag is second element)
NIL/false	NIL/false	Close
Any non NIL/false	string/number	Attribute assignment (value is first, tag is second element)
string/number	NIL/false	Content value (value is first element)

Iterator library

Wolframe lets you access filter interface iterators through a library called iterator offering the following functions:

Table 5.9. Method

Name	Parameter	Returns
scope	serialization iterator (*)	An iterator restricted on the subnodes of the last visited node (**)

(*) See section "serialization iterator"

(**) If iterator.scope is called, all elements of the returned iterator has to be visited in order to continue iteration with the origin iterator on which iterator.scope was called.

Global objects

Besides the provider library Wolframe defines the following objects global in the script execution context:

Name	Description
logger	object with methods for logging or debugging

Using forms

The provider function provider.form( ) with the name of the form as string as parameter returns an empty instance of a form. It takes the name of the form as string argument. If you for example have a form configured called "employee" and you want to create an employee object from a Lua table, you call

			
bcf = provider.form( "employee" )
bcf:fill( {surname='Hans', name='Muster', company='Wolframe'} )

The first line creates the data form object. The second line fills the data into the data form object.

The form method fill takes a second optional parameter. Passing "strict" as second parameter enforces a strict validation of the input against the form, meaning that attributes are checked to be attributes (when using XML serialization) and non optional elements are checked to be initialized. Passing "complete" as second parameter forces non optional elements to be checked for initialization but does not distinguish between attributes and content values. "relaxed" is the default and checks only the existence of filled-in values in the form.

Given the following validation form in simple form DDL syntax (see chapter "Forms"):

			
FORM Employee
	-root employee
{
    ID !@int                  ; Internal customer id (mandatory)
    name !string              ; Name of the customer (mandatory)
    company string            ; Company he is working for (optional)
}

the call of fill in the following piece of code will raise an error because some elements of the form ('ID' and 'name') are missing in the input:

			
bc = provider.form( "employee" ):fill( {company='Wolframe'}, "strict" )

To access the data in a form there are two form methods available. get() returns a filter interface iterator on the form data. There is also a method value() that returns the form data as Lua data structure (a Lua table or atomic value).

Form functions

For calling transactions or built-in functions loaded as modules the Lua layer defines the concept of functions. The provider function provider.formfunction with the name of the function as argument returns a Lua function. This function takes a table or a filter interface iterator as argument and returns a data form structure. The data in the returned form data structure can be accessed with get() that returns a filter interface iterator on the content and value() that returns a Lua table or atomic value.

If you for example have a transaction called "insertEmployee" defined in a transaction description program file declared in the configuration called "insertEmployee" and you want to call it with the 'employee' object defined above as input, you do

			
f = provider.formfunction( "insertEmployee")
res = f ( {surname='Hans', name='Muster', company='Wolframe'} )
t = res:value()
output:print( t[ "id" ] )

The first line creates the function called "insertEmployee" as Lua function. The second calls the transaction, the third creates a Lua table out of the result and the fourth selects and prints the "id" element in the table.

List of Lua objects

This is a list of all objects and functions declared by Wolframe:

Table 5.10. Data forms declared by DDL

Method Name	Arguments	Returns	Description
get		filter interface iterator (*)	Returns a filter interface iterator on the form elements
value		Lua table	Returns the contents of the data form as Lua table or atomic value
__tostring		string	String representation of form for debugging
name		string	Returns the global name of the form.
fill	Lua table or filter interface iterator (), optional validation mode (*)	the filled form (for concatenation)	Validates input and fills the input data into the form.

(*) See section "filter interface iterator"

(**) "strict" (full validation), "complete" (only check for all non optional elements initialization) or "relaxed" (no validation except matching of input to elements)

Table 5.11. Data forms returned by functions

Method Name	Returns	Description
get	filter interface iterator (*)	Returns a filter interface iterator on the form elements
value	Lua table or atomic value	Returns the contents of the data form as Lua table or atomic value
__tostring	string	String representation of form for debugging

(*) See section "filter interface iterator"

Table 5.12. Document

Method Name	Arguments	Description
docformat	-	Returns the format of the document {'XML','JSON',etc..}
as	filter and/or document type table	Attaches a filter to the document to be used for processing
doctype	-	Returns the document type of the content. For retrieving the document type you have first to define a filter.
metadata	-	Returns the meta data structure of the content. For retrieving the document meta data you have first to define a filter.
value	-	Returns the contents of the document as Lua table or atomic value. The method 'table' does the same but is considered to be deprecated.
table	-	Deprecated. Does return the same as the method 'value'
form	-	Returns the contents of the document as filled form instace
get	-	Returns a filter interface iterator (*) on the form elements

(*) See section "filter interface iterator"

Table 5.13. Logger functions

Method Name	Arguments	Description
logger.printc	arbitrary list of arguments	Print arguments to standard console output
logger.print	loglevel (string) plus arbitrary list of arguments	log argument list with defined log level

Table 5.14. Global functions

Function Name	Arguments	Description
provider.form	name of form (string)	Returns an empty data form object of the given type
provider.formfunction	name of function (string)	Returns a lua function to execute the Wolframe function specified by name
provider.type	name of data type (string)	Returns a constructor function for the data type given by name. The name specifies either a custom data type or a normalization function as used in forms or one of the additional userdata types 'datetime' or 'bignumber'.
provider.document	Content string of the document to process	Returns an object of type "document" that allows the processing of the contents passed as argument. See description of type "document"

(*) See section "filter interface iterator "

(**) The filter interface iterator of a defined scope must be consumed completely before consuming anything of the parent iterator. Otherwise it may lead to unexpected results because they share some part of the iterator state.

5.4.5. Functions in native C++

Introduction

You can write functions for the logic tier of Wolframe with C++. Because native C++ is by nature a compiled and not an interpreted language, you have to build a module out of your function implementation.

Prerequisites

For native C++ you need a C++ build system with compiler and linker or an integrated development environment for C++.

Declaring functions

Form functions declared in C++ have two arguments. The output structure to fill is passed by reference as first and the input structure passed is by value. The input structure copy should not be modified by the callee. This means in C++ that it is passed as const reference. The function returns an int that is 0 on success and any other value indicating an error code. The function may also throw a runtime error exception in case of an error. The following example shows a function declaration. The function declaration is not complete because the input output structures need to be declared with some additional attributes needed for introspection. We will explain this in the following section.

Example Function Declaration

The function takes a structure as input and writes the result into an output structure. In this example input and output type are the same, but this is not required. It's just the same here for simplicity.

The elements of the function declaration are put into a structure with four elements. The typedef for the InputType and OutputType structures is required, because the input and output types should be recogniceable without complicated type introspection templates. (Template based introspection might cause spurious and hard to understand error messages when building the module).

The function name returns the name of the function that identifies the function in the Wolframe global scope.

The exec function declared as static function with this signature refers to the function implementation.

		
// ... PUT THE INCLUDES FOR THE "Customer" STRUCTURE DECLARATION HERE !

struct ProcessCustomer
{
    typedef Customer InputType; 
    typedef Customer OutputType; 
    static const char* name() {return "process_customer";}

    static int exec( const proc::ProcessorProvider* provider, InputType& res, const OutputType& param);
};

Input/output data structures

For defining input and output parameter structures in C++ you have to define the structure and its serialization description. The serialization description is a static function getStructDescription without arguments returning a const structure that describes what element names to bind to which structure elements.

The following example shows a form function parameter structure defined in C++.

Header file

Declares the structure and the serialization description of the structure. Structures may contain structures with their own serialization description.

		
#include "serialize/struct/structDescriptionBase.hpp"
#include <string>

namespace _Wolframe {
namespace example {

struct Customer
{
    int ID;                         // Internal customer id
    std::string name;               // Name of the customer
    std::string canonical_Name;     // Customer name in canonical form
    std::string country;            // Country
    std::string locality;           // Locality

    static const serialize::StructDescriptionBase* getStructDescription();
};

}}//namespace

Source file

Declares 'ID' as attribute and name, canonical_Name, country, locality as tags. The '--' operator marks the end of attributes section and the start of content section.

		
#include "serialize/struct/structDescription.hpp"

using namespace _Wolframe;

namespace {
struct CustomerDescription :public serialize::StructDescription<Customer>
{
    CustomerDescription()
    {
        (*this)
        ("ID", &Customer::ID)
        --
        ("name", &Customer::name)
        ("canonical_Name", &Customer::canonical_Name)
        ("country", &Customer::country)
        ("locality", &Customer::locality)
        ;
    }
};

const serialize::StructDescriptionBase* Customer::getStructDescription()
{
    static CustomerDescription rt;
    return &rt;
}

Writing the module

Now we have all pieces together to build a loadable Wolframe module with our example C++ function. The following example shows what you have to declare in the main module source file.

Module declaration

The module declaration needs to include appdevel.hpp and of course all headers with the function and data structure declarations needed. The module starts with the header macro CPP_APPLICATION_FORM_FUNCTION_MODULE with a short description of the module. What follows are the function declarations declared with the macro CPP_APPLICATION_FORM_FUNCTION. This macro has the following arguments in this order:

Name	Description
NAME	identifier of the function
FUNCTION	implementation of the function
OUTPUT	output structure of the function
INPUT	input structure of the function

The declaration list is closed with the parameterless footer macro CPP_APPLICATION_FORM_FUNCTION_MODULE_END. The following example shows an example module declaration:

		
#include "appDevel.hpp"
// ... PUT THE INCLUDES FOR THE "ProcessCustomer" FUNCTION DECLARATION HERE !

#include "customersFunction.hpp"

using namespace _Wolframe;

WF_MODULE_BEGIN( "ProcessCustomerFunction", "process customer function")
WF_FORM_FUNCTION("process_customer",ProcessCustomer::exec,Customer,Customer)
WF_MODULE_END

Building the module

For building the module we have to include all modules introduced here and to link at against the wolframe serialization library (wolframe_serialize) and the wolframe core library (wolframe).

Using the module

The module built can be loaded as the other modules by declaring it in the wolframe LoadModules section of the configuration. Simply list it there with module <yourModuleName> with <yourModuleName> being the name or path to your module.

Validation issues

C++ is a strongly typed language. This means that the input of a function and the output is already validated to be of a strictly defined structure. So a validation by passing the input through a form might not be needed anymore. The constructs used to describe structures of Wolframe in native C++ are even capable of describing attributes like used in XML (section 'Input/Output Data Structures' above). See in the documentation of the standard command handler how validation can be skipped with the attribute SKIP.

Prev	Up	Next
5.3. Command handler	Home	5.5. Forms