This chapter describes how functions are linked to the logic tier. It gives an overview on the language bindings available for Wolframe.
For defining database transactions Wolframe introduces a language called TDL (Transaction Definition Language). TDL embeddes the language of the underlaying database (SQL) in a language that defines how sets of elements of input and output are addressed.
This chapter also describes how data types are defined that can be used in data definion languages (DDL) for form desciptions. Forms and their definition will be introduced in a different chapter.
After reading this chapter you should be able to write functions of the Wolframe logic tier on your own.
Be aware that you have to configure a programming language of the logic tier in Wolframe before using it. Each chapter introducing a programming language will have a section that describes how the server configuration of Wolframe has to be extended for its availability.
For the description of transactions Wolframe provides the transaction definition language (TDL) introduced here. Wolframe transactions in TDL are defined as functions in a transactional context. This means that whatever is executed in a transaction function belongs to a database transaction. The transaction commit is executed implicitely on function completion. Errors or a denied authorization or a failing audit operation lead to an abort of the database transaction.
A TDL transaction function takes a structure as input and returns a structure as output. The Wolframe database interface defines a transaction as object where the input is passed to as a structure and the output is fetched from it as a structure.
TDL is a language to describe the building of transaction input and the building of the result structure from the database output. It defines a transaction as a sequence of instructions on multiple data. An instruction is either described as a single embedded database command in the language of the underlying database or a TDL subroutine call working on multiple data.
Working on multiple data means that the instruction is executed for every item of its input set. This set can consist of the set of results of a previous instruction or a selection of the input of the transaction function. A "for each" selector defines the input set as part of the command.
Each instruction result can be declared as being part of the transaction result structure. The language has no flow control based on state variables other than input and results of previous commands and is therefore not a general purpose programming language. But because of this, the processing and termination of the program is absolutely predictable.
As possibility to convert the input data before passing it to the database, the transaction definition language defines a preprocessing section where globally defined Wolframe functions can be called for the selected input. To build an output structure that cannot be modeled with a language without control structures and recursion, TDL provides the possibility to define a function as filter for postprocessing of the result of the transaction function. This way it is for example possible to return a tree structure as TDL function result.
The TDL is - as most SQL databases - case insensitive. For clearness and better readability TDL keywords are written in uppercase here. We recommend in general to use uppercase letters for TDL keywords. It makes the source more readable.
TDL is compiled to a code for a virtual machine. Setting the log level to DATA will print the symbolic representation of the code as log output. The internals of the virtual machine will be discussed in a different chapter of this book.
Each TDL program source referenced has to be declared in the
Processor
section of the configuration with
program <sourcefile>
.
A TDL program consists of subroutine declarations and exported transaction function declarations. Subroutines have the same structure as transaction function blocks but without pre- and postprocessing and authorization method declarations.
A subroutine declaration starts with the Keyword SUBROUTINE
followed by the subroutine name and optionally some parameter names
in brackets ('(' ')') separated by comma.
The declared subroutine name identifies the function in the scope
of this sourcefile after this subroutine declaration.
The name is not exported and the subroutine not available for other TDL
modules. With includes described later we can reuse code.
The body of the function contains the following parts:
DATABASE <database name list>
This optional definition is restriction the definition and availability of the function to a set of databases. The databases are listed by name separated by comma (','). The names are the database id's defined in your server configuration or database names as specified in the module. If the database declaration is omitted then the transaction function is avaiable for any database. This declaration allows you to run your application with configurations using different databases but sharing a common code base.
BEGIN <...instructions...> END
The main processing block starts with BEGIN
and ends with END
. It contains all the
commands executed when calling this subroutine from
another subroutine or a transaction function.
The following pseudocode example shows the parts of a subroutine declaration:
SUBROUTINE <name> ( <parameter name list> ) DATABASE <list of database names> BEGIN ...<instructions>... END
The DATABASE declaration is optional.
A transaction function declaration starts with the Keyword TRANSACTION
followed by the name of the transaction function. This name identifies the
function globally. The body of the function contains the following parts:
AUTHORIZE ( <auth-function>, <auth-resource> )
This optional definition is dealing with authorization and access rights. If the authorization function fails, the transaction function is not executed and returns with error. The <auth-function> references a form function implementing the authorization check. The <auth-resource> is passed as parameter with name 'resource' to the function.
DATABASE <database name list>
This optional definition is restriction the definition and availability of the function to a set of databases. The databases are listed by name separated by comma (','). The names are the database id's defined in your server configuration. If the database declaration is omitted then the transaction function is avaiable for any database. This declaration allows you to run your application with configurations using different databases but sharing a common code base.
RESULT FILTER <post-filter-name>
This optional declaration defines a function applied as post filter to the transaction function. The idea is that you might want to return a structure as result that cannot be built by TDL. For example a recursive structure like a tree. The result filter function is called with the structure printed by the main processing block (BEGIN .. END) and the result of the filter function is returned to the caller instead.
PREPROC <...instructions...> ENDPROC
This optional block contains instructions on the transaction function input. The result of these preprocessing instructions are put into the input structure, so that they can be referenced in the main code definition block of the transaction. We can call any global normalization or form function in the preprocessing block to enrich or transform the input to process.
BEGIN <...instructions...> END
The main processing block starts with BEGIN
and
ends with END
. It contains all the database instructions
needed for completing this transaction.
AUDIT [CRITICAL] <funcname...> WITH BEGIN <...instructions...> END
This optional block specifies a function that is executed at the end of a transaction. The input of the function is the structure built from the output of the instructions block. If CRITICAL is specified then the transaction fails (rollback) if the audit function fails. Otherwise there is just the error of the audit function logged, but the transaction is completed (commit). You can specify several audit functions. The variables in the instructions block refer to the scope of the main processing block. So you can reference everything that is referencable after the last instruction of the main processing block.
AUDIT [CRITICAL] <funcname...> ( <...parameter...> )
If the input structure of the audit function is just one parameter list this alternative syntax for an audit function declaration can be used. You simply specify the audit function call after the AUDIT or optionally after the CRITICAL keyword.
The following pseudo code snippet shows the explained building blocks in transaction functions together:
TRANSACTION <name> AUTHORIZE ( <auth-function>, <auth-resource> ) DATABASE <list of database names> RESULT FILTER <post-filter-name> PREPROC ...<preprocessing instructions>... ENDPROC BEGIN ...<instructions>... END AUDIT CRITICAL <funcname> ( ...<parameter>... )
The lines with AUTHORIZE,DATABASE and RESULT FILTER are optional. So is the preprocessing block PREPROC..ENDPROC. A simpler transaction function looks like the following:
TRANSACTION <name> BEGIN ...<instructions>... END
Main processing instructions defined in the main execution block of a subroutine or transaction function consist of three parts in the following order terminated by a semicolon ';' (the order of the INTO and FOREACH expression can be switched):
INTO <result substructure name>
This optional directive specifies if and where the results of the database commands should be put into as part of the function output. In subroutines this substructure is relative to the current substructure addressed in the callers context. For example a subroutine with an "INTO myres" directive in a block of an "INTO output" directive will write its result into a substructure with path "output/myres".
FOREACH <selector>
This optional directive defines the set of elements on which the instruction is executed one by one. Specifying a set of two elements will cause the function to be called twice. An empty set as selection will cause the instruction to be ignored. Without quantifier the database command or subroutine call of the instruction will be always be executed once.
The argument of the FOREACH expression is either a reference to the result of a previous instruction or a path selecting a set of input elements.
Results of previous instructions are referenced either with the keyword RESULT referring to the result set of the previous command or with a variable naming a result set declared with this name before.
Input elements are selected by path relative to the path currently selected, starting from the input root element when entering a transaction function. The current path selected and the base element of any relative path calculated in this scope changes when a subroutine is called in a FOREACH selection context. For example calling a subroutine in a 'FOREACH person' context will cause relative paths in this subroutine to be sub elements of 'person'.
DO <command>
Commands in an instruction are either embedded database commands or subroutine calls. Command arguments are either constants or relative paths from the selector path in the FOREACH selection or referring to elements in the result of a previous command. If an argument is a relative path from the selector context, its reference has to be unique in the context of the element selected by the selector. If an argument references a previous command result it must either be unique or dependent an the FOREACH argument. Results that are sets with more than one element can only be referenced if they are bound to the FOREACH quantifier.
The following example illustrate how the FOREACH,INTO,DO expressions in the main processing block work together:
TRANSACTION insertCustomerAddresses BEGIN DO SELECT id FROM Customer WHERE name = $(customer/name); FOREACH /customer/address DO INSERT INTO Address (id,address) VALUES ($RESULT.id, $(address)); END
Preprocessing instructions defined in the PREPROC execution block of a transaction function consist similar to the instructions in the main execution block of three parts in the following order terminated by a semicolon ';' (the order of the INTO and FOREACH expression can be switched and has no meaning, e.g. FOREACH..INTO == INTO..FOREACH):
INTO <result substructure name>
This optional directive specifies if and where the results of the preprocessing commands should be put into as part of the input to be processed by the main processing instructions. The relative paths of the destination structure are calculated relative to a FOREACH selection element.
FOREACH <selector>
This optional directive defines the set of elements on which the instruction is executed one by one. The preprocessing command is executed once for each element in the selected set and it will not be executed at all if the selected set is empty.
DO <command>
Commands in an instruction are function calls to globally defined form functions or normalization functions. Command arguments are constants or relative paths from the selector path in the FOREACH selection. They are uniquely referencing elements in the context of a selected element.
The following example illustrate how the "FOREACH, INTO, DO" expressions in the main processing block work together:
TRANSACTION insertPersonTerms PREPROC FOREACH //address/* INTO normalized DO normalizeStructureElements(.); FOREACH //id INTO normalized DO normalizeNumber(.); ENDPROC BEGIN DO UNIQUE SELECT id FROM Person WHERE name = $(person/name); FOREACH //normalized DO INSERT INTO SearchTerm (id, value) VALUES ($RESULT.id, $(.)); END
An element of the input or a set of input elements can be selected by a path. A path is a sequence of one of the following elements separated by slashes:
Identifier
An identifier uniquely selects a sub element of the current position in the tree.
*
Anp asterisk selects any sub element of the current position in the tree.
..
Two dots in a row select the parent element of the current position in the tree.
.
One dots selects the current element in the tree.
This operator can also be useful as part of a path
to force the expression to be interpreted
as path if it could also be interpreted as a keyword of the
TDL language (for example ./RESULT
).
A slash at the beginning of a path selects the root element of the transaction function input tree. Two subsequent slashes express that the following node is (transitively) any descendant of the current node in the tree.
Paths can appear as argument of a FOREACH selector where they specify the set of elements on which the attached command is executed on. Or they can appear as reference to an argument in a command expression where they specify uniquely one element that is passed as argument to the command when it is executed.
When used in embedded database statements, selector paths are referenced
with $(<path expression>)
. When used as database
function or subroutine call arguments path expressions can be used in
plain without '$' and '(' ')' markers. These markers are just used to
identify substitution entities.
The following list shows different ways of addressing an element by path:
/
Root element
/organization
Root element with name "organization"
/organization/address/city
Element "city" of root "organization" descendant "address"
.//id
Any descendant element with name "id" of the current element
//person/id
Child with name "id" of any descendant "person" of the root element
//id
Any descendant element with name "id" of the root element
/address/*
Any direct descendant of the root element "address"
.
Currently selected element
This example shows the usage of path expression in the preprocessing and the main processing part of a transaction function:
TRANSACTION selectPerson PREPROC FOREACH /person/name INTO normalized DO normalizeName( . ); FOREACH /person INTO citycode DO getCityCode( city ); ENDPROC BEGIN FOREACH person DO INSERT INTO Person (Name,NormalizedName,CityCode) VALUES ($(name),$(name/normalized),$(citycode)); END
Database results of the previous instruction are referenced with a '$RESULT.' followed by the column identifier or column number. Column numbers start always from 1, independent from the database! So be aware that even if the database counts column from 0 you have to use 1 for the first column.
As already explained before, database result sets of cardinality
bigger than one cannot be addressed if not bound to a FOREACH
selection. In statements potentially addressing more than one
result element you have to add a FOREACH RESULT
quantifier.
For addressing results of instructions preceding the previous instruction,
you have to name them (see next section). The name of the result can then
be used as FOREACH argument to select the elements of a set to be
used as base for the command arguments of the instruction. Without
binding instruction commands with a FOREACH quantifier the named
results of an instruction can be referenced as
$<name>.<columnref>
,
for example as $person.id
for the column with name 'id' of the
result named as 'person'.
The 'RESULT.' prefix in references to the previous instruction result is a default and can be omitted in instructions that are not explicitly bound to any other result than the last one. So the following two instructions are equivalent:
DO SELECT name FROM Company WHERE id = $RESULT.id DO SELECT name FROM Company WHERE id = $id
and so are the following two instructions:
FOREACH RESULT DO SELECT name FROM Company WHERE id = $RESULT.id FOREACH RESULT DO SELECT name FROM Company WHERE id = $id
The result name prefix of any named result can also be omitted if the instruction is bound to a FOREACH selector naming the result. So the following two statements in the context of an existing database result named "ATTRIBUTES" are equivalent:
FOREACH ATTRIBUTES DO SELECT name FROM Company WHERE id = $ATTRIBUTES.id FOREACH ATTRIBUTES DO SELECT name FROM Company WHERE id = $id
Database results can be hold and made referenceable by name
with the declaration KEEP AS <resultname>
following immediately the instruction with the result to be referenced.
The identifier <resultname> references the
result in a variable reference or a FOREACH selector expression.
Subroutine Parameters are addressed like results but with
the prefix PARAM.
instead of RESULT.
or a named result prefix. "PARAM." is reserved for parameters.
The first instruction without FOREACH quantifier can reference
the parameters without prefix by name.
SUBROUTINE selectDevice( id) BEGIN INTO device DO SELECT name FROM DevIdMap WHERE id = $PARAM.id; END TRANSACTION selectDevices BEGIN DO selectDevice( id ); END
Database commands returning results can have constraints to catch certain errors that would not be recognized at all or too late otherwise. For example a transaction having a result of a previous command as argument would not be executed if the result of the previous command is empty. Nevertheless the overall transaction would succeed because no database error occurring during execution of the commands defined for the transaction.
Constraints on database results are expressed as keywords following the DO keyword of an instruction in the main processing section. If a constraint on database results is violated the whole transaction fails and a rollback occurrs.
The following list explains the result constraints available:
NONEMPTY
Declares that the database result for each element of the input must not be empty.
UNIQUE
Declares that the database result for each element of the input must be unique, if it exists. Result sets with more than one element are refused but empty sets are accepted. If you want to declare each result to have to exist, you have to put the double constraint "UNIQUE NONEMPTY" or "NONEMPTY UNIQUE".
This example illustrates how to add result constraint for database commands returning results:
TRANSACTION selectCustomerAddress BEGIN DO NONEMPTY UNIQUE SELECT id FROM Customer WHERE name = $(customer/name); INTO address DO NONEMPTY SELECT street,city,country FROM Address WHERE id = $id; END
Sometimes internal error messages are confusing and are not helpful
to the user that does not have a deeper knowledge about the database
internals. For a set of error types it is possible to add a message
to be shown to the user if an error of a certain class happens.
The instruction ON ERROR <errorclass> HINT <message>;
following a database instruction catches the errors of class <errorclass>
and add the string <message> to the error message show to the user.
We can have many subsequent ON ERROR definitions in a row if the error classes to be caught are various.
The following example shows the usage HINTs in error cases. It catches errors that are constraint violations (error class CONSTRAINT) and extends the error message with a hint that will be shown to the client as error message:
TRANSACTION insertCustomer BEGIN DO INSERT INTO Customer (name) VALUES ($(name)); ON ERROR CONSTRAINT HINT "Customers must have a unique name."; END
On the client side the following message will be shown:
unique constaint violation in transaction 'insertCustomer' -- Customers must have a unique name.
We already learned how to define substructures of the
transaction function result with the RESULT INTO
directive of a TRANSACTION.
But we can also define a scope in the result structure
for sub blocks. A sub-block in the result is declared with
INTO <resulttag> BEGIN ...<instruction list>... END
All the results of the instruction list that get into the final result will be attached to the substructure with name <resulttag>. The nesting of result blocks can be arbitrary and the path of the elements in the result follows the scope of the sub-blocks.
The result of a transaction consists normally of database command
results that are mapped into the result with the attached INTO directive.
For printing variable values or constant values you can in certain
SQL databases use a select constant statement without specifying a table.
Unfortunately select of constants might not be supported in your
database of choice. Besides that explicit printing seems to be much
more readable. The statement INTO <resulttag> PRINT <value>;
prints a value that can be a constant, variable or an input or result reference
into the substructure named <resulttag>. The following artificial
example illustrates this.
TRANSACTION doPrintX BEGIN INTO person BEGIN INTO name PRINT 'jussi'; INTO id PRINT '1'; END END
TDL allows the support of different transaction databases with one code base.
For example one for testing and demonstration and one for the productive system.
We can tag transactions,subroutines or whole TDL sources as beeing valid for one or a list of databases
with the command DATABASE
followed by a comma separated list of database names as declared in
the configuration. The following example declares the transaction function 'getCustomer' to be valid only for
the databases DB1 and DBtest.
TRANSACTION getCustomer DATABASE DB1,DBtest BEGIN INTO customer DO SELECT * FROM CustomerData WHERE ID=$(id); END
The following example does the same but declares the valid databases for the whole TDL file. In this case the database declaration has to appear as first declaration in the file.
DATABASE DB1,DBtest TRANSACTION getCustomer BEGIN INTO customer DO SELECT * FROM CustomerData WHERE ID=$(id); END
To reuse code with different context, for example for doing the same procedure on different tables, subroutine templates can be defined in TDL. Subroutine templates become useful when we want to make items instantiable that are not allowed to be dependent on variable arguments. Most SQL implementations for example forbid tables to be dependent on variable arguments. To reuse code on different tables you can define subroutine templates with the involved table names as template argument. The following example defines a transaction using the template subroutine insertIntoTree on a table passed as template argument. The subroutine template arguments are substituting the identifiers in embedded database statements by the passed identifier. Only whole identifiers and not substrings of identifiers and no string contents are substituted.
TEMPLATE <TreeTable> SUBROUTINE insertIntoTree( parentID) BEGIN DO NONEMPTY UNIQUE SELECT rgt FROM TreeTable WHERE ID = $PARAM.parentID; DO UPDATE TreeTable SET rgt = rgt + 2 WHERE rgt >= $1; DO UPDATE TreeTable SET lft = lft + 2 WHERE lft > $1; DO INSERT INTO TreeTable (parentID, lft, rgt) VALUES ( $PARAM.parentID, $1, $1+1); DO NONEMPTY UNIQUE SELECT ID AS "ID" from TreeTable WHERE lft = $1; END TRANSACTION addTag BEGIN DO insertIntoTree<TagTable>( $(parentID) ) DO UPDATE TagTable SET name=$(name),description=$(description) WHERE ID=$RESULT.id; END
TDL has the possibility to include files for reusing subroutines
or subroutine templates in different modules.
The keyword INCLUDE followed by the name of the relative path of the TDL file without the extension .tdl includes the declarations of the included file.
The declarations in the included file are treated as they would have been
made in the including file instead.
The following example swhows the use of include.
We assume that the subroutine template insertIntoTree
of the example before is defined in a separate include file
treeOperations.tdl
located in the same folder as the
TDL program.
INCLUDE treeOperations TRANSACTION addTag BEGIN DO insertIntoTree<TagTable>( $(parentID) ) DO UPDATE TagTable SET name=$(name),description=$(description) WHERE ID=$RESULT.id; END
TDL defines hooks to add function calls for auditing transactions. An audit call is a form function call with a structure build from transaction input and some database results. An auditing function call can be marked as critical, so that the final commit is dependent not only on the transaction success but also on the success of the auditing function call. The following two examples show equivalent calls of audit. One with the function call syntax for calls with a flat structure (only atomic parameters) as parameter and one with the parameter build from a result structure of a BEGIN..END block executed. The later one can be used for audit function calls with a more complex parameter structure.
TRANSACTION doInsertUser BEGIN DO INSERT INTO Users (name) values ($(name)); DO SELECT id FROM Users WHERE name = $(name); END AUDIT CRITICAL auditUserInsert( $RESULT.id, $(name) )
You can write functions for the logic tier of Wolframe in languages based on .NET (http://www.microsoft.com/net) like for example C# and VB.NET. Because .NET based libraries can only be called by Wolframe as a compiled and not as an interpreted language, you have to build a .NET assembly out of a group of function implementations before using it. There are further restrictions on a .NET implementation. We will discuss all of them, so that you should be able to write and configure .NET assemblies for using in Wolframe on your own after reading this chapter.
For enabling .NET you have to declare the loading of the module 'mod_command_dotnet' in the main section of the server configuration file.
Module mod_command_dotnet
For the configuration of the .NET assemblies to be loaded, see section 'Configure .NET Modules'.
In .NET the building blocks for functions called by Wolframe are classes and method calls. The way of defining callable items for Wolframe is restricted either due to the current state of the Wolframe COM/.NET interoperability implementation or due to general or version dependent restrictions of .NET objects exposed via COM/.NET interop. We list here the restrictions:
The methods exported as functions for Wolframe must not be defined in a nested class. They should be defined in a top level class without namespace. This is a restriction imposed by the current development state of Wolframe.
The class must be derived from an interface with all methods exported declared.
The methods must not be static because COM/.NET interop, as far as we know, cannot cope with static method calls. Even if the methods nature is static, they have to be defined as ordinary method calls.
Functions callable from Wolframe take an arbitrary
number of arguments as input and return a structure (struct
) as
output. The named input parameters referencing atomic elements or complex
structures are forming the input structure of the Wolframe
function. A Wolframe function called with a
structure containing the elements "A" and "B" is implemented in
.NET as function taking two arguments
with the name "A" and "B". Both "A" and "B" can represent either atomic elements
or arbitrary complex structures. .NET functions
that need to call global Wolframe functions,
for example to perform database transactions, need to declare
a ProcProvider
interface from Wolframe
namespace as additional parameter. We will describe the ProcProvider
interface in a separate section of this chapter.
The following simple example without provider context is declared
without marshalling and introspection tags. It can therefore not be called
by Wolframe
. We explain later how to make it callable.
The example just illustrates the structure of the exported object with
its interface (example C#):
using System; using System.Runtime.InteropServices; public struct Address { public string street; public string country; }; public interface FunctionInterface { Address GetAddress( string street, string country); } public class Functions : FunctionInterface { public Address GetAddress( string street, string country) { Address rt = new Address(); rt.street = street; rt.country = country; return rt; } }
Wolframe itself is not a .NET application. Therefore it has to call .NET functions via COM/.NET interop interface of a hosted CLR (Common Language Runtime). To make functions written in .NET callable by Wolframe, the following steps have to be performed:
First the asseblies with the functions exported to Wolframe have to be build COM visible.
To make the .NET functions called from Wolframe COM visible, you have to tick
"Properties/Assembly Information" the switch "Make assembly COM visible". Furthermore
every object and method that is part of the exported API (also objects used as parameters)
has to be tagged in the source as COM visible with [ComVisible(true)]
.
Each object that is part of the exported API has to be tagged with a global unique
identifier (Guid) in order to be adressable. Modules with .NET functions will have to be globally
registered and the objects need to be identified by the Guid because that's the only way
to make the record info structure visible for Wolframe. The record info structure is
needed to serialize/deserialize .NET objects from another interpreter context that
is not registered for .NET. There are many ways to create a Guid and
tag an object like this: [Guid("390E047F-36FD-4F23-8CE8-3A4C24B33AD3")]
.
For marshalling function calls correctly, Wolframe needs tags for every parameter and member of a sub structure of a parameter of methods exported as functions. The following table lists the supported types and their marshalling tags:
Table 5.2. Marshalling Tags
.NET Type | Marshalling Tag |
---|---|
I2 | [MarshalAs(UnmanagedType.I2)] |
I4 | [MarshalAs(UnmanagedType.I4)] |
I8 | [MarshalAs(UnmanagedType.I8)] |
UI2 | [MarshalAs(UnmanagedType.UI2)] |
UI4 | [MarshalAs(UnmanagedType.UI4)] |
UI8 | [MarshalAs(UnmanagedType.UI8)] |
R4 | [MarshalAs(UnmanagedType.R4)] |
R8 | [MarshalAs(UnmanagedType.R8)] |
BOOL | [MarshalAs(UnmanagedType.BOOL)] |
string | [MarshalAs(UnmanagedType.BStr)] |
RECORD | no tag needed |
array of structures | [MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_RECORD)] |
array of strings | [MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_BSTR)] |
array of XX (XX=I2,I4,I8,..) | [MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_XX)] |
Decimal floating point and numeric types (DECIMAL) are not yet supported, but will soon be available.
The following C# module definition repeats the example introduced above with the correct tagging for COM visibility and introspection:
using System; using System.Runtime.InteropServices; [ComVisible(true)] [Guid("390E047F-36FD-4F23-8CE8-3A4C24B33AD3")] public struct Address { [MarshalAs(UnmanagedType.BStr)] public string street; [MarshalAs(UnmanagedType.BStr)] public string country; }; [ComVisible(true)] public interface FunctionInterface { [ComVisible(true)] Address GetAddress( [MarshalAs(UnmanagedType.BStr)] string street, [MarshalAs(UnmanagedType.BStr)] string country); } [ComVisible(true)] [ClassInterface(ClassInterfaceType.None)] public class Functions : FunctionInterface { public Address GetAddress([MarshalAs(UnmanagedType.BStr)] string street, [MarshalAs(UnmanagedType.BStr)] string country) { Address rt = new Address(); rt.street = street; rt.country = country; return rt; } }
For making the API introspectable by Wolframe, we have to create a TLB
(Type Library) file from the assembly (DLL) after build. The type library has
to be recreated every time the module interface (API) changes. The type library is
created with the program tlbexp
. All created type library (.tlb) file
that will be loaded with the same runtime environment have to be copied into
the same directory. They will be referenced for introspection in the
configuration. The configuration of .NET will be explained later.
The type library created with tlbexp
has also to be registered.
For this you call the program regtlibv12
with your type library file
(.tlb file) as argument. The type libary fegistration has to be repeated when
the the module interface (API) changes.
Wolframe does not accept local assemblies. In order to be addressable over
the type library interface assemblies need to be put into the global assembly cache (GAC).
Unfortunately this has to be repeated every time the assembly binary changes.
There is no way around. For the registration in the GAC we have to call the program
gacutil /if <assemblypath>
with the assembly path <assemblypath>
as argument. The command gacutil
has to be called from
administrator command line. Before calling gacutil
, assemblies
have to be strongly signed. We refer here to the MSDN documentation for how
to sign an application.
We have to register the types declared in the assembly
to enable Wolframe to create these types.
An example could be a provider function returning a structure that is called
from a Wolframe .NET function. The structure
returned here has to be build in an unmanaged context. In order to be valid
in the managed context, the type has to be registered.
For the registration of the types in an assembly we have to call the program
regasm <assemblypath>
with the assembly path <assemblypath>
as argument. The command regasm
has to be called from
administrator command line.
Wolframe functions in .NET calling globally defined Wolframe functions need to declare the processor provider interface as an additional parameter. The processor provider interface is defined as follows (example C#):
namespace Wolframe { public interface ProcProvider { object call( [In] string funcname, [In] object argument, [In] Guid resulttype); } }
To use it we have to include the reference to the assembly WolframeProcessorProvider.DLL
.
The interface defined there has a method call
taking 3 arguments: The name of the
function to call, the object to pass as argument and the Guid
of the object type
to return. The returned object will be created with help of the registered Guid
and can be casted to the type with this Guid
.
The following example shows the usage of a Wolframe.ProcProvider
call.
The method GetUserObject
is declared as Wolframe function requiring the processor
provider context as additional argument and taking one object of type User
as
argument named usr
.
The example function implementation redirects the call to the global Wolframe function
named GetAddress
returning an object of type Address
(example C#):
public Address GetUserAddress( Wolframe.ProcProvider provider, User usr ) { Address rt = (Address)provider.call( "GetAddress", usr, typeof(Address).GUID); return rt; }
The objects involved in this example need no more tagging because the provider context and also
structures (struct
) need no additional mashalling tags.
.NET modules are grouped together in a configuration block that specifies the configuration
of the Microsoft Common Language Runtime (CLR) used for .NET interop calls. The configuration
block has the header runtimeEnv dotNET
and configures the version of the
runtime loaded (clrversion) and the path where the typelibraries (.tlb) files can be
found (typelibpath).
With the assembly
definitions you declare the registered assemblies to load.
RuntimeEnv dotNet { clrversion "v4.0.30319" typelibpath programs/typelibrary assembly "Functions, Version=1.0.0.30, Culture=neutral, PublicKeyToken=1c1d731dc6e1cbe1, processorArchitecture=MSIL" assembly "Utilities, Version=1.0.0.27, Culture=neutral, PublicKeyToken=1c1f723cc51212ef, processorArchitecture=MSIL" }
Table 5.3. Attributes of assembly declarations
Name | Description |
---|---|
<no identifier> | The first element of the assembly definition does not have an attribute identifier. The value is the name of the assembly (and also of the type library) |
Version | 4 element (Major.Minor.Build.Revision) version number of the assembly. This value is defined in the assembly info file of the assembly project. |
Culture | For Wolframe applications until now always "neutral". Functionality is in Wolframe not yet culture dependent on the server side. |
PublicKeyToken | Public key token values for signed assemblies. See next section how to set it. |
processorArchitecture | Meaning not explained here. Has on ordinary Windows .NET plattforms usually the value "MSIL". Read the MSDN documentation to dig deeper. |
We already found out that Wolframe .NET modules have to be strongly signed. Each strongly signed assembly has such a public key token that has to be used as attribute when referencing the assembly.
We can get the PublicKeyToken
of the assembly
by calling sn -T <assemblypath>
from the command line (cmd)
with <assemblypath> as the path of the assembly. The printed value is the
public key to insert as attribute value of PublicKeyToken
in the
Wolframe configuration for each .NET assembly.
Languages of .NET called via the CLR are strongly typed languages. This means that
the input of a function and the output is already validated to be of a strictly
defined structure. So a validation by passing the input through a form
might not be needed anymore. Validation with .NET
data structures is weaker than for example XML validation with forms defined in
a schema language. But only if distinguishing XML attributes from content elements
is an issue.
See in the documentation of the standard command handler how validation can be
skipped with the attribute SKIP
.
You can write functions for the logic tier of Wolframe in the Python programming language (http://www.python.org).
The implementation of Python calls is not yet available. But Wolframe will provide Python functions soon.
You can write functions for the logic tier of Wolframe with Lua. Lua is a scripting language designed, implemented, and maintained at PUC-Rio in Brazil by Roberto Ierusalimschy, Waldemar Celes and Luiz Henrique de Figueiredo (see http://www.lua.org/authors.html). A description of Lua is not provided here. For an introduction into programming with Lua see http://www.lua.org. The official manual which is also available as book is very good. Wolframe introduces some Lua interfaces to access input and output and to execute functions.
For enabling Lua you have to declare the loading of the module 'mod_command_lua' in the main section of the server configuration file.
Module mod_command_lua
Each Lua script referenced has to be declared in the Processor
section of the configuration with program <sourcefile>
.
The script is recognized as Lua script by the file extension ".lua".
Files without this extension cannot be loaded as Lua scripts.
For Lua we do not have to declare anything in addition to the
Lua script. If you configure a Lua
script as program, all global functions declared in this script are declared as global form
functions. For avoiding name conflicts you should declare private functions
of the script as local
.
Wolframe lets you access objects of the global context through
a library called provider
offering the following functions:
Table 5.4. Method
Name | Parameter | Returns |
---|---|---|
form | Name of the form | An instance of the form |
type | Type name and initializer list | A constructor function to create a value instance of this type |
formfunction | Name of the function | Form function defined in a Wolframe program or module |
document | Content string of the document to process | Returns an object of type "document" that allows the processing of the contents passed as argument. See description of type "document" |
authorize | 1) authorization function 2) (optional) authorization resource | Calls the specified authorization function and returns true on success (authorized) and false on failure (authorization denied or error) |
Wolframe lets us extend the type system consisting
of Lua basic data types with our own. We can create atomic data types
defined in a module or in a DDL datatype definition program (.wnmp file).
For this you call the type
method of the provider with
the type name as first argument plus the type initializer argument
list as additional parameters.
The function returns a constructor function that can be called
with the initialization value as argument to get a
value instance of this type. The name of the type can refer to
one of the following:
Table 5.5. List of Atomic Data Types
Class | Initializer Arguments | Description |
---|---|---|
Custom data type | Custom Type Parameters | A custom data type defined in a module with arithmetic operators and methods |
Normalization function | Dimension parameters | A type defined as normalization function in a module |
DDL data type | (no arguments) | A normalizer defined as sequence of normalization functions in a .wnmp source file |
Data type 'bignumber' | (no arguments) | Arbitrary precision number type |
Data fype 'datetime' | (no arguments) | Data type representing date and time down to a granularity of microseconds |
The data type 'datetime' is used as interface for date time values.
Table 5.6. Methods of 'datetime'
Method Name | Arguments | Description |
---|---|---|
<constructor> | year, month, day, hour, minute, second ,millisecond, microsecond | Creates a date and time value with a granularity down to microseconds |
<constructor> | year, month, day, hour, minute, second ,millisecond | Creates a date and time value with a granularity down to milliseconds |
<constructor> | year, month, day, hour, minute, second | Creates a date and time value |
<constructor> | year, month, day | Creates a date value |
year | (no arguments) | Return the value of the year |
month | (no arguments) | Return the value of the month (1..12) |
day | (no arguments) | Return the value of the day in the month (1..31) |
hour | (no arguments) | Return the value of the hour in the day (0..23) |
minute | (no arguments) | Return the value of the minute (0..59) |
second | (no arguments) | Return the value of the second (0..63 : 59 + leap seconds) |
millisecond | (no arguments) | Return the value of the millisecond (0..1023) |
microsecond | (no arguments) | Return the value of the microsecond (0..1023) |
__tostring | (no arguments) | Return the date as string in the format YYYYMMDD, YYYYMMDDhhmmss, YYYYMMDDhhmmssll or YYYYMMDDhhmmssllcc, depending on constructor used to create the date and time value. |
The data type 'bignumber' is used to reference fixed point BCD numbers with a precision of 32767 digits between -1E32767 and +1E32767.
Table 5.7. Methods of 'datetime'
Method name | Arguments | Description |
---|---|---|
<constructor> | number value as string | Creates a bignumber from its string representation |
<constructor> | number value | Creates a bignumber from a lua number value (double precision floating point number) |
precision | (no arguments) | Return the number of significant digits in the number |
scale | (no arguments) | Return the number of fractional digits (may be negative, may be bigger than precision) |
digits | (no arguments) | Return the significant digits in the number |
tonumber | (no arguments) | Return the number as lua number value (double precision floating point number) with possible lost of accurancy |
__tostring | (no arguments) | Return the big number value as string (not normalized). |
Lua provides an interface to the iterators internally used to couple objects and functions. They are accessible as iterator function closure in Lua. The look similar to Lua iterators but are not. You should not mix them with the standard Lua iterators though the semantic is similar. Filter interface iterators do not return nodes of the tree as subtree objects but only the node data in the order of a pre-order traversal. You can recursively iterate on the tree and build the object during traversal if you want. The returned elements of the Filter interface iterators are tuples with the following meaning:
Table 5.8. Filter interface iterator elements
Tuple First Element | Tuple Second Element | Description |
---|---|---|
NIL/false | string/number | Open (tag is second element) |
NIL/false | NIL/false | Close |
Any non NIL/false | string/number | Attribute assignment (value is first, tag is second element) |
string/number | NIL/false | Content value (value is first element) |
Wolframe lets you access filter interface
iterators through a library called iterator
offering the following functions:
Table 5.9. Method
Name | Parameter | Returns |
---|---|---|
scope | serialization iterator (*) | An iterator restricted on the subnodes of the last visited node (**) |
(*) See section "serialization iterator"
(**) If iterator.scope is called, all elements of the returned iterator has to be visited in order to continue iteration with the origin iterator on which iterator.scope was called.
Besides the provider library Wolframe defines the following objects global in the script execution context:
Name | Description |
logger | object with methods for logging or debugging |
The provider function provider.form( )
with the name of the form as string as
parameter returns an empty instance of a form. It takes the name of
the form as string argument. If you for example have a form configured
called "employee" and you want to create an employee object from
a Lua table, you call
bcf = provider.form( "employee" ) bcf:fill( {surname='Hans', name='Muster', company='Wolframe'} )
The first line creates the data form object. The second line fills the data into the data form object.
The form method fill
takes a second optional parameter. Passing "strict" as
second parameter enforces a strict validation of the input against the form, meaning that
attributes are checked to be attributes (when using XML serialization) and non optional elements are checked to be initialized.
Passing "complete" as second parameter forces non optional elements to be checked for initialization
but does not distinguish between attributes and content values. "relaxed" is the default and
checks only the existence of filled-in values in the form.
Given the following validation form in simple form DDL syntax (see chapter "Forms"):
FORM Employee -root employee { ID !@int ; Internal customer id (mandatory) name !string ; Name of the customer (mandatory) company string ; Company he is working for (optional) }
the call of fill
in the following piece of code will raise an error because some elements
of the form ('ID' and 'name') are missing in the input:
bc = provider.form( "employee" ):fill( {company='Wolframe'}, "strict" )
To access the data in a form there are two form methods available.
get()
returns a filter interface iterator on the form data.
There is also a method value()
that returns the form data as
Lua data structure
(a Lua table or atomic value).
For calling transactions or built-in functions loaded as modules the Lua layer defines the
concept of functions. The provider function provider.formfunction
with the name of the function
as argument returns a Lua function. This function takes a table or a filter interface iterator as argument
and returns a data form structure. The data in the returned form data structure can be
accessed with get()
that returns a filter interface iterator on the content
and value()
that returns a Lua table or
atomic value.
If you for example have a transaction called "insertEmployee" defined in a transaction description program file declared in the configuration called "insertEmployee" and you want to call it with the 'employee' object defined above as input, you do
f = provider.formfunction( "insertEmployee") res = f ( {surname='Hans', name='Muster', company='Wolframe'} ) t = res:value() output:print( t[ "id" ] )
The first line creates the function called "insertEmployee" as Lua function. The second calls the transaction, the third creates a Lua table out of the result and the fourth selects and prints the "id" element in the table.
This is a list of all objects and functions declared by Wolframe:
Table 5.10. Data forms declared by DDL
Method Name | Arguments | Returns | Description |
---|---|---|---|
get | filter interface iterator (*) | Returns a filter interface iterator on the form elements | |
value | Lua table | Returns the contents of the data form as Lua table or atomic value | |
__tostring | string | String representation of form for debugging | |
name | string | Returns the global name of the form. | |
fill | Lua table or filter interface iterator (*), optional validation mode (**) | the filled form (for concatenation) | Validates input and fills the input data into the form. |
(*) See section "filter interface iterator"
(**) "strict" (full validation), "complete" (only check for all non optional elements initialization) or "relaxed" (no validation except matching of input to elements)
Table 5.11. Data forms returned by functions
Method Name | Returns | Description |
---|---|---|
get | filter interface iterator (*) | Returns a filter interface iterator on the form elements |
value | Lua table or atomic value | Returns the contents of the data form as Lua table or atomic value |
__tostring | string | String representation of form for debugging |
(*) See section "filter interface iterator"
Table 5.12. Document
Method Name | Arguments | Description |
---|---|---|
docformat | - | Returns the format of the document {'XML','JSON',etc..} |
as | filter and/or document type table | Attaches a filter to the document to be used for processing |
doctype | - | Returns the document type of the content. For retrieving the document type you have first to define a filter. |
metadata | - | Returns the meta data structure of the content. For retrieving the document meta data you have first to define a filter. |
value | - | Returns the contents of the document as Lua table or atomic value. The method 'table' does the same but is considered to be deprecated. |
table | - | Deprecated. Does return the same as the method 'value' |
form | - | Returns the contents of the document as filled form instace |
get | - | Returns a filter interface iterator (*) on the form elements |
(*) See section "filter interface iterator"
Table 5.13. Logger functions
Method Name | Arguments | Description |
---|---|---|
logger.printc | arbitrary list of arguments | Print arguments to standard console output |
logger.print | loglevel (string) plus arbitrary list of arguments | log argument list with defined log level |
Table 5.14. Global functions
Function Name | Arguments | Description |
---|---|---|
provider.form | name of form (string) | Returns an empty data form object of the given type |
provider.formfunction | name of function (string) | Returns a lua function to execute the Wolframe function specified by name |
provider.type | name of data type (string) | Returns a constructor function for the data type given by name. The name specifies either a custom data type or a normalization function as used in forms or one of the additional userdata types 'datetime' or 'bignumber'. |
provider.document | Content string of the document to process | Returns an object of type "document" that allows the processing of the contents passed as argument. See description of type "document" |
(*) See section "filter interface iterator "
(**) The filter interface iterator of a defined scope must be consumed completely before consuming anything of the parent iterator. Otherwise it may lead to unexpected results because they share some part of the iterator state.
You can write functions for the logic tier of Wolframe with C++. Because native C++ is by nature a compiled and not an interpreted language, you have to build a module out of your function implementation.
For native C++ you need a C++ build system with compiler and linker or an integrated development environment for C++.
Form functions declared in C++ have two arguments. The output structure to fill
is passed by reference as first and the input structure passed is by value.
The input structure copy should not be modified by the callee.
This means in C++ that it is passed as const
reference.
The function returns an int
that is 0
on success
and any other value indicating an error code. The function may also throw a
runtime error exception in case of an error.
The following example shows a function declaration. The function declaration
is not complete because the input output structures need to be declared with some
additional attributes needed for introspection. We will explain this in the following
section.
The function takes a structure as input and writes the result into an output structure. In this example input and output type are the same, but this is not required. It's just the same here for simplicity.
The elements of the function declaration are put into a structure with four elements.
The typedef
for the InputType and OutputType structures is required,
because the input and output types should be recogniceable without complicated
type introspection templates. (Template based introspection might cause spurious
and hard to understand error messages when building the module).
The function name
returns the name of the function that
identifies the function in the Wolframe global scope.
The exec
function declared as static function with this signature refers to
the function implementation.
// ... PUT THE INCLUDES FOR THE "Customer" STRUCTURE DECLARATION HERE ! struct ProcessCustomer { typedef Customer InputType; typedef Customer OutputType; static const char* name() {return "process_customer";} static int exec( const proc::ProcessorProvider* provider, InputType& res, const OutputType& param); };
For defining input and output parameter structures in C++ you have to define the
structure and its serialization description. The serialization description is a static
function getStructDescription
without arguments returning a const structure that describes
what element names to bind to which structure elements.
The following example shows a form function parameter structure defined in C++.
Declares the structure and the serialization description of the structure. Structures may contain structures with their own serialization description.
#include "serialize/struct/structDescriptionBase.hpp" #include <string> namespace _Wolframe { namespace example { struct Customer { int ID; // Internal customer id std::string name; // Name of the customer std::string canonical_Name; // Customer name in canonical form std::string country; // Country std::string locality; // Locality static const serialize::StructDescriptionBase* getStructDescription(); }; }}//namespace
Declares 'ID' as attribute and name, canonical_Name, country, locality as tags. The '--' operator marks the end of attributes section and the start of content section.
#include "serialize/struct/structDescription.hpp" using namespace _Wolframe; namespace { struct CustomerDescription :public serialize::StructDescription<Customer> { CustomerDescription() { (*this) ("ID", &Customer::ID) -- ("name", &Customer::name) ("canonical_Name", &Customer::canonical_Name) ("country", &Customer::country) ("locality", &Customer::locality) ; } }; const serialize::StructDescriptionBase* Customer::getStructDescription() { static CustomerDescription rt; return &rt; }
Now we have all pieces together to build a loadable Wolframe module with our example C++ function. The following example shows what you have to declare in the main module source file.
The module declaration needs to include appdevel.hpp
and
of course all headers with the function and data structure declarations needed.
The module starts with the header macro CPP_APPLICATION_FORM_FUNCTION_MODULE
with a short description of the module.
What follows are the function declarations declared with the macro
CPP_APPLICATION_FORM_FUNCTION. This macro has the following arguments in
this order:
Name | Description |
NAME | identifier of the function |
FUNCTION | implementation of the function |
OUTPUT | output structure of the function |
INPUT | input structure of the function |
The declaration list is closed with the parameterless footer macro CPP_APPLICATION_FORM_FUNCTION_MODULE_END. The following example shows an example module declaration:
#include "appDevel.hpp" // ... PUT THE INCLUDES FOR THE "ProcessCustomer" FUNCTION DECLARATION HERE ! #include "customersFunction.hpp" using namespace _Wolframe; WF_MODULE_BEGIN( "ProcessCustomerFunction", "process customer function") WF_FORM_FUNCTION("process_customer",ProcessCustomer::exec,Customer,Customer) WF_MODULE_END
For building the module we have to include all modules introduced here and to link at against the wolframe serialization library (wolframe_serialize) and the wolframe core library (wolframe).
The module built can be loaded as the other modules by declaring it in the wolframe
LoadModules section of the configuration. Simply list it there with
module <yourModuleName>
with <yourModuleName> being the name
or path to your module.
C++ is a strongly typed language. This means that the
input of a function and the output is already validated to be of a strictly
defined structure. So a validation by passing the input through a form
might not be needed anymore. The constructs used to describe structures of
Wolframe
in native C++ are even capable of describing attributes
like used in XML (section 'Input/Output Data Structures' above).
See in the documentation of the standard command handler how validation can be
skipped with the attribute SKIP
.
Copyright © 2014 - Project Wolframe - All Rights Reserved