This chapter provides the following information for Caché 2007.1:
New and Enhanced Features for Caché 2007.1
The following features have been added to Caché for the 2007.1 release:
Call In / Call Out Enhancements
The following call in / call out enhancements are included in this project:
Applications can now call into Caché as a DLL / shared library, rather than as an executable, eliminating the need for static linking. This enhancement will be available for all platforms except OpenVMS.
Call in is now multithreaded on some platforms. This enables multithreaded applications to call into Caché, with multiple threads effectively executing simultaneously but independently within a single Caché process. Initially, this capability will be available on the following platforms: Windows (32-bit, 64-bit Intel Extended Memory, 64-bit AMD, not Itanium), Linux (32-bit, 64-bit Intel Extended Memory, 64-bit AMD, not Itanium), and Solaris (64-bit SPARC and 64-bit AMD). Other platforms may be added in the future.
For all other platforms, Caché is now thread safe. It can be called safely from multithreaded applications, with Caché automatically providing any necessary synchronization. While one application thread is executing within a Caché process, other threads are blocked from executing within Caché, although they may be executing other (non-Caché) application code.
New Error Handling Syntax
This project adds new Caché error handling syntax similar to that in Java, C++ and VB. It utilizes three commands: TRY, to identify the scope of an error handler; CATCH, to identify the code to execute on an error; and THROW, to explicitly signal an error.
For those familiar with other language implementations, FINALLY is not supported.
With this project, the maximum length of local and global strings is extended to 3.6 million characters. Storage of long strings is only supported for databases using 8KB blocks.
Enabling support for long strings is done via the Management Portal at [Home] > [Configuration] > [Advanced Settings]
. The category is Miscellaneous
and the setting name is EnableLongStrings
LDAP User Management
This enhancement enables user authentication and roles to be managed outside of Caché using LDAP.
The enhancement enables user authentication to be carried out using site-specific Caché code.
This enhancement adds extended SQL row-level security capabilities to Caché. Essentially, a column in the table contains a list of roles for each row. When a query is run, the user must have at least one of the row’s roles in order to see that row. Typically, the list of roles is calculated (based on other data in the table) and a method is automatically called to perform this calculation whenever a row is inserted or updated. This security column can be indexed to provide row-based security with minimum performance impact.
Changes to the state of journaling (both system-wide and process-specific) are now recorded in the audit log.
In previous releases, the Caché SQL Gateway required an ODBC driver for each foreign database on each Caché platform. While this is not an issue for Windows, where good ODBC drivers are available for all databases supported by the Gateway, it has been an ongoing problem for other platforms, particularly UNIX®.
To address this, with this release we will begin using JDBC, rather than ODBC, for Gateway connections. Because all of the databases supported by the Gateway have platform-independent JDBC drivers, this will significantly reduce the testing and deployment of the Gateway and Gateway-based applications.
SQL Gateway is supported on all Caché platforms except OpenVMS. For Caché 2007.1, we will support both the ODBC-based Gateway (on the same platforms as 5.2) and the JDBC-based Gateway. In later releases, for all platforms other than Windows, only the JDBC-based Gateway will be supported. On Windows systems, both the ODBC-based and JDBC-based Gateway will continue to be supported.
This enhancement provides an object binding to the Objective-C language on Mac systems.
New Distribution Mechanism for C++ Binding
With this release we are changing the way that the C++ binding is packaged and delivered, in order to reduce our dependence on specific C++ compilers and libraries.
Zen is an extensible framework for rapid development of Web applications. Pages are defined via high-level XML-based definitions using a rich set of components. Zen includes a strong, easy-to-use security model; built-in support for multi-lingual applications; server- and client-side event handling; and very good extensibility.
This release includes performance enhancements for SQL queries with multiple aggregates or with a combination of aggregates and GROUP BY.
Left outer joins can now be based on non-equality conditions, e.g. greater than or %startswith.
Text search enhancements include %CONTAINSTERM, use of indices for processing %SIMILARITY, and multi-term substitutions in the thesaurus facility.
This release includes enhancements in T-SQL support, including compilation and runtime performance improvements, support for statement-level triggers, exposure of Sybase-compatible system tables, improved error handling, and some additional T-SQL syntax support.
Routine Performance Enhancements
COS routine management has been extensively revised in 2007.1. As a result, the way routines are managed by Caché is now much more efficient. The amount of overhead spent invoking a new routine may be up to an order of magnitude less in this release. Furthermore, in prior releases, the number of routine buffers could become a source of contention in systems with dozens of cpus; this is nowfar less likely.
Among the changes is the addition of a new per-process cache identifying recently used routines. The cache speeds up the process of locating routines which are used frequently within the process. In addition, the cache changes the way routine buffers are managed by the system.
In prior releases, only the actual routine buffers being executed were protected against change. All others could be modified. A consequence of this was that prior routines on the call stack could be changed (for example, by recompilation). To illustrate the issue, suppose a routine, A, was executing and then called routine, B. When B started running, A was available to be changed. If A was re-compiled while B was executing, the buffer holding A became obsolete and was released. When B attempted to return to A, Caché would notice this fact and report an <EDITED> error.
With the per-process cache in place, the <EDITED> error no longer occurs. This is because all routines on the call stack of any process have their current versions locked. So the associated buffers do not become obsolete. B always returns to the proper version of its caller, A.
To carry the examination further, if B returns to A and, while A is executing, B is recompiled. The next call from A to B will execute the re-compiled version of B. Similarly, if A is re-compiled while B is executing and B then calls A, the new version of A will be invoked. When the new A returns to B and B subsequently returns, it will return to the prior version of A that was preserved in the routine buffer.
The initial size of the per-process cache (called the "routine vector") is set at system startup. The default is 256. This number can be changed via the function, $ZU(96,29,<value>)
. The cache can be cleared of all routines not in use with $ZU(96,31)
. Note that routines are cached by namespace; changing namespaces clears the cache. Changing the routine search path via $zu(39,...)
also clears the cache.
Because the per-process cache causes system routine buffers to be held until they are no longer in use, it is possible for a large collection of processes to consume all available buffers. When this happens, Caché will force the reduction of each of the per-process caches by some amount, freeing unused buffers. This reduction may recur until sufficient buffers are available for sustained operation.
cstat -s<MgrDirectory> -R2048
where <MgrDirectory> is the pathname of the Caché manager directory will display the routine buffer usage for running jobs in the rbuf#
column. If the number of routines in use for most jobs exceeds 30, then the administrator should increase the shared heap size
by 512 bytes times the number of expected simultaneously running jobs.
By default each process may cache up to 256 routines. As the system starts running out of available buffers, it will adjust the number of cached routines allowed per process down by 10% every quarter-second. When the number of cached routines per process is 30% of the originally configured number, Caché will log a message to the console, indicating the system may no longer be running efficiently.
When the number reaches 20% of the original setting, Caché will log a warning; and at the 10% threshold will log a severe
message to the console that indicates the system may hang.
The system will hang when it runs out of routine buffers and cannot free any for use.
The routines held in the cache by dead processes keep the buffers they hold in use until the dead process is removed or the system shuts down.
With this release, a number of internal improvements have been made to the performance and reliability of synchronous journal writes. In addition, system management improvements have been made to enable journal files to be retained on shadow systems for a limited period of time.
The Light C++ Binding is a high-performance object binding that operates in process, i.e. the client application executes in the same process as Caché. With this binding, objects are fetched directly from the database to the client, without the requirement to maintain an open object in memory within Caché. The Light C++ Binding is available for the same platforms as multithreaded call in
With this release, support is added for the CSP Gateway on OpenVMS systems (Alpha and Itanium) running the Apache Web server.
Support for GB18030 Character Set
This release adds support for GB18030, the official character set of the People’s Republic of China in the Simplified Chinese locale (chsw). Since the internal representation used by Caché is Unicode, only those code points in GB18030 that map to Unicode are supported. This includes all the 1, 2 and 4-byte sequences that map to characters in the Basic Multilingual Plane (characters U+0000 to U+FFFF; approximately 64K code points), plus all the 4-byte sequences that map to the additional Unicode planes (U+10000 to U+10FFFF; 1M code points).
This latter range is represented in Caché in UTF-16 as surrogate pairs. This is the same representation used by Microsoft since Windows 2000.
There are approximately 500K code points in GB18030 that don't map to Unicode and are currently unassigned. These code points are not supported by Caché.
With this release, the value of $USERNAME can be used to identify users for licensing purposes. This enables more accurate counting in situations where the IP address cannot be used to reliably identify distinct users.
Large numeric literals (>1E146) are now automatically stored as double values.
A process started with the JOB command can now be terminated by its parent, regardless of security settings.
Visual Studio 2005
This release makes use of Microsoft’s Visual Studio 2005 for Windows client components. This change affects customers who make programmatic access (e.g. using call in) to Caché executables.
We will support the MultiNet implementation of TCP/IP on OpenVMS Alpha systems (but not on OpenVMS Itanium). We may discover and document limitations with this software.
The toolbar has a View WebPage button that allows the developer to see a preview of the page
The toolbar now has Forward and Backward buttons
The options dialog has been revised
The StudioAssist feature now works with selected XML documents, notably Zen
The following platforms are added:
OpenVMS 8.3 for Alpha and Integrity
SUSE Linux Enterprise Server 10
Caché 2007.1 Upgrade Checklist
The purpose of this section is to highlight those features of Caché 2007.1 that, because of their difference in this version, affect the administration, operation, or development activities of existing 5.2 systems.
General upgrade issues are mentioned in the chapter General Upgrade Information
in the Caché Release Notes and Upgrade Checklist
. Those customers upgrading their applications from releases earlier than 5.2 are strongly urged to read the upgrade checklist for earlier versions first. This document addresses only the differences between 2007.1 and 5.2.
This section contains information of interest to those who are familiar with administering prior versions of Caché and wish to learn what is new or different in this area for version 2007.1. The items listed here are brief descriptions. In most cases, more complete descriptions are available elsewhere in the documentation.
Starting with this release, InterSystems is changing the distribution medium for Caché from CDROM to DVD.
You must have a CD reader capable of reading DVDs in order to install this version of Caché.
System Management Portal Changes
In Caché 2007.1, InterSystems has modified and improved the version of the System Management Portal available with Caché 5.2. A summary of the more important changes follows.
EnableLongStrings Configuration Parameter
Caché allocates a fixed amount of space to hold the results of string operations, the string stack. If a string expression exceeds the amount of space allocated, a <STRINGSTACK>
error results. For existing applications, this should not be a problem. However, since this release increases the maximum size of Caché strings
, applications that attempt to use this long strings may experience problems if this is not enabled.
When turned on, the configuration parameter, EnableLongStrings, located at [Home] > [Configuration] > [Advanced Settings]
under the Miscellaneous category, increases the size of the string stack by approximately 50 times to accommodate operations involving larger strings.
This switch is a system-wide setting. A Caché job checks this switch when it starts to see whether it should use long strings or not, and allocates a large (support long strings) or normal (do not support long strings) string stack accordingly. Although a program may change the switch setting using $zu(69, 69)
, it will not affect the current job. Only subsequent jobs will use the changed setting. Once a job allocates a string stack, it cannot change the size later in the same job.
Enabling this configuration parameter may affect Caché performance. Memory is often a precious resource and allocating it to the stack comes at the expense of other uses. Retuning of the system may be needed to maintain performance.
If this parameter is not enabled, then the default maximum length for read operations will remain at 32,767 characters.
Changes in Default Port Choices
This version of Caché chooses the default ports for installation differently from previous releases. The new algorithm is:
If port 1972 is unused, choose it. Otherwise, choose the first available port number equal to or greater than 56773.
Choose the first available port number equal to or greater than 57772.
With this change, Caché now conforms to RFC793.
Increased Maximum String Length
In this release, the maximum length of a Caché string has been increased from 32,767 to 3,641,144 characters. This change is transparent to existing applications. However, those applications that are modified to handle larger strings should consider enabling the EnableLongStrings
This limit applies only to databases with 8KB blocks. For older databases with a 2KB block size, the maximum remains unchanged: 32767 characters.
The maximum string size limit applies only to the size of the resulting string. Other size-related limits remain as before. For example, the length of a line of code given to the compiler is still 4096 characters. The maximum size of routines, determined by a configuration parameter, is 32K or 64K. So the length and number of string literals remains little changed.
Improvements in Routine Handling
Terminal Uses Authentication Setting
In this version of Caché if authentication is enabled (for example, by installing Caché with Normal security), Terminal sessions will start with an authentication prompt such as requesting the OS userid and password.
This class, %SYS.LDAP
, is provided to assist management of systems using the Lightweight Directory Access Protocol (LDAP) by providing an ObjectScript interface to an LDAP database. Callers can use the methods provided to authenticate themselves to the database. Once authenticated, they can add and delete items from the LDAP database, and search for information managed by LDAP.
Changes to Journaling API
An API, %SYS.JournalFile
, has been implemented for purging all journal files except those required for transaction rollbacks or crash recovery:
A crash recovery refers to the journal restore performed as part of Cache startup or cluster failover recovery.
Post-backup journal files are not necessarily preserved. Therefore, if you want to be able to restore databases from backups and subsequent journal files, you should configure the journal purging parameter based on backups and use the regular purging API (PURGE^JRNUTIL).
There is also an API for purging journal files based on criteria different from what is in the Cache configuration:
##class(%SYS.Journal.File).Purge(NDaysOld As %Integer,
NBackupsOld As %Integer) returns %Status
This method purges old journal files based on criteria given taking care not to purge files required for transaction rollbacks or crash recovery. The parameters are:
NDaysOld journal files must be at least this number of days old to be purged
NBackupsOld journal files must be at least this numberof backups old to be purged
If both parameters are specified, only one criterion has to be met (inclusive OR) to qualify a journal file for purging subject to the restriction about rollback and crash recovery.
Generic Memory Heap Configuration API Added
A new API, %SYSTEM.Config.SharedMemoryHeap
, has been added to assist in gathering information about how Caché uses the heap. It also provides the ability to get available generic memory heap and recommended gmheap parameter for configuration. For example, the DisplayUsage
method displays all memory used by each of the system components and the amount of available heap memory:
method returns the recommended amount of heap memory that should be configured in this instance for smooth operation:
This does not include the memory reserved for shadowing when enabled which is (2 * #CPUs + 1)MB, for example, A 4CPU system will require an additional 9MB.
IntegrityList Method Removed
This section holds items of interest to users of specific platforms.
With this release, the default installation directory for Windows has been changed to C:\InterSystems\Cache
For compatibility reasons, the installer no longer allows Caché to be installed into the Program Files
Use Large Page Size for Shared Memory Windows Server SP1
Caché for Windows has been enhanced to make use of Windows large pages for the shared memory section. Large pages are a hardware feature which Windows makes accessible to software applications when allocating shared memory sections starting with Server 2003, SP1. The actual size of the page depends on the processor used. The recognized processors and their large pages sizes are:
Caché will attempt to use large pages whenever the feature is available from Windows and the process starting Caché can acquire the SeLockMemoryPrivilege resource. Performance is enhanced on systems which allocate large buffer pools because they are locked into physical memory and they make more efficient use of the hardware translation buffers. Since large pages are locked into physical memory care must be taken not to request too large a buffer pool; this may not leave enough physical memory for the number of users the machine is intended to support. This condition, not leaving enough memory, could result in worse performance than if the buffer pool had been used normal sized pages.
On Windows this calculation (how much physical memory to leave for users) must take into account the Page Table Entries which Windows creates as part of its memory management. Page Table Entries (PTEs) consume 4 bytes each on a 32-bit system and 8 bytes on 64-bit system. Windows allocates one page table entry for every 4KB of memory. This means that on a 64-bit system dividing the size of the shared memory section by 500 gives approximately the # of bytes Windows allocates for PTEs to map the shared memory section. This memory is private to each process; each process needs to allocate this chunk of memory.
The management of these PTEs can itself become a performance issue since these PTEs are pageable. Care must be taken to ensure that physical memory is available for these PTEs. Problems here may not show up at first as the PTEs can be paged out until the shared memory they reference is accessed so a system may appear to be functioning properly but become dramatically worse as time goes on and demand increases.
On most platforms when Caché starts up and fails to allocate the requested shared memory for global and routine buffers, it retries with a series of requests decreasing the amount of space requested with each iteration. This continues until it either succeeds or reaches the limit on the number of retries. If the latter condition occurs, Caché logs a message and startup aborts.
On Windows the approach occurs in two phases. Caché will first try using large pages, assuming large pages are supported. The startup process attempts to acquire the necessary privilege and run through the request loop. If it succeeds, then it proceeds to use large pages for memory managing memory. If it fails attempting to use large pages, the loop is restarted from the original memory request without using large pages. If the 2nd iteration through the loop fails, Caché fails to start.
Remote Database Pathnames
In prior versions, upgrading would change the drive letter for remote databases to uppercase. This is no longer done. Those sites with Windows ECP servers should examine the pathnames for remote databases in [Home] > [Configuration] > [Remote Databases]
. Those whose drive letter is in uppercase should be changed to lowercase.
In versions of Caché prior to 5.1, daemons always ran as root, and Caché used that privilege to set the UNIX® parameter RLIMIT_FSIZE to unlimited. However, since 5.1, the Caché daemons may run under a different userid. Since only root may set its own limits, the limits cannot be set for a non-root owner.
In addition, when running as a non-root user, Caché failed to set the current (soft) limit to the hard limit. In this case, even if the RLIMIT_FSIZE hard limit was unlimited, if Caché was started from a login user whose soft limit was set to some lower value, the Caché daemons ran with this lower value, and could encounter serious I/O errors attempting to write to CACHE.DAT or CACHE.WIJ.
Cache now checks that the RLIMIT_FSIZE hard limit is unlimited, and halts the cache startup (or installation) with an error message if this is not the case. Second, it sets the soft limit to the hard limit in the daemons.
CDL Support Dropped for Mac OS X
In this version of Caché, CDL support has been removed from the Mac platform. Please use XML for transport of applications across platforms.
Client applications attempting to connect to server systems running this version of Caché must be running on Caché version 5.0.13 or later. Attempts to connect from earlier versions will fail.
Dreamweaver No Longer Supported
The release of Caché no longer supports any version of Dreamweaver as a method for generating CSP pages.
Python Binding Version Requirements
For Caché version 2007.1 on Windows, customers wishing to use the Python language binding should use ActiveState Python 2.4 with Microsoft Visual Studio 7. InterSystems has tested the product specifically with ActiveState Python 18.104.22.168.
This section contains information of interest to those who have designed, developed and maintained applications running on prior versions of Caché.
The items listed here are brief descriptions. In most cases, more complete descriptions are available elsewhere in the documentation.
There have been a number of changes and improvements to the system management portal since 5.2 was released. These are detailed in the administrator section
Changes To Pattern Matching
This release makes an incompatible release to the way patterns are matched. If an application makes use of patterns with certain characteristics, it must be recompiled in this version. Otherwise, it will produce incorrect results. Those characteristics are:
the repetition count is 12, for example, X?12N
the first number of a repetition range is 12, such as X?12.14N
the difference between the two numbers of a repetition range is 12, such as X?3.15N
Caché SQL NOT! and NOT& Operators Removed
As an assistance to customer migrating their applications to Caché. SQL now supports the function, SYSDATE
. This is equivalent to the SQL standard function, CURRENT_TIMESTAMP
, but the latter is preferred.
is now an SQL reserved word. Applications using SYSDATE as an identifier must switch to using the delimited identifier syntax, SYSDATE
, or change the name of the identifier, to resolve the conflict.
CREATE TABLE Changes Default Schema Assignment
In Caché SQL, there is no default package defined with a schema name of USER. An attempt to create a table with a schema name of USER when no package definition for USER exists will create a package mapping with a package and a schema name of USER.
This overrides the default mapping of the USER package to the SQLUser (schema) mapping.
In addition, USER is an SQL reserved word, but since there was no ambiguity in previous releases, the SQL parser allowed USER as a schema name.
In Caché 2007.1, USER is no longer allowed as the schema name in DDL statements unless it is a delimited identifier. This prevents ad hoc DDL development of tables, or other SQL objects, from creating a package mapping that overrides the Caché default mapping of User->SQLUser. Undelimited uses of USER will result in the new SQL error
SQLCODE = -312: Invalid schema name 'USER'
Oracle Timestamps and JDBC Gateway
When linking to any column defined in an Oracle database as an Oracle Date, the linking procedure incorrectly represents this as a Caché Date. This interpretation makes it impossible manipulate data items declared as Date (Timestamps, actually) via JDBC.
To obtain the correct behavior, once the table has been linked, it is necessary for the user to open the class definition in Studio, and change the datatype from %Date to %TimeStamp.
TuneTable and KeepUpToDate Flag
Caché provides the ability to gather statistics about the contents of a table to assist in optimizing queries. The method that can be invoked by applications at runtime is: $SYSTEM.SQL.TuneTable
The fifth parameter of $SYSTEM.SQL.TuneTable
is the KeepUpToDate flag. If this flag is set to 1, Caché does not mark the classes and tables as out of date when it modifies the values for ExtentSize and the Selectivity.
In Version 2007.1, KeepUpToDate has been extended to influence cached queries. If KeepUpToDate is set to 1, Caché will not purge cached queries as part of the TuneTable.
If the flag is 0, the default, then the class is marked as not up to date and all cached queries based on the table are purged. Running TuneTable on a live system with KeepUpToDate = 0 can cause application errors if the application is using any queries that are removed from the cache.
Changes for Sybase / SQL Server Compatibility
The following changes have been made to Caché for compatibility with Sybase and SQL Server:
the command, TRUNCATE TABLE, has been added
six new functions are now available for use in expressions: CHARINDEX, DATALENGTH, REPLACE, STUFF, and SYSDATE
two additions have been made to the list of reserved words: SYSDATE, and STATISTICS
New DDL Datatype Mappings
As of this version, the following new datatype mappings have been added to the system set of datatype maps:
||Maps In Caché To
|NATIONAL CHAR VARYING
|NATIONAL CHARACTER VARYING
The following is a short list of changes to statement parsing. Please refer to the SQL Reference
manual for further information.
The INTO keyword is now optional on an INSERT statement
The UPDATE and DELETE statements now allow a new FROM clause to support multi-table selection criteria
The FROM keyword is now optional in a DELETE statement
Operator / Predicate Changes
The following are new in version 2007.1:
The LIKE predicate supports letter case collation
%CONTAINSTERM has been added as a new comparison operator
Caché SQL now supports multiline comments with the same /* ... */ syntax used by Caché Objectscript.
Changes to %Text Class Stemming
In this version of Caché, the algorithm has been changed to restore the behavior of the originally published algorithm, because the amended algorithm does an inferior job on most words ending in -ed
. This change affects what terms go into an index based on the %Text.English
class, so any index that is not rebuilt across versions may experience slight differences with respect to text predicates on words ending in -ed
It is recommended that you rebuild %Text.English
text indices that were created on prior versions of Caché. Otherwise, a word that could end in -ed
might fail to match a different form of the word already in the index.
In this release of Caché, CALLIN no longer supports msql* functions. These must be changed to Caché SQL.
An error has been fixed that under some circumstances caused Caché to ignore maximum string length settings. This allowed strings whose length exceeded the maximum to be accepted instead of being detected as being too long. Subsequent use, for example, when the value was passed to a stored procedure or persisted, would correctly truncate the value. Values are now processed consistently for both the initial and subsequent uses.
There have been a number of changes and improvements to the system management portal since 5.2 was released. These are detailed in the administrator section.