Settings for the File Inbound Adapter

Provides reference information for settings of the file inbound adapter, EnsLib.File.InboundAdapter.

Summary

The inbound file adapter has the following settings:

Group	Settings
Basic Settings	File Path, File Spec, Archive Path, Work Path, Call Interval
Additional Settings	Subdirectory Levels, Charset, Append Timestamp, Semaphore Specification, Confirm Complete, File Access Timeout

The remaining settings are common to all business services. For information, see “Settings for All Business Services” in Configuring Ensemble Productions.

Append Timestamp

Append a time stamp to filenames in the Archive Path and Work Path directories; this is useful to prevent possible name collisions on repeated processing of the same filename.

If this value is empty or 0, no time stamp is appended.
If this setting is 1, then the standard template '%f_%Q' is appended.
For other possible values, see “Time Stamp Specifications for Filenames” in Configuring Ensemble Productions.

Archive Path

Full pathname of the directory where the adapter should place the input file after it has finished processing the data in the file. This directory must exist, and it must be accessible through the file system on the local Ensemble machine. If this setting is not specified, the adapter deletes the input file after its call to ProcessInput() returns.

To ensure that the input file is not deleted while your production processes the data from the file, InterSystems recommends that you set Archive Path and Work Path to the same directory. Alternatively, you can use only synchronous calls from your business service to process the data.

Call Interval

The polling interval for this adapter, in seconds. This is the time interval at which the adapter checks for input files in the specified locations.

Upon polling, if the adapter finds a file, it links the file to a stream object and passes the stream object to the associated business service. If several files are detected at once, the adapter sends one request to the business service for each individual file until no more files are found.

If the business service processes each file synchronously, the files will be processed sequentially. If the business service sends them asynchronously to a business process or business operation, the files might be processed simultaneously.

After processing all the available files, the adapter waits for the polling interval to elapse before checking for files again. This cycle continues whenever the production is running and the business service is enabled and scheduled to be active.

It is possible to implement a callback in the business service so that the adapter delays for the duration of the Call Interval between input files. For details, see “Defining Business Services” in Developing Ensemble Productions.

The default Call Interval is 5 seconds. The minimum is 0.1 seconds.

Charset

Specifies the character set of the input file. Ensemble automatically translates the characters from this character encoding. The setting value is not case-sensitive. Use Binary for binary files, or for any data in which newline and line feed characters are distinct or must remain unchanged, for example in HL7 Version 2 or EDI messages. Other settings may be useful when transferring text documents. Choices include:

Binary — Binary transfer
Ascii — Ascii mode FTP transfer but no character encoding translation
Default — The default character encoding of the local Ensemble server
Latin1 — The ISO Latin1 8-bit encoding
ISO-8859-1 — The ISO Latin1 8-bit encoding
UTF-8 — The Unicode 8-bit encoding
UCS2 — The Unicode 16-bit encoding
UCS2-BE — The Unicode 16-bit encoding (Big-Endian)
Any other alias from an international character encoding standard for which NLS (National Language Support) is installed in Ensemble

Use a value that is consistent with your implementation of OnProcessInput() in the business service:

When the Charset setting has the value Binary, the pInput argument of OnProcessInput() is of type %FileBinaryStreamOpens in a new tab and contains bytes.
Otherwise, pInput is of type %FileCharacterStreamOpens in a new tab and contains characters.

For background information on character translation in Caché, see “Localization Support” in the Caché Programming Orientation Guide.

Semaphore Specification

The Semaphore Specification allows you to indicate that the data file is complete and ready to be read by creating a second file that is used as a semaphore. The inbound file adapter waits until the semaphore file exists before checking the other conditions specified by the Confirm Complete requirements and then processing the data file. This allows the application creating the data file to ensure that the adapter waits until the data file is complete before processing it. The adapter tests only for the existence of the semaphore file and does not read the semaphore file contents.

If the Semaphore Specification is an empty string, the adapter does not wait for a semaphore file and processes the data file as soon as the conditions specified by the Confirm Complete requirements are met. If you are using a semaphore file to control when the adapter processes the data file, you should consider setting the Confirm Complete field to None.

The Semaphore Specification allows you to specify individual semaphore files for each data file or a single semaphore file to control multiple data files. You can use wildcards to pair semaphore files with data files, and can specify a series of patterns matching semaphore files to data files. The adapter always looks for a matching semaphore file in the same directory as the data file. If the adapter is looking for data files in subdirectories, the semaphore file must be in the same subdirectory level as its corresponding data file.

The general format for specifying the Semaphore Specification is:

[DataFileSpec=] SemaphoreFileSpec [;[DataFileSpec=] SemaphoreFileSpec]...

For example, if the Semaphore Specification is:

ABC*.TXT=ABC*.SEM

It means that the ABCTest.SEM semaphore file controls when the adapter processes the ABCTest.TXT file and that the ABCdata.SEM semaphore file controls when the adapter processes the ABCdata.txt file.

Note:

In a semaphore specification, the * (asterisk) matches any character except dot. In a file specification, the asterisk matches any character including the dot.

You can have one semaphore file control multiple data files. For example, if the Semaphore Specification is:

*.DAT=DATA.SEM

The DATA.SEM semaphore file controls when the adapter processes all *.DAT files in the same directory. When the adapter is looking for data files and corresponding semaphore files, it loops through all the data files at a polling interval. With the previous Semaphore Specification, if it started looking for DATA.SEM for the ABC.DAT file and does not find it, it continues looking for the semaphore files for the other files. But, if during this process DATA.SEM is created and it is looking for a match for XYZ.DAT, it finds the corresponding semaphore file. But the adapter defers processing XYZ.DAT until the next polling interval because a preceding data file, ABC.DAT, was waiting for the same semaphore file.

If you specify multiple pairings, separate them with a ; (semicolon). For example, if the Semaphore Specification is:

*.TXT=*.SEM;*.DAT=*.READY

The semaphore file MyData.SEM controls when the adapter processes MyData.TXT, but the semaphore file MyData.READY controls when it processes MyFile.DAT.

The adapter finds the corresponding semaphore file for each data file by reading the Semaphore Specification from left to right. Once it determines the corresponding semaphore file, it stops reading the Semaphore Specification for that file. For example, if the Semaphore Specification is:

VIData.DAT=Special.SEM; *.DAT=*.SEM

The adapter looks for the semaphore file Special.SEM before it processes VIData.DAT, but it does not consider VIData.SEM as a semaphore file for VIData.DAT. It does consider stuff.SEM as the semaphore file for stuff.DAT because stuff.DAT did not match an earlier specification. Consequently, if you are including multiple specifications that can match the same file, you should specify the more specific specification before the more general ones.

The data file target pattern is case-sensitive and the semaphore pattern case sensitivity is operating system dependent, that is *.TXT=*.SEM .is only applied to target files found ending with capitalized .TXT but the operating system may not differentiate between *.SEM and *.sem. If the operating system is not case-sensitive, the adapter treats semaphore files ending in any case combination of *.SEM and *.sem as equivalent but only uses them as the semaphore for data files named *.TXT. It cannot distinguish case in the semaphore files but can distinguish it in the data files.

If you only specify a single file specification and omit the = (equals) sign, the adapter treats that as the Semaphore Specification for all data files. For example, if the Semaphore Specification is:

*.SEM

This is equivalent to specifying a single wildcard to the left of the = (equals) sign:

*=*.SEM

In this case, the semaphore file MyFile.SEM controls the data file MyFile.txt and the semaphore file BigData.SEM controls the data file BigData.DAT.

If no wildcard is used in the Semaphore Specification then it is the complete fileSpec for the semaphore file. For example, if the Semaphore Specification is:

*.DAT=DataDone.SEM

Then the DataDone.SEM semaphore file controls when the adapter reads any data file with the .DAT file extension.

If a Semaphore Specification is specified and a data file does not match any of the patterns, then there is no corresponding semaphore file and the adapter will not process this data file. You can avoid this situation by specifying * as the last data file in the Semaphore Specification. For example, if the Semaphore Specification is:

*.DAT=*.SEM; *.DOC=*.READY; *=SEM.LAST

The SEM.LAST is the semaphore file for all files that do not end with .DAT or .DOC.

If an adapter configured with a FileSpec equal to *, the adapter usually considers all files in the directory as data files. But, if the adapter also has a Semaphore Specification and it recognizes a file as a semaphore file, it does not treat it as a data file.

After the adapter has processed through all the data files in a polling cycle, it deletes all the corresponding semaphore files.

Confirm Complete

Indicates the special measures that Ensemble should take to confirm complete receipt of a file. The options are:

List option	Integer value	Description
None	0	Take no special measures to determine if a file is complete.
Size	1	Wait until the reported size of the file in the FilePath directory stops increasing. This option may not be sufficient when the source application is sluggish. If the operating system reports the same file size for a duration of the File Access Timeout setting, then Ensemble considers the file complete.
Rename	2	Read more data for a file until the operating system allows Ensemble to rename the file.
Readable	4	Consider the file complete if it can open it in Read mode.
Writable	8	Consider the file complete if it can open it in Write mode (as a test only; it does not write to the file).

The effectiveness of each option depends on the operating system and the details of the process that puts the file in the File Path directory.

File Access Timeout

Amount of time in seconds that the system waits for information from the source application before confirming the complete receipt of a file. For more information, see Confirm Complete.

If you supply a decimal value, the system rounds the value up to the nearest whole number. The default value is 2.

File Path

Full pathname of the directory in which to look for files. This directory must exist, and it must be accessible through the file system on the local Ensemble machine.

File Spec

Filename or wildcard file specification for file(s) to retrieve. For the wildcard specification, use the convention that is appropriate for the operating system on the local Ensemble machine.

Subdirectory Levels

Number of levels of subdirectory depth under the given directory that should be searched for files.

Work Path

Full pathname of the directory where the adapter should place the input file while processing the data in the file. This directory must exist, and it must be accessible through the file system on the local Ensemble machine. This setting is useful when the same filename is used for repeated file submissions. If no WorkPath is specified, the adapter does not move the file while processing it.