bloscCodec

2019.02.01

editors:
Marty Kraimer

This product is available via an open source license

Table of Contents

Preface

Background

In 2018 areaDetector added support for compressing and decompressing arrays, mostly for images. Marty Kraimer did a small part of the work related to ImageJ. This was the motivation for creating bloscCodec

Both the areaDetector and bloscCodec use blosc In particular c-blosc to do the actual compresson and decompresson.

Current Status

Existing Features
All examples described in this document work.
array types
Only numeric scalar arrays are supported. This means signed and unsigned integers of length 8, 16, 32, and 64 bits, as well as float, and double.
byte order
The present implementation does not support byte order. Thus, except for signed and unsigned 8 bit integers, compression and decompression must be done on CPUs that have the same byte order.

Future Plans

Except for fixing the byte order problem, nothing else is currently planned.

The idea is to wait and see if anyone is interested in the bloscCodec service and provides feedback about what is missing. If anyone is interested please provide feedback.

Introduction

NOTE: In this document blosc refers to the code that comes with blosc. and bloscCodec refers to the code described in this document.

bloscCodec is a facility for compressing and decompressing scalar arrays , i. e. a PVScalarArray, which is described in: pvDataCPP

The following components of bloscCodec are described below:

bloscCodec
Code that calls blosc for a scalar array as defined by pvDataCPP.
bloscCodecRecord
A PVRecord as defined by: pvDatabaseCPP
It provides code for calling blosc for a scalar array from either a DBrecord or a PVRecord.
It can both compress and decompress.
exampleClient
Examples for client code. The most important are:
clientMonitorCodec
This asks a bloscCodecRecord to start or stop monitoring a PVRecord or DBRecord.
clientDecompressCodec
This shows how a client can decompress an array it gets from a bloscCodecRecord.
clientGetputCodec
This is client code that asks a bloscCodecRecord to compress or decompress.

building

Clone the code:

git clone https://github.com/mrkraimer/bloscCodecCPP.git
Then just do the following:
cd bloscCodecCPP
cp ExampleRELEASE.local configure/RELEASE.local
edit file configure/RELEASE.local
make

In configure/RELEASE.local change the locations of EPICS4_DIR and EPICS_BASE.

bloscCodec requires the following epics cmponents:

pvDataCPP
normativeTypesCPP
pvAccessCPP
pva2pva
pvDatabaseCPP
pvaClientCPP

All are provided with a recent EPICS 7 release from the EPICS web site epics-controls

These could also be cloned from the master branches in epics-base.

Preparation for running examples

start ioc

In a window:
mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP/iocBoot/bloscCodecIoc
mrk> ../../bin/linux-x86_64/bloscCodecIoc st.cmd
... LOTS OF OUTPUT
epics>
To see all PVRecords enter:
epics> pvdbl
PVRdoubleArray
PVRfloatArray
PVRint16Array
PVRint32Array
PVRint64Array
PVRint8Array
PVRuint16Array
PVRuint32Array
PVRuint64Array
PVRuint8Array
bloscCodecRecord
To see all DBRecords enter:
epics> dbl
DBRint8Array
DBRint16Array
DBRint32Array
DBRint64Array
DBRuint8Array
DBRuint16Array
DBRuint32Array
DBRuint64Array
DBRfloatArray
DBRdoubleArray
epics> 

monitor bloscCodecRecord

In order to better understand the client examples it helps to monitor the bloscCodecRecord.

In another window:

mrk> pvget -m -r "" -v bloscCodecRecord
You will see:
bloscCodecRecord structure 
    ubyte[] value []
    alarm_t alarm MINOR CLIENT  is idle 
        int severity 1
        int status 7
        string message  is idle
    time_t timeStamp <undefined>              
        long secondsPastEpoch 0
        int nanoseconds 0
        int userTag 0
    string channelName 
    int elementScalarType 0
    enum_t command (0) idle
        int index 0
        string[] choices ["idle", "get", "put", "startMonitor", "stopMonitor"]
    structure bloscArgs
        int compressedSize 0
        int decompressedSize 0
        int level 3
        enum_t compressor (0) blosclz
            int index 0
            string[] choices ["blosclz", "lz4", "lz4hc", "snappy", "zlib", "zstd"]
        enum_t shuffle (0) NOSHUFFLE
            int index 0
            string[] choices ["NOSHUFFLE", "SHUFFLE", "BITSHUFFLE"]
        int threads 1

monitor scalar array records

You can monitor one or more of the scalar array records. For example:

mrk> pvget -m PVRdoubleArray DBRdoubleArray
PVRdoubleArray <undefined>              []
DBRdoubleArray <undefined>              INVALID DRIVER UDF []

Sample examples

This section describes how to run some of the example client code and the resulting output.

start IOC

mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP/iocBoot/bloscCodecIoc
mrk> ../../bin/linux-x86_64/bloscCodecIoc st.cmd

monitor bloscCodecRecord

mrk> pvget -m -r "" -v bloscCodecRecord

startMonitor

The following issues a command to bloscCodecRecord to start monitoring DBRdoubleArray:

mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP
mrk> bin/linux-x86_64/clientMonitorCodec
_____clientMonitorCodec starting__
channelStateChange is Connected false
channelStateChange is Connected true
enter one of: startMonitor stopMonitor exit
startMonitor
enter channelName
DBRdoubleArray
do You want to modify any bloscArgs? answer y or n
n
startMonitor success

Run clientDecompressCodec

This is a client that monitors bloscCodecRecord. When bloscCodecRecord compresses, the client gets the compressed data and decompresses.

mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP
mrk> bin/linux-x86_64/clientDecompressCodec 
_____clientMonitorCodec starting__
channelStateChange is Connected false
channelStateChange is Connected true
enter one of: start stop exit
start
enter channelName
DBRdoubleArray

Leave this running

Make change to DBRdoubleArray

execute the following:

mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP
mrk> bin/linux-x86_64/clientPutArray 
_____clienPutArray starting__
enter put or exit or return
put
channelName
DBRdoubleArray
number elements
100
first element
1
number of times to repeat same number
10
max element value
100

output

On the window whare bloscCodecRecord is being monitored:
bloscCodecRecord structure 
    alarm_t alarm 
        int severity 0
        string message startMonitor success
    time_t timeStamp 2019-01-30 09:50:38.165  
        long secondsPastEpoch 1548859838
        int nanoseconds 165344378
    string channelName DBRdoubleArray
    enum_t command (3) startMonitor
        int index 3
    structure bloscArgs
        int level 3
        enum_t compressor (0) blosclz
            int index 0
        enum_t shuffle (0) NOSHUFFLE
            int index 0
        int threads 1
bloscCodecRecord structure 
    ubyte[] value [2,1,0,1,32,3,0,0,32,3,0,0,157,0,0,0,20,0,0,0,133,
                   0,0,0,33,0,0,64,0,1,240,63,224,77,7,1,0,64,32,3,64,
                  0,0,64,96,5,224,64,7,1,8,64,32,75,32,0,1,8,64,64,5,
                  224,65,7,1,16,64,32,76,32,0,1,16,64,64,5,224,65,
                  7,1,20,64,32,76,32,0,1,20,64,64,5,224,65,7,1,24,64,32,
                  76,32,0,1,24,64,64,5,224,65,7,1,28,64,32,76,32,0,
                  1,28,64,64,5,224,65,7,1,32,64,32,76,32,0,1,32,64,64,5,
                  224,65,7,1,34,64,32,76,32,0,1,34,64,64,5,224,65,7,1,36,64]
    alarm_t alarm 
        string message compress success
    time_t timeStamp 2019-01-30 09:50:38.167  
        int nanoseconds 166709190
    string channelName DBRdoubleArray
    int elementScalarType 10
    enum_t command (1) get
        int index 1
    structure bloscArgs
        int compressedSize 157
        int decompressedSize 800
        int level 3
        enum_t compressor (0) blosclz
            int index 0
        enum_t shuffle (0) NOSHUFFLE
            int index 0
        int threads 1

On the window where clientDecompressCodec was started:

result success data
[1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,
7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,
9,9,9,9,9,9,9,9,9,9,9,10]

bloscCodec

This is a wrapper for c-blosc It provides a PVStructure and methods to call the c-blosc code.

bloscCodec structure

This is a structure for the arguments that can be passed to c-blosc. The descriptions below are taken from the header files provided by c-blosc. For a fuller explaination look at the header files in:

bloscCodecCPP/bloscCodecSrc/c-blosc/blosc

A pvStructure for the data is:

structure 
    int compressedSize 0
    int decompressedSize 0
    int level 3
    enum_t compressor
        int index 0
        string[] choices ["blosclz", "lz4", "lz4hc", "snappy", "zlib", "zstd"]
    enum_t shuffle
        int index 0
        string[] choices ["NOSHUFFLE", "SHUFFLE", "BITSHUFFLE"]
    int threads 1
where:
compressedSize
This is always set by blosc.
decompressedSize
This must always be set by code that calls blosc. Code that uses a pvScalarArray to call bloscCodec never needs to set this field since bloscCodec computes it while compressing.
level
The desired compresson level and must be a number between 0 (no compresson) and 9 (maximum compresson).
compressor
The name of one of the compressors shipped with blosc See blosc for more details. Note that blosclz is supplied by blosc itself and the other compressors are packaged with blosc.
shuffle
Specifies whether the shuffle compresson filters should be applied or not. BLOSC_NOSHUFFLE means not applying it, BLOSC_SHUFFLE means applying it at a byte level and BLOSC_BITSHUFFLE at a bit level (slower but may achieve better entropy alignment).
threads
Number of threads used by blosc.

bloscCodec methods

class BloscCodec
{
    static BloscCodecPtr create();
    static epics::pvData::StructureConstPtr getCodecStructure();
    bool compressBlosc(
        const epics::pvData::PVUByteArrayPtr & pvDest,
        const epics::pvData::PVScalarArrayPtr & pvSource,
        const epics::pvData::PVStructurePtr & pvBloscArgs);
    bool compressBlosc(
        const epics::pvData::PVUByteArrayPtr & pvDest,
        const void * decompressAddr, size_t decompressSize,
        const epics::pvData::PVStructurePtr & pvBloscArgs);
    bool decompressBlosc(
        const epics::pvData::PVUByteArrayPtr & pvSource,
        const epics::pvData::PVScalarArrayPtr & pvDest,
        const epics::pvData::PVStructurePtr & pvBloscArgs);
   bool decompressBlosc(
        const epics::pvData::PVUByteArrayPtr & pvSource,
        void * decompressAddr, size_t decompressSize,
        const epics::pvData::PVStructurePtr & pvBloscArgs);
    std::string getMessage();

    void initCodecStructure(const epics::pvData::PVStructurePtr & pvStructure);
};
where
create
Create and instance of BloscCodec.
getCodecStructure
Get an introspection interface for a bloscCodec structure
compressBlosc
Compress a scalar array. The compressed array is in pvDest and compressedSize is set.
There are two methods. The first has the argument pvSource This is the method that will be called by most clients.
The second method, which has arguments decompressAddr and decompressSize, is used by bloscCodecRecord for accessing a DBRecord.
decompressBlosc
Decompress a scalar array. The compressed array is in pvSource.
There are two methods. The first has the argument pvDest This is the method that will be called by most clients.
The second method, which has arguments decompressAddr and decompressSize, is used by bloscCodecRecord for accessing a DBRecord.
getMessage
If compressBlosc or decompressBlosc returns false then getMessage returns a reason.
initCodecStructure
This initializes a pvStructure created with the introspection interaface returned by getCodecStructure.

bloscCodecRecord

This is a PVRecord that compresses/decompresses data in a scalar array that resides in another record in the same IOC. The record can be either a PVRecord or a DBRecord.

bloscCodecRecord data

bloscCodecRecord structure 
    ubyte[] value []
    alarm_t alarm MINOR CLIENT  is idle 
        int severity 1
        int status 7
        string message  is idle
    time_t timeStamp <undefined>              
        long secondsPastEpoch 0
        int nanoseconds 0
        int userTag 0
    string channelName 
    int elementScalarType 0
    enum_t command (0) idle
        int index 0
        string[] choices ["idle", "get", "put", "startMonitor", "stopMonitor"]
    structure bloscArgs
        int compressedSize 0
        int decompressedSize 0
        int level 3
        enum_t compressor (0) blosclz
            int index 0
            string[] choices ["blosclz", "lz4", "lz4hc", "snappy", "zlib", "zstd"]
        enum_t shuffle (0) NOSHUFFLE
            int index 0
            string[] choices ["NOSHUFFLE", "SHUFFLE", "BITSHUFFLE"]
        int threads 1
where:
value
The array that holds the compressed data.
alarm
Shows the result of bloscCodecRecord processing.
timeStamp
The time when bloscCodecRecord::process was called.
channelName
The name of the record to compress or decompress. It must name a PVRecord or DBRecord in the same IOC that has the bloscCodecRecord.
elementScalarType
Set by bloscCodecRecord when it compresses.
command
Described in the next section.
bloscArgs
The arguments for blosc.

bloscCodecRecord processing

idle
This can be used to set other fields in the bloscCodecRecord without causing compresson or decompresson.
get
This is a request to compress channelName. The compressed array is in value.
put
This is a request to decompress value into channelName.
startMonitor
Start monitoring channelName Each time an event occurrs a get command is issued. All arguments passed to get are the same as when startMonitor was called.
Only one monitor at a time is suported. An error is returned via alarm if a monitor is already active.
stopMonitor
Stop monitoring.

exampleClient

clientMonitorCodec

This client asks a bloscCodecRecord to start or stop monitoring, i.e. , it issues commands get and put to a bloscCodecRecord.

The help option is:

mrk> pwd
/home/epicsv4/masterCPP/bloscCodecCPP
mrk> bin/linux-x86_64/clientMonitorCodec -help
 -h -c codecChannelName - d debug  
default
-c bloscCodecRecord -d false

When it is started it issues the followimg prompt:

mrk> bin/linux-x86_64/clientMonitorCodec 
_____clientMonitorCodec starting__
channelStateChange is Connected false
channelStateChange is Connected true
enter one of: startMonitor stopMonitor exit

A simple example of startMonitor is:

startMonitor
enter channelName
PVRdoubleArray
do You want to modify any bloscArgs? answer y or n
n
startMonitor success
enter one of: startMonitor stopMonitor exit

The monitor is stopped as follows:

stopMonitor
stopMonitor success
enter one of: startMonitor stopMonitor exit

startMonitor also allows the client to specify blosc options. For example:

startMonitor
enter channelName
PVRdoubleArray
do You want to modify any bloscArgs? answer y or n
y
level is 3 do you want to change it?
y
enter level
4
compressor is blosclz do you want to change?
y
0=blosclz,1=lz4,2=lz4hc,3=snappy,4=zlib,5=zstd
1
shuffle is NOSHUFFLE do you want to change?
y
0=NOSHUFFLE,1=SHUFFLE,2=BITSHUFFLE
1
threads is 1 do you want to change it?
y
enter threads
3
startMonitor success
enter one of: startMonitor stopMonitor exit

clientDecompressCodec

This is a client that monitors a bloscCodecRecord. When a monitor event occurs it looks to see if the channelName from the record is the same as the client specified when start was issued. If not the same it just ignores the event. If the names are the same it creates a PVScalarArray and decompresss from the value field of the bloscCodecRecord.

It is used as follows:

mrk> bin/linux-x86_64/clientDecompressCodec -help
 -h -c codecChannelName - d debug  
default
-c bloscCodecRecord -d false
mrk> bin/linux-x86_64/clientDecompressCodec 
_____clientMonitorCodec starting__
channelStateChange is Connected false
channelStateChange is Connected true
enter one of: start stop exit

when start is issued you will see output like:

enter one of: start stop exit
start
enter channelName
PVRdoubleArray
enter one of: start stop exit
monitorConnect PVRdoubleArray status Status [type=OK]
result success data
[1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,
7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,
9,9,9,9,9,9,9,9,9,9,9,10]

Note that clientMonitorCodec can be run to start monitoring and clientPutArray can be used to put values to channelName

clientGetputCodec.

This is a client that asks a bloscCodecRecord to compress and decompress.

It is used as follows:

mrk> bin/linux-x86_64/clientGetputCodec -help
 -h -c codecChannelName - d debug  
default
-c bloscCodecRecord -d false
mrk> bin/linux-x86_64/clientGetputCodec 
_____clietGetPutCodec starting__
channelStateChange isConnected false
channelStateChange isConnected true
enter one of: compress decompress exit

When compress is entered the client is asked for the same arguments as the startMonitor command of clientMonitorCodec .

enter one of: compress decompress exit
compress
enter channelName
PVRdoubleArray
do You want to modify any bloscArgs? answer y or n
n
compress success
enter one of: compress decompress exit

A pvaClientPutGet request is issued to the bloscCodecRecord The put part of the request specicies the bloscCodec arguments and command get, which means compress. The get part of the putGet requests all the fields required to issue a decompress request.

A example decompress request is:

enter one of: compress decompress exit
decompress
decompress success
enter one of: compress decompress exit

clientPutArray

This is client the uses pvaClientPut to put to a scalar array field. The record can be either a PVRecord or a DBRecord. It is used to generate arrays that has redundent elements. This allows testing how much compresson the various blosc methods preform.

For example:

mrk> bin/linux-x86_64/clientPutArray 
_____clienPutArray starting__
enter put or exit or return
put
channelName
PVRdoubleArray
number elements
100
first element
1
number of times to repeat same number
10
max element value
100
enter put or exit or return
Produces the following:
mrk> pvget PVRdoubleArray
PVRdoubleArray 2019-01-29 10:58:37.118
[1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,
4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,
7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,9,10]

clientCodec

This shows the default bloscArgs

mrk> bin/linux-x86_64/clientCodec 
pvStructure
structure 
    int compressedSize 0
    int decompressedSize 0
    int level 3
    enum_t compressor
        int index 0
        string[] choices ["blosclz", "lz4", "lz4hc", "snappy", "zlib", "zstd"]
    enum_t shuffle
        int index 0
        string[] choices ["NOSHUFFLE", "SHUFFLE", "BITSHUFFLE"]
    int threads 1