Example 1
Here I request data from a station that is restricted. I didn't supplied a password file (and the default was not present) and I ask the arclink_fetch command to save the file as OutPutFile:
# arclink_fetch -v -u mbianchi@gfz-potsdam.de -a localhost:18001 -o OutPutFile req Warnings detected on the command line: Default password file (dcidpasswords.txt) not found requesting routing from localhost:18001 launching request thread (st79:18001) st79:18001: request 407 ready the following data requests were sent: datacenter name: Bianchi request ID: 407, Label: , Type: WAVEFORM, Encrypted: True, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 178624, Info: volume ID: BIA, dcid: BIA, Status: OK, Size: 178624, Encrypted: True, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE APE BHZ . status: OK, Size: 91648, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE BKB BHZ . status: OK, Size: 112640, Info: file is encrypted but no password supplied. saved file: OutPutFile.bz2.openssl
One important detail to notice in this example is that the arclink_fetch program will recognize that the resulting file is encrypted (and compressed) and will automatically add a .openssl (.bz2.openssl) suffix to the end of the filename supplied. If the request generated multiple files that cannot be merged it will also adapt the names by adding the dcid of the data center that generates the file and the requestid on the ArcLink server to the filename. To decrypt the received file, using the password DVfe}D&D, you would do:
# openssl des-cbc -v -d -pass pass:"DVfe}D&D" -in OutPutFile.bz2.openssl -out OutPutFile.bz2 bytes read : 178624 bytes written: 178602
The difference in size is because the encrypted file is padded for fulfill the algorithm needs, and can just be ignored.
And after the decryption is done, we can just decompress the file (if needed):
# bunzip2 OutPutFile.bz2
Now, your file is decrypted/decompressed and ready to use with rdseed or another program that can read SEED files.
Example 2
If your arclink_fetch program supports encrypted files (version > 2011.221), all you need to do is to create a file containing the passwords that you have received. The password file, has a very simple two column format. The first column is the dcid identifier and the second column is the password that you received. In our case, the dcid is equal to BIA as can be seen from the arclink_fetch outputs for the first example that we show. To set your file you could do something like:
# echo "BIA DVfe}D&D" >> dcidpasswords.txt # cat dcidpasswords.txt BIA DVfe}D&D # arclink_fetch -v -u mbianchi@gfz-potsdam.de -a localhost:18001 -o OutPutFile req requesting routing from localhost:18001 launching request thread (st79:18001) st79:18001: request 409 ready the following data requests were sent: datacenter name: Bianchi request ID: 409, Label: , Type: WAVEFORM, Encrypted: True, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 178624, Info: volume ID: BIA, dcid: BIA, Status: OK, Size: 178624, Encrypted: True, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE APE BHZ . status: OK, Size: 91648, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE BKB BHZ . status: OK, Size: 112640, Info: saved file: OutPutFile
This file now was already decrypted on the fly and you could just use this one without any further processing.
Example 3
When your request involves data from different data centers, arclink_fetch will split the request, but since some of the requests returned encrypted volumes it is not able to merge the volumes obtained. The resulting files will then be saved into two separated files:
# arclink_fetch -v -u mbianchi@gfz-potsdam.de -a localhost:18001 -o OutPutFile req Warnings detected on the command line: Default password file (dcidpasswords.txt) not found requesting routing from localhost:18001 launching request thread (st79:18001) launching request thread (erde.geophysik.uni-muenchen.de:18001) st79:18001: request 415 ready erde.geophysik.uni-muenchen.de:18001: request 77109 ready the following data requests were sent: datacenter name: Bianchi request ID: 415, Label: , Type: WAVEFORM, Encrypted: True, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 78792, Info: volume ID: BIA, dcid: BIA, Status: OK, Size: 78792, Encrypted: True, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE APE BHZ . status: OK, Size: 91648, Info: datacenter name: LMU request ID: 77109, Label: , Type: WAVEFORM, Encrypted: False, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 97700, Info: volume ID: LMU, dcid: , Status: OK, Size: 97700, Encrypted: False, Info: request: 2011,1,1,10,0,0 2011,1,1,11,0,0 BW MANZ SHZ . status: OK, Size: 156160, Info: cannot merge volumes saving volumes as individual files file is encrypted but no password supplied. saved file: OutPutFile.415.BIA.bz2.openssl saved file: OutPutFile.77109.LMU
It returns the files: OutPutFile.391.BIA.bz2.openssl and OutPutFile.77102.LMU. The first file is compressed and encrypted, but the second file is just a plain SEED file and ready to use.
Again, supplying the password file, arclink_fetch is now able to merge the resulting SEED file for us:
# arclink_fetch -v -u mbianchi@gfz-potsdam.de -a localhost:18001 -o OutPutFile req requesting routing from localhost:18001 launching request thread (st79:18001) launching request thread (erde.geophysik.uni-muenchen.de:18001) st79:18001: request 393 ready erde.geophysik.uni-muenchen.de:18001: request 77103 ready the following data requests were sent: datacenter name: Bianchi request ID: 393, Label: , Type: WAVEFORM, Encrypted: True, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 78792, Info: volume ID: BIA, dcid: BIA, Status: OK, Size: 78792, Encrypted: True, Info: request: 2010,1,1,10,0,0 2010,1,1,11,0,0 GE APE BHZ . status: OK, Size: 91648, Info: datacenter name: LMU request ID: 77103, Label: , Type: WAVEFORM, Encrypted: False, Args: resp_dict=true compression=bzip2 format=MSEED status: READY, Size: 97700, Info: volume ID: LMU, dcid: , Status: OK, Size: 97700, Encrypted: False, Info: request: 2011,1,1,10,0,0 2011,1,1,11,0,0 BW MANZ SHZ . status: OK, Size: 156160, Info: saved file: OutPutFile
resulting in only one file OutPutFile
For client developers
To support encryption in ArcLink some changes in the behavior of the ArcLink server were necessary. Those changes didn't affect the request submitting procedure, but only, the delivery of the request.
The mostly visible changes are:
1) The ArcLink status XML was modified to include the dcid parameter and also, the encryption flag.
2) The product downloaded when requesting a !miniSeed or !fullSeed files could be an encrypted file (if the encryption flag is set to true).
3) Clients should be careful to try to provide a valid e-mail address, as this will be used for password generation. New passwords will be sent to the address given.
Attention: The most important thing is that before trying to process the files received by the ArcLink server the clients should check if those files are unencrypted SEED files or, if they should be decrypted before being used. Again it is important to notice that it is impossible to merge encrypted files like normally done with !miniSeed files.
Dcid and Encryption flags
Two important parameters were added to the !ArcLink status xml file. The dcid and the encryption parameters. Those parameters indicates respectively, the data center that prepared the volume, and also, if this volume received by the client will be encrypted.
Your client should use those variables to decide what to do with the downloaded file, i.e. what kind of actions would be necessary to correctly handle the file. The encryption parameter is defined at the Volume level, and at the Request level.
Volume Level
The encryption flag on the Volume level equal to True indicates that this volume contains restricted data, and it will be (in the case of the ArcLink server) or already is (in the case of a ArcLink proxy) encrypted. For both cases, what is important is that you would get in the end, encrypted data if you try to download this one.
For decrypting the resulting data, you will need to have a password issued by the data center indicated by the dcid flag of this volume.
Request Level
The encrypted parameter on the Request level equal to True indicates that the request contains at least one volume that is encrypted and the action of downloading the full request (all the volumes together) will trigger the encryption of all volumes on the request.
The ArcLink server in that case (of downloading the request), will try to concatenate all the volumes inside the request and encrypt it on the fly before sending this file to you. A problem that can occur, is that if some volumes are already pre-encrypted (this is true for the ArcLink proxy), those volumes cannot be concatenated and the full download of the request will return an Error.
Your client should expect the download error, or just, as a rule of safety always download the request by the volumes and concatenate the results (rebuild of full SEED volumes) after you are able to decrypt the files. This is the solution used by the arclink_fetch today.
The encrypted file / How to decrypt
The arclink server generates a file that should be compatible with the openssl command tool. This tool expect a file that contains the magic Salted__ follow by the actual salt as a 8char binary key, and in the sequence the encoded file, with the necessary padding. Schematically we would have:
8 bytes + 8 bytes + Multiple of 8 bytes block of data [Salted__] [ffffffff] [<Encrypted data follow><Padding>]
The Salt is used together with the password during the derivation of the Key and IV, that are the actually used binary sequences of numbers used on the encryption process by the openssl routines. For deriving those ones, we are using the EVP_BytesToKey method from the OpenSSL/EVP methods.
The real decryption of the data blocks can be done using the EVP interface of the OpenSSL library. You should initialize a EVP context , giving the key and iv derived, use the update method from the EVP followed by the final method. For more information please consult the EVP man pages EVP_DecryptInit, EVP_DecryptUpdate and EVP_DecryptFinal.
For those of you who are using Python, one possibility is to use the python-m2crypto library like we did for the arclink_fetch client. The segments of code that dealing with the encryption are shown below and they are used to decrypt the file while receiving it from the server:
try: from M2Crypto import EVP, util hasM2Crypto = True except: hasM2Crypto = False ... class SSLWrapper: def __init__(self, password): if not hasM2Crypto: raise Exception("Module M2Crypto was not found on this system.") self._cypher = None self._password = None if password is None: raise Exception ('Password should not be Empty') else: self._password = password def update(self, chunk): if self._cypher is None: if len(chunk) < 16: raise Exception('Invalid first chunk (Size < 16).') if chunk[0:8] != "Salted__": raise Exception('Invalid first chunk (expected: Salted__') [key, iv] = self._getKeyIv(self._password, chunk[8:16]) self._cypher = EVP.Cipher('des_cbc', key, iv, 0) chunk = chunk[16:] if len(chunk) > 0: return self._cypher.update(chunk) else: return '' def final(self): if self._cypher is None: raise Exception('Wrapper has not started yet.') return self._cypher.final() def _getKeyIv(self, password, salt=None, size=8): chunk = None key = "" iv = "" while True: hash=EVP.MessageDigest('md5') if (chunk is not None): hash.update(chunk) hash.update(password) if (salt is not None): hash.update(salt) chunk = hash.final() i = 0 if len(key) < size: i = min(size - len(key), len(chunk)) key += chunk[0:i] if len(iv) < size and i < len(chunk): j = min(size - len(iv), len(chunk) - i) iv += chunk[i:i+j] if (len(key) == size and len(iv) == size): break return [key,iv]
After defining this class, you can on the first received block of data from the ArcLink server filter it through a method that would prepare the decryptor for you to apply on each chunk of data you receive from the server and get the data automatically decrypted.
... def __getDecryptor(self, buf, password): try: SSL = None status = False if buf is None or len(buf) < 8: raise Exception("supplied Buffer smaller than 8, cannot find out encryption.") if buf[0:8] == "Salted__": status = True if password is None or password == "": raise Exception('file is encrypted but no password supplied.') SSL = SSLWrapper(password) except Exception, e: logs.info(str(e)) finally: return (SSL, status)
On download:
decryptor = None firstBlock = True while bytes_read < size: buf = self.__fd.read(min(BLOCKSIZE, size - bytes_read)) bytes_read += len(buf) if firstBlock: firstBlock = False (decryptor, encStatus) = self.__getDecryptor(buf, password) if decryptor is not None: buf = decryptor.update(buf) else: if decryptor is not None: buf = decryptor.update(buf) outfd.write(buf) if decryptor is not None: buf = decryptor.final() outfd.write(buf)
You can find more information on:
- http://www.openssl.org/
- http://chandlerproject.org/bin/view/Projects/MeTooCrypto
- http://tldp.org/LDP/LGNET/87/vinayak.html
- http://en.wikipedia.org/wiki/Data_Encryption_Standard
- http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation#Cipher-block_chaining_.28CBC.29
FAQ
Here we are collecting some comments on common failures found on the use of encryption layer:
1) I am using an older version of arclink_fetch, to get data from a encryption enabled server. It crashes with an error like:
Traceback (most recent call last): File "/usr/lib64/python2.7/threading.py", line 530, in __bootstrap_inner self.run() File "/home/pevans/seiscomp3/lib/python/seiscomp/arclink/manager.py", line 252, in run self.__req.download_data(fd, True, False) File "/home/pevans/seiscomp3/lib/python/seiscomp/arclink/manager.py", line 153, in download_data decomp=self.args.get("compression")) File "/home/pevans/seiscomp3/lib/python/seiscomp/arclink/client.py", line 311, in download_data buf = z.decompress(zbuf) IOError: invalid data stream
- Reason
- This is a known bug affecting the older versions of arclink_fetch trying blindly decompress data received from the ArcLink server.
- Solution
- Please update to a new version or modify the line 253 (on version 2011.136) of the file arclink_fetch.py to disable the request of compressed data like shown below.
Change the line #253 from:
req_args = {"compression": "bzip2"}
to this:
req_args = {}
Test server
For helping the development of the new clients and adapting the existing ones we are running at GEOFON a preview version of the new server with encryption support enabled. The server information is:
- Machine
- webdc.eu
- Port
- 36000
On this server we loaded the metadata from the GE network. We override the routing of all stations, giving a primary route to our encryption server and a secondary route to the webdc.eu:18001, that will be routed wrongly to the encryption server at webdc.eu:36000. The station APE in this server is defined as restricted but BKB is not. Also, please note that only the following timespans of data are avaliable on the webdc.eu:36000 server:
- APE BHZ
- 2010,001,00h00m20s to 2010,001,23h59m44s
- BKB BHZ
- 2010,001,00h00m14s to 2010,001,23h59m54s
If you ask anything different from those time you will get a NODATA error.
Important: Also, please contact me at mbianchi at gfz-potsdam dot de saying that you want to gain access to the APE station on this server. After that on your first request you should receive an email (from the server) with your password (necessary to decrypt your files) at this test server.
A final notice on this server:
This is a test server, don't use it for any different reason than testing your new implementation. The data it distribute can be incorrect and should not be used to any real work.