Encoding user information into a numeric format

Hi all,
is there such an algorithm out there that takes in a plain text message, and returns a (short) string of integers? This would also need to be a reversable process. I am not looking for a strong ciphering algorithm, but rather a method to encode user information into a simple licensing number. The idea behind is to have this license number included in a digitally signed license file, from which I could then extract the user information to be displayed on a splash screen.
As such, the encoded string of numbers should not be very long - like max 10-15 digits, regardless of the length of the plain text message
Thanks in advance,
Tero

2 Comments

Rik
Rik on 3 May 2019
Edited: Rik on 3 May 2019
You could use a hash, or even part of a hash. That will not be easily reversible, and if you only encode 10 decimal digits, you can only store 10^10 unique information elements. That sounds like a lot, but for a 26 letter alphabet, that allows only 7 characters to be stored (26^8>10^10).
reversible hash are tricky to find.
We have to be very careful about how we talk about this. Your tags mentioned Caesar cipher which we cannot talk about for legal reasons. But we can talk about encoding data, and in one of those quirks of bad law we can talk about authentication methods.

Sign in to comment.

 Accepted Answer

Take the data and lz compress it. You can use a java gzip method. Now base 64 encode it. Result goes into your license file.
At runtime base 64 decode, unzip into memory, extract.

5 Comments

Is this approach expecting to have the .gz file available somewhere in the end user's computer? This is how I understood it after trying it out.
No, there are java gzip methods that work purely in memory.
This method works fine for me and is not too difficult to implement. Thank you. I'm anyway RSA encrpting some 128 character SHA-512 hashes, so the length of this compressed data doesn't actually matter that much.
The Java methods that Jan showed are a cleaner implementation. However, gzip to an external file is certainly an easier implementation.
well, to be honest, I'm not really sure which solution I'm following at... I took the CompressLib from the File Exhange, and base 64 encoded the result. This works :)

Sign in to comment.

More Answers (1)

Jan
Jan on 3 May 2019
Edited: Jan on 4 May 2019
No, this cannot work. You can compress the text e.g. in ZIP format, but this cannot guarantee a length of 10 to 15 bytes for the output. It would be pure magic, if text can be compressed such efficiently in a reversible way.
You can create a text with the full user information and use an HMAC key to get a unique and secure vector:
YourKey = 'Hello,this is my secret key#3.14*'
Msg = sprintf(['Customer: Karl Heinz aus Rostock\n', ...
'Karlsonstraße 27']);
YourKey = uint8(YourKey(:));
Msg = uint8(Msg(:));
Method = 'MD5';
Block = 64; % 64 for: MD5, SHA-1, SHA-256, 128: SHA-384, SHA-512
Engine = java.security.MessageDigest.getInstance(Method);
% Encrypt key if it is longer than the block size:
if length(Key) > BlockSize % Alternatively: In every case
Engine.update(uint8(Key));
Key = typecast(Engine.digest, 'uint8');
end
% Padding
KeySize = numel(Key);
ipad(1:BlockSize) = uint8(54); % 0x36
ipad(1:KeySize) = bitxor(uint8(54), Key);
opad(1:BlockSize) = uint8(92); % 0x5c
opad(1:KeySize) = bitxor(uint8(92), Key);
% Calculate the hash:
Engine.update(ipad);
Engine.update(uint8(Msg)); % Fails for empty Msg!
iHash = typecast(Engine.digest, 'uint8');
Engine.update(opad);
Engine.update(iHash);
HMAC = typecast(Engine.digest, 'uint8');
% Output:
HMAC = reshape(HMAC, 1, []); % As UINT8 vector
% Or Hex: HMAC = sprintf('%.2x', double(HMAC));
% Or base64: HMAC = matlab.net.base64encode(HMAC);
% Or B64 = org.apache.commons.codec.binary.Base64;
% HMAC = char(B64.encode(HMAC)).';
% Shorter: as base64, then: HMAC(HMAC == '=') = [];
With the shortend base64 encoding this gives you 22 bytes. You cannot recreate the original message with this, but you can prove with your secret key, that the HMAC belongs to this specific text file and that the text file has not been modified.
I'm going to publish this method in the FileExchange soon.
By the way, the method for zipping some data:
import java.io.*;
import java.util.zip.*;
Msg = sprintf(['Customer: Karl Heinz aus Rostock\n', ...
'Karlsonstraße 27']);
ByteData = uint8(Msg(:)); % [EDITED, not TYPECAST()]
ByteStream = ByteArrayOutputStream();
ZIPStream = ZipOutputStream(ByteStream);
ZIPStream.setLevel(9);
entry = ZipEntry('Value');
entry.setSize(numel(ByteData));
ZIPStream.putNextEntry(entry);
ZIPStream.write(ByteData);
ZIPStream.closeEntry();
ZIPStream.close();
Byte = ByteStream.toByteArray();
You get 172 bytes for this input with 49 characters. ZIP has more advantages for longer input: For Msg = repmat(Msg, 1, 10) the output has 177 bytes, just 5 bytes more.
I assume the HMAC and the clear text message in a text file is better for your needs.

5 Comments

thanks, I need some time to figure this out :)
Just to point out that I do already posses a powerful asymmetric RSA encryption, which is used to digitally sign my license file. I can be sure nobody tampers with it.
My original intent was to find a magical algorithm that would use some clever math to encode basic user data like company name and end user's name into some length of a (numerical) string. I'm not that picky about the length, but it has to physically fit into the splash screen - kind of the same what's seen on Matlab splash while starting.
You probably are not going to do better than 4:1 compression and maybe not even that good.
@Tero: Does this mean that the javautil.zip or .gzip methods solve your problem? If the numerical vector matchs into the splash screen depends on how you display it: As gray scale block? As UINT8 or base64 encoded? Which detail of Matlab's splash screen do you mean?
What about displaying the name and the company of the customer in clear text on the splash screen? This is the standard method in many softwares.
Perhaps, I just need to test and see how it works - this is new stuff for me.
I'm doing a simple text to bitmap conversion, which I'm attaching onto the .png splah to display some user data (on Matlab splash there is the license number dsiplayed amongst other things). I have the "clean" splash image data embedded into my stand alone, which gets updated based on the extracted data from the license file.
And like you said, I will be displaying the names etc. in clear text. I was just looking for some ways not needing to write all that data into the license file as is.
So the meaning of all this is to have only one stand alone that everybody can download and install, but the license file is then user specific, and the stand alone can utilize the data form that file without the need to compile a new version every time.
But anyways, I can only just display some license number, and then keep a record myselft of the owners
Put some sample data into a text file without any headers but with a field delimiter even if only newline. Use a short file name. gzip -9 it. Also gzip an empty file with the same length of file name. Subtract lengths to get an estimate of the encoding length of needed for in memory encoding like Jan showed Java methods for. If it is significantly longer than you are willing to store in the license file then you are out of luck. If it starts getting close then time to start testing with the Java code.

Sign in to comment.

Categories

Asked:

on 3 May 2019

Commented:

on 6 May 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!