226 lines
8.8 KiB
Markdown
226 lines
8.8 KiB
Markdown
# Steganography
|
|
|
|
Steganography means hiding secret [information](information.md) within some unrelated data by embedding it in a way that's very hard to notice; for example it is possible to hide text messages in a digital photograph by slightly modifying the colors of the image pixels -- that photo then looks just like an innocent picture while in fact bearing an extra information for those who know it's there and can read it. Steganography differs from [encryption](encryption.md) by trying to avoid even suspicion of secret communication.
|
|
|
|
There are many uses of steganography, for example in secret communication, bypassing [censorship](censorship.md) or secretly tracking a piece of digital media with an invisible [watermark](watermark.md) (game companies have used steganography to identify which tester's game client was used to leak pre-release footage of their games). [Cicada 3301](cicada.md) has famously used steganography in its puzzles.
|
|
|
|
Steganography may need to take into account the possibility of the data being slightly modified, for example pictures exchanged on the [Internet](internet.md) lose their quality due to repeating [compression](compression.md), cropping and format conversions. Robust methods may be used to preserve the embedded information even in these cases.
|
|
|
|
Some notable methods and practices of steganography include:
|
|
|
|
- Embedding in **text**, e.g. making intentional typos in certain places, using extra white or zero-width characters, modifying formatting and case or using [Unicode](unicode.md) homoglyphs can all carry information.
|
|
- Embedding in **images**. One of the simplest methods is storing data in least significant bits of pixel values (which won't be noticeable by human eyes). Advanced methods may e.g. modify statistical properties of the image such as its color [histogram](histogram.md).
|
|
- Embedding in **sound**, **video**, vector graphics and all other kinds of media is possible.
|
|
- All kinds of data can be embedded given enough storage capacity of given bearing medium (e.g. it is possible to store an image in text, sound in another sound etc.).
|
|
- Information that's present but normally random or unimportant can be used for embedding, e.g. the specific order of items in a list (its [permutation](premutation.md)) can bear information as well as length of time delays in timed data, amount of noise in data etc.
|
|
|
|
The following two pictures encode text, each picture a different one, written under it. (The method used for the encoding as well as the whole code will be present further below.)
|
|
|
|
```
|
|
-,~.',.
|
|
'.,, -.....~.
|
|
_..-, .-.'.,~>"
|
|
,,-.,-.'',,-.!$$r><l+
|
|
'':~:~:::'.,.'.,,-.,+$$Ofls5'
|
|
.~:~''':::~:;:_::;_.,',,_:s"(!s^x}$;
|
|
-.~JJ<;.'::;~;;_::_::::_;<+F!v!r{888
|
|
-.<#O#$$ezrslll)l+lfr}{V$88#$!
|
|
'~+588#O$8$O8$8#O$#$O5,
|
|
,sOO#$O8$8#88#.
|
|
O8$$ $Opa{a{^x_
|
|
$@ M$ 8W M$ eFc>!vs:;sJ:/, , -:_!\
|
|
$@ @8 8@ @# ^esCc+//s^;_^c+cc+cc+cc+cCi
|
|
88#8 8# 88 #exoCC+cc>^^s^^s^^s^^s^^s//>/c+c/
|
|
#8 8#88 #VxsFC)CC+cc+cc+cc+//+cc+cc+cc)/
|
|
8#88 #8Vi^^oFFoFFoFFoFFoF^oFFoFFoFFs
|
|
8@ W8 8#88 s88#Ve}xxixxiee}ee}eeixxi^^_
|
|
88#88# 8@- 8@ #8 8#88#88#88#88#88#88#88#x
|
|
8W @8 #@, #88#8 ,#88#88#88#88#88#88#88)
|
|
88#8 8W ,8# ,V#88#88#88#88#88#^
|
|
8W 88 -C8#88#88#8^-
|
|
,8#8,-^Vie,
|
|
_;;oF^#88+c,-cc),
|
|
,>,, -,/>/,
|
|
sCC)c/s^/s,
|
|
,_,,
|
|
```
|
|
|
|
*The ability of accurate observation is often called cynicism by those who have not got it.*
|
|
|
|
```
|
|
-,~.,-,
|
|
,-.. '.'.,,_.
|
|
;-'.' ,,-.,':/+
|
|
'.,,,-'.,.,-,^#$r\<("
|
|
,,;;;__;~.',...'.,,,+$#$flsV,
|
|
-~:;.',:~:;;_~::~~:.'.'._~!cls!!x}$;
|
|
',;cc/_,,_;;_;;_;;_;;_;;_/co^^sxe#88
|
|
-,/#88#8eix^)CC)Cc)Fx}eV#88#8^
|
|
-;cb88#88#88#88#88#88b,
|
|
,s88#88#88#88#,
|
|
8#88 #8V}ee}^x_
|
|
8@ W8 8W @8 }Fc>^^s;;sc;>, , -;;s/
|
|
8W @8 #@ @# ^esCc+//s^;_^c+cc+cc+cc+cCi
|
|
88#8 8# 88 #exoCC+cc>^^s^^s^^s^^s^^s//>/c+c/
|
|
#8 8#88 #VxsFC)CC+cc+cc+cc+//+cc+cc+cc)/
|
|
8#88 #8Vi^^oFFoFFoFFoFFoF^oFFoFFoFFs
|
|
8@ W8 8#88 s88#Ve}xxixxiee}ee}eeixxi^^_
|
|
88#88# 8@- 8@ #8 8#88#88#88#88#88#88#88#x
|
|
8W @8 #@, #88#8 ,#88#88#88#88#88#88#88)
|
|
88#8 8W ,8# ,V#88#88#88#88#88#^
|
|
8W 88 -C8#88#88#8^-
|
|
,8#8,-^Vie,
|
|
_;;oF^#88+c,-cc),
|
|
,>,, -,/>/,
|
|
sCC)c/s^/s,
|
|
,_,,
|
|
```
|
|
|
|
*To be or not to be. That is the question.*
|
|
|
|
## Example Code
|
|
|
|
Here is a quite basic [C](c.md) program that hides text in ASCII grayscale pictures (used to generate the examples above):
|
|
|
|
```
|
|
#include <stdio.h>
|
|
|
|
const char alphabet[] = // our 6 bit alphabet, position = code
|
|
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 .";
|
|
|
|
const char groups[] = // newline separated groups of similar brightness chars
|
|
"@WM0" "\n" "Q&%R" "\n" "8#O$" "\n" "B69g" "\n"
|
|
"NUmP" "\n" "w3XD" "\n" "Vbp5" "\n" "d24S" "\n"
|
|
"qkGK" "\n" "EHAZ" "\n" "hY[T" "\n" "e}a{" "\n"
|
|
"]y17" "\n" "Fofu" "\n" "n?Ij" "\n" "C)(l" "\n"
|
|
"xizr" "\n" "^sv!" "\n" "t*=L" "\n" "/>\\<" "\n"
|
|
"c+J\"" "\n" ";_~:" "\n" ",-'." "\n";
|
|
|
|
unsigned char toAlphabet(unsigned char c) // encodes char to 6 bit alphabet
|
|
{
|
|
for (unsigned char i = 0; i < 64; ++i)
|
|
if (alphabet[i] == c)
|
|
return i;
|
|
|
|
return 63;
|
|
}
|
|
|
|
unsigned char fromAlphabet(unsigned char c) // decodes char from 6 bit alphabet
|
|
{
|
|
return alphabet[c & 0x3f];
|
|
}
|
|
|
|
int canEncode(char c) // says if specific visual char can be used for encoding
|
|
{
|
|
if (c == '\n')
|
|
return 0;
|
|
|
|
for (int i = 0; i < sizeof(groups) - 1; ++i)
|
|
if (groups[i] == c)
|
|
return 1;
|
|
|
|
return 0;
|
|
}
|
|
|
|
const char *seekGroup(char c) // helper, seeks to similar brightness group
|
|
{
|
|
const char *s = groups;
|
|
|
|
while (c != *s)
|
|
s++;
|
|
|
|
while (*s != '\n')
|
|
s++;
|
|
|
|
s--;
|
|
|
|
return s;
|
|
}
|
|
|
|
char encode(char c, int n) // encodes value n in given encodable char
|
|
{
|
|
const char *s = seekGroup(c);
|
|
|
|
while (n)
|
|
{
|
|
s--;
|
|
n--;
|
|
}
|
|
|
|
return *s;
|
|
}
|
|
|
|
int decode(char c) // decodes value n from given encodable char
|
|
{
|
|
int n = 0;
|
|
|
|
const char *s = seekGroup(c);
|
|
|
|
while (*s != c)
|
|
{
|
|
s--;
|
|
n++;
|
|
}
|
|
|
|
return n;
|
|
}
|
|
|
|
int main(int argc, char **argv)
|
|
{
|
|
unsigned char currentChar = 0;
|
|
int pos = 0, done = 0;
|
|
|
|
while (1) // read all chars from input
|
|
{
|
|
int c = getchar();
|
|
|
|
if (c == EOF)
|
|
break;
|
|
|
|
if (argc < 2)
|
|
{
|
|
// decoding text from image
|
|
|
|
if (canEncode(c))
|
|
{
|
|
currentChar = (currentChar << 2) | decode(c);
|
|
|
|
if (pos % 3 == 2)
|
|
{
|
|
putchar(fromAlphabet(currentChar));
|
|
currentChar = 0;
|
|
}
|
|
|
|
pos++;
|
|
}
|
|
}
|
|
else
|
|
{
|
|
// encoding text into image
|
|
|
|
if (canEncode(c))
|
|
{
|
|
unsigned char c2 = !done ? argv[1][pos / 3] : 0;
|
|
|
|
if (!c2)
|
|
{
|
|
done = 1;
|
|
c2 = ' ';
|
|
}
|
|
|
|
c2 = (toAlphabet(c2) >> ((2 - pos % 3) * 2)) & 0x03;
|
|
c = encode(c,c2);
|
|
pos++;
|
|
}
|
|
|
|
putchar(c);
|
|
}
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
```
|
|
|
|
The usage is following: make a file with a grayscale [ASCII art](ascii_art.md) picture, then pass it to the standard input of this program along with text you want to encode (maximum length of the text you can encode is given by the count of usable characters in the input image) passed as the first argument to the program, for example: `cat picture.txt | ./program "hello"`. The program will print out the image with the text embedded in. To read the text from the image similarly pass the picture to the program's input, without passing any arguments, for example: `cat picture2.txt | ./program`. The text will be written to terminal.
|
|
|
|
The method used is this: firstly for the encoded message we use our own 6 bit alphabet -- this only allows us to represent 63 symbols (which we have chosen to be uppercase and lowercase letters, space and period) but will allow us to store more of them. Each 6 bit symbol of our alphabet will be encoded by three bit pairs (3 * 2 = 6). One bit pair will be encoded in one ASCII art character by altering that character slightly -- we define groups of ASCII characters that have similar brightness. Each of these groups consists of 4 characters (e.g. `@WM0` is the group of darkest characters), so a character can be used to encode 2 bits (one bit pair of the encoded symbol). The first character in the group encodes `00`, the second one `01` etc. However not all ASCII art characters can be used for encoding, for example space (` `) has no similar brightness characters, so these are just skipped. |