Base64 encoding library with Arduino
Base64 is a coding system that uses 64 symbols grouped into messages that have a length multiple of four. These messages (data packets) are completed, if necessary, with a plus symbol (so 65 are used), often the equal sign (=), if the useful information encoded results in a shorter length.
Using 64 signs you can work with the 10 numbers and upper and lower case letters (26+26) of the code ASCII, the problem is that there are 62, let's say, unambiguous symbols plus two that vary in different implementations. Although sometimes referred to by the expression "characters ASCII printable", in reality they are those that range from the one represented by the code 32 (space) to 126 (~) the 95 truly printable.
The implementation of coding Base64 most used, that of PEM, which is also used by MIME, work with the extra "+" and "/" signs and the "=" sign to pad so that the packets have a length multiple of four. The letters AZ occupy positions 0-25, the letters az occupy positions 26-51, the numbers 0-9 occupy positions 52-61, the plus sign (+) positions 62, and position 63 is occupied by the slash (/ ).
The way to represent data in format Base64 consists of taking, from the original data, groups of 6 bits which are represented with the corresponding code. If there are bits left over, they are filled with zeros to the right. If the resulting number of codes is not a multiple of four, it is filled in with equal signs to the right.
The following image shows the coding ASCII of a text ("ohm") and the way in which it is converted to Base64. Since there are 7 symbols, the final message would need to be filled with an equal sign at the end. It could be said that the text "ohm" in ASCII equivalent to «b2htaW8=" in Base64.
Specific uses of coding Base64 They also usually impose a maximum line length. The implementation MIME Limits each line to 76 characters. Normally the lines will be separated by an end-of-line code (CR, represented by the value 0x0D in ASCII) and another new line (NL, which corresponds to the code ASCII 0x0A).
The inconvenience that is added when implementing coding Base64 on a device with few resources, as is often the case with a microcontroller is that you have to code as the information arrives or with a buffer minimum, which also requires providing a system that indicates that the end of the original message has been reached, for example, by adding a special code, or by using a pin whose level (synchronized with reception) indicates the status of the message.
The example code below is a library for Arduino to encode in Base64 which is implemented with both criteria: encoding the information that arrives (without a buffer) and wait for a warning signal to finish.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
//base64.h
#include <string.h> // memcpy/strncpy
#define LONGITUD_LINEA 76
#define MASCARA_B64 0B00111111
#define ULTIMO_CODIGO_BASE64 64 // 64 caracteres más el signo igual (empezando a contar desde cero)
#define MAXIMA_LONGITUD_RESULTADO 6 // Máximo número de caracteres del resultado parcial de la codificación. Puede ser 1 si no se ha llegado al final de un bloque (2 bytes en el original, 4 en la conversión), 2 si se ha llegado al final de un bloque, 3 si no se ha llegado al final de un bloque pero se supera la longitud máxima de la línea, 4 si se llega al final de un bloque y se supera la longitud máxima de la línea, 5 si hay que rellenar con un signo igual o 6 si hay que rellenar con dos signos igual
class Base64
{
private:
unsigned char simbolo_base64[ULTIMO_CODIGO_BASE64+1]; // Espacio para la codificación Base64, el relleno (=) una terminación en \0
unsigned int numero_valor; // Posición (empezando en cero) que ocupa el valor que se desea convertir en el mensaje completo original
unsigned int numero_codigo; // Posición del último código calculado. Podría limitarse al ancho de la línea (LONGITUD_LINEA, 76 caracteres) pero usando un contenedor alto se podría implementar también una cuenta estadística
unsigned char resto_base64; // Último resto obtenido al calcular el último código
unsigned char resultado[MAXIMA_LONGITUD_RESULTADO+1]; // Resultado de la conversión actual. Si es terminal puede incluir el caracter 65 (=) una o dos veces
unsigned char contador_caracteres_resultado=0;
void acumular_resultado(unsigned char valor);
public:
Base64();
~Base64();
void iniciar_conversion();
unsigned char *convertir(unsigned char valor_original, bool terminar_conversion);
unsigned char *convertir(unsigned char valor_original);
unsigned char *terminar();
};
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
|
//base64.cpp
#include “base64.h”
Base64::Base64() // Constructor
{
memcpy(simbolo_base64,“ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=”,ULTIMO_CODIGO_BASE64+1);
iniciar_conversion();
}
Base64::~Base64() // Destructor
{
}
void Base64::iniciar_conversion()
{
numero_valor=0;
numero_codigo=0;
resto_base64=0;
}
unsigned char *Base64::convertir(unsigned char valor_original, bool terminar_conversion) // Valor que se desea convertir a Base64
{
convertir(valor_original);
if(terminar_conversion)
{
terminar();
}
return resultado;
}
unsigned char *Base64::convertir(unsigned char valor_original) // Valor que se desea convertir a Base64
{
unsigned char desplazamiento;
contador_caracteres_resultado=0;
acumular_resultado((valor_original>>(2+(numero_valor%3)*2))|resto_base64);
desplazamiento=4–(numero_valor%3)*2;
resto_base64=(valor_original&(MASCARA_B64>>desplazamiento))<<desplazamiento;
if(((numero_codigo+1)%4==0))
{
acumular_resultado(resto_base64);
resto_base64=0;
}
numero_valor++;
resultado[contador_caracteres_resultado]=0;
return resultado;
}
unsigned char *Base64::terminar()
{
if(numero_codigo%4)
{
acumular_resultado(resto_base64);
while(numero_codigo%4)
{
acumular_resultado(ULTIMO_CODIGO_BASE64);
}
}
resultado[contador_caracteres_resultado]=0;
iniciar_conversion();
return resultado;
}
void Base64::acumular_resultado(unsigned char valor)
{
numero_codigo++;
resultado[contador_caracteres_resultado++]=simbolo_base64[valor];
if((numero_codigo%LONGITUD_LINEA)==0)
{
resultado[contador_caracteres_resultado++]=13; // CR “\r”
resultado[contador_caracteres_resultado++]=10; // LF “\n”
}
}
|
The fundamental part of the code calculation Base64 It is done with the expression:
(valor_original>>(2+(numero_valor%3)*2))|resto_base64
and the calculation of the remainder with the expression:
(valor_original&(MASCARA_B64>>desplazamiento))<<desplazamiento
,
siendo desplazamiento
a value that is calculated with the expression:
4-(numero_valor%3)*2
The process followed to obtain these expressions consists of generalizing the calculation of each of the four codes Base64 that result from representing three bytes of the original value.
Base64=((byte_1>>2)|resto)&0b00111111 |
resto=(byte_1&0b00000011)<<4 |
Base64=((byte_2>>4)|resto)&0b00111111 |
resto=(byte_2&0b00001111)<<2 |
Base64=((byte_3>>6)|resto)&0b00111111 |
resto=(byte_3&0b00111111)<<0 |
Base64=((byte_3>>0)|resto)&0b00111111 |
resto=(byte_3&0b00111111)<<0 |
With the text Base64
The pseudocode above refers to the code in Base64 that is being calculated. The expression has been used byte_n
to refer to the nth byte being encoded. The text resto
represents the leftover bits of the byte being encoded. At the beginning of the calculation it is assumed that the remainder is zero
For clarity, in the previous pseudocode the 6-bit mask has been included in the calculation of all the codes, although it is only necessary to determine the last of them, since the others are rotated so that the two most bits are always lost. significant.
As can be seen, the fourth code is all remainder and there is no need to calculate a remainder afterwards; It is therefore only necessary to perform three steps, one per encoded byte. It is important to remember that, if a third byte in a packet were not encoded, the last code would have to be filled with zeros on the right. Base64 obtained.
To generalize, the right rotation of the expression that calculates the code in Base64 can be represented as 2+(numero_byte%3)*2
so that the part inside the parentheses would rotate from zero to two, resulting in 2, 4 and 6 at each step. Of course it is not the only way to generalize, but I have chosen this one for functionality and above all for clarity. Since the mask (AND) was only necessary in the fourth code and it has already been seen that it is not necessary to calculate it (it is all remainder), it is not included in the final expression to simplify it, although we must remember that the type of data used (byte ) only the 6 least significant bits are taken.
The left rotation of the rest can be generalized in a way analogous to the previous one. It can also be seen that the mask that is applied (AND) undergoes the same bit rotation but in the opposite direction. That is the reason for calculating the displacement with 4-(numero_valor%3)*2
before applying it in the sense corresponding to each part of the expression.
The following example shows how to use the library to encode a text string (remember that Base64 can be used for any data set, such as an image, for example). In the following code there are a couple of details that are interesting to clarify. First, a special symbol (the ~ symbol) has been used to indicate the end of the text, instead of a hardware signal or indicating the length of the text. Logically, that symbol cannot be part of the data that is encoded.
The second issue that must be considered, as important as it is obvious, is that the decoder at the destination must know how the information that reaches it is represented. The text includes characters that do not belong to the set ASCII printable (from 32 to 126), letters with an accent, for example. Arduino will use two bytes (UTF-8) to represent these characters. The usual one cannot simply be used \0
as a text terminator since, in many cases, the first byte with which a character is represented will be precisely zero.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
#include “base64.h”
char texto_prueba[]=“La bella y graciosa moza marchose a lavar la ropa.\nLa mojó en el arroyuelo y cantando la lavó.\nLa frotó sobre una piedra, la colgó de un abedul.\nDespués de lavar la ropa, la niña se fue al mercado.\nUn pastor vendía ovejas pregonando a viva voz:\nved qué oveja, ved qué lana, ved qué bestia, qué animal.\nLa niña la vio muy flaca, sin embargo le gustó.\nYo te pago veinte escudos y no discutamos más.\nVuelve la niña cantando muy contenta con su oveja.\nCuando llegaron al bosque la ovejita se escapó.\nLa niña desesperada arrojose encima de ella.\nVelozmente y con destreza aferrola por detrás.\nLlegaba por el camino jinete de altivo porte.\nDescendió de su caballo y a la niña le cantó…~”;
char *resultado;
Base64 base64;
void setup()
{
Serial.begin(9600);
#if defined(__AVR_ATmega32U4__) || defined(__AVR_ATmega16U4__) // ¿Es un Arduino Leonardo (ATmega32U4)?
while(!Serial){}; // Esperar a Arduino Leonardo
#endif
// Mostrar el texto original
unsigned int contador=0;
while(texto_prueba[contador]!=‘~’)
{
//Serial.println(String(texto_prueba[contador])+”=”+String(texto_prueba[contador],DEC));
Serial.print(String(texto_prueba[contador]));
contador++;
}
Serial.println(“\n”);
// Mostrar el texto codificado en Base64
contador=0;
while(texto_prueba[contador]!=‘~’)
{
resultado=base64.convertir(texto_prueba[contador],texto_prueba[contador+1]==‘~’);
byte contador_resultado=0;
while(resultado[contador_resultado]>0)
{
Serial.print(String(resultado[contador_resultado]));
contador_resultado++;
}
contador++;
}
}
void loop()
{
}
|
Line 26 of the previous example shows the use of the library for Arduino to encode in Base64. It is only necessary to indicate the method convertir
each byte you want to encode and optionally whether it is the last one or, if not, stop the conversion with the method terminar
when you reach the end.
As can be seen in the screenshot below, the example program of the library for Arduino to encode in Base64 first displays the text to be encoded in Base64, in this case, the beginning of the famous song of the giants Les Luthiers, and subsequently the result of encoding in Base64 using format line length MIME.
Post Comment