Consider the program converter.c that converts an input
file from UTF-8 encoded Unicode characters to UTF-16 encoded Unicode characters. This program
contains the function decode_UTF8
that takes an 8-bit value (from the input file)
and assumes it is the lead byte of a UTF-8 encoded character. It returns the 32-bit Unicode
"code point" corresponding to that character. To do this, it might need to read the input file
several more times. See Wikipedia's page on
UTF-8 for more information about the encoding.
Using Frama-C's EVA plug-in, analyze this function to ensure that the value assigned to code_point
in each branch of the if... else... is in a sensible range.
You will have to think about several things, and you may need to look up how to best solve these issues.
To fully exercise the function, Frama-C needs to know that the input value is in the range 0 .. 255. You could do this by writing a driver program that feeds the function all the allowed values (for example, in a loop). You might also be able to use an ACSL precondition.
You will need to provide a stub for getc
so Frama-C knows that it might return
any value in the set { EOF, 0 .. 255 }. Notice that EOF (usually -1) is a special value that
is outside the normal range of an unsigned 8-bit byte.
Frama-C needs to know that the exit
function terminates execution of the
program. It does not return.
Write a document that describes what you did, maybe including any driver program or stub functions, if appropriate, and that summarizes the results of your analysis. Submit your document to Moodle.