#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "utrac.h"
#include "debug.h"
Include dependency graph for ut_recognition1.c:
Go to the source code of this file.
Defines | |
#define | UT_DEBUG 1 |
Functions | |
bool | ut_unicode_invalid (ulong unicode) |
Return false if unicode scalar value is invalid. | |
UtCode | ut_distrib_utf_pass (UtText *text) |
Scan the text to calculate frequency distribution and UTF-8 correctness. | |
void | ut_change_EOL1toEOL2 (char *beg, char *end) |
Change all UT_EOL_CHAR to UT_EOL_ALT_CHAR, from beg to end-1. | |
UtCode | ut_eol_pass (UtText *text) |
Scan the text to detect EOL type and replace EOL by UT_EOL_CHAR or UT_EOL_ALT_CHAR. |
Definition in file ut_recognition1.c.
|
Change all UT_EOL_CHAR to UT_EOL_ALT_CHAR, from beg to end-1.
Definition at line 194 of file ut_recognition1.c. References UT_EOL_CHAR. Referenced by ut_eol_pass(). |
|
Scan the text to calculate frequency distribution and UTF-8 correctness. This function calculate the frequency distribution, i.e. for i between 0 and 255, text->distribution [i] is equal to the number of bytes "i" in the text. This distribution is used to determinate if the file is binary or ASCII. The text is also simultaneously scanned to check for UTF-8 errors.
Definition at line 62 of file ut_recognition1.c. References UtText::charset, UtSession::charset, UtText::data, UtText::distribution, UtText::flags, UtSession::nb_charsets, UtSession::progress_function, UtText::size, UtCharset::type, ut_distrib_utf_pass(), UT_PROCESS_STEP, UT_THRESHOLD_CONTROL_CHAR, UT_THRESHOLD_UTF8, ut_unicode_invalid(), and ut_update_progress(). Referenced by ut_distrib_utf_pass(), and ut_recognize(). |
Here is the call graph for this function:
|
Scan the text to detect EOL type and replace EOL by UT_EOL_CHAR or UT_EOL_ALT_CHAR. EOL are recognized and replaced by UT_EOL_CHAR (null char), and eventually UT_EOL_ALT_CHAR if EOL type is UT_EOL_CRLF_CR or UT_EOL_CRLF_LF (see UtEolType). ut_session->progress_function() is called only if ( text->flags & UT_F_TRANSFORM_EOL )
Definition at line 277 of file ut_recognition1.c. References UtText::data, UtText::eol, UtText::eol_alt, UtText::flags, UtText::nb_lines, UtText::nb_lines_alt, UtSession::progress_function, UtText::size, UtText::skip_char, ut_change_EOL1toEOL2(), UT_EOF_CHAR, UT_EOL_CHAR, UT_EOL_MIX, ut_eol_pass(), UT_PROCESS_STEP, ut_update_progress(), and UtEolType. Referenced by ut_eol_pass(), and ut_recognize(). |
Here is the call graph for this function: