[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[patches] Rough cut at wide character option group
- To: patches@xxxxxxxxxx
- Subject: [patches] Rough cut at wide character option group
- From: Jim Blandy <jimb@xxxxxxxxxxxxxxxx>
- Date: Mon, 12 Nov 2007 05:51:11 -0800
Here's a rough cut at a wide character option group. It does compile
and run 'hello, world', but I haven't gone through and cleaned up the
test results yet. For that reason, I don't want to announce numbers,
but it looks like the size reduction here will be substantial. It
should be interesting to re-run the uClibc comparison.
One fun part about this patch is that, rather than explicitly excising
code that refers to wide character functions, I was in several cases
able to simply help the compiler see that the code referring to them
would never be reached. This helped keep the patch size down quite a
bit. It also means that, when OPTION_POSIX_C_LANG_WIDE_CHAR is
disabled, there is now an additional reason EGLIBC must be compiled
with optimization.
ChangeLog.eglibc:
2007-11-12 Jim Blandy <jimb@xxxxxxxxxxxxxxxx>
Implement the OPTION_POSIX_C_LANG_WIDE_CHAR option group.
* option-groups.def (OPTION_POSIX_C_LANG_WIDE_CHAR): New entry.
(OPTION_EGLIBC_LOCALE_CODE, OPTION_POSIX_WIDE_CHAR_DEVICE_IO):
Note dependence on OPTION_POSIX_C_LANG_WIDE_CHAR.
* option-groups.defaults (OPTION_POSIX_C_LANG_WIDE_CHAR):
Initialize.
* stdlib/Makefile (routines): Put in group: mblen mbstowcs mbtowc
wcstombs wctomb wcstoimax wcstoumax
* debug/Makefile (routines): Put in group: wctomb_chk wcscpy_chk
wmemcpy_chk wmemmove_chk wmempcpy_chk wcpcpy_chk wcsncpy_chk
wcscat_chk wcsncat_chk wmemset_chk wcpncpy_chk swprintf_chk
vswprintf_chk wcrtomb_chk mbsnrtowcs_chk wcsnrtombs_chk
mbsrtowcs_chk wcsrtombs_chk mbstowcs_chk wcstombs_chk.
* wcsmbs/Makefile (routines): Put in group: wcscat wcschr wcscmp
wcscpy wcscspn wcsdup wcslen wcsncat wcsncmp wcsncpy wcspbrk
wcsrchr wcsspn wcstok wcsstr wmemchr wmemcmp wmemmove wcpcpy
wcpncpy wmempcpy btowc wctob mbsinit mbrlen mbrtowc wcrtomb
mbsrtowcs wcsrtombs mbsnrtowcs wcsnrtombs wcsnlen wcschrnul wcstol
wcstoul wcstoll wcstoull wcstod wcstold wcstof wcstol_l wcstoul_l
wcstoll_l wcstoull_l wcstod_l wcstold_l wcstof_l wcscoll wcsxfrm
wcwidth wcswidth wcscoll_l wcsxfrm_l wcscasecmp wcsncase
wcscasecmp_l wcsncase_l wcsmbsload mbsrtowcs_l isoc99_swscanf
isoc99_vswscanf
* time/Makefile (routines): Put in group: wcsftime wcsftime_l
* libio/Makefile (routines): When group is disabled, add
wdummyfileops. Put in group: wfiledoalloc iowpadn swprintf
vswprintf iovswscanf swscanf wgenops wstrops wfileops wmemstream
* libio/wdummyfileops.c: New file. Provide a dummy definition for
the _IO_FILE functions that prints an error message and dies.
* libio/libioP.h: #include <gnu/option-groups.h>.
(_IO_is_wide): New macro. Used as necessary to excise references
to wide character code.
* libio/iofwide.c: #include <gnu/option-groups.h>.
(_IO_fwide): When the group is disabled, provide a simplified
definition that aborts if the caller attempts to make a stream
wide-oriented.
* wctype/Makefile (routines): Put in group: wcfuncs wctype
iswctype wcfuncs_l wctype_l iswctype_l wctrans_l
* posix/fnmatch_loop.c (FCT) (either internal_fnmatch or
internal_fnwmatch): Handle character categories accessed via
'wctype' only if the group is enabled.
* stdio-common/Makefile (routines): Put in group: vfwprintf
vfwscanf printf-parsewc
* stdio-common/printf_fp.c (__printf_fp): When the group is
disabled, fix 'wide' at zero.
* stdio-common/printf_fphex.c (__printf_fphex): Same.
* stdio-common/printf_size.c (__printf_size): Same.
* stdio-common/vfprintf.c: #include <gnu/option-groups.h>.
(MULTIBYTE_SUPPORT): New macro.
(process_string_arg): Test MULTIBYTE_SUPPORT as needed. Avoid
multibyte or wide-character-based features when it is not true.
* stdio-common/vfscan.c: #include <gnu/option-groups.h>
(MULTIBYTE_SUPPORT): New macro.
(_IO_vfscanf_internal): Test MULTIBYTE_SUPPORT as needed; avoid
multibyte or wide-character-based features when it is not true.
Fix 'map' at NULL when the group is disabled.
Make the regular expression matching code better respect
the OPTION_EGLIBC_LOCALE_CODE option group.
* posix/regex_internal.h: #include <gnu/option-groups.h>.
(string_mb_cur_max, dfa_mb_cur_max): New macros for accessing the
'mb_cur_max' fields of re_string_t and re_dfa_t, whose values can
be constant when the group is disabled. Use them throughout.
* posix/regex_internal.c: Use string_mb_cur_max and dfa_mb_cur_max
as appropriate.
* posix/regcomp.c: Same.
(re_compile_fastmap_iter): Process COMPLEX_BRACKET nodes only when
the group is enabled.
(parse_bracket_exp): Process MB_CHAR elements only when the group
is enabled. Fix 'nrules' at zero, for the compiler's benefit.
(parse_bracket_element): Create MB_CHAR elements only when the
group is enabled.
(build_equiv_class): When the group is disabled, we know there
will be no collation rules.
(build_charclass): When the group is disabled, do not try to
process references to wide character categories accessed via
'wctype'.
* posix/regexec.c: Use string_mb_cur_max and dfa_mb_cur_max
as appropriate.
(find_collation_sequence_value): Define function only when the
group is enabled.
(check_node_accept_bytes): Check character against 'wctype' style
classes only if group is enabled. When the group is disabled,
Skip collation-rule-based matching.
* posix/fnmatch.c: #include <gnu/option-groups.h>.
Define HANDLE_MULTIBYTE only if when OPTION_EGLIBC_LOCALE_CODE is
enabled.
* stdio-common/_i18n_number.h (_i18n_number_rewrite): Provide only
a trivial definition when the group is disabled.
* option-groups.def (OPTION_POSIX_WIDE_CHAR_DEVICE_IO): Doc fix.
effect on support for 'ccs=CHARSET' strings in fopen and friends.
* option-groups.mak (option-disabled): New function.
* include/libc-symbols.h (attribute_always_inline): New.
* scripts/option-groups.awk: Generate preprocessor conditionals to
protect gnu/option-groups.h from multiple #inclusion.
Index: stdlib/Makefile
===================================================================
--- stdlib/Makefile (revision 4185)
+++ stdlib/Makefile (working copy)
@@ -38,7 +38,6 @@
exit on_exit atexit cxa_atexit cxa_finalize old_atexit \
abs labs llabs \
div ldiv lldiv \
- mblen mbstowcs mbtowc wcstombs wctomb \
random random_r rand rand_r \
drand48 erand48 lrand48 nrand48 mrand48 jrand48 \
srand48 seed48 lcong48 \
@@ -52,10 +51,13 @@
system canonicalize \
a64l l64a \
getsubopt xpg_basename fmtmsg \
- strtoimax strtoumax wcstoimax wcstoumax \
+ strtoimax strtoumax \
getcontext setcontext makecontext swapcontext
routines-$(OPTION_EGLIBC_LOCALE_CODE) += \
strfmon strfmon_l
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) += \
+ mblen mbstowcs mbtowc wcstombs wctomb \
+ wcstoimax wcstoumax
ifeq (yy,$(OPTION_EGLIBC_LOCALE_CODE)$(OPTION_POSIX_REGEXP))
routines-y += rpmatch
endif
Index: debug/Makefile
===================================================================
--- debug/Makefile (revision 4185)
+++ debug/Makefile (working copy)
@@ -35,21 +35,23 @@
read_chk pread_chk pread64_chk recv_chk recvfrom_chk \
readlink_chk readlinkat_chk getwd_chk getcwd_chk \
realpath_chk ptsname_r_chk fread_chk fread_u_chk \
- wctomb_chk wcscpy_chk wmemcpy_chk wmemmove_chk wmempcpy_chk \
- wcpcpy_chk wcsncpy_chk wcscat_chk wcsncat_chk wmemset_chk \
- wcpncpy_chk \
- swprintf_chk vswprintf_chk \
confstr_chk getgroups_chk ttyname_r_chk \
- gethostname_chk getdomainname_chk wcrtomb_chk mbsnrtowcs_chk \
- wcsnrtombs_chk mbsrtowcs_chk wcsrtombs_chk mbstowcs_chk \
- wcstombs_chk \
+ gethostname_chk getdomainname_chk \
stack_chk_fail fortify_fail \
$(static-only-routines)
routines-$(OPTION_EGLIBC_GETLOGIN) += getlogin_r_chk
routines-$(OPTION_EGLIBC_BACKTRACE) += backtrace backtracesyms backtracesymsfd
routines-$(OPTION_POSIX_WIDE_CHAR_DEVICE_IO) \
- += wprintf_chk fwprintf_chk \
+ += wprintf_chk fwprintf_chk \
vwprintf_chk vfwprintf_chk fgetws_chk fgetws_u_chk
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) \
+ += wctomb_chk wcscpy_chk wmemcpy_chk wmemmove_chk wmempcpy_chk \
+ wcpcpy_chk wcsncpy_chk wcscat_chk wcsncat_chk wmemset_chk \
+ wcpncpy_chk \
+ swprintf_chk vswprintf_chk \
+ wcrtomb_chk mbsnrtowcs_chk \
+ wcsnrtombs_chk mbsrtowcs_chk wcsrtombs_chk mbstowcs_chk \
+ wcstombs_chk
static-only-routines := warning-nop stack_chk_fail_local
Index: scripts/option-groups.awk
===================================================================
--- scripts/option-groups.awk (revision 4185)
+++ scripts/option-groups.awk (working copy)
@@ -18,18 +18,20 @@
# Print final values.
END {
- print "/* This file is automatically generated."
+ print "/* This file is automatically generated by scripts/option-groups.awk"
+ print " in the EGLIBC source tree."
+ print ""
print " It defines macros that indicate which EGLIBC option groups were"
print " configured in 'option-groups.config' when this C library was"
print " built. For each option group named OPTION_foo, it #defines"
print " __OPTION_foo to be 1 if the group is enabled, or leaves that"
- print " symbol undefined if the group is disabled."
+ print " symbol undefined if the group is disabled. */"
print ""
- print " It is generated by scripts/option-groups.awk in the EGLIBC"
- print " source tree. */"
+ print "#ifndef __GNU_OPTION_GROUPS_H"
+ print "#define __GNU_OPTION_GROUPS_H"
print ""
- # Sort the variables by name.
+ # Produce a sorted list of variable names.
i=0
for (var in vars)
names[i++] = var
@@ -49,4 +51,7 @@
# option-groups.def.
}
}
+
+ print ""
+ print "#endif /* __GNU_OPTION_GROUPS_H */"
}
Index: wcsmbs/Makefile
===================================================================
--- wcsmbs/Makefile (revision 4185)
+++ wcsmbs/Makefile (working copy)
@@ -27,9 +27,13 @@
headers := wchar.h bits/wchar.h bits/wchar2.h bits/wchar-ldbl.h
distribute := wcwidth.h wcsmbsload.h
-routines := wcscat wcschr wcscmp wcscpy wcscspn wcsdup wcslen wcsncat \
+# These functions are used by printf_fp.c, even in the plain case; see
+# comments there for OPTION_EGLIBC_LOCALE_CODE.
+routines := wmemcpy wmemset
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) \
+ := wcscat wcschr wcscmp wcscpy wcscspn wcsdup wcslen wcsncat \
wcsncmp wcsncpy wcspbrk wcsrchr wcsspn wcstok wcsstr wmemchr \
- wmemcmp wmemcpy wmemmove wmemset wcpcpy wcpncpy wmempcpy \
+ wmemcmp wmemmove wcpcpy wcpncpy wmempcpy \
btowc wctob mbsinit \
mbrlen mbrtowc wcrtomb mbsrtowcs wcsrtombs \
mbsnrtowcs wcsnrtombs wcsnlen wcschrnul \
Index: time/Makefile
===================================================================
--- time/Makefile (revision 4185)
+++ time/Makefile (working copy)
@@ -31,7 +31,9 @@
tzfile getitimer setitimer \
stime dysize timegm ftime \
getdate strptime strptime_l \
- strftime wcsftime strftime_l wcsftime_l
+ strftime strftime_l
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) \
+ := wcsftime wcsftime_l
aux-$(OPTION_EGLIBC_LOCALE_CODE) += alt_digit era lc-time-cleanup
distribute := datemsk
Index: libio/libioP.h
===================================================================
--- libio/libioP.h (revision 4185)
+++ libio/libioP.h (working copy)
@@ -36,6 +36,10 @@
/*# include <comthread.h>*/
#endif
+#if defined _LIBC
+# include <gnu/option-groups.h>
+#endif
+
#include <math_ldbl_opt.h>
#include "iolibio.h"
@@ -493,8 +497,20 @@
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
+
+/* _IO_is_wide (fp) is roughly equivalent to '_IO_fwide (fp, 0) > 0',
+ except that when OPTION_POSIX_C_LANG_WIDE_CHAR is disabled, it
+ expands to a constant, allowing the compiler to realize that it can
+ eliminate code that references wide stream handling functions.
+ This, in turn, allows us to omit them. */
+#if __OPTION_POSIX_C_LANG_WIDE_CHAR
+# define _IO_is_wide(_f) ((_f)->_mode > 0)
+#else
+# define _IO_is_wide(_f) (0)
+#endif
+
# define _IO_do_flush(_f) \
- ((_f)->_mode <= 0 \
+ (! _IO_is_wide (_f) \
? INTUSE(_IO_do_write)(_f, (_f)->_IO_write_base, \
(_f)->_IO_write_ptr-(_f)->_IO_write_base) \
: INTUSE(_IO_wdo_write)(_f, (_f)->_wide_data->_IO_write_base, \
Index: libio/ioseekoff.c
===================================================================
--- libio/ioseekoff.c (revision 4185)
+++ libio/ioseekoff.c (working copy)
@@ -62,7 +62,7 @@
else
abort ();
}
- if (_IO_fwide (fp, 0) < 0)
+ if (! _IO_is_wide (fp))
INTUSE(_IO_free_backup_area) (fp);
else
INTUSE(_IO_free_wbackup_area) (fp);
Index: libio/iofwide.c
===================================================================
--- libio/iofwide.c (revision 4185)
+++ libio/iofwide.c (working copy)
@@ -27,6 +27,7 @@
#include <libioP.h>
#ifdef _LIBC
+# include <gnu/option-groups.h>
# include <dlfcn.h>
# include <wchar.h>
#endif
@@ -44,6 +45,8 @@
#endif
+#if ! defined _LIBC || __OPTION_POSIX_C_LANG_WIDE_CHAR
+
/* Prototypes of libio's codecvt functions. */
static enum __codecvt_result do_out (struct _IO_codecvt *codecvt,
__mbstate_t *statep,
@@ -521,3 +524,26 @@
return MB_CUR_MAX;
#endif
}
+
+#else
+/* OPTION_POSIX_C_LANG_WIDE_CHAR is disabled. */
+
+#undef _IO_fwide
+int
+_IO_fwide (fp, mode)
+ _IO_FILE *fp;
+ int mode;
+{
+ /* Die helpfully if the user tries to create a wide stream; I
+ disbelieve that most users check the return value from
+ 'fwide (fp, 1)'. */
+ assert (mode <= 0);
+
+ /* We can only make streams byte-oriented, which is trivial. */
+ if (mode < 0)
+ fp->_mode = -1;
+
+ return fp->_mode;
+}
+
+#endif
Index: libio/fileops.c
===================================================================
--- libio/fileops.c (revision 4185)
+++ libio/fileops.c (working copy)
@@ -176,7 +176,7 @@
/* Free buffer. */
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
- if (fp->_mode > 0)
+ if (_IO_is_wide (fp))
{
if (_IO_have_wbackup (fp))
INTUSE(_IO_free_wbackup_area) (fp);
@@ -343,6 +343,7 @@
cs = strstr (last_recognized + 1, ",ccs=");
if (cs != NULL)
{
+#if __OPTION_POSIX_WIDE_CHAR_DEVICE_IO
/* Yep. Load the appropriate conversions and set the orientation
to wide. */
struct gconv_fcts fcts;
@@ -407,6 +408,12 @@
/* Set the mode now. */
result->_mode = 1;
+#else
+ /* Treat this as if we couldn't find the given character set. */
+ (void) INTUSE(_IO_file_close_it) (fp);
+ __set_errno (EINVAL);
+ return NULL;
+#endif
}
}
#endif /* GNU libc */
Index: libio/Makefile
===================================================================
--- libio/Makefile (revision 4185)
+++ libio/Makefile (working copy)
@@ -28,15 +28,13 @@
routines := \
filedoalloc iofclose iofdopen iofflush iofgetpos iofgets iofopen \
- iofopncook iofputs iofread iofsetpos ioftell wfiledoalloc \
+ iofopncook iofputs iofread iofsetpos ioftell \
iofwrite iogetdelim iogetline iogets iopadn iopopen ioputs \
ioseekoff ioseekpos iosetbuffer iosetvbuf ioungetc \
iovsprintf iovsscanf \
iofgetpos64 iofopen64 iofsetpos64 \
- iowpadn \
- putchar putchar_u swprintf \
- vswprintf iovswscanf swscanf wgenops \
- wstrops wfileops iofwide wmemstream \
+ putchar putchar_u \
+ iofwide \
\
clearerr feof ferror fileno fputc freopen fseek getc getchar \
memstream pclose putc putchar rewind setbuf setlinebuf vasprintf \
@@ -47,6 +45,14 @@
__fpurge __fpending __fsetlocking \
\
libc_fatal fmemopen
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) += \
+ wfiledoalloc \
+ iowpadn \
+ swprintf \
+ vswprintf iovswscanf swscanf wgenops \
+ wstrops wfileops wmemstream
+routines-$(call option-disabled, OPTION_POSIX_C_LANG_WIDE_CHAR) += \
+ wdummyfileops
routines-$(OPTION_POSIX_WIDE_CHAR_DEVICE_IO) += \
fputwc fputwc_u getwc getwc_u getwchar getwchar_u iofgetws iofgetws_u \
iofputws iofputws_u iogetwline ioungetwc putwc putwc_u \
Index: libio/ioseekpos.c
===================================================================
--- libio/ioseekpos.c (revision 4185)
+++ libio/ioseekpos.c (working copy)
@@ -36,7 +36,7 @@
/* If we have a backup buffer, get rid of it, since the __seekoff
callback may not know to do the right thing about it.
This may be over-kill, but it'll do for now. TODO */
- if (_IO_fwide (fp, 0) <= 0)
+ if (! _IO_is_wide (fp))
{
if (_IO_have_backup (fp))
INTUSE(_IO_free_backup_area) (fp);
Index: libio/wdummyfileops.c
===================================================================
--- libio/wdummyfileops.c (revision 0)
+++ libio/wdummyfileops.c (revision 0)
@@ -0,0 +1,161 @@
+/* Copyright (C) 2007 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA.
+
+ As a special exception, if you link the code in this file with
+ files compiled with a GNU compiler to produce an executable,
+ that does not cause the resulting executable to be covered by
+ the GNU Lesser General Public License. This exception does not
+ however invalidate any other reasons why the executable file
+ might be covered by the GNU Lesser General Public License.
+ This exception applies to code released by its copyright holders
+ in files containing the exception. */
+
+#include <assert.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <libioP.h>
+
+static void __THROW __attribute__ ((__noreturn__))
+_IO_wfile_wide_char_support_disabled (void)
+{
+ static const char errstr[]
+ = ("The application tried to use wide character I/O, but libc.so"
+ " was compiled\n"
+ "with the OPTION_POSIX_C_LANG_WIDE_CHAR option group disabled.\n");
+ __libc_write (STDERR_FILENO, errstr, sizeof (errstr) - 1);
+ abort ();
+}
+
+static void
+_IO_wfile_disabled_void_int (_IO_FILE *fp, int x)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static int
+_IO_wfile_disabled_int_int (_IO_FILE *fp, int x)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static int
+_IO_wfile_disabled_int_none (_IO_FILE *fp)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_size_t
+_IO_wfile_disabled_xsputn (_IO_FILE *fp, const void *data, _IO_size_t n)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_size_t
+_IO_wfile_disabled_xsgetn (_IO_FILE *fp, void *data, _IO_size_t n)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_off64_t
+_IO_wfile_disabled_seekoff (_IO_FILE *fp, _IO_off64_t off, int dir, int mode)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_off64_t
+_IO_wfile_disabled_seekpos (_IO_FILE *fp, _IO_off64_t pos, int flags)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_FILE *
+_IO_wfile_disabled_setbuf (_IO_FILE *fp, char *buffer, _IO_ssize_t length)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_ssize_t
+_IO_wfile_disabled_read (_IO_FILE *fp, void *buffer, _IO_ssize_t length)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_ssize_t
+_IO_wfile_disabled_write (_IO_FILE *fp, const void *buffer, _IO_ssize_t length)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static _IO_off64_t
+_IO_wfile_disabled_seek (_IO_FILE *fp, _IO_off64_t offset, int mode)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static int
+_IO_wfile_disabled_close (_IO_FILE *fp)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static int
+_IO_wfile_disabled_stat (_IO_FILE *fp, void *buf)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static int
+_IO_wfile_disabled_showmanyc (_IO_FILE *fp)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static void
+_IO_wfile_disabled_imbue (_IO_FILE *fp, void *locale)
+{
+ _IO_wfile_wide_char_support_disabled ();
+}
+
+static const struct _IO_jump_t _IO_wfile_jumps_disabled =
+{
+ JUMP_INIT_DUMMY,
+ JUMP_INIT(finish, _IO_wfile_disabled_void_int),
+ JUMP_INIT(overflow, _IO_wfile_disabled_int_int),
+ JUMP_INIT(underflow, _IO_wfile_disabled_int_none),
+ JUMP_INIT(uflow, _IO_wfile_disabled_int_none),
+ JUMP_INIT(pbackfail, _IO_wfile_disabled_int_int),
+ JUMP_INIT(xsputn, _IO_wfile_disabled_xsputn),
+ JUMP_INIT(xsgetn, _IO_wfile_disabled_xsgetn),
+ JUMP_INIT(seekoff, _IO_wfile_disabled_seekoff),
+ JUMP_INIT(seekpos, _IO_wfile_disabled_seekpos),
+ JUMP_INIT(setbuf, _IO_wfile_disabled_setbuf),
+ JUMP_INIT(sync, _IO_wfile_disabled_int_none),
+ JUMP_INIT(doallocate, _IO_wfile_disabled_int_none),
+ JUMP_INIT(read, _IO_wfile_disabled_read),
+ JUMP_INIT(write, _IO_wfile_disabled_write),
+ JUMP_INIT(seek, _IO_wfile_disabled_seek),
+ JUMP_INIT(close, _IO_wfile_disabled_close),
+ JUMP_INIT(stat, _IO_wfile_disabled_stat),
+ JUMP_INIT(showmanyc, _IO_wfile_disabled_showmanyc),
+ JUMP_INIT(imbue, _IO_wfile_disabled_imbue)
+};
+
+strong_alias (_IO_wfile_jumps_disabled, _IO_wfile_jumps)
+libc_hidden_data_def (_IO_wfile_jumps)
+strong_alias (_IO_wfile_jumps_disabled, _IO_wfile_jumps_mmap)
+strong_alias (_IO_wfile_jumps_disabled, _IO_wfile_jumps_maybe_mmap)
Index: libio/__fpurge.c
===================================================================
--- libio/__fpurge.c (revision 4185)
+++ libio/__fpurge.c (working copy)
@@ -22,7 +22,7 @@
void
__fpurge (FILE *fp)
{
- if (fp->_mode > 0)
+ if (_IO_is_wide (fp))
{
/* Wide-char stream. */
if (_IO_in_backup (fp))
Index: include/libc-symbols.h
===================================================================
--- include/libc-symbols.h (revision 4185)
+++ include/libc-symbols.h (working copy)
@@ -133,6 +133,8 @@
# endif
+#define attribute_always_inline __attribute__ ((always_inline))
+
#else /* __ASSEMBLER__ */
# ifdef HAVE_ASM_SET_DIRECTIVE
Index: wctype/Makefile
===================================================================
--- wctype/Makefile (revision 4185)
+++ wctype/Makefile (working copy)
@@ -19,12 +19,16 @@
#
# Sub-makefile for wctype portion of the library.
#
+include ../option-groups.mak
+
subdir := wctype
headers := wctype.h
distribute := wchar-lookup.h
-routines := wcfuncs wctype iswctype wctrans towctrans \
- wcfuncs_l wctype_l iswctype_l wctrans_l towctrans_l
+routines := wctrans towctrans towctrans_l
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) \
+ := wcfuncs wctype iswctype \
+ wcfuncs_l wctype_l iswctype_l wctrans_l
tests := test_wctype test_wcfuncs
Index: option-groups.mak
===================================================================
--- option-groups.mak (revision 4185)
+++ option-groups.mak (working copy)
@@ -15,6 +15,16 @@
# defaults from option-groups.defaults.
-include $(option_group_config_file)
+# $(call option-disabled, VAR) is 'y' if VAR is not 'y', or 'n' otherwise.
+# VAR should be a variable name, not a variable reference; this is
+# less general, but more terse for the intended use.
+# You can use it to add a file to a list if an option group is
+# disabled, like this:
+# routines-$(call option-disabled, OPTION_POSIX_C_LANG_WIDE_CHAR) += ...
+define option-disabled
+$(firstword $(subst y,n,$(filter y,$($(1)))) y)
+endef
+
# Establish 'routines-y', etc. as simply-expanded variables.
aux-y :=
extra-libs-others-y :=
Index: option-groups.def
===================================================================
--- option-groups.def (revision 4185)
+++ option-groups.def (working copy)
@@ -401,6 +401,7 @@
config OPTION_EGLIBC_LOCALE_CODE
bool "Locale functions"
+ depends OPTION_POSIX_C_LANG_WIDE_CHAR
help
This option group includes locale support functions, programs,
and libraries. With OPTION_EGLIBC_LOCALE_FUNCTIONS disabled,
@@ -645,6 +646,35 @@
performing word expansion in the manner of the shell, and the
accompanying 'wordfree' function.
+config OPTION_POSIX_C_LANG_WIDE_CHAR
+ bool "ISO C library wide character functions, excluding I/O"
+ help
+ This option group includes the functions defined by the ISO C
+ standard for working with wide and multibyte characters in
+ memory. Functions for reading and writing wide and multibyte
+ characters from and to files call in the
+ OPTION_POSIX_WIDE_CHAR_DEVICE_IO option group.
+
+ This option group includes the following functions:
+
+ btowc mbsinit wcscspn wcstoll
+ iswalnum mbsrtowcs wcsftime wcstombs
+ iswalpha mbstowcs wcslen wcstoul
+ iswblank mbtowc wcsncat wcstoull
+ iswcntrl swprintf wcsncmp wcstoumax
+ iswctype swscanf wcsncpy wcsxfrm
+ iswdigit towctrans wcspbrk wctob
+ iswgraph towlower wcsrchr wctomb
+ iswlower towupper wcsrtombs wctrans
+ iswprint vswprintf wcsspn wctype
+ iswpunct vswscanf wcsstr wmemchr
+ iswspace wcrtomb wcstod wmemcmp
+ iswupper wcscat wcstof wmemcpy
+ iswxdigit wcschr wcstoimax wmemmove
+ mblen wcscmp wcstok wmemset
+ mbrlen wcscoll wcstol
+ mbrtowc wcscpy wcstold
+
config OPTION_POSIX_REGEXP
bool "Regular expressions"
help
@@ -668,6 +698,7 @@
config OPTION_POSIX_WIDE_CHAR_DEVICE_IO
bool "Input and output functions for wide characters"
+ depends OPTION_POSIX_C_LANG_WIDE_CHAR
help
This option group includes functions for reading and writing
wide characters to and from <stdio.h> streams.
@@ -692,6 +723,25 @@
some of these functions; you will not be able to link or run
C++ programs if you disable this option group.
+ This option group also affects the behavior of the following
+ functions:
+
+ fdopen
+ fopen
+ fopen64
+ freopen
+ freopen64
+
+ These functions all take an OPENTYPE parameter which may
+ contain a string of the form ",ccs=CHARSET", indicating that
+ the underlying file uses the character set named CHARSET.
+ This produces a wide-oriented stream, which is only useful
+ when the functions included in this option group are present.
+ If the user attempts to open a file specifying a character set
+ in the OPENTYPE parameter, and EGLIBC was built with this
+ option group disabled, the function returns NULL, and sets
+ errno to EINVAL.
+
# This helps Emacs users browse this file using the page motion commands
# and commands like 'pages-directory'.
Index: posix/regcomp.c
===================================================================
--- posix/regcomp.c (revision 4185)
+++ posix/regcomp.c (working copy)
@@ -304,7 +304,7 @@
{
re_dfa_t *dfa = (re_dfa_t *) bufp->buffer;
int node_cnt;
- int icase = (dfa->mb_cur_max == 1 && (bufp->syntax & RE_ICASE));
+ int icase = (dfa_mb_cur_max (dfa) == 1 && (bufp->syntax & RE_ICASE));
for (node_cnt = 0; node_cnt < init_state->nodes.nelem; ++node_cnt)
{
int node = init_state->nodes.elems[node_cnt];
@@ -314,9 +314,9 @@
{
re_set_fastmap (fastmap, icase, dfa->nodes[node].opr.c);
#ifdef RE_ENABLE_I18N
- if ((bufp->syntax & RE_ICASE) && dfa->mb_cur_max > 1)
+ if ((bufp->syntax & RE_ICASE) && dfa_mb_cur_max (dfa) > 1)
{
- unsigned char *buf = alloca (dfa->mb_cur_max), *p;
+ unsigned char *buf = alloca (dfa_mb_cur_max (dfa)), *p;
wchar_t wc;
mbstate_t state;
@@ -347,7 +347,11 @@
re_set_fastmap (fastmap, icase, ch);
}
}
-#ifdef RE_ENABLE_I18N
+
+ /* When OPTION_EGLIBC_LOCALE_CODE is disabled, the current
+ locale is always C, which has no rules and no multi-byte
+ characters. */
+#if defined RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE
else if (type == COMPLEX_BRACKET)
{
int i;
@@ -371,7 +375,7 @@
re_set_fastmap (fastmap, icase, i);
}
# else
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
for (i = 0; i < SBC_MAX; ++i)
if (__btowc (i) == WEOF)
re_set_fastmap (fastmap, icase, i);
@@ -384,7 +388,7 @@
memset (&state, '\0', sizeof (state));
if (__wcrtomb (buf, cset->mbchars[i], &state) != (size_t) -1)
re_set_fastmap (fastmap, icase, *(unsigned char *) buf);
- if ((bufp->syntax & RE_ICASE) && dfa->mb_cur_max > 1)
+ if ((bufp->syntax & RE_ICASE) && dfa_mb_cur_max (dfa) > 1)
{
if (__wcrtomb (buf, towlower (cset->mbchars[i]), &state)
!= (size_t) -1)
@@ -392,7 +396,7 @@
}
}
}
-#endif /* RE_ENABLE_I18N */
+#endif /* RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE */
else if (type == OP_PERIOD
#ifdef RE_ENABLE_I18N
|| type == OP_UTF8_PERIOD
@@ -835,7 +839,7 @@
dfa->mb_cur_max = MB_CUR_MAX;
#ifdef _LIBC
- if (dfa->mb_cur_max == 6
+ if (dfa_mb_cur_max (dfa) == 6
&& strcmp (_NL_CURRENT (LC_CTYPE, _NL_CTYPE_CODESET_NAME), "UTF-8") == 0)
dfa->is_utf8 = 1;
dfa->map_notascii = (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MAP_TO_NONASCII)
@@ -865,7 +869,7 @@
#endif
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
{
if (dfa->is_utf8)
dfa->sb_char = (re_bitset_ptr_t) utf8_sb_map;
@@ -1726,7 +1730,7 @@
token->word_char = 0;
#ifdef RE_ENABLE_I18N
token->mb_partial = 0;
- if (input->mb_cur_max > 1 &&
+ if (string_mb_cur_max (input) > 1 &&
!re_string_first_byte (input, re_string_cur_idx (input)))
{
token->type = CHARACTER;
@@ -1747,7 +1751,7 @@
token->opr.c = c2;
token->type = CHARACTER;
#ifdef RE_ENABLE_I18N
- if (input->mb_cur_max > 1)
+ if (string_mb_cur_max (input) > 1)
{
wint_t wc = re_string_wchar_at (input,
re_string_cur_idx (input) + 1);
@@ -1861,7 +1865,7 @@
token->type = CHARACTER;
#ifdef RE_ENABLE_I18N
- if (input->mb_cur_max > 1)
+ if (string_mb_cur_max (input) > 1)
{
wint_t wc = re_string_wchar_at (input, re_string_cur_idx (input));
token->word_char = IS_WIDE_WORD_CHAR (wc) != 0;
@@ -1961,7 +1965,7 @@
token->opr.c = c;
#ifdef RE_ENABLE_I18N
- if (input->mb_cur_max > 1 &&
+ if (string_mb_cur_max (input) > 1 &&
!re_string_first_byte (input, re_string_cur_idx (input)))
{
token->type = CHARACTER;
@@ -2175,7 +2179,7 @@
return NULL;
}
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
{
while (!re_string_eoi (regexp)
&& !re_string_first_byte (regexp, re_string_cur_idx (regexp)))
@@ -2313,7 +2317,7 @@
*err = REG_ESPACE;
return NULL;
}
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
dfa->has_mb_node = 1;
break;
case OP_WORD:
@@ -2606,7 +2610,7 @@
However, for !_LIBC we have no collation elements: if the
character set is single byte, the single byte character set
that we build below suffices. parse_bracket_exp passes
- no MBCSET if dfa->mb_cur_max == 1. */
+ no MBCSET if dfa_mb_cur_max (dfa) == 1. */
if (mbcset)
{
/* Check the space of the arrays. */
@@ -2769,11 +2773,13 @@
return __collseq_table_lookup (collseqwc, wc);
}
}
+#if __OPTION_EGLIBC_LOCALE_CODE
else if (br_elem->type == MB_CHAR)
{
if (nrules != 0)
return __collseq_table_lookup (collseqwc, br_elem->opr.wch);
}
+#endif
else if (br_elem->type == COLL_SYM)
{
size_t sym_name_len = strlen ((char *) br_elem->opr.name);
@@ -2851,7 +2857,7 @@
However, if we have no collation elements, and the character set
is single byte, the single byte character set that we
build below suffices. */
- if (nrules > 0 || dfa->mb_cur_max > 1)
+ if (nrules > 0 || dfa_mb_cur_max (dfa) > 1)
{
/* Check the space of the arrays. */
if (BE (*range_alloc == mbcset->nranges, 0))
@@ -2969,7 +2975,10 @@
re_bitset_ptr_t sbcset;
#ifdef RE_ENABLE_I18N
re_charset_t *mbcset;
- int coll_sym_alloc = 0, range_alloc = 0, mbchar_alloc = 0;
+ int coll_sym_alloc = 0, range_alloc = 0;
+#if __OPTION_EGLIBC_LOCALE_CODE
+ int mbchar_alloc = 0;
+#endif
int equiv_class_alloc = 0, char_class_alloc = 0;
#endif /* not RE_ENABLE_I18N */
int non_match = 0;
@@ -2979,7 +2988,13 @@
#ifdef _LIBC
collseqmb = (const unsigned char *)
_NL_CURRENT (LC_COLLATE, _NL_COLLATE_COLLSEQMB);
+#if __OPTION_EGLIBC_LOCALE_CODE
nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES);
+#else
+ /* This is true when OPTION_EGLIBC_LOCALE_CODE is disabled, but the
+ compiler can't figure that out. */
+ nrules = 0;
+#endif
if (nrules)
{
/*
@@ -3103,7 +3118,7 @@
#else
# ifdef RE_ENABLE_I18N
*err = build_range_exp (sbcset,
- dfa->mb_cur_max > 1 ? mbcset : NULL,
+ dfa_mb_cur_max (dfa) > 1 ? mbcset : NULL,
&range_alloc, &start_elem, &end_elem);
# else
*err = build_range_exp (sbcset, &start_elem, &end_elem);
@@ -3119,7 +3134,7 @@
case SB_CHAR:
bitset_set (sbcset, start_elem.opr.ch);
break;
-#ifdef RE_ENABLE_I18N
+#if defined RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE
case MB_CHAR:
/* Check whether the array has enough space. */
if (BE (mbchar_alloc == mbcset->nmbchars, 0))
@@ -3137,7 +3152,7 @@
}
mbcset->mbchars[mbcset->nmbchars++] = start_elem.opr.wch;
break;
-#endif /* RE_ENABLE_I18N */
+#endif /* RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE */
case EQUIV_CLASS:
*err = build_equiv_class (sbcset,
#ifdef RE_ENABLE_I18N
@@ -3187,11 +3202,11 @@
#ifdef RE_ENABLE_I18N
/* Ensure only single byte characters are set. */
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
bitset_mask (sbcset, dfa->sb_char);
if (mbcset->nmbchars || mbcset->ncoll_syms || mbcset->nequiv_classes
- || mbcset->nranges || (dfa->mb_cur_max > 1 && (mbcset->nchar_classes
+ || mbcset->nranges || (dfa_mb_cur_max (dfa) > 1 && (mbcset->nchar_classes
|| mbcset->non_match)))
{
bin_tree_t *mbc_tree;
@@ -3260,7 +3275,7 @@
re_token_t *token, int token_len, re_dfa_t *dfa,
reg_syntax_t syntax, int accept_hyphen)
{
-#ifdef RE_ENABLE_I18N
+#if defined RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE
int cur_char_size;
cur_char_size = re_string_char_size_at (regexp, re_string_cur_idx (regexp));
if (cur_char_size > 1)
@@ -3270,7 +3285,7 @@
re_string_skip_bytes (regexp, cur_char_size);
return REG_NOERROR;
}
-#endif /* RE_ENABLE_I18N */
+#endif /* RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE */
re_string_skip_bytes (regexp, token_len); /* Skip a token. */
if (token->type == OP_OPEN_COLL_ELEM || token->type == OP_OPEN_CHAR_CLASS
|| token->type == OP_OPEN_EQUIV_CLASS)
@@ -3350,7 +3365,9 @@
build_equiv_class (bitset_t sbcset, const unsigned char *name)
#endif /* not RE_ENABLE_I18N */
{
-#ifdef _LIBC
+ /* When __OPTION_EGLIBC_LOCALE_CODE is disabled, only the C locale
+ is supported; it has no collation rules. */
+#if defined _LIBC && __OPTION_EGLIBC_LOCALE_CODE
uint32_t nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES);
if (nrules != 0)
{
@@ -3423,7 +3440,7 @@
mbcset->equiv_classes[mbcset->nequiv_classes++] = idx1;
}
else
-#endif /* _LIBC */
+#endif /* _LIBC && __OPTION_EGLIBC_LOCALE_CODE */
{
if (BE (strlen ((const char *) name) != 1, 0))
return REG_ECOLLATE;
@@ -3457,7 +3474,7 @@
&& (strcmp (name, "upper") == 0 || strcmp (name, "lower") == 0))
name = "alpha";
-#ifdef RE_ENABLE_I18N
+#if defined RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE
/* Check the space of the arrays. */
if (BE (*char_class_alloc == mbcset->nchar_classes, 0))
{
@@ -3473,7 +3490,7 @@
*char_class_alloc = new_char_class_alloc;
}
mbcset->char_classes[mbcset->nchar_classes++] = __wctype (name);
-#endif /* RE_ENABLE_I18N */
+#endif /* RE_ENABLE_I18N && __OPTION_EGLIBC_LOCALE_CODE */
#define BUILD_CHARCLASS_LOOP(ctype_func) \
do { \
@@ -3584,7 +3601,7 @@
#ifdef RE_ENABLE_I18N
/* Ensure only single byte characters are set. */
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
bitset_mask (sbcset, dfa->sb_char);
#endif
@@ -3596,7 +3613,7 @@
goto build_word_op_espace;
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
{
bin_tree_t *mbc_tree;
/* Build a tree for complex bracket. */
Index: posix/regex_internal.c
===================================================================
--- posix/regex_internal.c (revision 4185)
+++ posix/regex_internal.c (working copy)
@@ -44,8 +44,8 @@
int init_buf_len;
/* Ensure at least one character fits into the buffers. */
- if (init_len < dfa->mb_cur_max)
- init_len = dfa->mb_cur_max;
+ if (init_len < dfa_mb_cur_max (dfa))
+ init_len = dfa_mb_cur_max (dfa);
init_buf_len = (len + 1 < init_len) ? len + 1: init_len;
re_string_construct_common (str, len, pstr, trans, icase, dfa);
@@ -56,7 +56,7 @@
pstr->word_char = dfa->word_char;
pstr->word_ops_used = dfa->word_ops_used;
pstr->mbs = pstr->mbs_allocated ? pstr->mbs : (unsigned char *) str;
- pstr->valid_len = (pstr->mbs_allocated || dfa->mb_cur_max > 1) ? 0 : len;
+ pstr->valid_len = (pstr->mbs_allocated || dfa_mb_cur_max (dfa) > 1) ? 0 : len;
pstr->valid_raw_len = pstr->valid_len;
return REG_NOERROR;
}
@@ -83,7 +83,7 @@
if (icase)
{
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
{
while (1)
{
@@ -92,7 +92,7 @@
return ret;
if (pstr->valid_raw_len >= len)
break;
- if (pstr->bufs_len > pstr->valid_len + dfa->mb_cur_max)
+ if (pstr->bufs_len > pstr->valid_len + dfa_mb_cur_max (dfa))
break;
ret = re_string_realloc_buffers (pstr, pstr->bufs_len * 2);
if (BE (ret != REG_NOERROR, 0))
@@ -106,7 +106,7 @@
else
{
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
build_wcs_buffer (pstr);
else
#endif /* RE_ENABLE_I18N */
@@ -131,7 +131,7 @@
re_string_realloc_buffers (re_string_t *pstr, int new_buf_len)
{
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
{
wint_t *new_wcs = re_realloc (pstr->wcs, wint_t, new_buf_len);
if (BE (new_wcs == NULL, 0))
@@ -171,7 +171,7 @@
pstr->trans = trans;
pstr->icase = icase ? 1 : 0;
pstr->mbs_allocated = (trans != NULL || icase);
- pstr->mb_cur_max = dfa->mb_cur_max;
+ pstr->mb_cur_max = dfa_mb_cur_max (dfa);
pstr->is_utf8 = dfa->is_utf8;
pstr->map_notascii = dfa->map_notascii;
pstr->stop = pstr->len;
@@ -197,7 +197,7 @@
{
#ifdef _LIBC
unsigned char buf[MB_LEN_MAX];
- assert (MB_LEN_MAX >= pstr->mb_cur_max);
+ assert (MB_LEN_MAX >= string_mb_cur_max (pstr));
#else
unsigned char buf[64];
#endif
@@ -220,7 +220,7 @@
{
int i, ch;
- for (i = 0; i < pstr->mb_cur_max && i < remain_len; ++i)
+ for (i = 0; i < string_mb_cur_max (pstr) && i < remain_len; ++i)
{
ch = pstr->raw_mbs [pstr->raw_mbs_idx + byte_idx + i];
buf[i] = pstr->mbs[byte_idx + i] = pstr->trans[ch];
@@ -268,7 +268,7 @@
size_t mbclen;
#ifdef _LIBC
char buf[MB_LEN_MAX];
- assert (MB_LEN_MAX >= pstr->mb_cur_max);
+ assert (MB_LEN_MAX >= string_mb_cur_max (pstr));
#else
char buf[64];
#endif
@@ -360,7 +360,7 @@
{
int i, ch;
- for (i = 0; i < pstr->mb_cur_max && i < remain_len; ++i)
+ for (i = 0; i < string_mb_cur_max (pstr) && i < remain_len; ++i)
{
ch = pstr->raw_mbs [pstr->raw_mbs_idx + src_idx + i];
buf[i] = pstr->trans[ch];
@@ -555,8 +555,9 @@
}
/* This function re-construct the buffers.
- Concretely, convert to wide character in case of pstr->mb_cur_max > 1,
- convert to upper case in case of REG_ICASE, apply translation. */
+ Concretely, convert to wide character in case of
+ string_mb_cur_max (pstr) > 1, convert to upper case in case of
+ REG_ICASE, apply translation. */
static reg_errcode_t
internal_function
@@ -567,7 +568,7 @@
{
/* Reset buffer. */
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
memset (&pstr->cur_state, '\0', sizeof (mbstate_t));
#endif /* RE_ENABLE_I18N */
pstr->len = pstr->raw_len;
@@ -658,7 +659,7 @@
pstr->tip_context = re_string_context_at (pstr, offset - 1,
eflags);
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
memmove (pstr->wcs, pstr->wcs + offset,
(pstr->valid_len - offset) * sizeof (wint_t));
#endif /* RE_ENABLE_I18N */
@@ -687,7 +688,7 @@
#endif
pstr->valid_len = 0;
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
{
int wcs_idx;
wint_t wc = WEOF;
@@ -699,7 +700,7 @@
/* Special case UTF-8. Multi-byte chars start with any
byte other than 0x80 - 0xbf. */
raw = pstr->raw_mbs + pstr->raw_mbs_idx;
- end = raw + (offset - pstr->mb_cur_max);
+ end = raw + (offset - string_mb_cur_max (pstr));
if (end < pstr->raw_mbs)
end = pstr->raw_mbs;
p = raw + offset - 1;
@@ -791,7 +792,7 @@
/* Then build the buffers. */
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
{
if (pstr->icase)
{
@@ -829,7 +830,7 @@
return re_string_peek_byte (pstr, idx);
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1
+ if (string_mb_cur_max (pstr) > 1
&& ! re_string_is_single_byte_char (pstr, pstr->cur_idx + idx))
return re_string_peek_byte (pstr, idx);
#endif
@@ -918,7 +919,7 @@
return ((eflags & REG_NOTEOL) ? CONTEXT_ENDBUF
: CONTEXT_NEWLINE | CONTEXT_ENDBUF);
#ifdef RE_ENABLE_I18N
- if (input->mb_cur_max > 1)
+ if (string_mb_cur_max (input) > 1)
{
wint_t wc;
int wc_idx = idx;
@@ -1429,7 +1430,7 @@
dfa->nodes[dfa->nodes_len].constraint = 0;
#ifdef RE_ENABLE_I18N
dfa->nodes[dfa->nodes_len].accept_mb =
- (type == OP_PERIOD && dfa->mb_cur_max > 1) || type == COMPLEX_BRACKET;
+ (type == OP_PERIOD && dfa_mb_cur_max (dfa) > 1) || type == COMPLEX_BRACKET;
#endif
dfa->nexts[dfa->nodes_len] = -1;
re_node_set_init_empty (dfa->edests + dfa->nodes_len);
Index: posix/regex_internal.h
===================================================================
--- posix/regex_internal.h (revision 4185)
+++ posix/regex_internal.h (working copy)
@@ -27,6 +27,10 @@
#include <stdlib.h>
#include <string.h>
+#if defined _LIBC
+# include <gnu/option-groups.h>
+#endif
+
#if defined HAVE_LANGINFO_H || defined HAVE_LANGINFO_CODESET || defined _LIBC
# include <langinfo.h>
#endif
@@ -373,6 +377,13 @@
};
typedef struct re_string_t re_string_t;
+/* When OPTION_EGLIBC_LOCALE_CODE is disabled, this is always 1;
+ help the compiler make use of that fact. */
+#if __OPTION_EGLIBC_LOCALE_CODE
+# define string_mb_cur_max(str) ((str)->mb_cur_max + 0)
+#else
+# define string_mb_cur_max(str) (1)
+#endif
struct re_dfa_t;
typedef struct re_dfa_t re_dfa_t;
@@ -657,6 +668,14 @@
__libc_lock_define (, lock)
};
+/* When OPTION_EGLIBC_LOCALE_CODE is disabled, this is always 1;
+ help the compiler make use of that fact. */
+#if __OPTION_EGLIBC_LOCALE_CODE
+# define dfa_mb_cur_max(dfa) ((dfa)->mb_cur_max + 0)
+#else
+# define dfa_mb_cur_max(dfa) (1)
+#endif
+
#define re_node_set_init_empty(set) memset (set, '\0', sizeof (re_node_set))
#define re_node_set_remove(set,id) \
(re_node_set_remove_at (set, re_node_set_contains (set, id) - 1))
@@ -717,7 +736,7 @@
re_string_char_size_at (const re_string_t *pstr, int idx)
{
int byte_idx;
- if (pstr->mb_cur_max == 1)
+ if (string_mb_cur_max (pstr) == 1)
return 1;
for (byte_idx = 1; idx + byte_idx < pstr->valid_len; ++byte_idx)
if (pstr->wcs[idx + byte_idx] != WEOF)
@@ -729,7 +748,7 @@
internal_function __attribute ((pure))
re_string_wchar_at (const re_string_t *pstr, int idx)
{
- if (pstr->mb_cur_max == 1)
+ if (string_mb_cur_max (pstr) == 1)
return (wint_t) pstr->mbs[idx];
return (wint_t) pstr->wcs[idx];
}
Index: posix/fnmatch_loop.c
===================================================================
--- posix/fnmatch_loop.c (revision 4185)
+++ posix/fnmatch_loop.c (working copy)
@@ -277,7 +277,7 @@
/* Leave room for the null. */
CHAR str[CHAR_CLASS_MAX_LENGTH + 1];
size_t c1 = 0;
-#if defined _LIBC || (defined HAVE_WCTYPE_H && defined HAVE_WCHAR_H)
+#if defined _LIBC ? __OPTION_POSIX_C_LANG_WIDE_CHAR : (defined HAVE_WCTYPE_H && defined HAVE_WCHAR_H)
wctype_t wt;
#endif
const CHAR *startp = p;
@@ -307,7 +307,7 @@
}
str[c1] = L('\0');
-#if defined _LIBC || (defined HAVE_WCTYPE_H && defined HAVE_WCHAR_H)
+#if defined _LIBC ? __OPTION_POSIX_C_LANG_WIDE_CHAR : (defined HAVE_WCTYPE_H && defined HAVE_WCHAR_H)
wt = IS_CHAR_CLASS (str);
if (wt == 0)
/* Invalid character class name. */
Index: posix/regexec.c
===================================================================
--- posix/regexec.c (revision 4185)
+++ posix/regexec.c (working copy)
@@ -185,11 +185,11 @@
static int check_node_accept_bytes (const re_dfa_t *dfa, int node_idx,
const re_string_t *input, int idx)
internal_function;
-# ifdef _LIBC
+# if defined _LIBC && __OPTION_EGLIBC_LOCALE_CODE
static unsigned int find_collation_sequence_value (const unsigned char *mbs,
size_t name_len)
internal_function;
-# endif /* _LIBC */
+# endif /* _LIBC && __OPTION_EGLIBC_LOCALE_CODE */
#endif /* RE_ENABLE_I18N */
static int group_nodes_into_DFAstates (const re_dfa_t *dfa,
const re_dfastate_t *state,
@@ -711,7 +711,7 @@
incr = (range < 0) ? -1 : 1;
left_lim = (range < 0) ? start + range : start;
right_lim = (range < 0) ? start : start + range;
- sb = dfa->mb_cur_max == 1;
+ sb = dfa_mb_cur_max (dfa) == 1;
match_kind =
(fastmap
? ((sb || !(preg->syntax & RE_ICASE || t) ? 4 : 0)
@@ -3405,7 +3405,7 @@
if (BE (dest_states_word[i] == NULL && err != REG_NOERROR, 0))
goto out_free;
- if (dest_states[i] != dest_states_word[i] && dfa->mb_cur_max > 1)
+ if (dest_states[i] != dest_states_word[i] && dfa_mb_cur_max (dfa) > 1)
need_word_trtable = 1;
dest_states_nl[i] = re_acquire_state_context (&err, dfa, &follows,
@@ -3547,7 +3547,7 @@
else if (type == OP_PERIOD)
{
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
bitset_merge (accepts, dfa->sb_char);
else
#endif
@@ -3598,7 +3598,7 @@
continue;
}
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
for (j = 0; j < BITSET_WORDS; ++j)
any_set |= (accepts[j] &= (dfa->word_char[j] | ~dfa->sb_char[j]));
else
@@ -3617,7 +3617,7 @@
continue;
}
#ifdef RE_ENABLE_I18N
- if (dfa->mb_cur_max > 1)
+ if (dfa_mb_cur_max (dfa) > 1)
for (j = 0; j < BITSET_WORDS; ++j)
any_set |= (accepts[j] &= ~(dfa->word_char[j] & dfa->sb_char[j]));
else
@@ -3789,12 +3789,6 @@
if (node->type == COMPLEX_BRACKET)
{
const re_charset_t *cset = node->opr.mbcset;
-# ifdef _LIBC
- const unsigned char *pin
- = ((const unsigned char *) re_string_get_buffer (input) + str_idx);
- int j;
- uint32_t nrules;
-# endif /* _LIBC */
int match_len = 0;
wchar_t wc = ((cset->nranges || cset->nchar_classes || cset->nmbchars)
? re_string_wchar_at (input, str_idx) : 0);
@@ -3806,6 +3800,7 @@
match_len = char_len;
goto check_node_accept_bytes_match;
}
+#if __OPTION_EGLIBC_LOCALE_CODE
/* match with character_class? */
for (i = 0; i < cset->nchar_classes; ++i)
{
@@ -3816,8 +3811,16 @@
goto check_node_accept_bytes_match;
}
}
+#endif
-# ifdef _LIBC
+ /* When __OPTION_EGLIBC_LOCALE_CODE is disabled, only the C
+ locale is supported; it has no collation rules. */
+# if defined _LIBC && __OPTION_EGLIBC_LOCALE_CODE
+ const unsigned char *pin
+ = ((const unsigned char *) re_string_get_buffer (input) + str_idx);
+ int j;
+ uint32_t nrules;
+
nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES);
if (nrules != 0)
{
@@ -3910,8 +3913,12 @@
}
}
else
-# endif /* _LIBC */
+# endif /* _LIBC && __OPTION_EGLIBC_LOCALE_CODE */
{
+ /* In the _LIBC version, if OPTION_EGLIBC_LOCALE_CODE is
+ disabled, there can be no multibyte range endpoints, and
+ cset->nranges is always zero. */
+#if __OPTION_EGLIBC_LOCALE_CODE
/* match with range expression? */
#if __GNUC__ >= 2
wchar_t cmp_buf[] = {L'\0', L'\0', wc, L'\0', L'\0', L'\0'};
@@ -3930,6 +3937,7 @@
goto check_node_accept_bytes_match;
}
}
+#endif /* __OPTION_EGLIBC_LOCALE_CODE */
}
check_node_accept_bytes_match:
if (!cset->non_match)
@@ -3945,7 +3953,7 @@
return 0;
}
-# ifdef _LIBC
+# if defined _LIBC && __OPTION_EGLIBC_LOCALE_CODE
static unsigned int
internal_function
find_collation_sequence_value (const unsigned char *mbs, size_t mbs_len)
@@ -4003,7 +4011,7 @@
return UINT_MAX;
}
}
-# endif /* _LIBC */
+# endif /* _LIBC && __OPTION_EGLIBC_LOCALE_CODE */
#endif /* RE_ENABLE_I18N */
/* Check whether the node accepts the byte which is IDX-th
@@ -4088,7 +4096,7 @@
if (pstr->icase)
{
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
{
ret = build_wcs_upper_buffer (pstr);
if (BE (ret != REG_NOERROR, 0))
@@ -4101,7 +4109,7 @@
else
{
#ifdef RE_ENABLE_I18N
- if (pstr->mb_cur_max > 1)
+ if (string_mb_cur_max (pstr) > 1)
build_wcs_buffer (pstr);
else
#endif /* RE_ENABLE_I18N */
Index: posix/fnmatch.c
===================================================================
--- posix/fnmatch.c (revision 4185)
+++ posix/fnmatch.c (working copy)
@@ -31,6 +31,10 @@
#include <fnmatch.h>
#include <ctype.h>
+#if defined _LIBC
+# include <gnu/option-groups.h>
+#endif
+
#if HAVE_STRING_H || defined _LIBC
# include <string.h>
#else
@@ -132,7 +136,7 @@
# define ISWCTYPE(WC, WT) iswctype (WC, WT)
# endif
-# if (HAVE_MBSTATE_T && HAVE_MBSRTOWCS) || _LIBC
+# if (HAVE_MBSTATE_T && HAVE_MBSRTOWCS) || (_LIBC && __OPTION_EGLIBC_LOCALE_CODE)
/* In this case we are implementing the multibyte character handling. */
# define HANDLE_MULTIBYTE 1
# endif
Index: stdio-common/printf_fp.c
===================================================================
--- stdio-common/printf_fp.c (revision 4185)
+++ stdio-common/printf_fp.c (working copy)
@@ -150,6 +150,10 @@
wchar_t thousands_sep, int ngroups)
internal_function;
+/* Ideally, when OPTION_EGLIBC_LOCALE_CODE is disabled, this should do
+ all its work in ordinary characters, rather than doing it in wide
+ characters and then converting at the end. But that is a challenge
+ for another day. */
int
___printf_fp (FILE *fp,
@@ -211,7 +215,14 @@
mp_limb_t cy;
/* Nonzero if this is output on a wide character stream. */
+#if __OPTION_POSIX_C_LANG_WIDE_CHAR
int wide = info->wide;
+#else
+ /* This should never be called on a wide-oriented stream when
+ OPTION_POSIX_C_LANG_WIDE_CHAR is disabled, but the compiler can't
+ be trusted to figure that out. */
+ const int wide = 0;
+#endif
/* Buffer in which we produce the output. */
wchar_t *wbuffer = NULL;
Index: stdio-common/printf_fphex.c
===================================================================
--- stdio-common/printf_fphex.c (revision 4185)
+++ stdio-common/printf_fphex.c (working copy)
@@ -145,7 +145,14 @@
int done = 0;
/* Nonzero if this is output on a wide character stream. */
+#if __OPTION_POSIX_C_LANG_WIDE_CHAR
int wide = info->wide;
+#else
+ /* This should never be called on a wide-oriented stream when
+ OPTION_POSIX_C_LANG_WIDE_CHAR is disabled, but the compiler can't
+ be trusted to figure that out. */
+ const int wide = 0;
+#endif
/* Figure out the decimal point character. */
Index: stdio-common/_i18n_number.h
===================================================================
--- stdio-common/_i18n_number.h (revision 4185)
+++ stdio-common/_i18n_number.h (working copy)
@@ -19,10 +19,13 @@
#include <wchar.h>
#include <wctype.h>
+#include <gnu/option-groups.h>
#include "../locale/outdigits.h"
#include "../locale/outdigitswc.h"
+#if __OPTION_EGLIBC_LOCALE_CODE
+
static CHAR_T *
_i18n_number_rewrite (CHAR_T *w, CHAR_T *rear_ptr)
{
@@ -93,3 +96,13 @@
return w;
}
+
+#else
+
+static CHAR_T *
+_i18n_number_rewrite (CHAR_T *w, CHAR_T *rear_ptr)
+{
+ return w;
+}
+
+#endif
Index: stdio-common/printf_size.c
===================================================================
--- stdio-common/printf_size.c (revision 4185)
+++ stdio-common/printf_size.c (working copy)
@@ -24,6 +24,7 @@
#include <math.h>
#include <printf.h>
#include <libioP.h>
+#include <gnu/option-groups.h>
/* This defines make it possible to use the same code for GNU C library and
@@ -117,7 +118,14 @@
struct printf_info fp_info;
int done = 0;
+#if __OPTION_POSIX_C_LANG_WIDE_CHAR
int wide = info->wide;
+#else
+ /* This should never be called on a wide-oriented stream when
+ OPTION_POSIX_C_LANG_WIDE_CHAR is disabled, but the compiler can't
+ be trusted to figure that out. */
+ const int wide = 0;
+#endif
/* Fetch the argument value. */
Index: stdio-common/Makefile
===================================================================
--- stdio-common/Makefile (revision 4185)
+++ stdio-common/Makefile (working copy)
@@ -30,7 +30,7 @@
_itoa _itowa itoa-digits itoa-udigits itowa-digits \
vfprintf vprintf printf_fp reg-printf printf-prs printf_fphex \
printf_size fprintf printf snprintf sprintf asprintf dprintf \
- vfwprintf vfscanf vfwscanf \
+ vfscanf \
fscanf scanf sscanf \
perror psignal \
tmpfile tmpfile64 tmpnam tmpnam_r tempnam tempname \
@@ -39,12 +39,18 @@
flockfile ftrylockfile funlockfile \
isoc99_scanf isoc99_vscanf isoc99_fscanf isoc99_vfscanf isoc99_sscanf \
isoc99_vsscanf
+# Ideally, _itowa and itowa-digits would be in this option group as
+# well, but it is used unconditionally by printf_fp and printf_fphex,
+# and it didn't seem straightforward to disentangle it.
+routines-$(OPTION_POSIX_C_LANG_WIDE_CHAR) += \
+ vfwprintf vfwscanf
include ../Makeconfig
install-headers-nosubdir: $(inst_includedir)/bits/stdio_lim.h
-aux := errlist siglist printf-parsemb printf-parsewc fxprintf
+aux := errlist siglist printf-parsemb fxprintf
+aux-$(OPTION_POSIX_C_LANG_WIDE_CHAR) += printf-parsewc
distribute := _itoa.h _itowa.h _i18n_number.h \
printf-parse.h stdio_lim.h.in tst-unbputc.sh tst-printf.sh
Index: stdio-common/vfprintf.c
===================================================================
--- stdio-common/vfprintf.c (revision 4185)
+++ stdio-common/vfprintf.c (working copy)
@@ -31,6 +31,7 @@
#include "_itoa.h"
#include <locale/localeinfo.h>
#include <stdio.h>
+#include <gnu/option-groups.h>
/* This code is shared between the standard stdio implementation found
in GNU C library and the libio implementation originally found in
@@ -106,6 +107,12 @@
# define EOF WEOF
#endif
+#if __OPTION_EGLIBC_LOCALE_CODE
+# define MULTIBYTE_SUPPORT (1)
+#else
+# define MULTIBYTE_SUPPORT (0)
+#endif
+
#include "_i18n_number.h"
/* Include the shared code for parsing the format string. */
@@ -1075,7 +1082,7 @@
# define process_string_arg(fspec) \
LABEL (form_character): \
/* Character. */ \
- if (is_long) \
+ if (MULTIBYTE_SUPPORT && is_long) \
goto LABEL (form_wcharacter); \
--width; /* Account for the character itself. */ \
if (!left) \
@@ -1089,6 +1096,8 @@
break; \
\
LABEL (form_wcharacter): \
+ if (! MULTIBYTE_SUPPORT) \
+ goto LABEL (form_character); \
{ \
/* Wide character. */ \
char buf[MB_CUR_MAX]; \
@@ -1144,14 +1153,15 @@
len = 0; \
} \
} \
- else if (!is_long && spec != L_('S')) \
+ else if (! MULTIBYTE_SUPPORT || (!is_long && spec != L_('S'))) \
{ \
if (prec != -1) \
{ \
/* Search for the end of the string, but don't search past \
the length (in bytes) specified by the precision. Also \
don't use incomplete characters. */ \
- if (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1) \
+ if (! MULTIBYTE_SUPPORT \
+ ||_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1) \
len = __strnlen (string, prec); \
else \
{ \
Index: stdio-common/vfscanf.c
===================================================================
--- stdio-common/vfscanf.c (revision 4185)
+++ stdio-common/vfscanf.c (working copy)
@@ -29,6 +29,7 @@
#include <wctype.h>
#include <bits/libc-lock.h>
#include <locale/localeinfo.h>
+#include <gnu/option-groups.h>
#ifdef __GNUC__
# define HAVE_LONGLONG
@@ -133,6 +134,12 @@
# define WINT_T int
#endif
+#if __OPTION_EGLIBC_LOCALE_CODE
+# define MULTIBYTE_SUPPORT (1)
+#else
+# define MULTIBYTE_SUPPORT (0)
+#endif
+
#define encode_error() do { \
errval = 4; \
__set_errno (EILSEQ); \
@@ -360,7 +367,8 @@
#endif
#ifndef COMPILE_WSCANF
- if (!isascii ((unsigned char) *f))
+ if (MULTIBYTE_SUPPORT
+ && !isascii ((unsigned char) *f))
{
/* Non-ASCII, may be a multibyte. */
int len = __mbrlen (f, strlen (f), &state);
@@ -645,7 +653,9 @@
break;
case L_('c'): /* Match characters. */
- if ((flags & LONG) == 0)
+ scan_character:
+ if (! MULTIBYTE_SUPPORT
+ || (flags & LONG) == 0)
{
if (width == -1)
width = 1;
@@ -798,6 +808,9 @@
}
/* FALLTHROUGH */
case L_('C'):
+ if (! MULTIBYTE_SUPPORT)
+ goto scan_character;
+
if (width == -1)
width = 1;
@@ -947,7 +960,9 @@
break;
case L_('s'): /* Read a string. */
- if (!(flags & LONG))
+ scan_string:
+ if (! MULTIBYTE_SUPPORT
+ || !(flags & LONG))
{
STRING_ARG (str, char, 100);
@@ -1124,6 +1139,8 @@
/* FALLTHROUGH */
case L_('S'):
+ if (! MULTIBYTE_SUPPORT)
+ goto scan_string;
{
#ifndef COMPILE_WSCANF
mbstate_t cstate;
@@ -1365,10 +1382,17 @@
const char *mbdigits[10];
const char *mbdigits_extended[10];
#endif
+#if __OPTION_EGLIBC_LOCALE_CODE
/* "to_inpunct" is a map from ASCII digits to their
equivalent in locale. This is defined for locales
which use an extra digits set. */
wctrans_t map = __wctrans ("to_inpunct");
+#else
+ /* This will always be the case when
+ OPTION_EGLIBC_LOCALE_CODE is disabled, but the
+ compiler can't figure that out. */
+ wctrans_t map = NULL;
+#endif
int n;
from_level = 0;
@@ -2027,7 +2051,8 @@
}
wctrans_t map;
- if (__builtin_expect ((flags & I18N) != 0, 0)
+ if (MULTIBYTE_SUPPORT
+ && __builtin_expect ((flags & I18N) != 0, 0)
/* Hexadecimal floats make no sense, fixing localized
digits with ASCII letters. */
&& !(flags & HEXA_FLOAT)
@@ -2280,7 +2305,7 @@
break;
case L_('['): /* Character class. */
- if (flags & LONG)
+ if (MULTIBYTE_SUPPORT && (flags & LONG))
STRING_ARG (wstr, wchar_t, 100);
else
STRING_ARG (str, char, 100);
@@ -2352,7 +2377,7 @@
conv_error();
#endif
- if (flags & LONG)
+ if (MULTIBYTE_SUPPORT && (flags & LONG))
{
size_t now = read_in;
#ifdef COMPILE_WSCANF