> {0} initializer in C or C++ for unions no longer guarantees clearing of the whole union (except for static storage duration initialization), it just initializes the first union member to zero. If initialization of the whole union including padding bits is desirable, use {} (valid in C23 or C++) or use -fzero-init-padding-bits=unions option to restore old GCC behavior.
This is going to silently break so much existing code, especially union based type punning in C code. {0} used to guarantee full zeroing and {} did not, and step by step we've flipped the situation to the reverse. The only sensible thing, in terms of not breaking old code, would be to have both {0} and {} zero initialize the whole union.
I'm sure this change was discussed in depth on the mailing list, but it's absolutely mind boggling to me
Fun fact: GCC decided to adopt Clang's (old) behavior at the same time Clang decided to adopt GCC's (old) behavior.
So now you have this matrix of behaviors: * Old GCC: Initializes whole union. * New GCC: Initializes first member only. * Old Clang: Initializes first member only. * New Clang: Initializes whole union.
That's funny and sad at the same time.
And it shows a deeper problem, even though they are willing to align behavior between each other, they failed to communicate and discuss what would be the best approach. That's a bit tragic, IMO
I would argue the even deeper problem is that it's implementation defined. Should be in the spec and they should conform to the spec. That's why I'm so paranoid and zeroize things myself. Too much hassle to remember what is or isn't zero.
It is like an era of average.
Since having multiple compilers is often touted as an advantage, how often do situations like what you're describing happen compared to the opposite — when a second compiler surfaces bugs in one's application or the other compiler?
i will call it "webification" of C!
I thought that {} should always initialize everything regardless of whether or not there is anything in between the braces, and that {0} should only be valid if the first member is a numeric or pointer type (but otherwise has the same effect as {} with nothing in between). I thought that would make more sense, isn't it?
(If you write {} with multiple values when initializing a union, then it should be an error unless all of the values are the same and all of the corresponding members (the first few if you do not explicitly specify which ones) are of the same type as each other.)
C never had {} until C23. In C {0} was the only way to explicitly zero-initialize a structure in a generic manner. It works because in C initializer lists are applied to members as-if nested structures are flattened out lexically.
However, a long time ago C++ went in a completely different direction with initializer lists, and gcc and clang started emitting warnings (in C mode) about otherwise perfectly valid C code, thus the adoption of C++'s {} for C23. {0} is still technically valid C23, though, as well as valid C89, C90, C99, and C11. In fact, reading both C23 and C89 I'm struck by how little the language has changed:
C89 3.5.7p16:
> If the aggregate contains members that are aggregates or unions, or if the first member of a union is an aggregate or union, the rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the members of the subaggregate or the first member of the contained union. Otherwise, only enough initializers from the list are taken to account for the members of the first subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next member of the aggregate of which the current subaggregate or contained union is a part.
C23 6.7.10p21:
> If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the elements or members of the subaggregate or the contained union. Otherwise, only enough initializers from the list are taken to account for the elements or members of the subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next element or member of the aggregate of which the current subaggregate or contained union is a part.
This was my instinct too, until I got this little tickle in the back of my head that maybe I remembered that Clang was already acting like this, so maybe it won't be so bad. Notice 32-bit wzr vs 64-bit xzr:
$ cat union.c && clang -O1 -c union.c -o union.o && objdump -d union.o
union foo {
float f;
double d;
};
void create_f(union foo *u) {
*u = (union foo){0};
}
void create_d(union foo *u) {
*u = (union foo){.d=0};
}
union.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000000 <ltmp0>:
0: b900001f str wzr, [x0]
4: d65f03c0 ret
0000000000000008 <_create_d>:
8: f900001f str xzr, [x0]
c: d65f03c0 ret
Ah, I can confirm what I see elsewhere in the thread, this is no longer true in Clang. That first clang was Apple Clang 17---who knows what version that actually is---and here is Clang 20:
$ /opt/homebrew/opt/llvm/bin/clang-20 -O1 -c union.c -o union.o && objdump -d union.o
union.o: file format mach-o arm64
Disassembly of section __TEXT,__text:
0000000000000000 <ltmp0>:
0: f900001f str xzr, [x0]
4: d65f03c0 ret
0000000000000008 <_create_d>:
8: f900001f str xzr, [x0]
c: d65f03c0 ret
Looks like that change is clang ≤19 to clang 20: https://godbolt.org/z/7zrocxGaq
> This is going to silently break so much existing code
The code was already broken. It was an undefined behavior.
That's a problem with C and it's undefined behavior minefields.
GCC has long been known to define undefined behavior in C unions. In particular, type punning in unions is undefined behavior under the C and C++ standards, but GCC (and Clang) define it.
I have always thought that punning through a union was legal in C but UB in C++, and that punning through incompatible pointer casting was UB in both.
I am basing this entirely on memory and the wikipedia article on type punning. I welcome extremely pedantic feedback.
> type punning in unions is undefined behavior under the C and C++ standards
Union type punning is entirely valid in C, but UB in C++ (one of the surprisingly many subtle but still fundamental differences between C and C++). There's specifically a (somewhat obscure) footnote about this in the C standard, which also has been more clarified in one of the recent C standards.
EDIT: This comment is wrong, see fsmv’s comment below. Leaving for posterity because I’m no coward!
- - -
Undefined behavior only means that the spec leaves a particular situation undefined and that the compiler implementor can do whatever they want. Every compiler defines undefined behavior, whether it’s documented (or easy to qualify, or deterministic) or not.
It is in poor taste that gcc has had widely used, documented behaviors that are changing, especially in a point release.
When you have a big system many people rely on you generally try to look for ways to keep their code working - not look for the changes you’re contractually allowed to make.
GCC probably has a better justification than “we are allowed to”.
> GCC probably has a better justification than “we are allowed to”.
Maybe, but I've seen GCC people justify such changes with little more than "it's UB, we can change it, end of story", so I wouldn't assume it.
Undefined in the standard doesn't mean undefined in GCC. Type-punning through unions has always been a special case that GCC has taken care with beyond the standard.
Really excited about #embed support:
> C: #embed preprocessing directive support.
> C++: P1967R14, #embed (PR119065)
See also:
https://news.ycombinator.com/item?id=32201951 - Embed is in C23 (2022-07-23)
I'd really wish for an `std::embed<...>` that would be a consteval function (IIRC there is a proposal for this, but I don't know its status). The less pre-processor stuff going on the less there is to worry about, the syntax would end up much cleaner and you can create your own wrapper functions.
"C++ Modules have been greatly improved."
It would be nice to know what these great improvements actually are.
Later in the article, it mentions:
Improved experimental support for C++23, including:
std and std.compat modules (also supported for C++20).
From https://developers.redhat.com/articles/2025/04/24/new-c-feat...: The next major version of the GNU Compiler Collection (GCC), 15.1, is expected to be released in April or May 2025.
GCC 15 greatly improved the modules code. For instance, module std is now supported (even in C++20 mode).
In GCC 14, C++ modules were unusable (incomplete, full of bugs, no std modules, etc). I haven't tried 15 yet but if that changed, then it definitely qualifies for a "great improvement".
Still no std modules but otherwise likely useable. modules are ready for early adoptors to use and start writing the books on what you should do. (Not how to do it, those books are mostly written though not in print. How hou should as is was imbort std a good idea or shoule containers and algorithms been split - or maybe something I haven't though of)
those were the greatest improvements of all time. all of them. :D
Interesting to see some improvements being done to Modula-2 frontend as well.
Finally, musttail, can't wait to try that out.
Any Hope for HaikuOs + Winlibs. GDC would be greatly appreciated.
Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.