codersnotes

Ill-Advised C++ Rant, Part 2 March 6th, 2016

I don't understand the C++ committee. Every few years they manage to pull together enough collective competence to propose some new features, and yet we end up with things like a network I/O library, a file system wrapper, a 2D vector graphics library, or more usually things so complicated that I can't even understand what they do.

Meanwhile things that would be actually useful - things to fix glaring giant holes in the language - just go ignored. C was invented around 1972-ish: that's literally 44 years ago, and yet we're still having to put up with inventing our own workarounds to fix the language problems.

Here's a list of some great unsolved problems in computer science that I don't think we'll ever see addressed within our lifetimes:


Calculating the size of a static array.

int myarray[] = { 4, 5, 6, 7, 8, 9 };
int x = countof(myarray); // returns 6

This is such a simple thing to add. I have no idea why they won't put it in the standard. I mean I can write my own version, sure, but when you find yourself writing exactly the same boilerplate every time you start a new project, surely that's a sign that the language should be doing that stuff for you?

Here's an example implementation for you. If you're using Visual C++, there's a built-in version. They could literally cut-n-paste this one line into <stddef.h> today.

#define countof(X) (sizeof(X) / sizeof((X)[0]))

Getting the maximum value of an enum

It's usually needed to know the number of elements in an enumeration. You can do it yourself by putting an extra element in (e.g. COLOR_MAX) at the end, but then compilers warn when you use a switch statement without checking for that case. Of course, that case can never come up, but now you have to check for it anyway. Wouldn't it be nice if the compiler just added an extra value? Here's an example syntax:

enum Color { red, green, blue };
for (int n=0;n<Color.count;n++) { /*...*/ }

Converting an enum to a string

enum Color { red, green, blue };
Color color = red;
const char *s = stringof(color); // returns "red"

Would that really be so hard? If you want this today, you either have to use X-Macros to create a horrific macro mess, or write your own little Perl script to preprocess the source code for you.

Converting a string to an enum

const char *str = "red";
Color color = parse_enum(Color, str); // returns 'red', or Color.count on error

If you have the stringof() proposed above, you can now write this macro yourself. However if the compiler were to make a built-in version it could do a much better job at code-generation (e.g. building a small hash table instead of checking each string in order).

#pragma once

Why is #pragma once not standardized despite literally every compiler supporting it? Or we could even get really crazy and augment #include with something more intelligent, but let's not get carried away here.

C99 designated initializers

It's only been 17 years since C got designated initializers, maybe they could be added to C++ too now? I understand it may be too soon, let's wait until 2033; don't want to rush into things now.

struct Enemy { float health; Color team; };
Enemy guard = {
    .health = 100,
    .team = red,
};

Binary file includes

#incbin "mypicture.jpg" mydata
Image *image = load_jpeg(&mydata_bytes, mydata_size);

Why can't I embed data right into my program? Devpac could do this 24 years ago for Christ's sake. I suppose I'll just write my own script to convert the binary to C, and the include that, and then fight the IDE/build system to add an extra step. Of course, that means anyone wanting to use my library now has to use my build system too.

FourCC codes

FourCC codes still not in the standard despite every compiler supporting it, and despite them being in widespread use for the past 30 years.

uint32_t tag = 'IHDR';

do/while(0)

44 years later and we're still having to wrap all our macros in do/while(0) blocks just to make the things actually work. I mean would it kill them to introduce a better syntax? Maybe a #macro directive that treated its contents as a statement automatically? You could even make it so you didn't have to wrap the arguments in extra brackets.

If you went really nuts, you could support multi-line macros.

#macro my_assert(X) {
    if (!X) { 
        error();
    }
}

A preprocessor that vaguely resembles a sane programming language

A macro is just a tiny little program that runs at compile time. The system that C/C++ implements is technically a programming language, and may even be Turing-complete for all I know, but it's the most bass-ackward language I've ever seen.

Lisp had a decent macro system since 1963 (53 years). It's not even that hard to do something better than what we already have; see this example of a Lua preprocessor that uses Lua as it's preprocessing language, in only 21 lines of code.

There's a million other preprocessor improvements I could list, like getting the count of the __VA_ARGS__ array, but honestly a new C++ preprocessor is a whole article on its own.

Why can't I iterate the members of a struct?

struct Player { float x, y, z; int health; };
member_t *m = &Player.members;
for (size_t n=0;n<countof(Player.members);n++) { /*...*/ }

It doesn't add any overhead, the compiler already has all that information right there, it just needs to save out a damn table when asked to.

Or I guess I'll just go write my own half-assed reflection system, gee I've only done that a million times before.

Switch/case is still stupid

switch/case still doesn't break by default. Now obviously you can't change it because of compatibility with existing programs, but could you not add a new statement with better behaviour? (match? select? I dunno)

void parse(char x) {
    match(x) {
        default:                   error();
        case 'a' .. 'z' || '_':    letter(x);
        case '0' .. '9':           digit(x);
    }
}

Notice that .. operator I used? That's a gcc extension, and it's still not in the standard despite being obviously fantastic.

Why can't I switch on pointers? Or strings? If it has an operator==, surely I should be able to switch on it? Anything that can be compared for equality can be reduced to a chain of if-else statements, and yet I have to do that manually - the compiler won't do it itself.

Still no strong typedefs

typedef strong int Handle;

Handle x = 4; // "error: cannot implicitly cast '4' to 'Handle'

Please? I mean it's only been 33 years since Ada had them, so I understand there may not have been enough time to implement this yet. Language committees are busy people.

Types as first-class primitives

struct Player { float x, y, z; };
type_info *t = typeid(Player);
printf("%s\n", t->name()); // "Player"
Player *p = (Player *)t->new(); // allocate a new Player
for (size_t n=0;n<t->members_count;n++) { /*...*/ }

C++ claims to already have run-time type information, via the typeid keyword. Unfortunately, while this does indeed return type information, they forgot to actually put anything useful in it. Here's the contents of the returned type_info according to MSDN:

class type_info {
public:
    virtual ~type_info();
    size_t hash_code() const
    _CRTIMP_PURE bool operator==(const type_info& rhs) const;
    _CRTIMP_PURE bool operator!=(const type_info& rhs) const;
    _CRTIMP_PURE int before(const type_info& rhs) const;
    _CRTIMP_PURE const char* name() const;
    _CRTIMP_PURE const char* raw_name() const;
};

Wait, that's it? I can get its name, and compare the type_info. But that's it. No way to create a new object given a type, no way to iterate the list of members, no way to get the size of a type given it's type_info, nothing.

RTTI in C++, right now, is a useless feature that most people just disable on the command-line. It's like they implemented it enough to satisfy one particular feature they wanted, and then just got bored after that.

What if I want to create an instance of a type given a string containing its name? Nope sorry, guess you'll just have to write out a giant factory method again yourself. But don't expect the language to actually help you in any way.

Wikipedia also has this to say:

Some aspects of the returned object are implementation-defined, such as std::type_info::name(), and cannot be relied on across compilers to be consistent.

Which is nice.


Epilogue

There will be some people who, upon reading this, will just say "Oh, well you can just do [whatever hack] in C++ which kinda does that right now". I don't care. I know how to work around these things, that's not the point. I want the damn language to do it for me. I want the people designing the language to pull their finger out of their asses and do their god damn jobs.

Written by Richard Mitton,

software engineer and travelling wizard.

Follow me on twitter: http://twitter.com/grumpygiant