My C++ Wishlist

Here's a list of features I think I'd like to have in C++. Some of them are pretty half-baked, but blue-sky ideas can be fun even if they aren't always technically useful.

Most of this list is old (2008 or older). And guess what? Some of my wishes have been granted! They are archived at the bottom, along with ones which have been made obsolete. I do have some wishes left though. Always.

Limited friendship

(Added January 2022)

Sometimes you need friend. But it's a bit of a sledgehammer, isn't it? You declare something a friend to your class, and gets to access everything in the class.

Sometimes you want a bit more control over what a friend gets to see. In libpqxx I introduced "call gates," little intermediary classes that have friend access to some class and expose a subset of its API, as little inline wrapper functions. Shouldn't cost anything at run time, but it's a pain to maintain.

This would be easier if a class could declare a friend for just a specific subset of its members. Imagine we could do that with "friend for":

    class big_class
    {
    public:
        // ...

    private:
        void foo();
        void bar() const;
        void szut();

        int value;

        // Give class "debugger" access to everything in this class.
        friend class debugger;

        // Give class "helper" access to just these member functions.
        friend class helper for { foo(); bar() const; };

        // Give class "serializer" access to just "value".
        friend class serializer for { value; };
    };

(Doesn't really need the for in there, but it's already a reserved word and I think it reads like nice, clear English.)

`new_shared()` and `new_unique()`

(Added May 2021)

I love make_shared() and make_unique(). But they only work for constructing an object on the heap.

Sometimes though you want to call a factory function which "returns" the object it creates. And that doesn't work with make_shared() etc. To do that, you have to go back to using new. It's a little awkward. There can be some type stutter.

By the way, I wrote "returns" in quotes, because the Return Value Optimization makes this kind of a function more of a construct-in-place than an actual "return." (When I toyed with the idea of move semantics back in the 1990s I even liked to call it "operator return()").

An RVO-based factory function doesn't even need the class it factories to have a copy or move constructor. Which is exactly why it would make sense to have analogues to make_unique() and make_shared(): "Please do the following to construct me an object on the heap."

The calls might look something like...

    std::shared_ptr p = new_shared(myobj::make, arg1, arg);

Here myobj::make would be a factory function for myobj, and arg1 and arg2 would be its arguments.

Template keyword arguments

(Added September 2020, though I've been wanting it for ages)

I'm sure people are working on giving us something akin to Python's "keyword arguments":

    x = foo(size=12, rotation=30, color="red", strict=True)

Keyword arguments change everything. No need to count parameters and arguments and avoid functions with more than 3 parameters. No need to look up the default value for a parameter just because you need a value so you can get to the next parameter. I love keyword args.

But I want them for templates as well, so I can do e.g.:

    auto x = std::map<int, std::string, Allocator=myalloc>{};

See what I did there? Skipped the Compare argument, because the default is just what I need.

Aligned views

(Added August 2020)

This would get easier if we had "template keyword arguments."

Back in the 1990s, I was doing some high-performance work (it was considered for inclusion in the SPECint benchmarks) that involved variable-length strings and bit sets. I had to do a lot of vector comparisons on bit sets.

There were some great optimisations I could make by assuming a certain alignment at the beginning and end of each container. For instance, a string's terminating zero would extend all the way to the end of its 64-bit word, so a loop could check for the end of the string by looking at just 1 byte in every 8. It let me go through comparisons and such 8 bytes at a time, with very little checking, and today's SIMD instruction sets might make even more of a difference.

A lot of this probably already happens. But if the compiler doesn't know the alignment, it's going to have to emit run-time checks that might walk through a few bytes of unaligned data before they hit a nice alignment boundary, or look for the greatest common divisor of the alignments of two strings that need to be compared. And it'd have to generate code to optimise in different ways depending on the common divisor. Which in turn would complicate the lives of branch predictors and instruction caches.

So I'm wondering... what if container and view templates had alignment parameters? In other words, what if your code could specify things like "this string must be aligned to at least 16 bytes"? Would language implementations suddenly gain access to any useful optimisations?

Of course you'd still have all these problems. You do want to be able to compare an aligned string to an unaligned one, or ones with different alignments. You might find yourself optimising separately for every possible combination of alignments.

But at least you'd be doing so at compile time, and emitting only your best code.

`for constexpr`

(Added August 2020)

Yes I know, we have folding expressions now but... don't you sometimes still wish you could iterate a tuple or parameter pack?

So would it be possible to have code like this?

    template<typename ...T> void print(T... t)
    {
        for constexpr (auto f : t) std::cout << f << '\n';
    }

or...

    template<typename TUPLE> void print(TUPLE const &t)
    {
        for constexpr (auto f : t) std::cout << f << '\n';
    }

It's outrageously weird, yes. It's not actually a loop, for starters. Each iteration is like a separate specialisation of an inline function template. The auto means a different type for every "iteration."

You could achieve the same thing with folding expressions using the sequence operator, but I think folding expressions really only express my intentions very well when I'm trying to fold a simple expression, or else something that already naturally fits in a function. It has to be a "thing." But when I need to start writing functions or lambdas just to squeeze a block of code into a folding expression, it just feels like a hack.

String `switch`

(Added August 2020)

If a std::string_view can be constexpr, can we finally have switch on string values?

There are constexpr map implementations out there. But can they optimise based on the actual cases the way a compiler could? In theory yes, but it might get very complicated.

In libpqxx I just wrote something like:

    // (text is nonempty)
    constexpr auto parse_transaction_status(std::string_view text) noexcept
    {
      switch (text[0])
      {
      case 'a': if (text == "aborted") return tx_aborted; break;
      case 'c': if (text == "committed") return tx_committed; break;
      case 'i': if (text == "in_progress") return tx_in_progress; break;
      }
      return tx_unknown;
    }

I cunningly used the coincidence that each of the strings happened to begin with a different letter. That won't scale very well, but there may be other tricks for other situations: simple perfect hash functions, looking at the strings' lengths, and so on. My code could be so concise and efficient and scalable if the compiler could use these to optimise the lookup under the hood, and just let me write:

    constexpr auto parse_transaction_status(std::string_view text) noexcept
    {
      switch (text)
      {
      case "aborted": return tx_aborted;
      case "committed": return tx_committed;
      case "in_progress" return tx_in_progress;
      default: return tx_unknown;
      }
    }

Immutable string

(Added January 2020)

Strings in many languages are immutable. The big advantage is that copies can share the same buffer, much like a std::shared_ptr.

If we had an immutable string type, let's call it fixstr, then it could have a constructor: fixstr(std::string &&text). This would "steal" the buffer from text, and effectively "freeze" it so it could no longer be modified. You'd still have std::string for composing strings, but once you've done that, this is a cheaper way of passing them around.

API-wise, a fixstr would be as similar as possible to a const std::string. It could even have a capacity(), but that would just return the size(). An immutable string does not need a capacity separate from its size.

It would also be trivial to create a "non-owning" fixstr, which would never deallocate its buffer. That way you could trivially wrap the same fixstr class around compile-time string constants — you basically get a more powerful std::string_view which optionally manages its buffer.

My friend Kirit Sælensminde has built something like this. The main problem is of course that it's not part of the standard library so it can't steal a std::string's buffer!

Test Mode

(Added January 2018, but it's been on my mind for a few years now)

Okay, this one needs some careful thought because people may be tempted to use it as a broken workaround for broken code. But it could be so useful.

For my unit tests I want very different things from the language than in production code. I wish my tests could:

Access private and protected members of a class, from outside the class. For instance, I may want to have thorough unit tests for a private member function. Or sometimes I want to very how member functions affect private data members.
Access members of private namespaces, from outside the compilation unit. Same reason as above. I want to unit-test my local functions, but still get full optimisation when I'm not.
Override member functions on a class or even an individual object. Without having to make them virtual and derive a fake class. This makes it easier to create test doubles. Yes, I could structure my code around this, but I don't want it to affect the performance or simplicity of production code.

I really don't mind if a test that does these things runs slow as molasses. In fact it may help discourage abuse of the feature in production code. Just so long as my other tests still go through realistic optimisation. Optimisation can help reveal subtle bugs, and we don't want to hide those in the test suite.

Exposing members of members

The way inheritance works in C++ is basically as "default member variables," anonymous fields that are designated only by type. Any name that can't be found in the derived class is looked up in this "member," and there can be multiple "default members" as long as they have different (type) names.

So why not take this idea back to regular class members and allow using-definitions on them?

class foo
{
	std::set<int> myset;
	gui::scrollist mywidget;
public:
	using myset.find;
	using myset.end;
	using mywidget.refresh;
};

Here foo exports member functions of its two member variables as if they were members of foo itself — or as if the member variables were base-class objects of foo. The functions are delegated without writing any wrapper boilerplate, and without caring about signature details. That last part can be important when using templates:

template<typename CONTAINER> class foo
{
	CONTAINER myset;
	gui::scrollist mywidget;
public:
	using myset.find;
	using myset.end;
	using mywidget.refresh;
};

A keyword for function declarations and definitions

(Added May 2015)

In spite of all the available tools, I still like to use simple text search on code. It's one of those tools that you'll always have at hand, even if you're working on a borrowed machine, a fresh install, or a VM. But it's not as useful as I would like for looking up function declarations and definitions.

It would be insanely practical if C++ had a keyword to mark functions, just as it does for classes, and just as most recent languages do. For instance, Python has a keyword def to introduce a function definition.

It's become even more important now that return types can come after the signature. A future project style guide could demand that a function name be on the first line of its declaration, and now you can search for e.g. "func auto myroutine(" to find just the declarations of myroutine() (but not any of its uses).

Over time, of course, you could hope that the keyword would become the norm. It could make it easier to tell function declarations from variable definitions with initializers, if that will still be an issue. Parsers and syntax highlighers might have an easier time. Humans almost certainly would.

Optimization Goals

For ages now we've been optimizing CPU performance. Higher clock speeds, more aggressive optimizations, faster memory to keep the CPU fed. It all matters, because practically all of the CPU's time is spent either waiting for memory or cranking through tight loops. The benchmarks we use to guide all this work say so. That's one big reason why we still use so many statically compiled languages like C++: you get to spend a lot more time finding ways to make those loops run just a little bit faster. We need multimedia extensions so we can speed up those crucial loops in multimedia applications: it's low-hanging fruit for an important chunk of perceived performance in everyday computing tasks.

Then there are those who say that computers are "fast enough" now, or at least for most users. Most of the computer's time is spent waiting for something it can do for us. We don't need our word processors or web browsers to be any faster. What matters now is to produce lower-power processors with more functionality, so computers can do ever more and ever smarter work for us. Some say we've even run out of ideas to put the computing power we have to effective use. Processors have been getting faster at an astounding exponential rate, and although memory hasn't exactly been keeping up, it's gotten a lot faster too. This has kept system speeds skyrocketing faster than we know how to waste it on slower software.

Bollocks!

Both positions are full of good, technically correct points. But I think they misrepresent what's been happening: the CPU cranking through loops is no longer the major time waster in most computer use. The computer isn't fast enough, and never will be. But ever-faster processors and ever-more efficient loops have hit a point of diminishing returns. It's a simple application of Amdahl's Law: if n% of runtime is spent on a particular part of the job, then the best you can hope for in optimizing that one task is an n% speedup. And that's if you manage to get the task done infinitely fast; if you speed it up by a factor of x, you only shed n%-(n%/x) of overall runtime, while getting to a higher x typically requires a disproportionately large amount of effort.

The moral is obvious: don't waste large efforts getting a large x for a low n. Double the speed of a 10% part of your workload, and what have you gained? An overall 5% improvement. Which is great, but not as much as you could have gotten from a mere 20% speedup on a task that took 50% of your time.

So where are our systems wasting their time (and ours)? Don't believe the benchmarks. Those tend to measure the straightforward, classic computing tasks that used to take so long, way back when. In fact, this trend I'm talking about has already come to the point where it's getting hard to find good benchmarks for that kind of thing. The big problem in composing "practical" benchmark tests today is finding realistic, straight-through processing tasks that take long enough to run! Second big problem: to eliminate the huge, highly variable "warmup" cost of running a benchmark — a cost that would otherwise grossly distort the real runtime of the program. We need consistent, reproduceable results that are subject to all kinds of rational analysis.

Waaaaaiit a minute. Ho! Step back. What's happening here? We're doing our best to eliminate this large, highly variable cost from our benchmarks. But it's getting harder and harder to make the straight-through processing take up a significant part of the total. Now that is an institutional blind spot. We're focusing on a low n so we can fight over the best x!

Back to the high-n part of the job, therefore. Here are some interesting facts about that, which are quite widely known in various circles:

In practical desktop computing, very little processing time is actually spent in applications. It's mostly in shared libraries. In theory, this is good because a single, reusable implementation of a function can be optimized more thoroughly than a thousand different versions in individual applications. The drawback is that the optimizer cannot "see across" these library calls to optimize at a higher level of abstraction.
Higher-level interpreted languages can do quite well in terms of performance compared to ones with low-level virtual machines that try to be close to machine code — despite typically being too heavyweight to do simple things like adding up numbers efficiently. A particularly important part better left to a smart, optimizing interpreter/library author than to a brilliant compiler is string processing.
Loading shared libraries, including relocation of references, takes a huge amount of time compared to many program runtimes. All those references to library symbols need to be looked up in hash tables, and need to be "patched up" throughout executables files at load time. The manipulation of symbol visibility in gcc 4.0 is a great example of good work being done here, as is the prelink program.

Below are some things we could conceivably do about the real performance problem...

Lazy vtables

For every class that has virtual member functions (or destructors — strictly speaking they're not called member functions I believe), a typical C++ compiler will emit a vtable. This is a list of pointers to the various virtual functions in that class. Every object of the class will have a pointer to its vtable (the vptr), except that each class in the hierarchy may have its pointers point to different implementations of the same virtual function.

All of these functions need to be looked up by the system linker/loader when the code is loaded in memory for execution, so the vtable can be filled with the addresses where the various virtual function implementations happen to have been loaded into memory. Theoretically speaking this is only needed if the class, nay even the individual function is actually used — but the conventional model of separate compilation and linkage makes that impossible for the compiler to predict. Perhaps we could save a lot of time if we didn't need to do all these lookups; lots of them are never used on the average run of a typical program. But inserting run-time checks would be too costly: "I'm going to call this virtual function. Wait a minute, have I looked up its address yet? If not, do it now." That would defeat the entire purpose of the exercise.

Something we could do, however, is to replace the function pointers in the vtable with stubs. The pointers would not be directed to the virtual function implementations themselves on loading; instead, they would all point to a single stub function that does the following:

Identify the function that the caller really intended to call.
See whether the function's address has been looked up yet. If not:
1. Look it up.
2. Replace the pointer in the vtable with the newly found address.
Jump to the originally intended function.

After the program has been running for a while, all functions that are actually used would have been looked up and run at full speed, just as in the conventional implementation.

This may look unnecessarily complex. There is a technical reason for that: the program may be multi-threaded, and doing things this way avoids the need for a lock to synchronize multiple threads that may try to make that crucial first call at the same time. Writing the new address to the vtable should be atomic on most architectures, without the need for any synchronization; the compiler implementation would know about this anyway.

This trick could greatly reduce the number of symbols that need to be looked up at program load time, probably reducing the time needed to get some first useful responses from it, at a hopefully small extra cost in initial performance.

There has to be a catch, of course, and there is.

What if the program does not call the virtual function, but tries to take its address instead? You could just let it read the stub, and there are things you could do to make that "patch itself out of the code" to some extent when a call is made through the pointer. But who ever wants to take the address of a virtual function in order to call it? That's exactly what virtual functions are for in C++: to take the "pointer dancing" for function calls with runtime binding out of the programmer's hands — look through the Linux kernel source for some examples of this ugly stuff. No, a much more likely reason why someone could want to take the function's address is to obtain some runtime type information. Compare the function pointer with another. And since that does not involve a call, using stubs here becomes really difficult.

Assuming that such cases are rare, however, it may be well worth the cost to insert a runtime check for this usage. Just check that every virtual function whose address is taken has had its address looked up, then proceed as with a conventional vtable. If that case is rare enough, as per Amdahl's law (a low n and negative x for taking function addresses, but higher n and good x for load effort) this could be a big win.

This is not a new idea. I've used this in the 1990's, and I'm sure others have come up with it before me. A particularly nice part of the "atomic update" trick is that it is also idempotent: even on a CPU that keeps its instruction and data caches separate, there is no need to flush caches to make the change stick.

Parallellism

(This wish dates back to 2005. The Rust programming language has pretty much granted it, though not in C++!)

Everyone "knows," and has "known" for some time now, that parallellism is going to be the next major step up in performance.

As far as I can see, it hasn't happened. There have been many nice academic ideas that weren't quite ready for practical use, or only worked in environments that weren't (such as overly restrictive languages). Then there are the various threading implementations for the major programming environments, which have more or less standardized but remain messy. And there's the fact that C and C++ don't exactly help make it easy. Java has good ideas on how to do conventional multithreading, but it's still just that — conventional multithreading.

In my humble opinion, we've been looking the wrong way. For an example of how to approach these things, look at const.

I love const. It cuts both ways: it expresses an important semantic property in the source code (good for the programmer), but it also provides a powerful piece of information to the optimizer.

A viable approach to parallellism should be like that: first and foremost a way of specifying correct software, so people have reason to adopt it in their daily programming; and then we can look at how to use it for optimizing the resulting code. That requires a leap of faith: start using it before we know it's going to be designed "just right" from the optimizer's point of view. The same happened with const: what optimizers really wanted was something more like restrict (a C99 feature), and practical experience with const was the only way to make clear just what was needed. The ensuing difficulty in designing a stronger successor indicates to me that const was a stroke of genius. I can only hope that future languages will allow the programmer to define similar attributes for their own classes.

So, I say, design useful correctness semantics for parallellism first, and trust that it will provide optimization opportunities. Even if it doesn't, a construct that provides this much will have gained us something.

Really easy to say, right? We need a construct that enhances correctness before all else. How can you take a problem that puzzles many, and introduces uncountable (and more often than not, un-debuggable) bugs into the world, and somehow make it enhance correctness?

I think there are things we can do; see below. But first and foremost, we need to try. It's hard, but it's worthwhile. The promised Era of Parallellism will not come all by itself. We can start by thinking about dependency relationships (and lack of same) between tasks.

Parallellism for Correctness

In case you thought this was all theoretical, here's a list of real-world examples that could be vastly improved with a good model for dependencies and independent tasks. Some of them are largely about error handling, which is a particularly important (and underrated) area of programming.

I like to have lots of different pages open in my Web browser, all in different tabs. Sometimes I open a second window, e.g. on a separate virtual desktop, for what is to me a different task. But the error handling is just horrendous: an error dialogue window may appear on the other desktop, where I'm not looking, and completely block the browser window I'm working in. Do I care that they happen to be the same application? Worse, more often than not, the error is a networking problem that occurs for up to a dozen different pages I'm currently trying to open. They all appear as consecutive, near-identical, GUI-blocking messages that I have to confirm individually.
When I upgrade or install software packages on a Debian system, the package manager prints a very nice overview of what it's going to do, including any new packages it needs to download to fulfill dependencies of the ones I requested, and so on. I confirm my wish to do this. Then it begins to download these packages. Then a very nice plugin queries the online bug database to see whether the new packages either introduce or fix any known serious problems. I may browse the bug reports for a bit, then confirm that I want to install. The packages are installed, after which I confirm that I want them to be configured as well (duh). Most of this work can be done without any confirmation from the user, or at least in anticipation of the final go-ahead. Don't we do that in real life? Isn't there some useful part of the job that you can do without causing any ill effects, instead of sitting idle until you get permission for the final step?
Most CD burning in the open source world is done with a program called cdrecord. I'm using it a lot for system testing at the moment. It has one or two stages where it says, effectively, "I'll give you a few seconds to reconsider before I go ahead and write to this CD." That can be very useful, but the next step after that is to obtain information about the CD drive or the inserted CD. The program might as well have tried getting this information before going idle — but it isn't easy to write, debug, and manage parallel code in C.

What these examples have in common is the spilled-coffee effect: you set the computer to work on some long-running task and go off to get a cup of coffee in the meantime. By the time you get back, you expect it to be more or less finished.

Instead, you find that almost none of the work has been done. For most of the time you've been away, the program has been sitting there displaying some equivalent of "Press Any Key to Continue." This is of course the point where you may spill your coffee; even if you don't, the coffee break has, in a sense, been wasted.

Like I said, error handling is particularly important. Few programmers spend much time thinking about it. There are good reasons for this. It's against human nature to focus on what might not work; the programmers are intimate enough to understand most of their errors when using their programs, and subconsciously avoid them in the first place; and frankly, it's not usually very clear what to do with them anyway.

So programmers tend to ignore errors. Users, as a result, get bombarded with unclear, unspecific, dumb, or superfluous error messages and tend to ignore them also. A good model for expressing interdependence of tasks can help here: if we stop treating exceptions like hot potatoes that must be passed on to the user immediately, we can aggregate them a bit. Present them in more sensible ways. Recognize what's going on before passing the buck to whoever's sitting outside the computer. "The following error occurred for a bunch of different files." I've seen some programs do that, but not many, and never with multi-selectable lists to make it easy to manipulate the program's reactions to each. Let's find better ways of handling errors, outside of the regular sequential code path!

In the following sections I'll show a very, very simple construct that could help solve all these problems — given one very simple extension to the language.

Object Sync

(This wish dates back to 2004. In retrospect it's probably not a good idea. It's also probably moot, as of C++11 with its threading memory model.)

In conventional multithreaded programming, you often obtain a lock, access an object related to the lock, release the lock, and continue. This exposes you to two risks that the language doesn't help with:

When you think you're reading (parts of) the object from memory, you may actually be reading from processor registers because the compiler "cached" them for you during a previous access, or "prefetched" possibly incorrect values before you obtained the lock.
After you've modified the object and released the lock, the compiler may still be keeping some of the values you've written in registers, either to write them back after you release the lock (creating a race condition) or not writing them back at all because it thinks it knows exactly how and when it will be read/written next.

Lots of C and C++ programs deal with this problem by ignoring it. This will usually work because the compiler isn't smart enough to optimize across the external lock/unlock function calls or inlined assembly instructions. But compilers get smarter (some already do whole-program analysis), and we normally want them to optimize as aggressively as possible. In fact, even with the lock/unlock functions, it can be good to have optimization "think across" the calls for all matters except the locked object.

We do have one construct to help us explain to a compiler that an object may be shared between different threads or execution contexts: volatile. But this vaguely defined modifier affects the entire lifetime of the object, so all code accessing the object is kept less efficient — just because we need to synchronize it at two points in the program!

What we need, I guess, is a targeted synchronization primitive. Imagine:

namespace std
{
template<typename T> void sync(volatile T &) throw ();
}

Now, doing a std::sync(foo) on an object foo would tell the compiler, "I don't want you to keep any part of foo in registers across this call, because another execution context may access it." It would also help express to anyone reading the code which is the, ahem, object of that lock. I don't think the volatile is strictly necessary (heck, it might even cause compilation problems) but I'm including it here mostly for artistic reasons.

Compilers could implement the template in various ways:

Insert an empty inline assembly block to frighten the compiler out of moving any variable registerizations across the access at all.
Use a #pragma to tell the compiler what to do.
Don't implement it at all; declare it as a function, have the linker remove the calls, and the compiler will never know anything except that the object may be accessed during the "call."
Call a single external helper function taking a void * argument. Let's call it __sync(). Include an empty void __sync(void *) function in the standard library.

Class Member Names

Some people like to prefix the names of class member variables with underscores (_foo), Hungarian-looking lower-case letters (pFoo), or other prefixes (i_Foo, m_foo). I'm one of them. I dislike Hungarian notation but like to know when code is referring to a member variable of its containing class. I find it really helps, both in maintaining my own code and in reading somebody elses code. Example code snippets can be shorter and clearer given such a convention, because they no longer need to state whether a particular variable is a class member.

Unfortunately, it's not going to work very well unless everyone can agree on a single convention — which they can't. None of them looks quite natural to me. It's often better to express things inside the language than using conventions on top of the language, so here's an idea on how to do that.

Dotted Names

The idea is to allow class member names to start with a dot:

struct bar
{
  int .foo;

  void func() { .foo++; }
};

Now this member works a lot like a Python field. So you can refer to foo from within its object as .foo, or in general you can access it like this->foo, or myobj.foo, etc, but never as just foo.

This should work for member variables and member functions alike:

struct splat
{
  static int .foo;
  int .myfoo;

  static void .func()
  {
    .foo++;
    splat::foo++;
    foo++;	// ERROR: "foo" undeclared
  }

  void .anotherfunc()
  {
    .myfoo++;
    this->myfoo++;
    (*this).myfoo++;
    .func();
    splat::func();
    func();	// ERROR: "func()" undeclared
  }
};

There could be serious parser problems with this idea; but if it's possible then I think it would solve a lot of problems. It relies on standard language notation to indicate class membership, so no more convention conflicts. It's compiler-enforceable. And it's backwards-compatible: you can even rename a public class member using this notation without breaking code outside the class. Only code inside the class would have to be adapted!

In practice, I also run into naming conflicts elsewhere:

struct atype
{
  int property;

  // Change spelling of parameter to avoid name conflict
  explicit atype(int Property) : property(Property) { }
};

With the proposed "dotted names," that could become the slightly more elegant

struct atype
{
  int .property;

  // Struct member name starts with dot, parameter name doesn't and can't
  explicit atype(int property) : .property(property) { }
};

A problem that remains, of course, is that property names often come in threes:

struct record
{
  string .name;
  explicit record(string name) : .name(name) { }
  string .name() const { return .name; }	// Which .name do you mean!?
};

Ideas on solving that case are more than welcome. But even without it, I think "dotted names" could be a great maintainability enhancer.

Templates

New template bracket syntax

(This wish dates back to 2004. At least the tools have improved, and C++11 fixes the problem with adjacent closing angled-brackets looking like the left-shift operator.)

If templates were a bit easier to parse, we'd be rid of many little problems we have now. The problem is that the <angled brackets> we use for templates had various different meanings already (in the <, <=, << etc. operators). And because of this, they are not consistently matched in pairs as the other bracket characters ([{}]) are. This is also a pain in the backside for other tools that try to minimally parse C++ code, such as syntax-colouring plugins for editors. They don't need to understand the full language, but it helps to make the basic structure easy to figure out.

If we had some easily recognizable syntax such as

template@(int T, typename C)

then we might not have to give the compiler quite so many hints in difficult situations.

Non-intrusive tuples

(This list dates back to 2004. C++11 introduced std::tuple, auto type deduction, lambdas, and initializer lists, giving me most of what I wanted. But in 2015 some of the wish still looks pretty neat.)

It bugs me that associative containers use std::pair, with its first and second, whereas sequential containers just contain their value types — meaning that algorithms written to deal with the two kinds of iterators may have to look very different:

template<typename IT> void dump_values(IT begin, IT end)
{
  for ( ; begin != end; ++begin) cout << *begin << endl;
}

// Special version for associative containers--sorry!
template<typename IT> void dump_values(IT begin, IT end)
{
  for ( ; begin != end; ++begin) cout << begin->second << endl;
}

Sure you could add a template argument specifying how to access the iterators, and either write your own functor that just returns an iterator's ->second or use the standard library's nifty lambda-calculus tricks (not all of which made it into the Standard) to achieve the same effect. But in practice it's rarely worth the trouble. Your code usually gets longer, not shorter, and harder, not easier to maintain.

What if we had some way of specializing what first and second mean in the context of a container? What if we accessed iterators using templated key() and value() functions defined by the container? A set would be nothing more than a map where the two iterator access functions return the same value, or where value() returns void (a set<T> would then be a map<T,void>). A type like vector could implement a key() that computed an iterator position's index, making it work from your algorithm's point of view like an associative container indexed by integral values. Which is what it is, really. The only remaining difference besides performance specifications would be that a vector must have a contiguous key range. That may not be of any interest to most algorithms!

Why do I propose making these functions members of the container rather than of the iterator? Several reasons:

It's non-intrusive on the iterator, meaning that it would still work with pointers. One of the strong points of the standard library's iterator model is that it also covers pointers.
Containers like vector may need access to private members to compute an iterator position's key() efficiently.
It would be easy to make the key() and/or value() functions overridable through the container's template parameters.

The iterator's traits would also have to provide a way to access these functions so sequence algorithms (which don't know about containers) could access them.

There is one big chicken-and-egg problem with all this: if these functions are added as container template parameters, how do you specify the defaults? You can't write "I want a set<int> with, as its value() function, whatever is the builtin default for a set of int with the default implementation for value()." It's an infinitely recursive definition. Perhaps the problem can be solved with an extra layer of templatization on the contained type (so that set and map, for instance, would only differ in this detail) but of course we also want to keep containers simple.

Did I just say that? Make it "keep containers at or near their current daunting level of complexity." At least until compiler error diagnostics improve dramatically.

Namespaces

Access protection

Imagine being able to write:

namespace foo 
{
private:
	void bar();
public:
	int splat(); 
}

What I do to get a similar effect in libpqxx is to define a public namespace pqxx, and within that, a nested namespace called internal which is documented as something that users should not touch. With access protection I could actually enforce this, and know that I wouldn't be breaking any third-party code when changing classes etc. in there.

Access protection

Member function access levels

Imagine being able to specify the level of access a member function has to its class: private (default and maximum), protected (as if it were defined in a derived class), or public (as if it weren't a member at all).

You'd be able to define member functions that look like perfectly normal members from the syntactic point of view, but can be implemented purely in terms of the class' public interface. You could add functionality to a class in the form of member functions without losing oversight of what state transitions the objects can go through; the number of "privileged" member functions would remain limited.

This is currently worked around by not making the "unprivileged" functions members at all. But if the interface between the class and the function ever changes, for instance because an optimization opportunity is found that requires more access to the class' implementation details, you can either move the function back into the class (and break source compatibility) or make it a friend (and break the whole model).

Builtin Types

Sub-byte types

typedef bool bit_t:1;

You wouldn't be able to take its address or its size, which complicates the way arrays are defined, but it could be nice to have packed-bit arrays in the language.

`void` Objects

Why can't we have void variables? Some compilers used to implement this as a C language extension. In C++ it could be even more fun. Consider:

    template<typename T> inline T foo(T t)
    {
	return bar(t);
    }

What happens if for some type T, bar<T>() returns void? Should instantiation of this template fail over such a minor point? Hell, no! There will be an error if the program tries to receive the return value into a variable (unless that too is void, of course) and unless that happens, there is no problem whatsoever.

Okay, okay, there is an answer to my rhetorical question. Why can't we have void variables? Because every value must have a unique address.

But surely there are tricks? For the sake of argument, let's say you wanted to implement something like set as a std::map. Surely the compiler could find some byte in each entry that could not be addressed by itself, and use its address? Say that KEY happened to be a 32-bit integer... ISTM the compiler could use the address of its second byte as the unique address for the void value. And yes there's things you could do that would cause disasters, but are any of them not violations of the language's aliasing rules?

Promoting, Converting, or Returning `void`

Returning a non-void value from a void function would still yield a compiler warning, but this time as a type demotion rather than as a special case. Returning a void from a non-void function (say it returns type T) could either be an error or return a default-constructed T. T's default constructor doubles as a "promotion" from void to T.

This raises the question of whether we'd need to have explicit default constructors as well, but I suspect there wouldn't be much need.

Container Library

Once we allow void objects, some simple templates may become implementable as degenerate cases of more complex ones:

template<typename T> typedef map<T,void> set<T>;

The standard library may specialize this to contain no object storage at all. The set<void> would then hold no more than a boolean to indicate whether an "object" had been inserted (or a counter if we decide that all void values should differ, instead of considering them all equal). Personally I'd like to say all void objects were not only equal, but actually the same object. But that may not be feasible; see below.

Storage

If void is allowed to have zero size, arrays and other variables of void type could be optimized away entirely in all cases. That may reduce the need for manual specialization when we reduce simple templates to degenerate cases of generic ones; you'd be almost guaranteed maximum efficiency for the void case, as if it were a manual re-implementation.

On the other hand, if it better fits the Standard to give each void object its own identity and nonzero size, then this could be a convenient way to tell the compiler to generate a unique address (without necessarily allocating storage to it). Sometimes that's useful too. We could just define sizeof(void)==1 and specify that the compiler may avoid assigning real storage to it where it can. That would be consistent with the notion that incrementing a void pointer increases it by one, as e.g. some code in Linux assumes (which probably means it's part of the C99 standard).

Arguments

A nasty consequence of all this is that a function's number of variables suddenly becomes a relative thing. If a function declared as void foo(); may be called as either foo() or foo(void()), why not allow foo(void(),void()) as well? That will take some more thinking, but I'm sure there's a way of living with it.

It may even be put to use. Function templates that accept "up to 5" arguments for example could in some cases be implemented as a single template, some of whose arguments may be of type void. Present practice requires 6 different definitions for zero arguments, one argument etc.:

template<typename A, typename B, typename C, typename D, typename E>
inline void print_total(A a, B b, C c, D d, E e)
{
	// Add and print.  Addition operators taking void arguments are all
	// defined as no-ops.
	cout << a + b + c + d + e << endl;
}

To be fair though, this is probably an exceptional case. Most templates will probably fail to compile when compiled with void as an argument type. But that's all compile-time trouble, not run-time surprises, and if those cases can be fixed it may eventually make for more elegant programs.

`noexcept` on code blocks

Assuming that compilers learn to warn for situations where a noexcept (or a throw specification — this wishlist item is actually a very old one), it might be useful to be able to specify what types of exceptions may come out of any particular block of code:

void foo()
{
  int *p = new int[100];
  noexcept
  {
    process(p, 100);
    p[99]++;
  }
  delete [] p;
  bar();
}

Now, if the declaration of process() allows it to throw exceptions, the compiler could warn that the code block could result in an exception, which would mean that the delete would be bypassed. Perhaps it could also generate more efficient code.

Note that in this example, bar() may throw exceptions — and therefore foo() may as well. But the sensitive part between the new and the delete must not be allowed to.

Granted Wishes

`this` as a reference

A very old wish, which may finally come true in C++23 with P0847R7: Deducing this.

Something that's been bothering me for decades now: why is "this" a pointer? In modern C++, it seems to me that it should be a reference.

So, what if we had self as an alias for *this?

Access to `switch` value

This wish has been granted: the parenthesised switch expression can now initialise a variable for use inside the switch body.

Inside switch blocks, expose the value that the directly surrounding switch statement received in a special variable called case, so we don't have to introduce named variables anymore:

// Escape certain special characters with backslashes; pass on all others.

switch (get_unsigned_char(input))
{
case '\\':
case '\'':
case '\0':
	output.append_string(escape_special(case));
	break;
default:
	if (case <= 127) output.append_char(case);
	else output.append_string(escape_numeric(case));
	break;
}

The name case was chosen because it's a reserved word and so cannot clash with any existing identifiers in existing programs. Nor is there any way to confuse it with any valid existing usage of the reserved word itself; it should be easy to add to the parser.

Actually it's tempting to do something similar for the inputs to while, if, for (the loop condition in this case), and the ?: operator. But that would make the keyword issue a lot harder to handle, and would make "nesting errors" more likely during program maintenance. Moreover, I think some conversion (e.g. to bool) is usually done on conditionals. Who'd want to query the truth value of the last-evaluated loop (or if) condition inside the statement's body?

Also, note that catch already has a mechanism for dealing with use of the unnamed input variable (the no-argument throw) although it's a bit more limited than what is proposed here.

`likely` and `unlikely`

(I've had this one for a very long time, possibly since the 20th century. It was sort of granted in C++20 by the attributes of the same names.)

It'd be nice if we had likely bool and unlikely bool types. These would be bool types for variables that are probably true or probably false, respectively.

Think similar to signed char and unsigned char, with optional compiler warnings for probable mistakes such as implicit conversions between the two, exceptions thrown on likely conditions, etc.

(Of course operator!(likely) would return unlikely and vice versa. An operator&& wouild return the most "pessimistic" of its argument types.)

Good for static compiler optimization as well as code clarity. As a side effect, C99's if (unlikely(x < 0)) syntax would also work. And with unlikely being a builtin type instead of an operator or function, there is less temptation to read that statement as "if x is unlikely to be negative," which is not what it means.

I wonder if these two new types should be seen as different from regular bool for purposes of overloading and template specialization... On the one hand it would make things much more complex, but on the other, it might allow for some additional optimizations.

`noexcept(auto)`

(This wish was added in April/May 2015. I've since found out that it has already been proposed. I know there are possible complications; the differences are that (1) I vaguely imagined that noexcept would be deduced after a call was resolved, and (2) my version leaves the compiler the freedom to be inconsistent in its noexcept deductions for one and the same function.)

By default in C++14, a function is declared noexcept(false). This means that it does not make the noexcept guarantee — in other words, it may throw exceptions.

I would like to see the default changed to noexcept(auto). Which does not exist. But I would like it to mean: "At points where the compiler is aware of the function's implementation, treat it as conditionally noexcept depending on the union of the noexcept-ness of everything it does. Where the compiler is not aware of the implementation, treat it as noexcept(false)."

In other words: figure out whether the function can throw exceptions, and if you know that it can't, treat it as if it were declared noexcept. But, make no interface guarantees to humans.

Of course there's a downside. The exact same function could be noexcept(true) in one place where it's referenced, and noexcept(false) in another. How bad would that be? I have no idea. Making no assumption even though you could is not so bad, just not as fast. The converse might happen: some other piece of code may hear from elsewhere that your function can't throw, even though the compiler at that point hasn't made that determination. But in that case, the function still can't throw! It may not matter.

And then there's the upside. Compilers can generate faster code for some functions even when its authors do not want to make any interface promises. But even libraries can join in, choosing faster implementations when the compiler detects exception-free regions of code. (They do this already, e.g. move-constructing elements when reallocating a vector, instead of copy-constructing them, when it knows the move constructor can't throw any exceptions that might leave the vector in a bad state.)

It's a bit similar to auto return types, where you don't need to declare your function's return type in C++14. The difference is that there's more room for quality-of-implementation choices: a weakly-optimizing compiler could stick with the C++14 default of noexcept(false). A slightly better one could do this only for inline functions. A more aggressive optimizer could infer noexcept for non-inline functions within the same translation unit. Whole-program optimizers might carry the information across translation units for perfect noexcept information everywhere. The difference may affect the code here and there, especially in STL containers, but not functionally. The only difference would be in the optimization opportunities.

Cloning Exceptions

(This wish dates back to 2005. C++11 granted it by defining std::current_exception().)

We need a way to create copies of exception objects, without full static type information. Why is this under "parallellism?" Because it allows us to write frameworks that perform multiple tasks independently of one another, and gather up their successes and failures for later processing. Just catch any exception coming out of each individual task, add it to a container of results, and proceed to the next independent task.

This can really help. See my pipeline class in libpqxx, the C++ client API to PostgreSQL. This is the kind of "execution management" class that we'll need in order to support more liberal execution models.

(In case you didn't know it, PostgreSQL or "Postgres" for short is a fully enterprise-ready, open-source database management system. I vastly prefer it over MySQL for standards compliance and even, to my own surprise, ease of use. Its licensing is also more liberal than MySQL's, although I must confess to being a GPL man myself.)

Primitive Example: pipeline

The pipeline class is conceptually simple: instead of executing queries in synchronous fashion ("execute this query and give me the result"), you create a pipeline object. You feed your queries into the "front" end, and retrieve results from the "back" end. They are executed in strict insertion order, but apart from that, sequencing is controlled by the pipeline object. This allows for several optimizations:

Latency hiding: your program can go and do something useful while the query is executing. No need to waste the time, nor to risk a gruesome debugging effort to get your event handling right.
Concatenation: PostgreSQL happens to allow queries to be concatenated and sent to the server collectively. If the server is far away on the network, and you have a lot of queries to perform that you can formulate completely before the first one completes, they can all be bundled and transmitted in bulk.
Server-side Pipelining: some database management systems are implemented internally as pipelines. One stage may receive and parse a query, then pass it on to the next which optimizes it; the following stage may plan data access patterns for it, the next (or next few) execute it, and so on until the results are sent back to the client. If these pipeline stages are implemented as independent threads running on separate processors, for instance, then sending more consecutive queries to the server at once may allow these stages to work on the queries concurrently.

Of course, the pipeline's error handling gets a little complicated. If one of your queries fails, you want an exception when you retrieve its result — not before, when it is executed, and you've still got some perfectly good previous results waiting to be retrieved (literally "in the pipeline"). So the pipeline class retains any error information and doesn't convert it to exceptions until the right moment.

Implementing Exception Cloning

The pipeline is lucky in this regard because it exists on the boundary between C (which doesn't have exceptions) and C++ (which does). In the general case, this is much harder to do. What if the query could just throw any C++ exception? About the best we could do, I guess, is to catch as many known exception types as possible, and for each of them, include code to create a copy on the free store (using new), and store that somewhere. We can't return a reference to the original exception, because its lifetime will end at the end of the catch block, and passing copies around directly pretty much restricts us to one basic exception type.

What we arrive at is not very scalable, nor does it lend itself to very effective reuse:

template<typename T> inline std::exception *clone(const T &t)
{
  return new T(t);
}

exception *perform_one(std::string query)
{
  std::exception *fail = 0;
  try
  {
    execute(query);
  }
  catch (const std::bad_alloc &e) { fail=clone(e); }
  catch (const std::length_error &e) { fail=clone(e); }
  catch (const std::domain_error &e) { fail=clone(e); }
  catch (const std::out_of_range_error &e) { fail=clone(e); }
  catch (const std::invalid_argument &e) { fail=clone(e); }
  // TODO: Add more exception types here!

  try { post_query_work(); } catch (...) { }

  return fail;
}

Obviously we need something better. Making run-time type information more accessible would be nice: then we could ask the compiler's run-time system to copy-construct the exception object for us, whatever its type. But that would require a fairly radical change in the language (which I hope to write more about in the future). Let's assume that we only have simple tools, and small changes in the language at our disposal.

As a "quick fix," why not give std::exception a new member function along the lines of:

namespace std
{
class exception
{
public:
  virtual auto_ptr<exception> clone() const
  {
    auto_ptr<exception> result(new exception(*this));
    return result;
  }

  // ...
};

The std::exception hierarchy already has a vtable, so this is no great loss in terms of performance. Now, every concrete class derived from std::exception would hopefully implement this function, always returning a copy of itself as a std::auto_ptr<std::exception> to avoid changing the return type — but dynamically it would always point to an exception of the right type. Programmers implementing exception classes of their own would be called upon to do the same.

So what about scalability — what if some exception class is not supported? The answer is not great, but better than what we had before. If some third-party exception class fails to implement this function, then some details for that exception class will be lost, and you'd get a copy-constructed object of some parent class of the real exception. This is known as slicing and happens a lot in current exception programming as well. At least you should still have the what() string. It's regrettable, but not a new problem.

Moreover, if this were part of the standard, more programmers would support it and at some point one would simply expect it to be there. Remember how old C++ libraries used to grow into frameworks and define their own complete exception hierarchy, completely unrelated to anything in the Standard Library? That slowly went away, the custom exception classes were brought into line with the std::exception hiearchy, and so should exceptions without clone().

Using It

So let's say all relevant exception classes support clone(). What do we do with it? To answer that, here's a simple class that performs a set of actions that are not interdependent. At some future point, this could be optimized to distribute these tasks over multiple threads — but what it does for now is express that these tasks can be executed in any order, and we don't want to just cancel or hold everything whenever an exception comes out of one of them.

/** If you've got a job you'd like us to perform, wrap it in a class derived
 * from task, create an object of your class, and pass it to perform() (see
 * below).  A job is a functor.
 */
class job
{
  std::string m_name;
public:
  job(const std::string &jobname) : m_name(jobname) {}
  const std::string &name() const throw () { return m_name; }

  /// Overridable: the action to be performed.  Feel free to throw exceptions.
  virtual void operator()() throw (std::exception) =0;
};

/** Performs jobs.  Pass it a sequence of pointers to jobs.  Any failed jobs are
 * logged to cerr, but only the first (if any) will throw an exception.
 */
template<typename ITER> void perform(ITER begin, ITER end)
{
  exception *error = 0;

  for ( ; begin != end; ++begin) try
  {
    (**begin)();
  }
  catch (const exception &e)
  {
    std::cerr << "Job " << (*begin)->name() << " failed: "
              << e.what() << std::endl;
    if (!error) error = e.clone();
  }
  if (error) throw *error;
}

Customized error messages

(This wish dates back to 2004. It was granted in C++11, as static_assert. Deleted member functions, also introduced in C++11, address one of the use-cases specifically.)

One of the really cool things about templates in C++ is that you can make them "break the build" when the program uses them the wrong way.

In libpqxx, for example, I have template functions for converting various types to and from a domain-specific string representation. Those templates are specialized for many built-in types, but not for char. That was a deliberate choice: a char may represent a character, but it could also be a small integer (unsigned if the program was written with portability in mind). So I'd rather have the build fail and force the programmer to work around it.

But when the build fails with a link error about the missing specialization, the programmer probably thinks it's a bug in the library. The message that the library doesn't want this code to compile doesn't come across.

I work around this by providing a specialization that invokes a function error_ambiguous_string_conversion. I declare that function but leave out any definition. A program that tries to convert a char to a string representation will now fail to link, and the build error will at least contain this helpful function name. Hopefully the programmer will try to figure out why the build failed, and end up at the declaration for error_ambiguous_string_conversion where the problem is documented.

But wouldn't it be nice if I could customize the error message completely? Imagine I could just say:

template<> std::string to_string(const char &)
[
  "String conversion for type char is ambiguous.  If you really mean a "
  "character, create a std::string directly.  If you mean a small integer, "
  "convert to int or another unambiguous integral type."
];

That would mean that this function is declared but has no definition. If the program tries to invoke it, the compiler will emit my custom error message to the user.

This would also be helpful for nocopy base classes: current practice for classes that shouldn't be copied is to declare the copy constructor and copy assignment operator but not define them. A custom error message would make that trick a bit more user-friendly.

"`prohibited`" protection level

(This wish dates back to 2004. C++11 lets you mark constructors etc. as =delete, which is even better than doing it through access protection.)

Imagine a protection level "beyond private," which doesn't allow any access at all — not even from within the declaring class.

"prohibited" would be useful for disallowing copy constructors, copy assignment operators or default constructors that the compiler would implicitly generate otherwise. Standard practice for accomplishing this is currently to declare them as private and deliberately fail to implement them, but it's sort of a dirty trick and errors may not be caught until link time. Make them prohibited and you'll get a compile-time error if you try to call them even from within the class itself. The new protection level would also be useful for implementing virtual member functions that should only be accessed through virtual calls. Virtual functions can be dangerous, believe it or not (ever called one from a constructor or destructor?) so it may be useful to restrict access to them even from within the defining class.

Update: David Vandevoorde tells me that work is being done on a special syntax to suppress automatic generation of constructors and assignment operators, making the "false declaration" trick unnecessary.

Ideally the new keyword would precede "private" in alphabetical order so it fits into the private - protected - public progression. But at least prohibited starts with a 'p'.

Update: Bart Samwel found a word that meets these requirements and has a fitting meaning: precluded.

Namespace inheritance

(This wish dates back to 2004. Inline namespaces, introduced in C++11, allow some of these ideas.)

It may not make all that much sense in practice, but allowing inheritance between namespaces would give protected a meaning in namespace context. Presumably it could be "tucked underneath" the definition of class inheritance in the same way that namespaces themselves have been turned into a fitting component of the existing class inheritance mechanism.

Static virtual functions

Once you start doing namespace inheritance, you might as well have virtual functions in namespaces (and, by extension, static virtual functions in classes as well). "Dynamic" dispatch (actually I suppose it could be optimized away at link time) would be based on the calling function's namespace, regardless of whether the implementation's declaration is known to the compiler at the call site.

History

2017-01-01 Test mode.
2015-05-02 17:40 noexcept(auto).
2015-05-02 17:40 New section: Granted Wishes.
2006-01-12 11:30 throw specifications on blocks.
2005-11-14 23:00 Added anchors for linking into sections.
2005-11-13 18:00 New Optimization Goals section.
2005-11-13 18:00 Described lazy vtables.
2005-11-13 18:00 "Daemonized libraries."
2005-11-03 21:00 Moved Object Sync into new Parallellism section.
2005-11-03 21:00 New feature: exception cloning.
2005-07-13 15:00 Added "switch value" variable.
2005-07-13 15:00 Wrote void objects section.
2005-05-06 14:50 Documented "dotted names" idea.
2005-05-06 14:20 Keyword for new access level: precluded found by Bart Samwel.
2004-11-30 13:45 Incorporated comments from David Vandevoorde.
2004-11-12 14:00 Moved templates to new section at the top; added non-intrusive tuples.
2004-11-11 14:45 Clarifications.
2004-09-05 14:40 Added static virtual functions and sync().
2004-08-12 17:00 Created and uploaded.

Back to Jeroen Vermeulen Home

My C++ Wishlist

Limited friendship

new_shared() and new_unique()

Template keyword arguments

Aligned views

for constexpr

String switch

Immutable string

Test Mode

Exposing members of members

A keyword for function declarations and definitions

Promoting, Converting, or Returning void

Container Library

Storage

Arguments

noexcept on code blocks

Granted Wishes

this as a reference

Access to switch value

Primitive Example: pipeline

Implementing Exception Cloning

Using It

History

`new_shared()` and `new_unique()`

`for constexpr`

String `switch`

Promoting, Converting, or Returning `void`

`noexcept` on code blocks

`this` as a reference

Access to `switch` value