My C++ Wishlist

Here's a list of features I think I'd like to have in C++. Some of them are pretty half-baked, but blue-sky ideas can be fun even if they aren't always technically useful.

Most of this list is old (2008 or older). And guess what? Some of my wishes have been granted! They are archived at the bottom. I do have some wishes left for C++14 though.

noexcept(auto)

(This wish was added in April/May 2015. I've since found out that it has already been proposed. I know there are possible complications; the differences are that (1) I vaguely imagined that noexcept would be deduced after a call was resolved, and (2) my version leaves the compiler the freedom to be inconsistent in its noexcept deductions for one and the same function.)

By default in C++14, a function is declared noexcept(false). This means that it does not make the noexcept guarantee — in other words, it may throw exceptions.

I would like to see the default changed to noexcept(auto). Which does not exist. But I would like it to mean: "At points where the compiler is aware of the function's implementation, treat it as conditionally noexcept depending on the union of the noexcept-ness of everything it does. Where the compiler is not aware of the implementation, treat it as noexcept(false)."

In other words: figure out whether the function can throw exceptions, and if you know that it can't, treat it as if it were declared noexcept. But, make no interface guarantees to humans.

Of course there's a downside. The exact same function could be noexcept(true) in one place where it's referenced, and noexcept(false) in another. How bad would that be? I have no idea. Making no assumption even though you could is not so bad, just not as fast. The converse might happen: some other piece of code may hear from elsewhere that your function can't throw, even though the compiler at that point hasn't made that determination. But in that case, the function still can't throw! It may not matter.

And then there's the upside. Compilers can generate faster code for some functions even when its authors do not want to make any interface promises. But even libraries can join in, choosing faster implementations when the compiler detects exception-free regions of code. (They do this already, e.g. move-constructing elements when reallocating a vector, instead of copy-constructing them, when it knows the move constructor can't throw any exceptions that might leave the vector in a bad state.)

It's a bit similar to auto return types, where you don't need to declare your function's return type in C++14. The difference is that there's more room for quality-of-implementation choices: a weakly-optimizing compiler could stick with the C++14 default of noexcept(false). A slightly better one could do this only for inline functions. A more aggressive optimizer could infer noexcept for non-inline functions within the same translation unit. Whole-program optimizers might carry the information across translation units for perfect noexcept information everywhere. The difference may affect the code here and there, especially in STL containers, but not functionally. The only difference would be in the optimization opportunities.

A keyword for function declarations and definitions

(Added May 2015)

In spite of all the available tools, I still like to search a codebase with grep. But it's not as useful as I would like for looking up function declarations and definitions.

It would be insanely practical if C++ had a keyword to mark functions, just as it does for classes, and just as most recent languages do. Especially now that return types can come after the signature: a project style guide can demand that a function name be on the first line of its declaration, and now you can search for e.g. "func .* myroutine(" to find just the declarations of myroutine() but not any of its uses.

Over time, of course, you could hope that the keyword would become the norm. It could make it easier to tell function declarations from variable definitions with initializers, if that will still be an issue. Parsers and syntax highlighers might have an easier time. Humans almost certainly would.

Optimization Goals

For ages now we've been optimizing CPU performance. Higher clock speeds, more aggressive optimizations, faster memory to keep the CPU fed. It all matters, because practically all of the CPU's time is spent either waiting for memory or cranking through tight loops. The benchmarks we use to guide all this work say so. That's one big reason why we still use so many statically compiled languages like C++: you get to spend a lot more time finding ways to make those loops run just a little bit faster. We need multimedia extensions so we can speed up those crucial loops in multimedia applications: it's low-hanging fruit for an important chunk of perceived performance in everyday computing tasks.

Then there are those who say that computers are "fast enough" now, or at least for most users. Most of the computer's time is spent waiting for something it can do for us. We don't need our word processors or web browsers to be any faster. What matters now is to produce lower-power processors with more functionality, so computers can do ever more and ever smarter work for us. Some say we've even run out of ideas to put the computing power we have to effective use. Processors have been getting faster at an astounding exponential rate, and although memory hasn't exactly been keeping up, it's gotten a lot faster too. This has kept system speeds skyrocketing faster than we know how to waste it on slower software.

Bollocks!

Both positions are full of good, technically correct points. But I think they misrepresent what's been happening: the CPU cranking through loops is no longer the major time waster in most computer use. The computer isn't fast enough, and never will be. But ever-faster processors and ever-more efficient loops have hit a point of diminishing returns. It's a simple application of Amdahl's Law: if n% of runtime is spent on a particular part of the job, then the best you can hope for in optimizing that one task is an n% speedup. And that's if you manage to get the task done infinitely fast; if you speed it up by a factor of x, you only shed n%-(n%/x) of overall runtime, while getting to a higher x typically requires a disproportionately large amount of effort.

The moral is obvious: don't waste large efforts getting a large x for a low n. Double the speed of a 10% part of your workload, and what have you gained? An overall 5% improvement. Which is great, but not as much as you could have gotten from a mere 20% speedup on a task that took 50% of your time.

So where are our systems wasting their time (and ours)? Don't believe the benchmarks. Those tend to measure the straightforward, classic computing tasks that used to take so long, way back when. In fact, this trend I'm talking about has already come to the point where it's getting hard to find good benchmarks for that kind of thing. The big problem in composing "practical" benchmark tests today is finding realistic, straight-through processing tasks that take long enough to run! Second big problem: to eliminate the huge, highly variable "warmup" cost of running a benchmark — a cost that would otherwise grossly distort the real runtime of the program. We need consistent, reproduceable results that are subject to all kinds of rational analysis.

Waaaaaiit a minute. Ho! Step back. What's happening here? We're doing our best to eliminate this large, highly variable cost from our benchmarks. But it's getting harder and harder to make the straight-through processing take up a significant part of the total. Now that is an institutional blind spot. We're focusing on a low n so we can fight over the best x!

Back to the high-n part of the job, therefore. Here are some interesting facts about that, which are quite widely known in various circles:

Below are some things we could conceivably do about the real performance problem...

Lazy vtables

For every class that has virtual member functions (or destructors — strictly speaking they're not called member functions I believe), a typical C++ compiler will emit a vtable. This is a list of pointers to the various virtual functions in that class. Every object of the class will have a pointer to its vtable (the vptr), except that each class in the hierarchy may have its pointers point to different implementations of the same virtual function.

All of these functions need to be looked up by the system linker/loader when the code is loaded in memory for execution, so the vtable can be filled with the addresses where the various virtual function implementations happen to have been loaded into memory. Theoretically speaking this is only needed if the class, nay even the individual function is actually used — but the conventional model of separate compilation and linkage makes that impossible for the compiler to predict. Perhaps we could save a lot of time if we didn't need to do all these lookups; lots of them are never used on the average run of a typical program. But inserting run-time checks would be too costly: "I'm going to call this virtual function. Wait a minute, have I looked up its address yet? If not, do it now." That would defeat the entire purpose of the exercise.

Something we could do, however, is to replace the function pointers in the vtable with stubs. The pointers would not be directed to the virtual function implementations themselves on loading; instead, they would all point to a single stub function that does the following:

  1. Identify the function that the caller really intended to call.
  2. See whether the function's address has been looked up yet. If not:
    1. Look it up.
    2. Replace the pointer in the vtable with the newly found address.
  3. Jump to the originally intended function.

After the program has been running for a while, all functions that are actually used would have been looked up and run at full speed, just as in the conventional implementation.

This may look unnecessarily complex. There is a technical reason for that: the program may be multi-threaded, and doing things this way avoids the need for a lock to synchronize multiple threads that may try to make that crucial first call at the same time. Writing the new address to the vtable should be atomic on most architectures, without the need for any synchronization; the compiler implementation would know about this anyway.

This trick could greatly reduce the number of symbols that need to be looked up at program load time, probably reducing the time needed to get some first useful responses from it, at a hopefully small extra cost in initial performance.

There has to be a catch, of course, and there is.

What if the program does not call the virtual function, but tries to take its address instead? You could just let it read the stub, and there are things you could do to make that "patch itself out of the code" to some extent when a call is made through the pointer. But who ever wants to take the address of a virtual function in order to call it? That's exactly what virtual functions are for in C++: to take the "pointer dancing" for function calls with runtime binding out of the programmer's hands — look through the Linux kernel source for some examples of this ugly stuff. No, a much more likely reason why someone could want to take the function's address is to obtain some runtime type information. Compare the function pointer with another. And since that does not involve a call, using stubs here becomes really difficult.

Assuming that such cases are rare, however, it may be well worth the cost to insert a runtime check for this usage. Just check that every virtual function whose address is taken has had its address looked up, then proceed as with a conventional vtable. If that case is rare enough, as per Amdahl's law (a low n and negative x for taking function addresses, but higher n and good x for load effort) this could be a big win.

This is not a new idea. I've used this in the 1990's, and I'm sure others have come up with it before me. A particularly nice part of the "atomic update" trick is that it is also idempotent: even on a CPU that keeps its instruction and data caches separate, there is no need to flush caches to make the change stick.

Parallellism

(This wish dates back to 2005. The Rust programming language has pretty much granted it, though not in C++!)

Everyone "knows," and has "known" for some time now, that parallellism is going to be the next major step up in performance.

As far as I can see, it hasn't happened. There have been many nice academic ideas that weren't quite ready for practical use, or only worked in environments that weren't (such as overly restrictive languages). Then there are the various threading implementations for the major programming environments, which have more or less standardized but remain messy. And there's the fact that C and C++ don't exactly help make it easy. Java has good ideas on how to do conventional multithreading, but it's still just that — conventional multithreading.

In my humble opinion, we've been looking the wrong way. For an example of how to approach these things, look at const.

I love const. It cuts both ways: it expresses an important semantic property in the source code (good for the programmer), but it also provides a powerful piece of information to the optimizer.

A viable approach to parallellism should be like that: first and foremost a way of specifying correct software, so people have reason to adopt it in their daily programming; and then we can look at how to use it for optimizing the resulting code. That requires a leap of faith: start using it before we know it's going to be designed "just right" from the optimizer's point of view. The same happened with const: what optimizers really wanted was something more like restrict (a C99 feature), and practical experience with const was the only way to make clear just what was needed. The ensuing difficulty in designing a stronger successor indicates to me that const was a stroke of genius. I can only hope that future languages will allow the programmer to define similar attributes for their own classes.

So, I say, design useful correctness semantics for parallellism first, and trust that it will provide optimization opportunities. Even if it doesn't, a construct that provides this much will have gained us something.

Really easy to say, right? We need a construct that enhances correctness before all else. How can you take a problem that puzzles many, and introduces uncountable (and more often than not, un-debuggable) bugs into the world, and somehow make it enhance correctness?

I think there are things we can do; see below. But first and foremost, we need to try. It's hard, but it's worthwhile. The promised Era of Parallellism will not come all by itself. We can start by thinking about dependency relationships (and lack of same) between tasks.

Parallellism for Correctness

In case you thought this was all theoretical, here's a list of real-world examples that could be vastly improved with a good model for dependencies and independent tasks. Some of them are largely about error handling, which is a particularly important (and underrated) area of programming.

What these examples have in common is the spilled-coffee effect: you set the computer to work on some long-running task and go off to get a cup of coffee in the meantime. By the time you get back, you expect it to be more or less finished.

Instead, you find that almost none of the work has been done. For most of the time you've been away, the program has been sitting there displaying some equivalent of "Press Any Key to Continue." This is of course the point where you may spill your coffee; even if you don't, the coffee break has, in a sense, been wasted.

Like I said, error handling is particularly important. Few programmers spend much time thinking about it. There are good reasons for this. It's against human nature to focus on what might not work; the programmers are intimate enough to understand most of their errors when using their programs, and subconsciously avoid them in the first place; and frankly, it's not usually very clear what to do with them anyway.

So programmers tend to ignore errors. Users, as a result, get bombarded with unclear, unspecific, dumb, or superfluous error messages and tend to ignore them also. A good model for expressing interdependence of tasks can help here: if we stop treating exceptions like hot potatoes that must be passed on to the user immediately, we can aggregate them a bit. Present them in more sensible ways. Recognize what's going on before passing the buck to whoever's sitting outside the computer. "The following error occurred for a bunch of different files." I've seen some programs do that, but not many, and never with multi-selectable lists to make it easy to manipulate the program's reactions to each. Let's find better ways of handling errors, outside of the regular sequential code path!

In the following sections I'll show a very, very simple construct that could help solve all these problems — given one very simple extension to the language.

Cloning Exceptions

We need a way to create copies of exception objects, without full static type information. Why is this under "parallellism?" Because it allows us to write frameworks that perform multiple tasks independently of one another, and gather up their successes and failures for later processing. Just catch any exception coming out of each individual task, add it to a container of results, and proceed to the next independent task.

This can really help. See my pipeline class in libpqxx, the C++ client API to PostgreSQL. This is the kind of "execution management" class that we'll need in order to support more liberal execution models.

(In case you didn't know it, PostgreSQL or "Postgres" for short is a fully enterprise-ready, open-source database management system. I vastly prefer it over MySQL for standards compliance and even, to my own surprise, ease of use. Its licensing is also more liberal than MySQL's, although I must confess to being a GPL man myself.)

Primitive Example: pipeline

The pipeline class is conceptually simple: instead of executing queries in synchronous fashion ("execute this query and give me the result"), you create a pipeline object. You feed your queries into the "front" end, and retrieve results from the "back" end. They are executed in strict insertion order, but apart from that, sequencing is controlled by the pipeline object. This allows for several optimizations:

  1. Latency hiding: your program can go and do something useful while the query is executing. No need to waste the time, nor to risk a gruesome debugging effort to get your event handling right.
  2. Concatenation: PostgreSQL happens to allow queries to be concatenated and sent to the server collectively. If the server is far away on the network, and you have a lot of queries to perform that you can formulate completely before the first one completes, they can all be bundled and transmitted in bulk.
  3. Server-side Pipelining: some database management systems are implemented internally as pipelines. One stage may receive and parse a query, then pass it on to the next which optimizes it; the following stage may plan data access patterns for it, the next (or next few) execute it, and so on until the results are sent back to the client. If these pipeline stages are implemented as independent threads running on separate processors, for instance, then sending more consecutive queries to the server at once may allow these stages to work on the queries concurrently.

Of course, the pipeline's error handling gets a little complicated. If one of your queries fails, you want an exception when you retrieve its result — not before, when it is executed, and you've still got some perfectly good previous results waiting to be retrieved (literally "in the pipeline"). So the pipeline class retains any error information and doesn't convert it to exceptions until the right moment.

Implementing Exception Cloning

The pipeline is lucky in this regard because it exists on the boundary between C (which doesn't have exceptions) and C++ (which does). In the general case, this is much harder to do. What if the query could just throw any C++ exception? About the best we could do, I guess, is to catch as many known exception types as possible, and for each of them, include code to create a copy on the free store (using new), and store that somewhere. We can't return a reference to the original exception, because its lifetime will end at the end of the catch block, and passing copies around directly pretty much restricts us to one basic exception type.

What we arrive at is not very scalable, nor does it lend itself to very effective reuse:

template<typename T> inline std::exception *clone(const T &t)
{
  return new T(t);
}

exception *perform_one(std::string query)
{
  std::exception *fail = 0;
  try
  {
    execute(query);
  }
  catch (const std::bad_alloc &e) { fail=clone(e); }
  catch (const std::length_error &e) { fail=clone(e); }
  catch (const std::domain_error &e) { fail=clone(e); }
  catch (const std::out_of_range_error &e) { fail=clone(e); }
  catch (const std::invalid_argument &e) { fail=clone(e); }
  // TODO: Add more exception types here!

  try { post_query_work(); } catch (...) { }

  return fail;
}

Obviously we need something better. Making run-time type information more accessible would be nice: then we could ask the compiler's run-time system to copy-construct the exception object for us, whatever its type. But that would require a fairly radical change in the language (which I hope to write more about in the future). Let's assume that we only have simple tools, and small changes in the language at our disposal.

As a "quick fix," why not give std::exception a new member function along the lines of:

namespace std
{
class exception
{
public:
  virtual auto_ptr<exception> clone() const
  {
    auto_ptr<exception> result(new exception(*this));
    return result;
  }

  // ...
};

The std::exception hierarchy already has a vtable, so this is no great loss in terms of performance. Now, every concrete class derived from std::exception would hopefully implement this function, always returning a copy of itself as a std::auto_ptr<std::exception> to avoid changing the return type — but dynamically it would always point to an exception of the right type. Programmers implementing exception classes of their own would be called upon to do the same.

So what about scalability — what if some exception class is not supported? The answer is not great, but better than what we had before. If some third-party exception class fails to implement this function, then some details for that exception class will be lost, and you'd get a copy-constructed object of some parent class of the real exception. This is known as slicing and happens a lot in current exception programming as well. At least you should still have the what() string. It's regrettable, but not a new problem.

Moreover, if this were part of the standard, more programmers would support it and at some point one would simply expect it to be there. Remember how old C++ libraries used to grow into frameworks and define their own complete exception hierarchy, completely unrelated to anything in the Standard Library? That slowly went away, the custom exception classes were brought into line with the std::exception hiearchy, and so should exceptions without clone().

Using It

So let's say all relevant exception classes support clone(). What do we do with it? To answer that, here's a simple class that performs a set of actions that are not interdependent. At some future point, this could be optimized to distribute these tasks over multiple threads — but what it does for now is express that these tasks can be executed in any order, and we don't want to just cancel or hold everything whenever an exception comes out of one of them.

/** If you've got a job you'd like us to perform, wrap it in a class derived
 * from task, create an object of your class, and pass it to perform() (see
 * below).  A job is a functor.
 */
class job
{
  std::string m_name;
public:
  job(const std::string &jobname) : m_name(jobname) {}
  const std::string &name() const throw () { return m_name; }

  /// Overridable: the action to be performed.  Feel free to throw exceptions.
  virtual void operator()() throw (std::exception) =0;
};

/** Performs jobs.  Pass it a sequence of pointers to jobs.  Any failed jobs are
 * logged to cerr, but only the first (if any) will throw an exception.
 */
template<typename ITER> void perform(ITER begin, ITER end)
{
  exception *error = 0;

  for ( ; begin != end; ++begin) try
  {
    (**begin)();
  }
  catch (const exception &e)
  {
    std::cerr << "Job " << (*begin)->name() << " failed: "
              << e.what() << std::endl;
    if (!error) error = e.clone();
  }
  if (error) throw *error;
}

Object Sync

(This wish dates back to 2004. In retrospect it's probably not a good idea. It's also probably moot, as of C++11 with its threading memory model.)

In conventional multithreaded programming, you often obtain a lock, access an object related to the lock, release the lock, and continue. This exposes you to two risks that the language doesn't help with:

  1. When you think you're reading (parts of) the object from memory, you may actually be reading from processor registers because the compiler "cached" them for you during a previous access, or "prefetched" possibly incorrect values before you obtained the lock.
  2. After you've modified the object and released the lock, the compiler may still be keeping some of the values you've written in registers, either to write them back after you release the lock (creating a race condition) or not writing them back at all because it thinks it knows exactly how and when it will be read/written next.

Lots of C and C++ programs deal with this problem by ignoring it. This will usually work because the compiler isn't smart enough to optimize across the external lock/unlock function calls or inlined assembly instructions. But compilers get smarter (some already do whole-program analysis), and we normally want them to optimize as aggressively as possible. In fact, even with the lock/unlock functions, it can be good to have optimization "think across" the calls for all matters except the locked object.

We do have one construct to help us explain to a compiler that an object may be shared between different threads or execution contexts: volatile. But this vaguely defined modifier affects the entire lifetime of the object, so all code accessing the object is kept less efficient — just because we need to synchronize it at two points in the program!

What we need, I guess, is a targeted synchronization primitive. Imagine:

namespace std
{
template<typename T> void sync(volatile T &) throw ();
}

Now, doing a std::sync(foo) on an object foo would tell the compiler, "I don't want you to keep any part of foo in registers across this call, because another execution context may access it." It would also help express to anyone reading the code which is the, ahem, object of that lock. I don't think the volatile is strictly necessary (heck, it might even cause compilation problems) but I'm including it here mostly for artistic reasons.

Compilers could implement the template in various ways:

Class Member Names

Some people like to prefix the names of class member variables with underscores (_foo), Hungarian-looking lower-case letters (pFoo), or other prefixes (i_Foo, m_foo). I'm one of them. I dislike Hungarian notation but like to know when code is referring to a member variable of its containing class. I find it really helps, both in maintaining my own code and in reading somebody elses code. Example code snippets can be shorter and clearer given such a convention, because they no longer need to state whether a particular variable is a class member.

Unfortunately, it's not going to work very well unless everyone can agree on a single convention — which they can't. None of them looks quite natural to me. It's often better to express things inside the language than using conventions on top of the language, so here's an idea on how to do that.

Dotted Names

The idea is to allow class member names to start with a dot:

struct bar
{
  int .foo;

  void func() { .foo++; }
};

That may not seem to make much sense, except there is one special rule: if the class member is explicitly qualified as such, the dot can (or must, really) be omitted. So it only shows up when referring to an unqualified name, but in that case it is also enforced by the compiler. This should work for member variables and member functions alike:

struct splat
{
  static int .foo;
  int .myfoo;

  static void .func()
  {
    .foo++;
    splat::foo++;
    foo++;	// ERROR: "foo" undeclared
  }

  void .anotherfunc()
  {
    .myfoo++;
    this->myfoo++;
    (*this).myfoo++;
    .func();
    splat::func();
    func();	// ERROR: "func()" undeclared
  }
};

There could be serious parser problems with this idea; but if it's possible then I think it would solve a lot of problems. It relies on standard language notation to indicate class membership, so no more convention conflicts. It's compiler-enforceable. And it's backwards-compatible: you can even rename a public class member using this notation without breaking code outside the class. Only code inside the class would have to be adapted!

In practice, I also run into naming conflicts elsewhere:

struct atype
{
  int property;

  // Change spelling of parameter to avoid name conflict
  explicit atype(int Property) : property(Property) { }
};

With the proposed "dotted names," that could become the slightly more elegant

struct atype
{
  int .property;

  // Struct member name starts with dot, parameter name doesn't and can't
  explicit atype(int property) : .property(property) { }
};

A problem that remains, of course, is that property names often come in threes:

struct record
{
  string .name;
  explicit record(string name) : .name(name) { }
  string .name() const { return .name; }	// Which .name do you mean!?
};

Ideas on solving that case are more than welcome. But even without it, I think "dotted names" could be a great maintainability enhancer.

Templates

New template bracket syntax

(This wish dates back to 2004. At least the tools have improved, and C++11 fixes the problem with adjacent closing angled-brackets looking like the left-shift operator.)

If templates were a bit easier to parse, we'd be rid of many little problems we have now. The problem is that the <angled brackets> we use for templates had various different meanings already (in the <, <=, << etc. operators), and that because of this, they are not consistently matched in pairs as the other bracket characters ([{}]) are. This is also a pain in the backside for other programs that try to parse the language, such as syntax-colouring plugins for editors.

If we had some easily recognizable syntax such as

template![int T, typename C]

then we would be rid of warts like set<set<int> > needing that piece of whitespace to avoid parsing >> as a shift operator, or the abominable typename A<T>::template B<int> b; (as found in the gcc 3.4 series changelog). As David Vandevoorde points out (thanks David!) the typename keyword would have to remain, but the template in the middle would become unnecessary.

Non-intrusive tuples

(This list dates back to 2004. C++11 introduces std::tuple, auto type deduction, lambdas, and initializer lists, giving me most of what I wanted. But in 2015 some of the wish still looks pretty neat.)

It bugs me that associative containers use std::pair, with its first and second, whereas sequential containers just contain their value types — meaning that algorithms written to deal with the two kinds of iterators may have to look very different:

template<typename IT> void dump_values(IT begin, IT end)
{
  for ( ; begin != end; ++begin) cout << *begin << endl;
}

// Special version for associative containers--sorry!
template<typename IT> void dump_values(IT begin, IT end)
{
  for ( ; begin != end; ++begin) cout << begin->second << endl;
}

Sure you could add a template argument specifying how to access the iterators, and either write your own functor that just returns an iterator's ->second or use the standard library's nifty lambda-calculus tricks (not all of which made it into the Standard) to achieve the same effect. But in practice it's rarely worth the trouble. Your code usually gets longer, not shorter, and harder, not easier to maintain.

What if we had some way of specializing what first and second mean in the context of a container? What if we accessed iterators using templated key() and value() functions defined by the container? A set would be nothing more than a map where the two iterator access functions return the same value, or where value() returns void (a set<T> would then be a map<T,void>). A type like vector could implement a key() that computed an iterator position's index, making it work from your algorithm's point of view like an associative container indexed by integral values. Which is what it is, really. The only remaining difference besides performance specifications would be that a vector must have a contiguous key range. That may not be of any interest to most algorithms!

Why do I propose making these functions members of the container rather than of the iterator? Several reasons:

  1. It's non-intrusive on the iterator, meaning that it would still work with pointers. One of the strong points of the standard library's iterator model is that it also covers pointers.
  2. Containers like vector may need access to private members to compute an iterator position's key() efficiently.
  3. It would be easy to make the key() and/or value() functions overridable through the container's template parameters.

The iterator's traits would also have to provide a way to access these functions so sequence algorithms (which don't know about containers) could access them.

There is one big chicken-and-egg problem with all this: if these functions are added as container template parameters, how do you specify the defaults? You can't write "I want a set<int> with, as its value() function, whatever is the builtin default for a set of int with the default implementation for value()." It's an infinitely recursive definition. Perhaps the problem can be solved with an extra layer of templatization on the contained type (so that set and map, for instance, would only differ in this detail) but of course we also want to keep containers simple.

Did I just say that? Make it "keep containers at or near their current daunting level of complexity." At least until compiler error diagnostics improve dramatically.

Namespaces

Access protection

Imagine being able to write:

namespace foo 
{
private:
	void bar();
public:
	int splat(); 
}

What I do to get a similar effect in libpqxx is to define a public namespace pqxx, and within that, a nested namespace called internal which is documented as something that users should not touch. With access protection I could actually enforce this, and know that I wouldn't be breaking any third-party code when changing classes etc. in there.

Access protection

Member function access levels

Imagine being able to specify the level of access a member function has to its class: private (default and maximum), protected (as if it were defined in a derived class), or public (as if it weren't a member at all).

You'd be able to define member functions that look like perfectly normal members from the syntactic point of view, but can be implemented purely in terms of the class' public interface. You could add functionality to a class in the form of member functions without losing oversight of what state transitions the objects can go through; the number of "privileged" member functions would remain limited.

This is currently worked around by not making the "unprivileged" functions members at all. But if the interface between the class and the function ever changes, for instance because an optimization opportunity is found that requires more access to the class' implementation details, you can either move the function back into the class (and break source compatibility) or make it a friend (and break the whole model).

Builtin Types

Sub-byte types

typedef bool bit_t:1;

You wouldn't be able to take its address or its size, which complicates the way arrays are defined, but it could be nice to have packed-bit arrays in the language.

likely and unlikely

It'd be nice if we had likely bool and unlikely bool types (somewhat similar to signed char and unsigned char, with optional compiler warnings for such probable mistakes as implicit conversions between the two, exceptions thrown on likely conditions, etc. (Note that operator!(likely) would return unlikely and vice versa).

Good for static compiler optimization as well as code clarity. As a side effect, C99's if (unlikely(x < 0)) syntax would also work. And with unlikely being a builtin type instead of an operator or function, there is less temptation to read that statement as "if x is unlikely to be negative," which is not what it means.

I wonder if these two new types should be seen as different from regular bool for purposes of overloading and template specialization... On the one hand it would make things much more complex, but on the other, it might allow for some additional optimizations.

void Objects

Why can't we have void variables? Some compilers used to implement this as a C language extension. In C++ it could be even more fun. Consider:

template<typename T> inline T foo(T t)
{
	return bar(t);
}

What happens if for some type T, bar<T>() returns void? Should instantiation of this template fail over such a minor point? Hell, no! There will be an error if the program tries to receive the return value into a variable (unless that too is void, of course) and unless that happens, there is no problem whatsoever.

Promoting, Converting, or Returning void

Returning a non-void value from a void function would still yield a compiler warning, but this time as a type demotion rather than as a special case. Returning a void from a non-void function (say it returns type T) could either be an error or return a default-constructed T. T's default constructor doubles as a "promotion" from void to T.

This raises the question of whether we'd need to have explicit default constructors as well, but I suspect there wouldn't be much need.

Container Library

Once we allow void objects, some simple templates may become implementable as degenerate cases of more complex ones:

template<typename T> typedef map<T,void> set<T>;

The standard library may specialize this to contain no object storage at all. The set<void> would then hold no more than a boolean to indicate whether an "object" had been inserted (or a counter if we decide that all void values should differ, instead of considering them all equal). Personally I'd like to say all void objects were not only equal, but actually the same object. But that may not be feasible; see below.

Storage

If void is allowed to have zero size, arrays and other variables of void type could be optimized away entirely in all cases. That may reduce the need for manual specialization when we reduce simple templates to degenerate cases of generic ones; you'd be almost guaranteed maximum efficiency for the void case, as if it were a manual re-implementation.

On the other hand, if it better fits the Standard to give each void object its own identity and nonzero size, then this could be a convenient way to tell the compiler to generate a unique address (without necessarily allocating storage to it). Sometimes that's useful too. We could just define sizeof(void)==1 and specify that the compiler may avoid assigning real storage to it where it can. That would be consistent with the notion that incrementing a void pointer increases it by one, as e.g. some code in Linux assumes (which probably means it's part of the C99 standard).

Arguments

A nasty consequence of all this is that a function's number of variables suddenly becomes a relative thing. If a function declared as void foo(); may be called as either foo() or foo(void()), why not allow foo(void(),void()) as well? That will take some more thinking, but I'm sure there's a way of living with it.

It may even be put to use. Function templates that accept "up to 5" arguments for example could in some cases be implemented as a single template, some of whose arguments may be of type void. Present practice requires 6 different definitions for zero arguments, one argument etc.:

template<typename A, typename B, typename C, typename D, typename E>
inline void print_total(A a, B b, C c, D d, E e)
{
	// Add and print.  Addition operators taking void arguments are all
	// defined as no-ops.
	cout << a + b + c + d + e << endl;
}

To be fair though, this is probably an exceptional case. Most templates will probably fail to compile when compiled with void as an argument type. But that's all compile-time trouble, not run-time surprises, and if those cases can be fixed it may eventually make for more elegant programs.

Miscellany

Granted Wishes

Customized error messages

(This wish dates back to 2004. It was granted in C++11, as static_assert. Deleted member functions, also introduced in C++11, address one of the use-cases specifically.)

One of the really cool things about templates in C++ is that you can make them "break the build" when the program uses them the wrong way.

In libpqxx, for example, I have template functions for converting various types to and from a domain-specific string representation. Those templates are specialized for many built-in types, but not for char. That was a deliberate choice: a char may represent a character, but it could also be a small integer (unsigned if the program was written with portability in mind). So I'd rather have the build fail and force the programmer to work around it.

But when the build fails with a link error about the missing specialization, the programmer probably thinks it's a bug in the library. The message that the library doesn't want this code to compile doesn't come across.

I work around this by providing a specialization that invokes a function error_ambiguous_string_conversion. I declare that function but leave out any definition. A program that tries to convert a char to a string representation will now fail to link, and the build error will at least contain this helpful function name. Hopefully the programmer will try to figure out why the build failed, and end up at the declaration for error_ambiguous_string_conversion where the problem is documented.

But wouldn't it be nice if I could customize the error message completely? Imagine I could just say:

template<> std::string to_string(const char &)
[
  "String conversion for type char is ambiguous.  If you really mean a "
  "character, create a std::string directly.  If you mean a small integer, "
  "convert to int or another unambiguous integral type."
];

That would mean that this function is declared but has no definition. If the program tries to invoke it, the compiler will emit my custom error message to the user.

This would also be helpful for nocopy base classes: current practice for classes that shouldn't be copied is to declare the copy constructor and copy assignment operator but not define them. A custom error message would make that trick a bit more user-friendly.

"prohibited" protection level

(This wish dates back to 2004. C++11 lets you mark constructors etc. as =delete, which is even better than doing it through access protection.)

Imagine a protection level "beyond private," which doesn't allow any access at all — not even from within the declaring class.

"prohibited" would be useful for disallowing copy constructors, copy assignment operators or default constructors that the compiler would implicitly generate otherwise. Standard practice for accomplishing this is currently to declare them as private and deliberately fail to implement them, but it's sort of a dirty trick and errors may not be caught until link time. Make them prohibited and you'll get a compile-time error if you try to call them even from within the class itself. The new protection level would also be useful for implementing virtual member functions that should only be accessed through virtual calls. Virtual functions can be dangerous, believe it or not (ever called one from a constructor or destructor?) so it may be useful to restrict access to them even from within the defining class.

Update: David Vandevoorde tells me that work is being done on a special syntax to suppress automatic generation of constructors and assignment operators, making the "false declaration" trick unnecessary.

Ideally the new keyword would precede "private" in alphabetical order so it fits into the private - protected - public progression. But at least prohibited starts with a 'p'.

Update: Bart Samwel found a word that meets these requirements and has a fitting meaning: precluded.

Namespace inheritance

(This wish dates back to 2004. Inline namespaces, introduced in C++11, allow some of these ideas.)

It may not make all that much sense in practice, but allowing inheritance between namespaces would give protected a meaning in namespace context. Presumably it could be "tucked underneath" the definition of class inheritance in the same way that namespaces themselves have been turned into a fitting component of the existing class inheritance mechanism.

Static virtual functions

Once you start doing namespace inheritance, you might as well have virtual functions in namespaces (and, by extension, static virtual functions in classes as well). "Dynamic" dispatch (actually I suppose it could be optimized away at link time) would be based on the calling function's namespace, regardless of whether the implementation's declaration is known to the compiler at the call site.

History


Valid XHTML 1.0! Created with vim Back to Jeroen Vermeulen Home