Complexity of std::list::size() is... O(N)

I want to share my recent "discovery" that was very shocking to me. Reading comments to a slashdot story about the C++0x standard I read something very interesting: std::list::size() has O(N) complexity in the gcc compiler.

For many years I thought it's obvious that std::list::size() has constant complexity and I've used it relaying on that "fact". I was more convinced by some nice sites with C++ documentation I use like cppreference.com. I was really surprised by the slashdot comment and posts it refers to, so I've made a simple benchmark:

  1. #include <iostream>
  2. #include <list>
  3.  
  4. using namespace std;
  5.  
  6. int main ()
  7. {
  8. list< int > l;
  9.  
  10. for (int i = 0; i < 1000000; i++)
  11. l.push_back (i);
  12.  
  13. for (int i = 0; i < 1000000; i++)
  14. cout << l.size () << endl;
  15.  
  16. return 0;
  17. }

I've compiled it with gcc 4.4.5 (standard GCC in the current Ubuntu), run this program and waited forever to finish... So it's true! Using gdb or doing a little research one can find that calling std:list::size() actually does this:

  1. typename iterator_traits<_InputIterator>::difference_type __n = 0;
  2. while (__first != __last)
  3. {
  4. ++__first;
  5. ++__n;
  6. }
  7. return __n;

Comments

list.size is O(1) in C++11

In C++11, list's size cost const time.

Why is it shocking to you?

Why is it shocking to you? Did you ever implement a list yourself? Did it have a size readily available or was is just a bunch of linked nodes?

Because it seems easy to have

Because it seems easy to have a variable that holds the number of items in the list and is updated every time the list is modified. The size() method in hash based structures like unordered_map is constant so I wrongly assumed size() for a list is also constant and didn't see any reason why not to implement a counter variable.

List is not vector

Set aside any performance consideration, this is the correct thing to do. Different containers, different purpuse. With a vector, it's easy, because it is only a ( __first - __last) calculation, and a list could be expanded by their nodes, which the main container know nothing about. If you store the size in the list object, you could end up with bad values. Fortunately, if you decide to use list for some reason, probably you won't query the size a lot of times because the task it solves are generally not requiring such a thing. Occasional size query with O(N) complexity isn't a huge deal.

Even more shocking is the

Even more shocking is the memory profile on all those STL containers. I wrote a blog post about my quick findings: http://gjcarneiro.blogspot.com/2010/10/c-stl-containers-memory.html

I've just seen that the new

I've just seen that the new standard is going to make size() a constant time operation. I think this is a bad move because I rely on having constant time splice(). They should have removed the size member as they did for forward_list. It is better to break source code than to break at runtime. Maybe the GCC folks will not implement the constant size().