Raw loops vs. STL algorithms

published at 04.02.2016 10:46 by Jens Weller

Since last week I am working on my CMS for static HTML pages again, and so the series about Building applications with Qt and boost continues. Today its about using STL algorithms, or how Sean Parent once said "no raw loops!". Now, I am not Sean Parent, and not even the implementers of the STL are perfect. Most code which I write is application code, which then powers Meeting C++. Also, I don't know all STL algorithms, and some times its just to tempting to write a little loop instead of searching the STL for the specific algorithm. Yesterday I had such a case.

I'm working on the generation of the menu, which is an important part towards generating the actual HTML with boostache. So, connected to this, is the problem to find the relative path between two pages. Directories form a tree, and in every directory there is at least one html page. Given two pages, what is the relative path between page A and the current page?

Raw loop solution

There is no STL algorithm to solve this in one go, I thought about using boost::filesystem, as it has a relative path function, but my data model is actually a tree of directories and pages, so I am not sure if boost::filesystem would also work on non existing paths. After all this code needs to be executed before a single file is written.

The core of the raw loop solution are 3 raw loops:

auto& vec_cp = dircache[current_page->id()];//menu page dir1/dir1_1/p
Page* p = node->get< Page >();
auto& vec_tp = dircache[p->getId()];// this page  dir1/dir1_2/p
size_t same =0;
while(vec_cp.size() > same && vec_tp.size() > same && vec_cp[same] == vec_tp[same])
std::string path;
for(size_t diff_cp = vec_cp.size() - same;diff_cp > 0;--diff_cp)
    path += "../";
for(size_t diff_tp = vec_tp.size() - same; diff_tp + same < vec_tp.size(); ++diff_tp)
    path += vec_tp[same + diff_tp] + "/";
path+= p->getName() + ".html";

Counting the loop and the variable declaration the raw loop is 7 loc, and maybe there is a clever trick to replace the first for loop. Both for loops can easily be written as for_each, but what is with the while loop? Which STL algorithm could handle this? Running on two ranges at the same time, also checking which range ends first?

STL algorithm solution

As I mentioned, turning the for loops into std::for_each and using a lambda for the body is kind of easy. The while loop is replaced by std::mismatch, giving the first iterator pair where the values mismatch:

auto& vec_cp = dircache[current_page->id()];//menu page dir1/dir1_1/p
Page* p = node->get< Page >();
auto& vec_tp = dircache[p->getId()];// this page  dir1/dir1_2/p
auto it_pair = std::mismatch(vec_cp.begin(),vec_cp.end(),vec_tp.begin(),vec_tp.end());
std::string path;
std::for_each(vec_cp.begin(), it_pair.first,[&path](const std::string&){path += "../";});
std::for_each(it_pair.second, vec_tp.end(),[&path](const std::string& s){path += s +"/";});
path += p->getName() + ".html";

This code uses the C++14 version of mismatch, earlier standards are lacking a version, where it didn't matter which of the two ranges was shorter, the first had to be the shorter one. The STL solution is only 3 lines of code, and uses iterators instead of indexes. The variable same and the counters are not needed anymore.

Ranged for loops

Are the new fancy C++11 range for loops also raw loops? A ranged for loop can be replaced by std::transform or std::for_each. Unlike these algorithms, the for loop is usually only executed over the whole range. The STL algorithms offer more flexibility regarding the start and end. Sean Parent noted in his C++ Seasoning talk, that its ok to use range based for loops in the role of transform and for_each:

I think, when you need to iterate over the full container, a range based for loop is more readable, then a transform or for_each. So I prefer it, but it always should be exactly one of the two.

As an important difference is that inside a raw loop certain keywords can change the behavior of the loop: break, continue and return. Usually these are accompanied with an if/else, and are often replaceable with an *_if algorithm:

Or simply can be applied inside a predicate in algorithms like rotate, partition, and maybe any_of, all_of or none_of?

So no raw loops?

The range based for loop is the only loop which is not inherited from C in C++. So, when ever you write raw loops, your code is closer to C (with classes) then to C++ how its understood today. The STL wraps the different use cases of loops in named algorithms, and hence makes your code more readable. The used algorithm hints what the code is doing, once you start using algorithms, raw loops will look like boiler plate code to you.

Yet, you only can apply STL algorithms, if you have the interface to do so. Which means begin and end iterators. When I work with Qt, there are some cases, when you only get the count and a method to access item n, like in a QListWidget:

for(int i = 0,s=ui->lst_feeds->count();i < s; ++i)
    auto* item = ui->lst_feeds->item(i);
    auto si = item->data(Qt::UserRole).value< FeedItem::SharedItem >();
    if(si && si->contains(list))

This code checks the list items, if they are in the current data set selected. The QListWidget displays the available feeds inside a panel for lists, so when a list is selected, you can check in which feeds it will appear.

But, of course there is also a solution to this, you can write a proxy that has a iterator facade towards the STL, and wrapps the access pattern of QListWidget.

So for most of us, it could be no raw loops except in legacy code that we can't change.