How to efficiently find the n’th biggest element in a collection

Looking at std::nth_element the other day, I noticed that it’s complexity is O(n) for a collection of size n, and wondered how that was achieved. A basic implementation of an algorithm to find the ith-biggest element might start by sorting the collection and index to the ith afterwards – complexity O(n*log(n)):

int ith_element_by_sorting( std::vector<int> input, int i )
{
    std::sort( std::begin(input), std::end(input) );
    return input[i];
}

It’s a small step to realise then that you don’t need to sort all n elements of the collection – only the first i elements, so complexity O(n*log(i)):

int ith_element_by_partial_sort( std::vector<int> input, int i )
{
    std::partial_sort( std::begin( input ), std::begin( input ) + i + 1, std::end( input ) );
    return input[i];
}

But the real trick is that you don’t need to do any sorting at all. That’s the approach taken by quickselect, which is the selection sibling to quicksort, and achieves O(n) complexity on average:

int partition( std::vector<int>& values, int left, int right, int pivot )
{
     auto pivotValue = values[ pivot ];

     std::swap( values[pivot], values[right] ); // Move pivot to end
     auto store_pos = left;

     for ( int j = left; j < right; ++j )
     {
         if ( values[j] < pivotValue )
         {
             std::swap( values[ store_pos ], values[j] );
             ++store_pos;
         }
     }

     std::swap( values[right], values[ store_pos ] );  // Move pivot to its final place

     return store_pos;
}

int quickselect( std::vector<int> values, int left, int right, int i )
{
    // Unordered - no need to sort values at all, 
    // instead we recursively partition only the subset of values
    // containing the i'th element, until we have either
    // a) trivial subset of 1
    // b) pivot is moved to exactly the location we wanted 
    while( 1 )
    {
         if ( left == right )
         {
             return values[left];
         }

         // Pick a pivot from middle of values.
         // Better options are a random pivot or median of 3
         auto pivot = (left + right)/2; 

         // Move anything smaller than values[pivot] to the left of pivot,
         // and return updated position of pivot
         pivot = partition( values, left, right, pivot );

         if ( pivot == i )
         {
             return values[i];
         }
         else if ( i < pivot )
         {
             right = pivot - 1;
         }
         else
         {
             left = pivot + 1;
         }
    }
}

int ith_element_by_quickselect( const std::vector<int>& input, int i )
{
    return quickselect( input, 0, input.size()-1, i );
}

int ith_element( const std::vector<int>& input, int i )
{
    if ( i < 0 || i >= input.size() )
    {
        std::ostringstream ss;
        ss << "Input '" << i << "' outside range [0," << input.size() << ")";
        throw std::out_of_range( ss.str() );
    }

    return ith_element_by_quickselect( input, i );
}

Here’s some test code to check that the implementation works:

template <typename F, typename T>
void should_be( T t, F f, const std::string& message )
{
    try
    {
        std::ostringstream ss;

        auto got = f();
        if ( got != t )
        {
            ss << message << " got " << got << ", expected " << t;
            throw std::runtime_error( ss.str() );
        }
        else
        {
            ss << "OK: " << message << ": got " << got;
            std::cout << ss.str() << "\n";
        }
    }
    catch( const std::exception& ex )
    {
        // Report error if either f() threw or we found unexpected value
        std::cout << "ERROR: " << ex.what() << "\n";
    }
}

template <typename F>
void should_throw( F f, const std::string& message )
{
    try
    {
        f();
    }
    catch( const std::exception& ex )
    {
        std::cout << "OK: " << message << ": threw \"" << ex.what() << "\"\n";
        return;
    }

    std::cout << "ERROR: " << message << " should have thrown\n";
}

#define SHOULD_BE( t, expr ) should_be( t, [](){ return expr; }, #expr )
#define SHOULD_THROW( expr ) should_throw( [](){ expr; }, #expr )

void testIthElement()
{
    SHOULD_THROW( ith_element( {}, 0 ) );
    SHOULD_THROW( ith_element( {1,2}, -1 ) );
    SHOULD_THROW( ith_element( {1,2,3}, 3 ) );

    SHOULD_BE( 1, ith_element( {1}, 0 ) );
    SHOULD_BE( 0, ith_element( {0,1,2,3,4,5,6,7,8}, 0 ) );
    SHOULD_BE( 2, ith_element( {0,1,2,3,4,5,6,7,8}, 2 ) );
    SHOULD_BE( 6, ith_element( {5,4,7,6,1,2,0,8,3}, 6 ) );
    SHOULD_BE( 8, ith_element( {5,4,7,6,1,2,0,8,3}, 8 ) );
    SHOULD_BE( 5, ith_element( {5,5,5,5,5,5}, 1 ) );
}

Here’s the output:
QuickSelect

In fact, reviewing old posts on this blog, I found this link that dates back to 2013, when std::nth_element was only required by the standard to have worst-case O(N^2) complexity, with O(N) on average – precisely what you’d get from quickselect. Now though, thanks to introselect, O(N) is possible.

Leave a comment

Filed under C++, C++ Code, Programming

Tech Book: Effective Modern C++, Scott Meyers

EffectiveModernC++When Scott Meyers announced his retirement from C++ duties, I thought I’d get a copy of his latest book. There’s plenty in it to interest even seasoned C++ developers – as always, Scott’s insight and in-depth examples are worth studying. I’ve tried out most of the C++11 features, but still learnt a lot working through this book.

Uniform Initialisation and Auto
Meyers points out that compared to parentheses or just plain “=”, braced initialisation can be used in the most contexts, avoids his famous “C++’s most vexing parse”, and will issue compiler errors if narrowing occurs. However, for types that can take a std::initialiser_list in their constructor, the compiler is required to favour std::initialiser_list over any other interpretation. That’s a deterrent to use of braced initialisation with types that have a std::initializer_list constructor – and also precludes using auto with braced initialisation.

auto x = { 4 }; // type of x is inferred to be std::initializer_list&lt;int&gt;!

How to use the new “= delete” idiom
Meyers recommends declaring deleted methods as public to get better error messages from compilers that check for accessibility before deleted status. He also points out that you can declare non-member functions as deleted (I’d only seen examples of copy-constructor/copy-assignment operator as deleted before).

Use of noexcept on move operations
This one is a classic example of a combination of language features having an unexpected effect. std::vector::push_back offers the strong exception guarantee. It can achieve this using any copy constructor and being sure not to change the original contents of the vector until any new elements or copies have been constructed. Hence, move constructors are only preferred if it is exception-safe to do so – which means they must be marked as noexcept (and if that isn’t appropriate, you just won’t get move semantics pushing instances of your type into a std::vector).

Recursion with constexpr
Having toyed with template meta-programming in the past, this use of constexpr appeals to me:

constexpr int pow( int b, int exp ) noexcept
{
    return (exp == 0 ? 1 : b * pow( b, exp-1 ));
}
constexpr auto eight = pow( 2, 3 );

void testConstExpr()
{
    std::cout &lt;&lt; &quot;2^3 = &quot; &lt;&lt; eight &lt;&lt; &quot;\n&quot;;
}

It’s much more succinct than the equivalent TMP version, and still performs the calculation at compile time.

Summary
In the past, I’ve recommended Effective C++ for developers who have little experience of the language and wish to improve their skills. I think this book is a bit too advanced for that, particularly given the chapters on template type deduction and decltype being in the first part of the book! So read this book if you’ve got a few years’ experience of C++ already, but look at Effective C++ (3rd Edition) to improve on the basics.

Four stars

Leave a comment

Filed under C++, C++ Code, Programming, Tech Book

C++14 Language Extensions

Whilst reading Scott Meyers’ Effective Modern C++, I was looking for some clarification on ‘generalised lambda capture’ and came across this C++ 14 feature list. It was interesting to compare to the list of expected features that Bjarne Stroustrup presented at the ACCU 2013 conference – he described all of the following, which made it into C++14:

  • Generalised return type deduction for normal functions, not just lambdas
  • Generic (polymorphic) lambdas
  • Generalised constexpr functions (listed as Extended constexpt on the feature list)
  • Binary Literals
  • Digit Separators
  • Deprecated Attribute

However, this one appears not to have made the cut:

  • Runtime-sized arrays

This would be a great new feature and I blogged about it at the time – but it appears the work has stalled.

Leave a comment

Filed under C++, Programming

BCS Meetup – The Trust Machine, Simon Taylor

BCS LogoThe BCS hosted a presentation by Simon Taylor of Barclays Bank.

Meetup - The Trust Machine

With the full potential of the uses of blockchain databases still yet to be discovered, there is a race, led by financial services, following the growth of Bitcoin, to find new, transformational business models that will exploit this technology.

Simon will explain that, while a Blockchain is just a kind of shared database, it is quite different because it creates a system of transparent, unalterable and permanent records of agreements. The impact of this is to allow all parties to organise themselves without fear of cheating. Simon will explain how cryptographic keys are used to create this trust, how it stops any single party from having more control than any other, and how, by allowing parties, for instance a buyer and a seller, to directly connect, it is seen as an opportunity and a threat by established trusted intermediaries, like his own organisation.

Simon covered a lot of interesting use cases in his presentation and shone light on the differences between bitcoin/blockchain as a technology. For example, this scenario was in his talk and on his blog:

Problem: When an investor comes to raise a Seed, things get messy, people fall out. Solution: Store this on a Blockchain at NASDAQ which has perfect time stamps, and digital signatures and no database administrator can edit the record without signatures of the founders. Could you do this with a database? Sure, but you’d lose that audit trail.

There are plenty of links to further reading on the blog post, 10 things you should know about Blockchains.  Also, What is the Blockchain and why should you care on Recode is worth a read.

Leave a comment

Filed under Meetup, Technology

F# Meetup – Microservices Chaos Testing

Meetup - Microservice Chaos TestingThis was the first event I’ve attended by the F#unctional Londoners group and the venue at Skillsmatter.com was excellent.

Some of the biggest growing pains we’ve experienced with our microservice architecture at Jet is in preparing for system outages.

In this talk, Rachel will discuss Jet.com’s chaos testing methods and code in depth, as well as lay out a path to implementation that everyone can use.

I haven’t used chaos testing in my own work, but in the world of distributed services, it makes sense to test the robustness of the system to failures on individual nodes. Rachel’s story was quite compelling, even if her own developers aren’t all convinced of the attractions of chaos testing on the production system just yet!

Leave a comment

Filed under F#, Meetup, Programming, Technology

Calculator App for Apple Watch

I was surprised that Apple didn’t include a calculator app with the Apple Watch. In the 1980’s, calculator watches were cool – having missed out then, I was keen for my Apple watch to have one.CasioCalculatorWatch

I chose to write my own calculator WatchKit App for fun, rather than purchasing one from the App Store. It’s a good choice for a first project, because the user interface is static and the focus is more on getting up the learning curve of WatchKit development. Here are some of the lessons I learnt in the process.

How to get the text from a WatchKit label
Suppose you’ve set up an outlet for a label that displays the input figures in your app – then you’d expect to be able to get the text back from it:

@IBOutlet var labelSum: WKInterfaceLabel!
//...
labelSum.setText( "42.0" )
let digits = labelSum.getText() // error - does not compile

It seems this is not supported in Xcode 7.2/Watch OS 2.1. Instead, you have to track the state in a variable and use that to populate the label.
How to store state in a WatchKit App
For a calculator App at least, you need a state machine to keep track of state, because at different stages you may be inputting the first or second number in the calculation, or re-using the previous answer in another calculation. Swift enum is a discriminated union that is well suited to this:

// Define enum type outside your interface controller
enum CalculationState
{
    case BuildingLHS( String )
    case BuildingRHS( LHS : String, Op : Operation, RHS : String ) // Waiting for Equals
    case WaitingForOperation( String ) // Got answer already, may do another operation
}

// Declare a variable to hold the state 
// as a member inside the interface controller class
class InterfaceController: WKInterfaceController {
    // ...
    var state : CalculationState
}

Use ‘RelativeToContainer’ to scale layout
When defining the UI elements on your story board, the interface controller is only approximately the size of the watch screen. My UI has all the buttons displayed at once, so it’s important to make maximum use of the screen size.
Calc - RelativeToContainer
Here, buttons 7, 8 and 9 are in a horizontal group. To fill that container, we need to use the ‘RelativeToContainer’ scaling style and fill in the proportion each UI element should take. For height, it’s 1 (i.e. the whole container), whereas for width, it’s one third (i.e. 0.33-ish). Personally, I think it would have been more obvious that this is the scaling style to choose if the proportion was displayed as a percentage, rather than a value relative to one.
How to set completion after animation
The WatchKit animation API lacks the ability to specify a completion function that runs after the initial animation changes. This is awkward if you want to flash a UI element from one colour and return to the original colour – if you run both animations in parallel, the colour doesn’t change. I used this code to add an extension method – then I could easily flash UI elements as below:

    func flashSum( origColor : UIColor, flashColor : UIColor ){
        animateWithDuration(0.25,
            animations: { () -> Void in self.labelSum.setTextColor( flashColor ) },
            completion: { () -> Void in self.labelSum.setTextColor( origColor )}
        )
    }

How to run WatchKit App on hardwareThis should be as simple as plugging the hardware into your MacBook (i.e. the iPhone that’s paired to the Apple Watch), selecting the correct device from the devices list, then running the App in the debugger. However, there are numerous pitfalls:

  • You need either a developer licence or a personal team set up. See Team | Identity in the Project settings
  • Xcode may think that the iPhone and Watch are unpaired – restarting Xcode solved this one for me, other people have had to re-boot their watch and/or phone
  • Xcode may not have the symbols for your Watch OS (this happened after I updated to Watch OS 2.1) – however, it seems happy to download them once you connect to the internet

Re-booting, re-connecting the phone to the MacBook, re-starting Xcode eventually sorted this out.
Conclusion
I’d already worked through a couple of tutorials for writing WatchKit apps, but you learn far more by writing your own.
Calc On Watch
The end result looks great, although the buttons are slightly too small for frequent use, even on a 42mm Apple Watch.

1 Comment

Filed under Programming, Swift

IET Meetup: Turing Lecture 2016, The Internet of Me, Robert Schukai

IETLogoThis years Prestige Turing Lecture was given by Robert Schukai, Head of Applied Innovation for Thomson Reuters.

TuringTrustAn opening address was given by the great nephew of Alan Turing, on behalf of TuringTrust.co.uk. This organisation builds on the legacy of Turing by distributing pre-owned computing equipment, both in the UK and particularly to schools in Africa who have no facilities.

The lecture hall was completely sold out for this talk and the speaker lived up to his billing, giving a mixed media presentation with great passion and insight. He took the audience on a journey from the introduction and incredible growth of mobile technology, both in terms of number of users, speed of data transfer, and bulk of data stored annually. Then he showed that, with the advent of new applications such as Genomic Profiling, and/or the Internet of Things, today’s data footprint will be blitzed by that of the future.

He introduced Cognitive Computing by way of IBM’s Jon Iwata talking about Watson and Robert Schukaithen illustrated the rate of progress of Artificial Intelligence with reference to Google’s Deep Learning having mastered the game Go a full decade earlier than predicted.

Schukai’s vision for the near future is of “DayFlow“, a seamless user experience where their needs are met across devices, throughout the day, with content being proactively displayed just as the user needs it.

I suspect my iPhone is already there – it volunteers the Bus Timetable app first thing in the morning, just as I’m ready to leave for work.

Leave a comment

Filed under Meetup, Technology

Book Review: Angel’s Flight, Michael Connelly

AngelsFlightThis is another Harry Bosch thriller by Michael Connelly. This one is set at a period of time when Harry is married to Eleanor, but the marriage is in trouble. A bad time, then, to be assigned to a highly sensitive case which could trigger riots in the discontented city if handled injudiciously.

Bosch has to handle inter-departmental politics and work with a team headed by Chasten, his sworn enemy who has investigated his own conduct in the past. The case centres on the death of Elias, a celebrity lawyer known for taking, and usually winning, cases against the LAPD.

Harry is assisted by Kizmin Rider and Jerry Edgar, but has to watch out for a high-level leak from within the case. It’s a good read, though the background case of a murdered young girl is rather harrowing.
Three Stars

Leave a comment

Filed under Book Review

IET Meetup – Wireless Charging Buses in the UK

IETLogoThis is the first meeting I’ve been to at the IET since the refurbishment of the premises at Savoy Place, London.

Savoy Place

The location has always been amazing, but now the venue makes the most of it, with new entertainment space on the first floor leading to balconies that overlook the River Thames.

The presentation by Murat Basarir covered a Joint Venture by Arup and Mitsui to trial 8 electric buses on an urban bus route in Milton Keynes. The linear route is 24km long and takes about 50 minutes from one end to the other. Whilst they considered many battery charging options, they decided to install wireless charging plates into the road at the terminals, giving the bus 10 minutes to top-up before it turns round and runs the route in the opposite direction. There’s no need for the driver to physically plug in or swap batteries, so s/he gets a ten minute breaks, as per routine with diesel buses. The regular top-ups increase the range of the bus, so that it only needs a longer, trickle charge overnight back at the depot.

Leave a comment

Filed under Meetup, Technology

How to conditionally compile against different .NET framework versions in F#

I recently needed to supply two variations of an algorithm in an F# source file – one for .NET 3.5 consumers and another for .NET 4.0 consumers (the later framework version allows parallel execution using the FSharp.Collections.ParallelSeq). One way is to tell msbuild to reference the library, conditional on the version of the .NET framework – and also, define a constant which can be used within the source code to choose the algorithm.

In msbuild, define a constant according to the framework

<PropertyGroup Condition=" '$Framework' == 'NET35' ">
  <DefineConstants>NET35</DefineConstants>
</PropertyGroup>
<PropertyGroup Condition=" '$Framework' == 'NET40' ">
  <DefineConstants>NET40</DefineConstants>
</PropertyGroup>

In msbuild, reference library if .NET 4.0

<ItemGroup Condition=" '$Framework' == 'NET40' ">
  <Reference Include="FSharp.Collections.ParallelSeq"/>
</ItemGroup>

Specify conditional compilation directives in the source code

let doMapping = ... // some transformation
#if NET35
    items |> Seq.map doMapping
#else
    items |> FSharp.Collections.ParallelSeq.PSeq.map doMapping

Leave a comment

Filed under F#, Programming