Meaningful Connections @ Facebook

Facebook invited me (and a few hundred others!) to a private event at their London offices to network and hear presentations from several of their engineers.

Allan Mertner – Welcome
Allan was previously at King (makers of Candy Crush) and joined just five months ago. He leads the Ads group in London – the largest group in Facebook London. The culture is to move fast and ship things, to make things better – at scale.

Damien Lefortier – Causal Modeling: Delivering Incremental Advertiser Value
Incrementality
A conversion is any event of interest happening on the advertiser’s side (subscribing, purchasing etc). Incrementality is a conversion caused by the ad that would not have happened otherwise. So reporting these incremental conversions is a goal for any advertising platform. But it’s hard to assess – how to measure people that would have discovered the advertiser anyway.

Attribution frameworks
Used as a proxy to incrementality for determining which conversions are considered for both reporting and delivering. Eg. 1-day or 7-day click-through (so record that the person saw the ad, then count it if they convert within some period if going to the advertiser’s site). Or 1-day View-through – the whole thing from showing the ad to converting within a day. But not all attributed conversions are incremental – may have happened anyway.

Delivering incremental value
First need to measure incremental conversions! Then you can assess if you’re improving your delivery. So you can measure random samples of people that are shown/not shown the ad, and see how many convert.

Causal modelling – model both the conversion probability given ad exposure and given non ad exposure. No need for attribution with this approach. E.g. 1 – Natural Experiments – use the natural randomisation in the system to get data. Use this to train a model – but needs correction for biases. E.g. 2 – Small randomised trials – collect data, but typically means less data to teach models.

A model can not only be used for measuring, but also predicting the impact of an ad campaign. It’s a hard, and interesting, machine learning problem. But it actually works – such as for Brand Awareness following ad campaigns.

Brian Rosenthal – How Facebook culture influenced our ad system evolution
Brian has been at Facebook for over 8 years and has worked on much of the Ads codebase.

Mission driven, monetisation business. Facebook don’t want you to see ads that waste your time and the advertiser’s money. This helps frame the problem as one of efficiency – which is appealing to engineers!

Stage 1: Launching Quickly
Ads started in 2004/2005 – it was initially a college site, so the adverts were related to colleges, storage and moving companies. Ads were integrated as part of the social media interface, not just as banner ads. They had an Advert creation interface, where you could add your ad – with identification of who is running it. Then in 2007, an auction where you could bid per click (where a higher bid increased the likelihood of it being seen). In 2009, you could finally edit your ads (previously, they were only created or bulk uploaded)

Stage 2: Embrace best practices
Figured out the right principles – unlike stage 1 (which was quick and dirty to launch), they really understood the problems. Fixed bugs due to database inconsistencies – instead, adopted modular, transactional units – taking 12 months to ensure consistency (half of that time writing tests). Enforced a strict API layer (so internal code also had to use the API as well as outside users). The auction functionality matured – multi-objective optimisation. You could now bid on all the actions you wanted.

Stage 3: Deeper lessons
Try things again, decouple concerns. In 2011, the world was moving to mobile, but ads didn’t work on mobile! They tested ads on mobile and it was a failure. Trying again with deeper knowledge helped to get it right. Decoupled concerns – an early design error was to have objectives and ad formats entangled. These were separated so that every format supported every objective and the code was independent. Dynamic Ads introduced – made the ad more suited to the person (‘customer segmentation’). Objective-based ad buying – do you want to increase adds to cart or checkouts?

Current problems
Scaling, fighting fraud, quality, malicious actors. Biggest thing has been to be adaptive, to change what they do to meet demand.

Ben Savage – Reducing Unintentional Clicks
Ben joined Facebook in 2013 – initially thinking it would be for a year or so, but has stayed because it’s a fun place to work!

Facebook has a long term view – they need their advertisers to be happy, getting return on investment and increasing their budget year-on-year. This enables Facebook to increase the sophistication of their offering – they used to bill by Clicks, now can model conversions.

Audience Network – the problem was that the value of a click was lower than that on Facebook news feed. They saw that the number of clicks reported was not well associated with the number of significant website sessions. Hypothesis – this poor conversion was due to unintentional clicks. This was frustrating for the users and reduced the value per click.

Tinder uses audience network for its ads. The click goes to Web View – so you can measure the time spent in the browser (before they go back to the launching app). At Facebook, Data Wins Arguments. If there are short sessions, it’s likely the click was unintentional. Could they remove the financial incentive to get these unintentional clicks?

Well, what if they stop billing advertisers for such clicks? Blogged that the company would stop billing for this, even though it reduced revenue, because it’s the right thing to do. As an engineer, that’s pretty rewarding. The hope is that advertisers would then re-design ads (such as removing active whitespace areas) because they wouldn’t count clicks for that anyway.

Wrap up
The panel answered questions from the audience. One development they would love to see is that, for users who allow mobile Facebook to track their location, associating showing an ad with the user physically visiting the store to make a purchase. Another feature that’s already available is “Why am I seeing this?” if you’re wondering why a particular ad was selected for you.

Leave a comment

Filed under Meetup, Technology

Book Review: The Third Option, Vince Flynn

This implausible thriller features Mitch Rapp, an assassin who works undercover for the US government, but wishes to retire and settle down with Anna. Anna is the love of his life, a TV news reporter whom he met whilst saving the life of the President, in a previous adventure.

Mitch is assigned one last job – to assassinate an arms dealer who has broken sanctions by selling to terrorists. Mitch (aka “Iron Man” because he’s invincible) is the Third Option – when the government sees that diplomacy has failed and military action is not viable, they turn to him. The job goes smoothly – that is, until Mitch is shot by one of his own team, a double agent working for a shadowy figure who wishes to discredit the President. Apparently, an undercover operation like this blowing up in a foreign country would be enough to destabilise the leadership of the CIA, leaving the door open for a Presidential bid.

Mitch hooks up with a team to track down the traitor and exact revenge. A counter-team is also assigned by the shadowy figure to track down the traitors and eliminate them before Mitch finds them (otherwise, they might talk, you see). And another lone assassin (a top European model no less) is engaged to terminate the leader of the counter-team. Seriously.

Three stars

Leave a comment

Filed under Book Review

C++ London Meetup: Distributed C++

This is the first time that the Stockholm C++ and the London C++ group have combined to produce a series of lightning talks, half given from Sweden and half from England.

Bjorn Fahller: A variant of recursive descent parser
Raised issues with generators and lexers when trying to do this, particularly for debugging. What about std::variant? Has knowledge of the type that it’s holding, and there’s std::visit that can overload function call operators for each type.

// visitor class is defined elsewhere, too far from its declaration!
visitor visitor_obj; 
int i = std::visit(visitor_obj, variant_var );

// Alternative - Bjarne's overload approach with lambdas  
auto t = lexer.next_token();
std::visit(
      overload{
    [=](ident var){ return lookup( var.value); },
    [=](number n){ return n.value;}
  }
  , t );

The template overload makes clever use of deriving from a lambda via templates and C++17 features. The details are here.

Bjarne also referenced: Matt Kline’s post on std::visit is everything wrong with modern c++.

Mikael Rosbacke – Yet another state machine tool
How to make it easy to write state machines? Hierarchy, entry/exit action, queuing, no heap allocations, no code generation.

Mikael’s framework provides a finite state machine base class from which you derive and post events as they arrive. Each state derives from a framework state base class, and provides an event method to transition into other states.

See https://www.github.com/rosbacke/mcu-tools.

Simon Pettersson – The Art of manufacturing types
Compile time lookup-table – with a very convincing demo in compiler explorer, where marking the lookup result as constexpr changed the compiled code to simply a constant.

See https://github.com/simonvpe/cmap

Paul Dreik – what is this std::forward thing?
A beginner’s guide to forwarding references – suppose you want to write a wrapper function that does some action then calls an underlying function. You would need to use std::forward like this, otherwise the wrong overload of function f would be called:

struct S{};

void f(S& s){ puts("f(S&)"); }
void f(S&& s){ puts("f(S&&)"); }

template<typename T>
void wrap(T&& t){
  f(std::forward<T>(t));
}

int main(){
  S s;

  wrap( s );
  wrap( S() );
}

Dominic Jones – Reflecting on names – Facilitating expression tree transforms
Would like to transform expressions differently according to the variable inputs:

transform(a + a) -> 2 * a
transform(a + b) -> a * b

Thoughts are to introduce varid, a keyword based on the position of the declaration of the variables. Evaluated at compile time, a bit like address-of, possibly the hash of the file name, row and column of the referenced variable. The use is for faster automatic differentiation.

Phil Nash – a Composable Command Line Parser
When writing Catch 1.0, Phil wrote Clara 0.x – a command line parser library (but it never reached maturity). As part of Catch 2.0, he has written and completed Clara 1.0! The latest version of the library introduces class Opt which declares command line options, which are then combined with the pipe operator.

auto a = 
      Opt( width, "width" )
        ["-w"]["--width"]
        ("How wide should it be?")
    + Opt( name, "name" )
        ["-n"]["--name"]
        ("By what name should I be known");

See github.com/philsquared.

C++ London University
Tristan is a volunteer at this initiative to support interested parties in learning C++. Hosted by Mirriad, it comprises a mixture of lectures, exercises and covers the basics. 4 meetings so far, looking for tutors and more students! Tristan Brindle – tcbrindle@gmail.com, http://www.cpplondonuni.com, gitHub.com/cpplondonuni

Ian Sheret – Automatic Differentiation in C++
For mathematical functions in various domains. z = f(x,y) e.g. hypotenuse as root of sum of squares in a triangle. But what about values of z near x and y, if we change by some epsilon? Can use Ceres Jets to keep track of the differentiated values. Templatise the function so that you can pass in the Jet variables (instead of simply doubles). Then the return value automatically returns the derivatives as well as the function value!

Andrew Gresyk – Effective Screening Interviews
Consider cultural fit and technical fit – but if they do a separate cultural interview, 80% of people pass that anyway. Need to test communication and ability to write real code – asking knowledge questions only is not sufficient. Replace questions with practical exercises.

Jamie Taylor – Beyond SSO: The Merits of Fixed-Length Strings
std::string – easier and safer to work with than a c string. Handles memory allocation. Uses short string optimisation – short strings go on the stack, so no dynamic allocation.

But – length of SSO buffer is implementation defined. If string is too long, will dynamically allocate. Also, the fixed buffer inside the std::string will frequently be much larger than a very small string you wish to create (typically 32-bytes).

// fl::string is an alternative where the size of the buffer
// is customisable in the template parameters
template<size_t length>
class fl::string
{
  // Implementation elided!
  char m_data[length];
};

Don’t want to pay full cost of e.g. 32 byte std::string if your string will only be 3 characters. This allows you to specify the maximum length in the template. Get average 6x and 13x speed-ups for 8 and 32 character strings for both creation and access of string keys in maps. Can be more friendly for your cache and great for low-latency.

Vittorio Romeo – you must type it out 3 times
Vittorio wants to write a generic log and call method with a noexcept handler – but has to type the same forwarding signature three times! Solution could be the “=>” operator that’s been proposed.

Leave a comment

Filed under C++, Meetup, Programming

Video: Meta – Toward Generative C++, Herb Sutter

Herb Sutter shared this video of his Qt 2017 talk on his personal metaclasses project.  

 

I first heard about this development at ACCU 2017. The benefits of standardising best practices for definitions of interfaces/value types etc are huge and would let the developer concentrate more on the business problem they are solving, rather than the technicalities of the language. 

  

Leave a comment

Filed under C++, Programming, Uncategorized, Video

Book Review: No Middle Name, Lee Child

This is a collection of short stories featuring Jack Reacher. Despite reading a few negative reviews, I found this book pretty good. I’m sure a lot of Jack Reacher fans will be interested to read about Reacher’s childhood – but I can imagine it would have been hard to explore that in a complete novel.

“Second Son” is set when Reacher is 13 and newly arrived at a military base. Whilst his upbringing is mentioned in other books, the relocation from one base to another is shown to be a big part of his life. He has to find his feet pretty quickly when surrounded by openly hostile kids – and his loathing for running means that in a fight-or-flight situation, the choice is already made.

“High Heat” is set a few years later – Reacher goes to the city at 16, purely to look around before visiting his brother. As a man, we see that he gets involved whenever he witnesses an injustice – as a young man, he was already inserting himself into adult conflicts, and somehow coming out on top despite tough odds.

“James Penney’s new Identity” stands out because Jack Reacher is really incidental to the main plot – I don’t think Lee Child has written many books without Reacher (any?), but this shows that he has more than enough ideas if he wanted to invent another character. But Reacher is so popular, you can’t blame him for giving the public what they want.

The best stories are at the start – the last few are shorter too, but by then I’d had a great time reading the book anyway.
Four stars

Leave a comment

Filed under Book Review

Rambling: Great Gable, Lake District


This walk starts from Seathwaite in the Lake District. My Pathfinder Guide advertised that the village has a cafe, but there was no sign of it. Wikipedia mentions that Seathwaite is the wettest inhabited place in England – luckily for me, the weather was glorious. A local farmer provides car parking in a field for a small fee.

As you look up from the car park, the peaks look pretty daunting and the walk starts with a tough climb straight up the valley side. Actually, most of this walk involves climbing or descending – the book wasn’t kidding when it wrote “you need to be reasonably fit to tackle it”. The first peak is Green Gable, offering spectacular views of Scafell Pike and Buttermere. After a steep descent and a scramble back up, you then reach Great Gable itself (site of a memorial to members of the National Trust who died in World War 1). The long descent down to Styhead Tarn is arduous, then the sensible choice is to take the path back via Stockley Bridge. I took the more difficult route to see the waterfalls at Taylorgill – the path is ill-defined and tracks the areas of peat bog, so not recommended.

The walk was booked at 6 miles and my phone recorded 25000 steps and 231 floors climbed.

Green Gable Summit

Leave a comment

Filed under Rambling

Book Review: The Lost Fleet – Dauntless, Jack Campbell

This is the first book in the “Lost Fleet” series, featuring Captain Jack Geary. I read “Guardian”, a book from later in the series earlier this year, and was hoping that this book would describe the moment that Jack Geary’s survival capsule was found in space. However, this book goes back to a period just a few weeks after he has thawed out (!) and is adjusting to life in the future. It’s 100 years after he’d famously escaped his last stand in a battle against the Syndicate Worlds, but he has no recollection of the passing of a century. Moreover, he was promoted to Captain upon his supposed death, and soon finds himself running the entire fleet due to his length of service and the ensuing legends that have built over the years.

The book covers a number of space battles, as well as describing the difficulty Geary faces in retraining his team in the lost arts of combat at near light speed. He faces opposition in the boardroom too – not everyone is happy to be shown the error of their methods. It’s an enjoyable read and highly similar to Guardian – it will be interesting to read a third from the series to see if the author follows the same template throughout.
Four stars

Leave a comment

Filed under Book Review