Network, Neural Network. License to model?

Image from James Bond starting sequence.

I’ve created a neural network analysis of James Bond movies using neuromats model manager software. The neuromat software implements a neural network using a Bayesian statistics framework using the methods developed by David Mackay.

Training the model

I hoped to find which factors are important to make a successful Bond film and then to predict the revenue of the James Bond film to be released in November, Casino Royale. The model is limted to variables that can be easily quantified, and to which I had easy access.

The data included inputs of, the number of female conquests by Bond, the number of Martinis he drinks, the number of licensed kills, the year of the film, and the number of times Bond introduces himself with the catch-phrase ‘Bond, James Bond. The world wide box office for the film in dollars was discounted to a present day value, using the year of release and data of US inflation rate.

After adjusting the box office takings for inflation the database looked like this:

   "Conquests"   "Martinis"   "Kills"   "BJB"    "Year"    "M$*2006"      "Label"
        2              2         16       1      2002       470.44      "Die_Another_Day"
        3              1         19       2      1999       423.91      "The_World_Is_Not_Enough"
        3              1         25       1      1997       419.98      "Tomorrow_Never_Dies"
        2              1         12       1      1995       465.61      "GoldenEye"
        2              1         12       1      1989       262.48      "Licence_To_Kill"
        2              2          2       1      1987       347.68      "The_Living_Daylights"
        4              0          5       2      1985       292.92      "A_View_To_A_Kill"
        2              0         14       1      1983       381.22      "Octopussy"
        2              0         11       2      1981       480.78      "For_Your_Eyes_Only"
        3              1         14       1      1979       651.71      "Moonraker"
        3              1         14       1      1977       690.11      "The_Spy_Who_Loved_Me"
        2              0          1       2      1974       385.46      "The_Man_With_The_Golden_Gun"
        3              0          6       1      1973       658.51      "Live_and_Let_die"
        1              0          7       1      1971       652.83      "Diamonds_are_Forever"
        3              1          8       2      1969       408.40      "On_Her_Majesty's_Secret_Service"
        3              1          21      0      1967       758.09      "You_only_live_twice"
        3              0          22      0      1965      1004.90      "Thunderball"
        2              1          10     1.5     1964       900.42      "Goldfinger"
        4              0          17      0      1963       575.94      "From_Russia_with_Love"
        3              2          5       1      1962       439.60      "Dr_No"

The data is also represented in the figure directly below, with each variable normalised by dividing by its maximum value in the database.

Variation of inputs and box office with year.

After training on half of the data, 208 potential models where tested by their ability to predict the unseen data. Attempting to create a committee of models from the best models it was found that the best predictions could be made using just one model. This model was retrained using all of the data. Bayesian inference should automatically prevent overtraining since each model represents a distribution of weights, and complex relationships are penalised.

The graph below is a plot of the output against the target after selection of the commitee and retraining with all the data. The failure to have all the points lying on the line could indicate that we haven’t taken account off all the factors which influence the box office takings, with the two most profitable films, Goldfinger and Thunderball, out-performing the expectation of the model.
Committee predictions of the training data.

The significances of each of the inputs percieved by the model shows that the year of the film and the number of kills have a strong influence on the box office takings, as we can see directly below. The number of conquests also has an influence, but the number of Martinis drank and the use of ‘Bond, James Bond” catch-phrase are not very important at the box office.

James Bond input Significances

Bond Movie Trends

To see the trends in the data, predictions were made using average values of the inputs in the database, and stepping each value. The average values were 2.6 conquests, 0.75 martinis, 12.05 kills, 1.12 utterings of ‘Bond, James Bond’ and year of 1979.

Bond, James Bond
Pierce Brosnan drinking a Martini cocktail.
James Bond, Martinis

There is a linear decrease with the world wide box office takings with the year, this may be due to decrease in popularity of James Bond or a general decrease in the total size of the international Box Office. This prediction is for the average number of kills, however films since 1995 had higher than average number of kills (and conquests, Martinis and ‘Bond, James Bond’s) and made ‘average’ box office takings as seen in the database. The recent films have therfore maintained their box office takings at about 400 Million dollars.

There may be a trend that recent films make less at the box office but more from secondary sources such as movie rental, merchanising and cross-promotion, in which case the box office takings may not be the best index to determine the profitabilty or popularity of a film. For example may James Bond computer games exist such as Golden Eye which was popular in it’s own right, and would have generated a large amount of revenue.
James Bond, Year

The number of kills made in the film has a stong positive correlation with the Box Office takings. The most number of kills in a film was Tomorrow Never Dies (1997) with 25 followed by Thunderball (1965) with 22 and You only live twice (1967) with 21 Kills. According to the model the trend continues to higher number of kills. It seems that action is popular in James Bond films.
James Bond, Kills

Daniel Craig poses with the Bond girls from Casino Royale 2006.

Surprisingly by the number of Bonds conquests, has a negative correlation with the Box Office takings. So according to the neural network analysis the producers should minimise the number of times Bond has to make this sacrafice in the line of duty. There is no film in which James Bond neglects to have a female conquests but according to the neural network this would be more profitable.

James Bond, Conquests

In conclusion

According to this simple analysis the box office takings of the next James Bond Movie can be maximised by increasing the number of on screen kills, and contrary to expectation by minimising the number of conquests. The neural network predicts a slightly larger box office with 0 conquests than with 1 conquest, although no bond film exists were he does not go to bed with a bond girl. It would be very brave of the producers to take this action since it has been a factor which is characteristic of the Bond films, however it seems excesses aren’t appreciated, presumably because it takes time away for other kinds of action with broader appeal (or plot development?).

Of the factors included in the database we saw that the number of Martinis drank and the use of the bond catchphrase were not regarded as significant and had flat trend lines with small error bars. These factors can be safety removed from the database in a future model. Many possible inputs can be imagined for example it is possible to get estimates of the budget for each film, which should be related to the number of stunts, or we could count the number of explosions, or the time that ‘bond girls’ are on screen for.

One factor which is not simple to include but which is probably the most frequently discussed is the actor playing Bond. Which there is no simple way to objectively include in the model without extra information, perhaps the best way would be by the wage the actor recieved which should be atleast a measured of his popularity as percieved by the production team).

Daniel Craig will play bond in the 2006 movie, Casino Royale

If 2006 movie had same inputs as previous movies
The graph directly above shows the prediction of the box office if the previous movies were released in 2006. In reality the last 3 movies made 420, 424 and 470 million dollars at the box office, however according to this model they should have made 250, 300 and 150 million if released in 2006. If we simply take an average of the last 3 movies we could expect the bond movie to make 450 million. The highest revenue predicted by the model would be for a new version of Thunderball which is predicted to make 350-500 million. It seems that the model has been influenced by the downward trend in the box office revenue between 1975 and 1990.

The prediction for the ‘average’ film above says that we can expect the movie to make 350 million dollars. Since the production of the film will have included some analysis of the previous movies we can expect that it will have a large amount of violence and action, so we can expect a high amount of kills.

The failure of the model to predict the box office revenue of the most popular films suggests that not all of the most important factors have been included in the film, it might be worth to consider the inclusion of other factors for which data is available for example the estimated budget of the film. The model has suceeded in showing us trends in the data which can give us some idea how a bond film can be optimised.


3 Responses

  1. Can I make predictions for; unoffical bond film ‘Never Say Never Again’ , 1983 with Sean Connery, the original ‘Casino Royale’, 1967 with Peter Sellars (only 4 movies before first spoof film!), or ‘the spy who shagged me’ with Austin Powers 😉

  2. that’s a pretty intersting thing to work on. good idea.

  3. The actual takings of the bond film at the box office have been $594,293,106.

    Although I haven’t measured the various inputs for the new film (conquests, deaths, martinis, etc.) my prediction for the best performing movie, Thunderball, said the most that the movie should be expected to make was $500 million dollars at the box office. Inflation has been high but shouldn’t effect the result by more than 10%. (Maybe changes in exchanges rates make the dollar earnings more?).

    If anyone knows the number of people killed in the movie, that would help my test my model more thoroughly.

    We can conclude that the latest James Bond film has done exceptionally well at the box office, either Craig David is a very popular James Bond, or their has been an increase in cinema going which wasn’t an input in the model.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: