2015 General Election Prediction: Wisdom of the Crowds

Polling Station

Introduction

The 2015 UK General Election looks like being one of the closest, and hardest to predict, for many years. With 650 seats being contested, one party needs to win more than half the seats (326) to be able to form a government. Most, if not all, polls are predicting a hung parliament, with the likelihood that the UK will have another coalition government, though what form that will take is open to much debate.

It is not difficult to find predictions for the election result. They tend to fall into two categories; the percentage share of the vote or the number of seats that will be won by each party. Of most interest is the number of seats that will be won by each party, as this is what determines the formation of the next government.

Wisdom of the Crowds

In 1907, Francis Galton reported in Nature an event that had taken place at a country fair, where around 800 people were asked to guess the weight of an ox. The average guess was 1,197 pounds. The actual weight was 1,198 pounds, which is close to the average guess to be considered just about spot on. Importantly, many of the people who participated could be considered experts, such as farmers and butchers, but many people were far from experts – just being people attending the fair. Also, importantly, not a single person guessed the correct weight and only one person guessed 1,197 and two people guessed 1,199.

This concept of the Wisdom of the Crowds was popularised in a 2004 book by James Surowiecki, arguing that the opinion of a large number of people will do better than the judgement of a few experts.

2010 General Election

Wisdom of the Crowds was used to predict the 2010 general election. Martin Boon, of ICM Research, showed that “that the Wisdom of Crowds approach at the 2010 general election would have produced the most accurate final pre-election prediction.

Henretty and Jennings

Chris Henretty and Will Jennings have used the Wisdom of Crowds to predict the number of seats for each major party in the 2015 General Election. They surveyed 2,338 people, with 537 responding. They asked two questions, one about the percentage share of the votes and one about the number of seats for the major parties. Their report (published on 03 Mar 2015) gives the following predicted seats.

 

Party Seats
Con 278.4
Lab 282.3
Lib Dem 24.8
SNP 28.7
Greens 1.9
Plaid Cymru 3.3
UKIP 6.6
Others 13.4

Our Data

Drawing inspiration from this study, we utilise other predictions, to see how it compares with the study of Henretty and Jennings. Our study looks at 24 different predictions, aggregating them to produce our predictions.

Our data is drawn from a variety of sources.

  • The Henretty and Jennings study is used, recognising that it incoporates over 500 individual predictions.
  • A recent BBC Panorama program asked Nate Silver for his predictions. Silver is an American statistician who has successfully predicted the outcome of the last two US presidential elections.
  • Data was taken from several spread betting firms, taking the middle of the spread as their prediction.
  • The London School of Economics asked a number of election forecasters at a conference they held on the 27 Mar 2015 for their predictions. These have been incorporated into our predictions below.
  • Some newspapers publish predictions, and these are used in our model.
  • Some on-line prediction web sites were used.

We decided against using the 2010 results, or the current parliamentary standings, although we show the predictions using these two additional pieces of data, just for completeness.Poll Card

One issue that has to be considered is missing data. Predictors do not always provide predictions for all the parties, but provide an aggregated figure in Others for some of the parties. Some predictors also exclude Northern Ireland so they only supply 632 predictions, rather than the full 650. We work around this as best we can.

In order to calculate our predictions, we averaged all the polls under consideration. We normalise the figures for each party so that the total number of seats adds up to 650.

Predictions

Our predictions are shown in the table below. The Excl. 2010 column shows the predictions when the 2010 results or the current parliamentary standings are not taken into account. The Incl. 2010 results are shown just for comparison.

 

  Excl. 2010 Incl. 2010
Con 279 281
Lab 274 272
Lib Dem 24 27
SNP 46 42
Greens 1 1
Plaid Cymru 3 3
UKIP 3 3
Others 20 21

The two sets of figures are reasonably close with the obvious differences being the higher prediction of the SNP and the lower prediction of the Lib Dems, which reflects the (potential) changing fortunes of the two parties since the last general election.

Concluding Remarks

I guess, not surprisingly, we are also predicting a hund parliament, with the Conservatives having a slight lead over Labour. If our predictions are accurate, a coalition with the SNP would give a combined total of 325 seats – not quite the 326 needed to give an overall majority. Now that would be interesting!

How can academics use Twitter effectively?

twitter-245460_640I have been using Twitter for a few years now, but it has been a while since I blogged on this topic.

When I started tweeting I know that I would not be able to tweet every day, and weeks (or even months) could go by without me tweeting, and, to me, that would not give a very good impression.

Before I signed up I looked at how I could tweet automatically. You quickly come across services such a Hootsuite, which enable you to schedule tweets. This is a very nice service but you still have to do something on a regular basis. Like before, it is easy to forget, not have time etc. and you suddenly find yourself tweetless for weeks (or months).

That was the main reason I developed my own Twitter application. This enables me to do one tweet a day. This means that I am tweeting regularly and I hope that it is supplemented by a liberal smattering of ‘live’ tweets, as well as retweets.

I have automated two types of tweets.

  1. I have a list of my publications, held as bibtex, and I tweet those.
  2. I have another database, which holds (I hope) items of interest – let’s call them News.

Each day (it could be set up to do more), I choose (randomly) whether to tweet a News item, or a paper. Once that decision has been made, I just choose randomly from the relevant database/bibtex file.

But, this set me thinking. What would academics actually want from a Twitter service? I am specifically thinking about sending tweets, but I would also be happy to hear about reading them.twitter-292994_640

If you have any views, I would be really interested to know (as I might do it!).

Please just add a comment which specifies your ‘Tweeting wish list’, from your academic perspective.

 

Can Artificial Intelligence be used in the Board Room?

Computers are knocking on the door of the company boardroom

Graham Kendall, University of Nottingham

While women sitting on company boards remains a much-discussed topic, there is something new waiting to take a seat at the table: artificial intelligence, computers with company voting rights.

Deep Knowledge Ventures has appointed an algorithm called VITAL (Validating Investment Tool for Advancing Life Sciences) as a member of its board. It uses state-of-the-art analytics to assist in the process of making investment decisions in a given technology.

Of course, companies have used computer assisted analysis to analyse investment opportunities for a long time, but is the vision of a computer with equal voting rights as human board members a bit far-fetched?

Defining artificial intelligence


Alan Turing Wikimedia Commons

What does the future hold with regard to the influence of computers on business decisions – and can they ever be used in place of a human board member? The Turing Test, formulated by Alan Turing in the 1950s, provides a strict interpretation of machine intelligence. A human participant must be unable to tell whether they are communicating (through a typed, text medium) with a computer or a human. If the human participant cannot reliably tell whether their conversation partner is a computer, then Turing would argue the computer has demonstrated intelligence.


Numberphile: The Turing Test

Not everybody agrees that passing the Turing Test is enough for a computer to exhibit intelligence. In his Chinese Room argument, the Stanford philosopher John Searle described a closed room, into which a sentence written in Chinese is fed. A response emerges from the room, written in Chinese, that correctly answers the questions or conversational cues in the sentence submitted. The assumption could be made that inside the room is someone that can speak Chinese.

Instead, inside the room is a human who cannot speak Chinese but is equipped with manuals that exhaustively provide the appropriate Chinese characters to produce in response to those received. The argument holds that an appropriately programmed computer (the person in the room) could pass the Turing Test (by producing convincing Chinese) but would still not have an intelligent mind that we would regard as human intelligence (by understanding Chinese).


The Chinese Room

A computer in the boardroom

If we want computers to make business decisions and even have equal voting rights on a company board, what would it have to do in order for the other board members to have confidence in its decisions?

Part of the challenge of the Turing Test is syntax versus semantics. Compare the sentences “Fruit flies like bananas” and “Time flies like an arrow”. The sentence structure is similar but the meaning is entirely different, making it a linguistic challenge.

Even a very simple conversation relies upon a substantial amount of linguistic knowledge and understanding. Consider the following questions:

  • What was the result of the big match last night?
  • I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play? (these chess moves are from Turing’s original paper)
  • What book do you think of if I say 42?

These might seem easy for humans to understand, but are challenging for a computer. Thankfully, a computer making business decisions is not faced with such a general task as the Turing Test. But if we are serious about having a computer as a full member of a company board, what are the hurdles that need to be addressed? Here is a (almost certainly not complete) list.

  1. Access to LOTS of data: An automated approach to decision-making will require the use of big data. Company reports and accounts, economic data such as share prices, interest rates and exchange rates, and government statistics such as employment rates and house prices would all be obvious inputs. More subjective data such as newspapers, social media feeds and blogs might also be useful. Peer-reviewed scientific papers might also provide insight. Of course as always, the challenge with big data is to process the large quantities of data that will be be of different types (figures, text, charts), stored in different ways, and have missing elements.
  2. Cost: Much of the data required is likely to generate significant costs. Social media feeds may be free (but not always), but stock market information, company accounts, government data, scientific papers and so on are generally commercial products that must be paid for. In addition, there is the cost of developing and maintaining the system. The algorithm is likely to require continual development by highly skilled analysts and programmers.
  3. Complexity: Big data algorithms will be central to the boardroom decision support algorithm, but they will be underpinned by advanced analytics, many of which we are only just starting to understand and develop. To have a real impact there is likely to be some research required which would require staff with the relevant skills.

So, are we really at a point where a computer could take its place on the board? Technically it’s possible but the costs to develop and maintain, as well as subscribe to the data that is required, probably means that it is not within the reach of most companies and I suspect that the money would be better spent on a human decision maker – at least for now.

The Conversation

This article was originally published on The Conversation. Read the original article.

The science that makes us spend more in supermarkets, and feel good while we do it

The science that makes us spend more in supermarkets, and feel good while we do it

Graham Kendall, University of Nottingham

When you walk into a supermarket, you probably want to spend as little money as possible. The supermarket wants you to spend as much money as possible. Let battle commence.

As you enter the store your senses come under assault. You will often find that fresh produce (fruit, vegetables, flowers) is the first thing you see. The vibrant colours put you in a good mood, and the happier you are the more you are likely to spend.

Your sense of smell is also targeted. Freshly baked bread or roasting chickens reinforce how fresh the produce is and makes you feel hungry. You might even buy a chicken “to save you the bother of cooking one yourself”. Even your sense of hearing may come under attack. Music with a slow rhythm tends to make you move slower, meaning you spend more time in the store.

Fresh Produce at a Supermarket

 

Supermarkets exploit human nature to increase their profits. Have you ever wondered why items are sold in packs of 225g, rather than 250g? Cynics might argue that this is to make it more difficult to compare prices as we are working with unfamiliar weights. Supermarkets also rely on you not really checking what you are buying. You might assume that buying in bulk is more economic. This is not always the case. Besides, given that almost half of our food is believed to be thrown away, your savings might end up in the bin anyway.

Strategies such as those above get reported in the media on a regular basis. Mark Armstrong analysed retail discounting strategies for The Conversation last year, for example, and the Daily Mail recently published a feature on making “rip offs look like bargains”.

You might think that awareness of these strategies would negate their effectiveness, but that doesn’t appear to be the case. It would be a strong person that does not give way to an impulse buy occasionally and, for the supermarkets, the profits keep flowing.

Product placement

There are marketing strategies which you may not be aware of that also have an effect on our buying habits. Have you ever considered how supermarkets decide where to place items on the shelves and, more importantly, why they place them where they do?

When you see items on a supermarket shelf, you are actually looking at a planogram. A planogram is defined as a “diagram or model that indicates the placement of retail products on shelves in order to maximise sales”.

Planograms in action
lyzadanger

Within these planograms, one phrase commonly used is “eye level is buy level”, indicating that products positioned at eye level are likely to sell better. You may find that the more expensive options are at eye level or just below, while the store’s own brands are placed higher or lower on the shelves. Next time you are in a supermarket, just keep note of how many times you need to bend down, or stretch, to reach something you need. You might be surprised.

The “number of facings”, that is how many items of a product you can see, also has an effect on sales. The more visible a product, the higher the sales are likely to be. The location of goods in an aisle is also important. There is a school of thought that goods placed at the start of an aisle do not sell as well. A customer needs time to adjust to being in the aisle, so it takes a little time before they can decide what to buy.

You might think that designing a good planogram is about putting similar goods together; cereals, toiletries, baking goods and so on. However, supermarkets have found it makes sense to place some goods together even though they are not in the same category. Beer and crisps is an obvious example. If you are buying beer, crisps seem like a good idea, and convenience makes a purchase more likely. You may also find that they are the high quality brands, but “that’s okay, why not treat ourselves?”

This idea of placing complementary goods together is a difficult problem. Beer and crisps might seem an easy choice but this could have an effect on the overall sales of crisps, especially if the space given to crisps in other parts of the store is reduced. And what do you do with peanuts, have them near the beer as well?

Supermarkets will also want customers to buy more expensive products – a process known as “upselling”. If you want to persuade the customer to buy the more expensive brand of lager, how should you arrange the store? You still need to stock the cheaper options, for those that are really are on a budget. But for the customers that can afford it, you want them to choose the premium product. Getting that balance right is not easy. My colleagues and I are among the researchers striving to develop the perfect algorithm taking into account size, height and depth of shelves, to direct customers to the right product, at the right time.

Shoppers won’t always obey the science, but these techniques are retailers’ most effective tools in the fight for our weekly budget. The battle between supermarkets and their customers continues.

The Conversation

This article was originally published on The Conversation.
Read the original article.

What is your Erdös Number?

Paul Erdös (1913-1996) is one of the most prolific mathmeticians. He wrote over 1500 papers in his lifetime and collaborated with over 500 people. As a tribute, his friends created the Erdös number, which is a tongue in cheek way of asking how well you are associated with the top mathematicians.

Paul ErdosErdös himself has an Erdös number of 0. A co-author of Erdös has an Erdös number of 1, if you have written a paper with one of those co-authors you have an Erdös number of 2, and so on.

I found out recently that my Erdös is 3. That’s not too bad. I can never get an Erdös number of 1, but perhaps (one day) I might write a paper with somebody who has an Erdös of 1, giving me anErdös number of 2.

If you are interested in finding out what your Erdös number is, I found two very good web sites:

  1. The Erdös Number Project at http://www.oakland.edu/enp/
  2. The Microsoft Academic Search at http://academic.research.microsoft.com/VisualExplorer#2019273&1112639

If you are interested in reading more about this fascinating man, and his life, I would highly recommend The Man Who Loved Only Numbers.


Header Image: Erdös Number (downloaded from Google 26 Nov 2013, labelled as free to reuse): URL https://commons.wikimedia.org/wiki/File:Erdosnumber.png


This post was originally published at the University of Nottingham.

A Day in the Life of Pi

A day in the life of Pi

By Graham Kendall, University of Nottingham

Most people have heard of the mathematical constant Pi (?), and will know that it’s roughly 3.14. Taking inspiration from these three digits, March 14 (3/14 in the US date format) is heralded as international Pi Day, first marked by US physicist Larry Shaw in 1988.

This year brings a unique opportunity to demonstrate an entirely unnecessary degree of zeal by marking Pi Day correct to nine decimal places on March 14, 2015, at 9.26am 53sec – corresponding to 3.141592653, the first 10 digits of Pi. If you’re too busy this weekend, you could book in July 22 – another way of expressing Pi approximately is the fraction 22/7.

Pi Pie at Delft University

 

Pi is calculated as the ratio between a circle’s circumference to its diameter.

Pi is always the same value, no matter the size of the circle, which makes it an important mathematical constant.

The ancient Babylonians calculated Pi as three by taking three times the square of the circle’s radius, later refining the value to 3.125. Archimedes of Syracuse (287-212 BCE) approximated Pi by inscribing polygons on the inside and outside of a circle. By increasing the number of sides of the polygons, Pi could be calculated to higher levels of accuracy.

Even today, calculating Pi to ever increasing levels of accuracy continues – it has now been found to an accuracy of over 13 trillion digits. There’s no reason to suspect this record will remain forever, even though only about 39 decimal places are sufficient for astronomical precision. There is no reason to be more precise for practical purposes but it is good scientific sport of sorts to strive to be ever more accurate.

Ever decreasing Pi.
German

Some Properties of Pi

Pi is an irrational number, which means it cannot be accurately represented as a fraction, a/b, where a and b are integers. An approximation is to express it as 22/7 (3.1428…) which is inaccurate by 0.04025%. A closer approximation is 104348/33215, which has a far smaller error of 0.00000001056% but is still, technically, wrong.

Pi is also a transcendental number which, simplified, are numbers that cannot be reduced algebraically (more accurately a number that is not the root of any non-zero polynomial equation with rational coefficients).

The proof that Pi is transcendental was found in 1882, but it had been known for much longer that if Pi was transcendental then it would be impossible to square the circle – to construct a square with the same area as a circle.


Numberphile: Squaring the Circle

Putting Pi to use

Among the unusual uses for Pi is its relation to the nature of meandering rivers. A river’s path is described by its sinuosity, it’s tendency to wind from side to side as it traverses a plain. This is described mathematically as the length of its winding path divided by the length of the river as the crow flies. The average river has a sinuosity of about 3.14.

Albert Einstein actually made some observations about why rivers meandered as they did. He noticed that the water that flows faster around the outside of a bend, eroding that bank more quickly. This creates a larger bend. These bends eventually meet and the river forms a short cut through them. Hans-Henrik Stolum used these observations and noted a relationship with chaos theory, which suggests that, despite rivers straightening out as the rivers cut through the short cuts, the sinuosity tends to move back towards Pi.


Numberphile: Pi and Sinuosity

Further examples of where Pi appears in the real world can be seen in a BBC item written for Pi Day 2008, and this New Scientist item written for the 2010 Pi Day. For example, Pi can be found in the measurements of the Great Pyramid of Giza, the angular distances of stars in the sky and in a song by Kate Bush. Included in the lyrics were the first hundred digits, or so, of Pi, but she went slightly wrong at around digit 50.


Pi by Kate Bush

How to celebrate Pi day?

If you fancy some Pi-related entertainment for Saturday, you could try:

  • Looking up whether your birth date appears in the decimal places of Pi – mine does, starting at 200,703, although if you want to know my age you’ll have to look it up.

Birthday Boy
Oren Jack Turner
  • Memorising the first digits of Pi. Piphilology, a system of mnemonics to help you remember the digits, may help. There are even piems (Pi poems) to help you remember. You may not beat the record-holder, which currently stands at 67,000 places.
  • Examining the first million digits of Pi – you might see a pattern no one else has.
  • Looking for Pi in everyday life. For example, it has featured in Mythbusters.
  • Follow #piday2015 on Twitter, and see how others have marked the day in the past.

… and finally

Albert Einstein, one of the greatest scientists the world has known, spent some time working on Pi as it related to rivers. Is it a coincidence that Albert Einstein was born on March 14, 1879? As he would have said himself, God doesn’t play dice.


For more slices of Pi, try a taste here or here.

This article was originally published on The Conversation.
Read the original article.

The Conversation

The Christmas Present Problem: It’s Hard – NP-Hard

Santa Clause (downloded from Google 07 Dec 2014, labelleed as free to reuse)

Santa Clause (downloded from Google 07 Dec 2014, labelleed as free to reuse)

As Christmas approaches, many of us are faced with the annual problem of who to buy presents for, how much to spend, what presents to buy and how to be fair to everybody.

Is this a tough problem to solve? Perhaps some science will help?

Let’s try and state the problem a little more precisely, and make it simple by assuming that we are buying presents for just one person. That does not detract from the more complex problem, especially if we establish that buying presents for one person is hard.

  • You have to buy presents for just one person; your daughter
  • You have a certain budget; an amount of money that you cannot excced
  • You have a list of gifts that your daughter would like (we are assuming that she has been good so Father Christmas is willing to leave these gifts)
  • To make life easier, you have given your daughter 100 points, and told her to assign a certain number of points to each item. The more points she assigns, the more likely she is to get that gift. For example, if she assigns all 100 points to just one gift (and zero points to the other gifts), then she is likely to get that gift. But if she assigns equal points to all gifts, then she is equally likely to get each item, but not all of them, as the total costs of the gifts bought cannot be more tha the overall budget.

Your task is to buy as many presents as possible which maximizes the number of points, whilst staying within your budget.

How difficult do you think it is to find a solution to this problem so that you daughter gets the best possible gifts such that there is not another selection of gifts that has a higher overall points value?

In fact, it is surprisingly difficult, especially as the number of available presents (and people you are buying for)  increases.

Pile of wrapped presents: http://christmasstockimages.com/free/objects/slides/many_christmas_gifts.htm (CC 3.0)

Pile of wrapped presents: http://christmasstockimages.com/free/objects/slides/many_christmas_gifts.htm (CC 3.0)

The problem is the so called Knapsack Problem, so named as you have knapsack which can carry a certain weight (your budget). Each item has a weight (cost of each item) but also a value (the number of points). You have to fill the knapsack, maximizing the value, but keeping the sum of all the weights less than the capacity of your knapsack.

Or, in terms of the Christmas Present Problem, you need to buy the best selection of presents to give the maximum number of points, while staying within your overall budget.

The Knapsack Problem is known to be an NP-Hard problem. These type of problems do not, as yet, have any efficient algorithm that can produce the optimal solution in reasonable time. At least for large sized problem instances. Hopefully, you can still give your daughter the best set of presents!

In fact, if you can find an efficient algorithm you could win one million dollars.

How to get ants to solve a chess problem

How to get ants to solve a chess problem

By Graham Kendall, University of Nottingham

Take a set of chess pieces and throw them all away except for one knight. Place the knight on any one of the 64 squares of a chess board.

Can you make 63 legal moves so that you visit every square on the chess board exactly once? As a reminder, a knight can move two squares in a straight line, followed by a ninety degree turn and a move of one further square. It might seem like a hard task, but this set of moves, called the knight’s tour, can be achieved in too many ways to count.

The knight's tour

If you are able to make the 63 moves and end up on a square from which you can move back to the original square with the 64th legal move, then this is known as a closed tour. Other tours are called open tours.

Mathematicians have pondered how many closed tours exist, and they have come up with an astonishing number: more than 26 trillion. There are so many more open tours that we do not know the exact number.

Both Philip Hingston and I were so captivated by the knight’s tour problem that we wanted to find a different way to solve it. We found that motivation in nature – specifically in ants.

Ants use a certain pattern, or algorithm, to forage for food. This algorithm can be used to tackle many types of problems including the Travelling Salesman Problem and Vehicle Routing Problems. Philip and Graham wondered if they could use the ant colony optimisation algorithm to solve the knight’s tour problem.

Here’s how that algorithm works: a computer program is used to simulate a population of ants. These ants are assigned the task to find a solution to a problem. As each ant goes about their task they lay a pheromone trail – a smelly substance that ants use to communicate with each other. In the simulated algorithm, the most successful ants (the ones that solve the problem better), lay more pheromone than those that perform poorly.


L. Shyamal

We repeat this procedure many times (perhaps millions of times). Through repetitions, the pheromone trails on good solutions increase and they decrease on the poorer solutions due to evaporation, which is also programmed in the simulation algorithm.

In the simulation to solve the knight’s tour problem, the ants could only make legal knight moves and were restricted to stay within the confines of the chess board. If an ant successfully completes a tour then we reinforce that tour by depositing more pheromone on that tour, when compared to a tour that was not a full tour.

Ants which attempt to find later tours are more likely to follow higher levels of pheromone. This means that they are more likely to make the same moves as previously successful ants.

There is a balance to be struck. If the ants follow the successful ants too rigidly, then the algorithm will quickly converge to a single tour. If we encourage the ants too much, not to follow the pheromone of previous ants, then than they will just act randomly. So it is a case of tuning the algorithm’s parameters to try and find a good balance.

Using this algorithm, we were able to find almost half a million tours. This was a significant improvement over previous work, which was based on a genetic algorithm. These algorithms emulate Charles Darwin’s principle of natural evolution – survival of the fittest. Fitter members (those that perform well on the problem at hand) of a simulated population survive and weaker members die off.

It is not easy to say why the ant algorithm performed so well, when compared to the genetic algorithm. Perhaps it was down to tuning the algorithmic parameters, or perhaps ants really do like to play chess!

The knight’s tour problem was being worked on as far back as 840 AD. Little did those problem-solvers know that ants, albeit simulated ones, would be tackling the same puzzle more than 1,000 years in the future.

The Conversation

Graham Kendall does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations.

This article was originally published on The Conversation.
Read the original article.

My First Java Project

My first Java project

Java Programming. Downloaded from Googled, labelled as free to reuse, under Wikipedia Commons. URL: http://commons.wikimedia.org/wiki/File:Java_Programming_Cover.jpg

It’s been a while since I decided to use Java as my new programming language of choice. Since my last post I have been honing my Java skills, with my first java project.

I had a project I needed to do that is about analyzing academic papers, comparing them in various ways, sorting them, writing out reports etc.

My first instinct was to use PHP (which I learnt a few years ago – and I really like the language) but the only development environment I have is on a server. I could set up a development environment on my desktop, but it did not seem worth it for the times that I would use that environment in anger.

The other option was C++, but that went against what I was trying to do for this project, as you’ll understand if you read earlier posts in this project.

So, the obvious solution was to use Java.

As my first Java project, it seemed to have suitable complexity for a first time Java user, but allowing for the fact that I have a programming background. I suspect that the project would not be suitable for a complete newbie to programming.

The details of what the project has to deliver is not that important but the lessons learnt are important, as these are things that I can use in later, larger, projects. These included:

  • Reading in a CSV (Comma Separated Variable) file. This is always a useful thing to be able to do. It’s one of those fall backs that is useful to have in your armoury as it makes accessing files such as spreadsheets and databases easy to do as they invariably have an option to save as a CSV file. Of course, it’s usually possible to access spreadsheets and databases directly (which is what I used to do under C++ using ODBC) but for this project I thought I would get to grips with CSV first. After some searching, I came across a package called CsvReader. It’s not the most basic package (which is a good thing), and it did have good reviews. I had some challenges installing it, but that was not to do with the package but the fact that this is the first time I have installed a Java package. Once I had sorted that out, it worked perfectly.
  • Writing out a latex/PDF file. One of the things I wanted to do was produce a half decent looking report. My initial thought was to write out a text file and then use Word to manually format it. This, for many reasons, is not a good idea; not least of all as it would be labour intensive every time I produced a new version of the report. I like to think that I had a flash of inspiration (but perhaps it is the obvious thing to do) and I decided to write out a latex file that I would convert to PDF via a suitable editor (my editor of choice at the moment is TexStudio, although I have used WinEdt in the past). This seems to work pretty well and I can now produce nice looking reports, without having to worry too much about the look of the final document as Latex will handle this as long as I have some idea of the structure as I develop the program. Of course, the beauty is that the report is complete, once the program has been run, without any need for any other processing/formatting. In a future blog I’ll provide a few more details as to how I did this as a) I think it’s interesting and b) I am sure that there are better ways of doing things and I’d like to get some ideas for developing the system further.
  • Writing files. An obvious thing I had to do when writing out latex files, was to learn out to write out files. As any Java programmer will know, this is very easy using the PrintWriter package.
  • Sorting arrays. I had a need to sort an array on one of the fields. In fact (see below), this involved sorting an array of class instantiations. This was probably the most difficult thing I did when developing my first Java project, and it took a while before it came together. But following a few examples from the web, and I had this working pretty quickly. It certainly seems easier than C++, which always seemed complicated and involved having to have friends of classes. I’m sure that there are easier ways in C++ but I never really got my head around it.
  • Classes and data. This is not really a Java thing and maybe I am totally wrong, but I quickly found that my data and member functions were making the class quite large, so I decided to have a class called (say) ‘Papers’ and another class called ‘PapersData’. The PapersData class simply holds all the data and the Papers class provides access to it, as well as providing all the other functionality. This leads to (at least in the way I do it) too many getters and setters, but it does separate the data/functions. But, the main reason I did this is because I wanted to hold different data types for my various objects and an ArrayList (or other array type objects) would not allow this. I am happy to be corrected but I was trying to recreate the struct (i.e. a class) type concept of C++. Anyhow, it seemed to work for what I wanted, but whether it would scale to larger projects is another matter.

The system I have ended up with, seems to work well. Whether it is scalable to larger projects remains to be seen, but it has certainly been a very good learning experience. I have doubts that if I was trying to learn C++, even with a good programmimg background, I would have progressed as fast as I have with Java.

No doubt, other people would pick up C++ faster than I would have done but, for me, Java is a lot easier to learn.

The other big bonus is the Eclipse IDE.  No doubt, I have only scratched the surface but the autocomplete (Ctrl/0) and the suggested error correction (Ctrl/1) are my new best friends!

So, as my first Java project, I think, has been a worth while exercise and I have learnt a great deal.

 

Videos on the basics of Java

Videos on the basics of Java

Whilst looking through youtube (see my previous posts – here and here – for why I am doing this), I have come across some very nice youtube videos on the basics of Java. A very nice series of videos by Jose Vidal starts from the the basics of Java:

Eclipse Java ‘Hello World’ Introduction Tutorial

)

… but moves quickly on to topics such as:

Javadoc

)

Constructors

)

Java Wrapper Classes

)

Java Immutable Classes

)

Java Packages

)

Java Arrays

)

Java Expanding Array

)

Java Selection Sort

)

Java Multidimensional Array

)

Java Inheritance Tutorial

)

Java Polymorphism

)

Java Reading a CSV file

)

Java Reading and Writing Binary Files

)

Java Serializable interface: Reading and writing Objects to a file

)

Java Arraylist

)

Java HashSet HashMap Demonstration

)

Java Generics Tutorial

)

Java Iterators

)

Java Building your own Iterator using Inner Classes

)

Eclipse Tips and Tricks for Beginners Tutorial

)

Java Set: HashSet TreeSet LinkedHashSet

)

Jose has many other videos in his collection and if you find the above useful you should check out his youtube channel.

 

Other useful videos

As I have said before, there are many (many, many) youtube videos out there that can help teach the basics of Java. They are too numerous to list, but here are just a few that I have either found helpful, or that I plan to watch later (and I might just keep adding to this list so that all the videos I found useful are kept in one place).

In no particular order.

Iterators Part 1 (Java)

)

I realise that many of the topics above are very basic for a seasoned programmer but I think that there is also some advanced material there as well (e.g. inheritance, polymorphism, iterators etc.), so hopefully it will be of interest to a wide variety of people.

I can certainly say (from my own persepctive) that to get your head around this in MFC/VS/C++ would certainly be a lot mor work (for me anyway).

The football prediction project

You can read more about this project by looking at the posts for this football prediction project.