Thursday, February 4, 2016

Do people in South Florida and Southern California spend more money on cars than those Elsewhere?


I wrote this article for my Typepad Blog but wanted to share it with my readers here as well.


This is the sort of question I have pondered for quite some time. As you see thousand dollar millionaires driving around in luxury automobiles in South Florida and Southern California. A friend might ask you when propped up at a bar on a Friday night if this is true. It’s the sort of unanswerable question that had me intrigued when I first thought of it. It came to me as part of my daily reverie. My line of thinking was that if you spend more time out in the sun and generally have better weather, you’re going to want to spend more time outside driving to places and going to the beach. A car is more like a home to someone who drives a lot and so they are prepared to pay more for it.

Like a lot of theories, the difficulty lies in how you go about proving it. It was all very well for Newton to have an apple fall on his head. He could prove the idea of gravity from that (not that I am comparing my theories to Newton, you understand)

First define the question.
Before attempting any analysis, I need to set the scope. So to be clear, the actual question I am going attempt to answer is:
Do workers across a spectrum of employments from South Florida and upscale communities on the West Coast of the United States spend more money on purchasing cars in their price range compared to the same workers in say the Midwest and Mid-Atlantic? We can assume there may be a different variety of cars purchased depending on location. For example there will be more SUV’s and All Wheel Drive vehicles in mountain towns that receive snow compared to Seattle where there is precipitation but very little snow.

Where to start on this journey then?
Well first off, I have to choose some counties to compare. I am not going to be able to compare data from every county although I would be forever grateful and willing to pay somebody to help me expand this data to encompass purchasing data from all available U.S. zipcodes. Until then I chose zipcodes each from the following States, Cities, and regions including Americas most famous zipcode 90210

California
90210 (Beverly Hills, CA) 
92651 (Laguna Beach, CA) 

Florida
33401 (Palm Beach, Florida) 
33134 (Coral Gables, Miami) 

Maryland
20847 (Rockville, MD)
Maryland
63005 (Chesterfield, MO) 
Missouri


Which cars should I use for my thesis?
Like the states, the choice of cars is bewildering. I decided to choose a small selection of popular models and group them by price.

Originally, I really wanted to find the actual prices for cars sold but finding the data proved to be like finding a needle in a haystack. Whilst there are automotive economic indicators there seems to be no historical car sale records for this sort of data. It is possible through the big credit agencies they may have some additional data that may be helpful for future case studies conducted.  I really thought the project was doomed without that data though. As its been a theory I’ve been talking about researching for the last four years I was glad that I was finally moving forward and working with the data I had at this point.

It then occurred to me that I could use for sale prices instead. There are a plethora of sites which offer car prices by county, so all I had to do was run queries by zipcode and car on these databases and hey presto, the data was available (and for free!)
Of course, I would have preferred to use real prices, but seeing as I am applying the same technique to find prices for each zipcode and car type, the approach holds up.

Not everyone will buy a new car.
Well, yes, there is that. Additionally cars depreciate at differing rates. For the sake of argument, I could have assumed that everyone buys a new car, but this really doesn’t represent reality. I have decided therefore to include car prices from 2013-2015.

The following models were used to give us an understanding of different vehicle classes amongst different locations:

Entry Level (new price less than 25k)
Mazda Miata
Nissan Rogue 
Nissan Altima
Honda Accord

Mid Range  (new price less than 40k)
Toyota 4Runner
Ford F-150
Honda Odyssey
Kia Sedona 

Prestige (new price greater than 40k)
Mercedes S-600 
Audi R-8
Chevy Corvette

So who is coming with me on my journey?
Like all the best road trip stories; Goldilocks, the Birth of Jesus and the Blind mice, three was considered the appropriate number.
Similar to the car selection, I separated the occupations into partially skilled, skilled, and highly skilled professions and that one profession from each would be enough. However, I had to ensure that these professions represented a large proportion of the working population for each region. (For the sticklers amongst you, the word skilled refers to the amount of formal education required to perform these jobs. Even someone working in retail will require basic math skills taught at school)

Partially Skilled
Retail Sales Representative

Skilled
Nurse

Highly Skilled
Physician

You can be pretty certain that if you walk into any major town, you will find a good proportion of the population employed by these sectors. Given the growth of the aging baby-boomers, the number employed in these sorts of jobs will only increase with time.

How will I find out about salaries for each of these professions in each zip code?
In the same way I was concerned about finding proper car prices, I didn’t think it would be possible to get average salary information by zip code for each profession.

Luckily for me, Google really is your friend and the site Indeed appears to calculate the average salary for each zip code by profession too. The results are shown below.

Job
State
Salary $
Physician
33401 (Palm Beach, Florida) 
86000
Physician
33134 (Coral Gables, Miami) 
82000
Physician
90210 (Beverly Hills, CA) 
97000
Physician
92651 (Laguna Beach, CA) 
83000
Physician
63005 (Chesterfield, MO) 
86000
Physician
20847 (Rockville, MD)
108000
Nurse
33401 (Palm Beach, Florida) 
65000
Nurse
33134 (Coral Gables, Miami) 
61000
Nurse
90210 (Beverly Hills, CA) 
72000
Nurse
92651 (Laguna Beach, CA) 
62000
Nurse
63005 (Chesterfield, MO) 
64000
Nurse
20847 (Rockville, MD)
80000
Retail Salesperson
33401 (Palm Beach, Florida) 
52000
Retail Salesperson
33134 (Coral Gables, Miami) 
49000
Retail Salesperson
90210 (Beverly Hills, CA) 
57000
Retail Salesperson
92651 (Laguna Beach, CA) 
49000
Retail Salesperson
63005 (Chesterfield, MO) 
51000
Retail Salesperson
20847 (Rockville, MD)
64000

I have to admit I was a little disappointed with the physician pay. I really thought that given the high profile physicians have, their pay would be significantly higher than the other two. However, this is not the case. Had I chosen a specialist doctor such as an oncologist or plastic surgeon, the salary would have been significantly higher.
On reflection, however, I felt that it would be “cheating” to do this. Added to which there are far fewer specialists than generalist such as physicians.

Assumptions about the data
Additionally, I have ignored any other type of cost variable such as housing or food. There may well be regional differences which would impact what a person can spend on a car, but their disposable income and those considerations are not included in this data analysis.

Now I have the data, how do I figure out the answer?

As with any analysis it is the construction of the dataset which takes the longest. The steps I need to perform are

1)     Filter all of the car data to only include cars from the years 2013-2015
2)     Find the average price for each car group by postcode
3)     Link the car group average price per postcode to the salary of the worker by postcode.
*Here I will assume that
a.     a physician will buy a premium car
b.     a nurse will buy a middle market
c.     a retail worker will buy an entry level car
4)     **Percentage cost =  Car price / (salary * 5)
5)     Produce a bar graph of percentage costs paid per worker and postcode

* Of course, some doctors will buy an entry level car and retail salesperson may fork out for a premium vehicle, but this exercise is only show the theoretical cost of a car to a worker.
** to simulate paying back the car over 5 years which is the current norm

By merging the data in Trifacta and then putting all this data into Tableau, I produced the following results
Retail Salesperson
Nurse
Physician
Conclusive results
From all of the above comparisons, the Mid-Atlantic is the cheapest area to purchase a car irrespective of your job. For the entry level and mid-range cars, it seems that the Mid-West is the most expensive place to purchase your car if you work there.
However, the most interesting was the fact that the prestigious cars appear to be the most expensive in California and Florida, with the peculiar exception of the Mercedes in the Mid-West. This is counterintuitive to my original thinking as I believed that there was a much higher percentage of income spent on luxury cars in these locations and were most likely cheaper to purchase as more of these cars were sold in those regions.

Inconclusive results
What I originally set out to prove, namely that a higher percentage of people in South Florida and Southern California spend a higher percentage of their income on their automobiles, does not appear true when the data is analyzed.  

Possible explanations About Car Pricing
Given the higher wealth that is generally associated with southern coastal states, perhaps the margins for premium vehicles are higher, simply because they are more affordable.
However for the mid-level vehicles, it is possible that people value space and comfort above status and knowing this, the mid-level cars are more expensive.



References

Car loan length

Salaries


Car prices & Car Stuff

Employment populations


 
Official website