Sunday, January 28, 2018

When it comes to Amazon's HQ2, you should be careful what you wish for

By Murtaza Haider and Stephen Moranis
Note: This article originally appeared in the Financial Post on January 25, 2018 Inc. has turned the search for a home for its second headquarters (HQ2) into an episode of The Bachelorette, with cities across North America trying to woo the online retailer.
The Seattle-based tech giant has narrowed down the choice to 20 cities, with Toronto being the only Canadian location in the running.
While many in Toronto, including its mayor, are hoping to be the ideal suitor for Amazon HQ2, one must be mindful of the challenges such a union may pose.
Amazon announced in September last year that its second headquarters will employ 50,000 high-earners with an average salary of US$100,000. It will also require 8 million square feet (SFT) of office and commercial space.
A capacity-constrained city with a perennial shortage of affordable housing and limited transport capacity, Toronto may be courting trouble by pursuing thousands of additional highly-paid workers. If you think housing prices and rents are unaffordable now, wait until the Amazon code warriors land to fight you for housing or a seat on the subway.
The tech giants do command a much more favourable view in North America than they do in Europe. Still, their reception varies, especially in the cities where these firms are domiciled. Consider San Francisco, which is home to not one but many tech giants and ever mushrooming startups. The city attracts high-earning tech talent from across the globe to staff innovative labs and R&D departments.
These highly paid workers routinely outbid locals and other workers in housing and other markets. No longer can one ask for a conditional sale offer that is subject to financing because a 20-something whiz kid will readily pay cash to push other bidders aside.
We wonder whether Toronto’s residents, or those of whichever city ultimately wins Amazon’s heart, will face the same competition from Amazon employees as do the residents of Seattle? The answer lies in the relative affordability gap.
Amazon employees with an average income of US$100,000 will compete against Toronto residents whose individual median income in 2015 was just $30,089. It is quite likely that the bidding wars that high-earning tech workers have won hands down in other cities will end in their favour in the city chosen for Amazon HQ2.
While we are mindful of the challenges that Amazon HQ2 may pose for a capacity-constrained Toronto, we are also alive to the opportunities it will present. For starters, Toronto can use 50,000 high-paying jobs.


The emergence of the gig economy has had an adverse impact in the City of Toronto, where the employment growth has largely concentrated in the part-time category. Between 2006 and 2016, full-time jobs grew by a mere 8.7 per cent in Toronto, while the number of part-time jobs grew at four times that rate.
While being the largest employment hub in Canada, with an inventory of roughly 180 million square feet, an influx of 8 million square feet of first-rate office space will improve the overall quality of commercial real estate in Toronto. It could also be a boon for office construction and a significant source of new property tax revenue for the city.
But those hoping the city itself might make money should seriously consider the fate of cities lucky enough to host the Olympics, which more often than not end up costing cities billions more than they budgeted for.
Toronto may still pursue Amazon HQ2, but it should do so with the full knowledge of its strengths and vulnerabilities. At the very least, it should create contingency plans to address the resulting infrastructure deficit (not just public transit) and housing affordability issues before it throws open its doors for Amazon.
Murtaza Haider is an associate professor at Ryerson University. Stephen Moranis is a real estate industry veteran. They can be reached at

Friday, June 16, 2017

Did the cold weather put a chill on Toronto’s housing market?

Toronto’s housing market took a dive in May. After years of record highs in housing sales and prices, the hype seems to have evaporated. While some link the slowdown to the Ontario government’s legislation to tighten lending in housing markets, one should also factor in the unusually cold, dark, and wet weather in May that felt more like a ‘May-be.'
Housing sales in the greater Toronto area (GTA) were down 23% last month from a year earlier. However, the average sales price was 14.8% higher than the price in May 2016. On a month-by-month basis, housing prices in May were down by 6% than the prices in April.

The declining numbers have alarmed homebuyers, sellers, brokerages, and governments. Many are questioning if the Ontario government’s intervention has a more adverse impact than was intended. Homebuyers, who have not yet closed on properties, are wondering whether they have paid too much, while sellers are rushing to list properties to benefit from high housing prices that appear to be past their peak.

While it is too early to determine the ‘causal impact’ of the legislative changes introduced in April, which included a 15% tax on foreign home buyers, one must also consider other mitigating factors that might have affected Toronto’s housing market. We must even consider the influence of the weather.       

The unusually cold weather in May might have had a chilling effect on housing sales. Typically, housing markets start to heat up in April while being in synch with the rising temperatures. May 2017 was unusually wet. Toronto received a total of 157 mm of precipitation last month compared to 25 mm a year ago. The unusually high rainfall caused flooding all across Ontario. In downtown Toronto, Lake Ontario water rushed into lakefront condos.  At the same time, May 2017 was unusually cooler than last year. The average temperature last month was 12 degrees Celsius compared to 16 degrees in May 2016. May was also unusually dark with much less sunshine. However, Toronto saw this trend since January 2017 when it received a mere 50 hours of sunlight compared to the seasonal average of 85 hours.

 So why should an unusually cold, dark, and wet weather have any impact on housing markets? Research has shown that weather and atmosphere influence consumer behavior. Retail experts call this phenomenon ‘store atmospherics’ where a store’s environment is altered to enhance consumer behavior that may promote sales. It applies to housing markets as well. Researchers discovered that adverse weather has a significant, yet the short-term effect on economic activity. Writing in Real Estate Economics, John Goodman Jr. found a slight adverse impact of unseasonable weather on housing markets. In related work, researchers found that sale prices of homes with central air-conditioning and swimming pools are higher for sales recorded in summer months.

There are other factors to consider in assessing the market dip. The Ontario government’s regulations to tighten housing markets could have encouraged some homebuyers to advance their purchase to avoid uncertainty. The government’s plans to impose new restrictions on housing markets were known in advance of their announcement in April. Investors are risk and uncertainty averse. Hence some homebuyers could have advanced their purchase to March when the sales unexpectedly jumped by 50% over February 2017. As for those who could not advance their purchase to March, they may have decided to sit through this confusion and wait for calmer markets to prevail.

In earlier research, we documented a similar trend for housing sales in Toronto, when sales escalated in 2007 in advance of Toronto’s new land transfer tax, which was implemented in February 2008. The additional sales recorded in 2007 meant that fewer sales were realized in 2008. The sales activity returned to the long-term trends in a couple of years. 

And if this was not enough, financial troubles at the alternative mortgage lender, Home Capital, spooked borrowers who were not deemed mortgage worthy by the mainstream Canadian banks. Many real estate professionals believe the cumulative effect of unseasonal weather, tightening of mortgage regulations, and troubles at alternative lenders were likely the reason behind the declining housing sales and prices.

The roof is not collapsing on Toronto’s housing market. The decline in sales and prices is a rational response by homebuyers and sellers who are reacting to Ontario government’s initiatives to tighten lending in housing markets. The cold, dark, and wet weather certainly did not help either.

Wednesday, September 14, 2016

Data Science 101, now online

We are delighted to note that IBM's has launched the quintessential introductory course on data science aptly named Data Science 101.

The target audience for the course is the uninitiated cohort that is curious about data science and would like to take the baby steps to a career in data and analytics. Needless to say, the course is for absolute beginners.

To get a taste of the course, watch the following video "What is Data Science?

Here is the curriculum:

  • Module 1 - Defining Data Science
    • What is data science?
    • There are many paths to data science
    • Any advice for a new data scientist?
    • What is the cloud?
    • "Data Science: The Sexiest Job in the 21st Century"
  • Module 2 - What do data science people do?
    • A day in the life of a data science person
    • R versus Python?
    • Data science tools and technology
    • "Regression"
  • Module 3 - Data Science in Business
    • How should companies get started in data science?
    • R versus Python
    • Tips for recruiting data science people
    • "The Final Deliverable"
  • Module 4 - Use Cases for Data Science
    • Applications for data science
    • "The Report Structure"
  • Module 5 -Data Science People
    • Things data science people say
    • "What Makes Someone a Data Scientist?"
Want to learn more about IBM's Big Data University, Click HERE.

Thursday, September 1, 2016

The X-Factors: Where 0 means 1

Hadley Wickham in a recent blog post mentioned that "Factors have a bad rap in R because they often turn up when you don’t want them." I believe Factors are an even bigger concern. They not only turn up where you don't want them, but they also turn things around when you don't want them to.

Consider the following example where I present a data set with two variables: and y. I represent age in years as 'y' and gender as a binary (0/1) variable as 'x' where 1 represents males.

I compute the means for the two variables as follows:

The average age is 43.6 years, and 0.454 suggests that 45.4% of the sample comprises males. So far so good. 

Now let's see what happens when I convert x into a factor variable using the following syntax:

The above code adds a new variable male to the data set, and assigns labels female and male to the categories 0 and 1 respectively.

I compute the average age for males and females as follows:

See what happens when I try to compute the mean for the variable 'male'.

Once you factor a variable, you can't compute statistics such as mean or standard deviation. To do so, you need to declare the factor variable as numeric. I create a new variable gender that converts the male variable to a numeric one.

I recompute the means below. 

Note that the average for males is 1.45 and not 0.45. Why? When we created the factor variable, it turned zeros into ones and ones into twos. Let's look at the data set below:

Several algorithms in R expect the factor variable to be of 0/1 form. If this condition is not satisfied, the command returns an error. For instance, when I try to estimate the logit model with gender as the dependent variable and as the explanatory variable, R generates the following error:

Factor or no factor, I would prefer my zeros to stay as zeros!

Monday, August 22, 2016

Five Questions about Data Science


Recently, we were able to ask five questions of Murtaza Haider, about the new book from IBM Press called “Getting Started with Data Science: Making Sense of Data with Analytics.” Below, the author talks about the benefits of data science in today’s professional world.

Getting Started with Data Science

1. What are some examples of data science altering or impacting traditional professional roles already?

Only a few years ago there did not exist a job with the title Chief data scientist. But that was then. Small and large corporations, and increasingly government agencies are putting together teams of data scientists and analysts under the leadership of Chief data scientists. Even White House has a Chief data scientist position, currently held by Dr. DJ Patel.

The traditional role for those who analyzed data was that of a computer programmer or a statistician. In the past, firms collected large amounts of data to archive rather than to subject it to analytics to assist with smart decision-making. Companies did not see value in turning data into insights and instead relied on the gut feeling of managers and anecdotal evidence to make decisions.

Big data and analytics have alerted businesses and governments to the latent potential of turning bits and bytes into profits. To enable this transformation, hundreds of thousands of data scientists and analysts are needed. Recent reports suggest that the shortage of such professionals will be in millions. No wonder we see hundreds of postings for data scientists on LinkedIn.

As businesses increasingly depend upon analytics-driven decision making, data scientists and analysts are simultaneously becoming front-office superstars, which is quite a change from them being the back office workers in the past.

2. What steps can a professional take today to learn how and why to implement data science into their current role?

Sooner than later, workers will find their managers asking them to assume additional responsibilities that would involve dealing with data, and either generating or consuming analytics. Smart professionals, who are uninitiated in data science, would therefore proactively address this shortcoming in their portfolio by acquiring skills in data science and analytics. Fortunately, in the world awash with data, the opportunities to acquire analytic skills are also ubiquitous.

For starters, professionals should consider enrolling in open online courses offered by the likes of Coursera and These platforms offer a wide variety of training opportunities for beginners and advanced users of data and analytics. At the same time, most of these offerings are free.

For those professionals who would like to pursue a more structured approach, I suggest that they consider continuing education programs offered by the local universities focusing on data and analytics. While working full-time, the professionals can take part-time courses in data science to fill the gap in their learning and be ready to embrace impending change in their roles.

3. Do you need programming experience to get started in data? What kind of methods and techniques can you utilize in a program more commonly used, such as Excel?

Computer programming skills are a definite plus for data scientists, but they are certainly not a limiting factor that would prevent those trained in other disciplines from joining the world of data scientists. In my book, Getting Started with Data Science, I mentioned examples of individuals who took short courses in data science and programming after graduating from non-empirical disciplines, and subsequently were hired in data scientist roles that paid lucrative salaries.

The choice of analytics tools depends largely on the discipline and the type of organization you are currently working for or intend to work for in the future. If you intend to work for corporations that generate real big data, such as telecom and Internet-based establishments, you need to be proficient in big data tools, such as Spark and Hadoop. If you would like to be employed in the industry that tracks social media, you would require skills in natural language programming and proficiency in languages such as Python. If you happen to be interested in a traditional market research firm, you need proficiency in analytics software, such as SPSS and R.

If your focus is on small and medium size enterprises, proficiency in Excel could be a great asset, which would allow you to deploy its analytics capabilities, such as Pivot Tables, to work with small sized data.

A successful data scientist is one who knows some programming, basic understanding of statistical principles, possesses a curious mind, and is capable of telling great stories. I argue that without the storytelling capabilities, a data scientist will be limited in his or her abilities to become a leader in the field.

4. How do you see data science affecting education and training moving forward? What benefits will it bring to learning at all levels?

Schools, colleges, universities and others involved in education and learning are putting big data and analytics to good use. Universities are crunching large amounts of data to determine what gaps in learning at the high school level act as impediments to success in the future. Schools are improving not just curriculum, but also other strategies to improve learning outcomes. For instance, research in India using large amounts of data showed that when children in low-income communities were offered free meals at school, their dropout rates declined and their academic achievements improved.

Big data and analytics provide instructors and administrators the opportunity to test their hypothesis about what works and what doesn’t in learning, and replace anecdotes with hard evidence to improve pedagogy and learning. Learning has taken a new shape and form with open online courses in all disciplines. These transformative changes in learning have been enabled by advances in information and communication technologies, and the ability to store massive amounts of data.

5. Do you think that modern governments and societies are prepared for what changes that big data and data science might bring to the world?

Change is inevitable. Despite what modern governments and societies like, they would have to embrace change. Fortunately, smart governments and societies have already embraced data-driven decision-making and evidence-based planning. Governments in developing countries are already using data and analytics to devise effective poverty-reducing strategies. Municipal governments in developed economies are using data and advanced analytics to find solutions to traffic congestion. Research in health and well-being is leveraging big data to discover new medicines and cures for illnesses that challenge us all.

As societies embrace data and analytics as tools to engineer prosperity and well-being, our collective abilities to achieve a better tomorrow will be further enhanced.

Wednesday, August 10, 2016

So you want to be a data scientist

The New York Times made it look so easy. Take a few courses in data science and a web-based startup will readily pay top dollars for your newly acquired skills.

Since the McKinsey Global Institute reported on the impending shortage of data crunchers, the wanna be data scientists are searching for learning opportunities in big data analytics. Newspaper coverage suggests that even with limited previous exposure to empirics, one may enroll in MOOCs or join programming boot camps to establish one's bonafides.

In a recent blog on, Meta S. Brown, the author of Data Mining for Dummies, gave four reasons not to get an advanced degree in data science. I, on the other hand, believe that a structured learning environment is exactly what many need to enable the career change they have contemplated for years but have not moved on it.

It all depends on upon what kind of a learner you are. If you are a disciplined, self-motivated, self-actuated individual, you can pick up the skills by attending MOOCs or participating in coding boot camps.

But if you are like the rest of us, who once enrolled in a free online course, but didn't complete it, you need some structure. A degree or a certificate in data science or business analytics is exactly what you need to upgrade your skills and be part of the network that will help you reorient your career.

In my book, Getting Started with Data Science, I mentioned Paul Minton, who was making $20,000 serving tables in New York. However, a three-moth programming course at the Zipfian Academy turned his life around. He earned over $100,000 in 2014 as a data scientist for a web startup in San Francisco. "Six figures, right off the bat ... To me, it was astonishing," he told The New York Times.

When the inspiring data scientists think of a career in the 'glamorous' world of big data and analytics, they think of Mr. Minton. His story, though a bit Cinderella-ish, is true, but rare. He works for! However, not everyone should expect a similar outcome. In addition to good fortune, Mr. Minton had majored in math in his undergraduate training, and we all know that math helps. It will be unwise, however, to assume that with almost no  empirical background, one can master the 
complex world of data and algorithms in a matter of a few weeks and be gainfully employed.

While speaking at meet-ups organized by IBM's BigDataUniversity, I encounter dozens of enthusiasts who are keen to start training in data science but do not know where to begin. I advise them to build on their core competencies and domain knowledge. For instance, if you studied journalism or creative writing as an undergraduate, you might want to learn how to analyze socioeconomic data instead of trying to set up Hadoop clusters, a big data task best left to computer scientists and engineers.

If you are a disciplined learner, you can explore data science training offered as MOOCs. Coursera, one of the largest MOOCs platform, listed several data science courses among the top 10 most popular courses in 2015. IBM's Big Data University (BDU) is another platform dedicated to promoting training in data science and analytics. Not only BDU offers similar resources for online learning as other platforms, it also offers cloud-based resources for hands-on training through the Data Scientist Workbench.

The Workbench provides the state-of-the-art computing solutions for regular-sized data. These include RPython, and OpenRefine. To wrangle big data, the Workbench offers Hadoop and Spark-based solutions. Such coupling of computing infrastructure with online learning resources frees the new learners from the concerns about installing and maintaining software and clustering hardware.

For learners who would prefer a structured learning environment, they also have several options. They can register for courses or certificates offered by universities' continuing education faculties, enroll in an online graduate degree in data science, or take a more traditional approach of enrolling in a full- or part-time Master's program.

A good place to search for learning opportunities is the KDNuggets website that maintains detailed lists of post-graduate programs in data science including full-time, part-time, and online masters and other certifications.

Once you have earned some credentials, you still have to prove your worth to future employers. If you are making a switch from another career, your experience may not be of much use in your pursuits in the data-centric world. My advice to the novice data scientists lacking experience is to ask the potential employer not necessarily for a job, but instead for a data set and a puzzle. If you can solve a data-oriented problem for a firm as part of the vetting process, you can overcome the shortcomings in your résumé.

For those who are still on the fence thinking whether to take the plunge into the world of big data and analytics, they should know that the demand for data scientists far exceeds the capacity of the universities and colleges to produce them. This is unlikely to change shortly. Act now and embrace data. 

Wednesday, July 27, 2016

Book Review: Getting Started With Data Science

I PROGRAMMER's Kay Ewbank's reviews Getting Started with Data Science: Making Sense of Data with Analytics.

By Kay Ewbank

If you've enjoyed books such as Freakonomics or Outliers, you'll feel at home reading this book as it uses a similar approach; take an interesting question such as 'Does the higher price of cigarettes deter smoking?', and use that as the basis for some data analysis.

The aim is to teach you how to do your own analyses. Haider works through the examples in R, Stata, SPSS and SAS. Within the book the examples are worked mainly in R, and one of the other languages. The code for the other languages is available for download from the IBM Press website, along with details of how to use it. 

The book opens with a chapter called 'the bazaar of storytellers' that discusses what data science is and gives the author's definition of a data scientist. The next chapter, data in the 24/7 connected world, identifies sources of data that you can analyse, and also introduces the concept of big data. Chapter three looks at how data becomes meaningful when it is used as the basis for 'stories'. Haider's view is that the strength of data science lies in the power of the narrative, and that is what underpins most of the book.

"Overall, this is a book that is accessible, interesting and still manages to introduce the statistical techniques you need to use for real data analytical work. A good way to get into data analysis."

From a practical perspective, the book begins to get useful in chapter four,  which looks at how you can generate summary tables, including multi-dimensional tables. Next is a chapter on graphics and how to generate them. If you're thinking that it seems a bit odd to concentrate on the 'end result' first, you have to remember that the author's view is that data analysis is only useful if your audience actually looks at the results and understands them.

The next chapter gets more into the workings of data analysis with an examination of hypothesis testing using techniques such as t-tests and correlation analysis. Regression analysis is looked at next, based on the notions "why tall parents don't have even taller children". This is a fun chapter, with examples including consumer spending on food and alcohol, housing markets, and whether the appearance of teachers affects their evaluations by students.

A chapter on analysis of binary variables considers logit and probit models using data from New York transit use. Categorical data and multinomial variables are the topic of the next chapter, which expands on the ideas of logit models.

Spatial data analysis is covered next, taking us into the use of GIS systems and how these have expanded the options for data analysis. There's a good chapter on time series analysis looking at how regression models can be used with time series data, using the examples of forecasting housing markets.

The final chapter introduces the field of data mining. It's more of a taster discussing some of the techniques that can be used, but fun anyway.

Overall, this is a book that is accessible, interesting and still manages to introduce the statistical techniques you need to use for real data analytical work. A good way to get into data analysis. 

Related Reviews

To keep up with our coverage of books for programmers, follow @bookwatchiprog on Twitter or subscribe to I Programmer's Books RSS feed for each day's new addition to Book Watch and for new reviews.