What is an average score? Normally, when the scale is from 1 to 5, the average score would be 3.
I think there is quite a bit of evidence that that is not how humans use rating scales. Typically, scores below the midpoint are off limit, and so "average" is actually halfway between the midpoint and the top. The midpoint then actually means bad. And less than the midpoint is abjectly terrible. This ranges from most education assessments (sub 50% is a fail, and the average is usually in the 70s) to wine (Cuisine don't give out below 3 stars; Robert Parker doesn't go below 50).
But these conventions make it easy for people to break them to make a point, with the very negative consequences you've outlined. Perhaps Uber should discard all ratings under 3, or use them in a qualitatively different fashion*. Also, Uber is US based, and there is plenty of evidence that people from the US skew much further to the top of the scale, which will shape their expectations.
*The geek in me thinks that one way to better deal with this data would be to have a random effect for passenger in the model (so that a passenger's rating is adjusted for both the passenger's mean rating and distribution). Actually there is a whole lot of quite sensible stuff you could do with this, though you'd have to be careful you didn't model out real variation (eg, discounting negative feedback from a passenger who'd found a driver racist/sexist, when the driver otherwise had excellent feedback).
While I do love the Uber experience, a lot of things I've read (thanks for that post from the Guardian), make me more lukewarm to corporate Uber.
And then to wonder whether Uber themselves may be disrupted. Perth is trialling small self-driving electric buses, and the company that builds them already has the app technology which enables them to come to you rather than you going to a conventional bus stop, as long as you are more or less on the route.
I have always avoided taking taxis as much as possible, except when travelling, and even then as an emergency measure — I have even taken an intercity taxi ride, after missing a public transport connection which would have meant missing a plane flight. And Lisbon. Lisbon taxis were ubiquitous and cheap, and it really was just easy to catch a cab when you needed one. I guess the public transport there was so good, that to be even vaguely competitive, they couldn't be particularly expensive.
But Uber rather changed that. My first Uber experience was a bit of a fail. I had airport wifi at LAX, but failed to answer the driver's call, so I missed out on a ride in a Lincoln Town Car. But the van that I ordered next didn't call, and after I got my US sim card fired up, it was just so easy, and if you are travelling with others, starts to become cheaper than public transport. It took a while for me to Uber in NZ, but I was in Auckland during a bus strike earlier in the year, so my bus based plans were foiled. And I have to say that my 2 Uber rides were easily better and more pleasant than pretty much any NZ taxi ride I've ever had. And my two drivers were: a taxi driver, and a striking bus driver! I don't imagine I will be a super frequent Uber-er in NZ, but the transparent pricing, good chat, and payment/receipts. There is so much to like as rider.
A few thoughts: one of the things that seems clear here is that there are quite clear disciplinary expectations on how things should be done (including some I'm inclined to flip on myself, such as reverse scoring).
Reverse scoring — totally a classic move for the reasons previously outlined (agreement bias, check people are awake etc.), but also disadvantages those with lower English proficiency, and they still tend to cluster together as their own factor.
Having a mid point — I'm strongly in favour of having a midpoint (unless, Rasch, see later). Forcing people to have an opinion where they are actually neutral leads can in paper surveys to people circling between the two (creating their own midpoint), and people stopping the survey in electronic.
And we haven't really talked about clustering, which is less relevant for the main argument under discussion here, but are there really genuine clusters, or is the data pretty evenly scattered? In which case plotting in 2 or 3D space would make more sense, rather than applying inaccurate arbitrary boundaries.
So back to Rasch. As the NZAVS statement alludes, using adaptive Rasch or Mokken scaling would have been a great well to test this domain of interest. In essence, you start out with a statement on the midpoint of the scale, which is relatively uncontentious (e.g., Cultural diversity enriches New Zealand society). If you agree with that statement, you start to be shown progressively more, well progressive statements (e.g., New Zealanders need to do more to make up for past treatment of Māori.). Whereas if you disagree with the statement, you start see statements heading in the other direction ((e.g., Claims about Māori discrimination are exaggerated.), perhaps culminating in the "special treatment" question. This confers a couple of great benefits. (1) People don't have to answer as many questions and (2) People are less likely to see statements they don't agree with. This also swings both ways. It's not surprising that the special treatment rankles with many, but there is also the real potential to get non-completion with questions like Claims about Māori discrimination are exaggerated. I understand that the reason why the census ethnicity question reverted to using "New Zealand European" is due to the backlash/write-in responding when "Pākehā" was used (c. 2001).
Finally, in my nosing about today, the special treatment question could be a local adaption of "Irish, Italian, Jewish, and many other minorities overcame prejudice and worked their way up. Blacks should do the same without any special favors." (Symbolic Racism Scale; Henry & Sears, 2002)
I’m still taking it in, but it looks like something very weird and wrong has gone on here.
On a preliminary look, I’d say they took the data from the bigger set of questions, (used factor analysis as Cliff from Vox suggested) and worked out what the smallest number of questions they could ask was that yielded the same information.
While this process is designed to decrease the boredom factor and make the scale shorter, removing the positively worded questions could have changed people’s reactions to the scale e.g.:
- It should be compulsory to teach Māori language in school.
- The government should compensate Māori for past injustices.
- Māori have fewer opportunities in life than do other New Zealanders.
From a statistical “information” point of view, these questions are probably strongly (negatively) associated with the punch in the guts question, so they were removed to make it shorter. However, without those questions, the context for the questions changes.
On a related nerdy point, “Māori should not receive any special treatment.” is almost certainly reverse scored (and certainly the reverse of the other examples I’ve given above that were removed), and there is a trend away from using reverse scored items.
Aside: Coal tar is still on the WHO list of essential medicines, and is also the original source of the ubiquitous paracetamol.
Kudos to your mum.
It's just as well we're building cycleways, because those things look bloody difficult to ride in traffic.
Having spent a month cycling round Santa Monica and Venice on a less exaggerated cruiser, I can tell you that they're much better than they appear. The super upright laid-back position is great in traffic, and equally great on the boardwalk or the firm intertidal sand. The relative lack of gears was a bit annoying on the hills, but rather loved it.
On a side-note, that whole area is surprisingly fabulous for cycling. A lot of on-street cycle lanes (typically one street across from the major thoroughfares like Santa Monica Boulevard), the boardwalk along the beach (I think the only separated facility), and a really high level of respect from motorists — apparently in part from fear of legal liability if they were to collect a cyclist. Even a friend that would never cycle in NZ was really happy to cycle there, sans helmet...
CODA - Someone usefully pointed out that I haven't accounted for couples in my analysis, so you could probably reduce those estimates, but I also haven't accounted for investor activity, which will push them back out a bit again.
I realise I forgot to spell the final point out. If your surname doesn't sound Chinese, then it would appear that there is 1 sale per home owner every 30 years*. Which is really freaking low.
* There's quite a bit of kludge in this, but doesn't seem worse than the original analysis.
Don't ask me why I'm wading into this.
Firstly on the ethnicity measure. Yes it's not very robust, and there will be plenty of false positives, and more than a few false negatives. But it's not totally terrible. I've seen it in one published paper (on medical school admissions, where they had no ethnicity data; from memory mostly to separate "south asian" in the UK). And more broadly, the idea of using a proxy measure in the absence of good data is pretty normal.
Secondly, any way that you present data has strengths and weaknesses. Yes the percentages portray it one way, and looking at the absolute numbers portray it another way. But Keith also indulges in some sleight of hand. Yes it's entirely possible for 126000 people to buy 3500 houses. But that does also look very high when that means that 1274000 buy 5318 houses. But see my final conclusion below.
Thirdly, fewer houses do sell in winter. I could only find 2013 data, but it looks like approx 26000 houses sold that year. Assuming the ethnicity breakdown was to approximately hold for a full year and that the agency was reasonably representative, that would be 10270/126000 and 15370/1274000. That's 8.2 sales per 100 people and 1.2 sales per 100 people. Or 6.75 times more sales per hundred for people with Chinese sounding surnames. That's quite a big difference, and likely to be a combination of multiple factors.
Fourth, in Auckland, the going rate of home ownership is 61% at the last census. If that rate were to apply to people with Chinese sounding surnames, that would equate to on average 1 house sale per home owner every 8 years. That is not an outrageous rate of sales. If the home ownership rate for the Census "asian" category (35%) applies to Chinese sounding surnames, then it's a sale very four and a bit years. Still not that outrageous.
Surprising New Conclusion: House sales to Chinese sounding surnames only seem high against amazingly low rate of house sales among people that don't have Chinese sounding surnames.