Forecasting with Friends

By John Mauldin

“The best way to predict the future is to create it.” 

– Peter Drucker 

“Do not dwell in the past, do not dream of the future. Concentrate the mind on the present moment.” 

– Buddha

Forecasting the New Year is a curious tradition. Much evidence suggests that no one does it both accurately and consistently, yet everyone keeps trying. Why is this?

In some ways, I think it’s just entertainment. You might compare our prognosticating to what happens with NFL football. For weeks before the Super Bowl, we devoted fans will spend hours speculating on the game’s every detail.
We’ll dissect the rosters, talk about each team’s strengths and weaknesses, debate game plans, and so on. Is any of this ritual necessary or useful? No, but it extends the experience and we enjoy it.

Economic and market forecasting are similarly pointless fun if you don’t try to turn your forecast into next year’s trade list. Just as no war plan survives contact with the enemy, no investment plan survives contact with February. The real value of an annual forecast is strategic. It helps you set priorities, define important issues, and think about what you should anticipate and what you can safely ignore. That’s a good exercise to go through periodically, and January is as good a time as any.

I gave you my own thoughts last week (see “Skeptically Optimistic”). Today we’ll review several other forecasts from people who deserve your attention. Of necessity, I must leave out some good ones, but I think the ones I cover will give you plenty of useful information.

As I noted last week, this particular upcoming year is unusual due to unfolding political events. The next three months or so will tell us a lot more about how US fiscal and monetary policy will change. The last eight months of 2017 are highly dependent on what happens in the first four. We could easily be in a whole different environment by May, and it could be a better one, or it could be worse.

In that regard, my Strategic Investment Conference is perfectly timed for May 22–25 in Orlando. By then we will know much more about changes in US tax policy, stimulus spending, and Federal Reserve and Supreme Court appointments. We’ll have more visibility on Italy’s bank problems, and we’ll know who won the French election. SIC will be our chance to review and update what we’re thinking. I hope you’ll join me and my all-star guests. It’s going to be a fantastic event. Click here to learn more.

And now let’s plunge into the new year with Gavekal, David Rosenberg, Christopher Wood, and Bank Credit Analyst.
Gavekal: A Spectrum of Views
The Gavekal team, led by co-founders Charles Gave, Louis-Vincent Gave, and Anatole Kaletsky, have a knack for asking the right questions. Sometimes that’s half the battle. Having asked the right questions, they frequently disagree on the answers. But unlike many research firms, they aren’t afraid to reveal their differences. The resulting conversations are inevitably fascinating and informative.

This month Gavekal published “Our Top 12 Questions for 2017.” I’ll give you the full list first, then we’ll zero in on their most important answers.
Global Issues
  1. Will the US dollar continue its strong rally?
  2. Will US bond yields move permanently above 3%?
  3. Will the eurozone succumb to an existential crisis?
  4. Will capital outflows trigger financial panic in China?
  5. Will the oil price end 2017 above US $55?
Regional Markets
  1. US: Will tax reform push up the dollar or improve the US trade balance?

  1. US: Will bonds outperform equities?

  1. China: Are A-shares poised for a rally?

  1. Europe: Will Britain face recession and sterling fall more?

  1. Europe: Will EU equities finally outperform?

  1. Emerging markets: Will a dollar squeeze cause financial crisis?

  1. Emerging markets: Will Indian growth recover from demonetization?
These are all excellent questions. I think the USD was a good way to start the list, too. So many other things hinge on what happens to the dollar this year. Charles and Anatole both think the dollar rally will continue, though Anatole thinks it will not strengthen against the euro. Louis – who will speak at SIC, by the way – has a more nuanced opinion, which I’ll quote in full.

Short answer: It all depends on what happens to return on invested capital. Whether the US dollar rises or falls will be the primary driver of performance for almost any asset class in 2017. And behind that question of the dollar lies the broader outlook for the US economy; specifically, can President Trump manage to raise ROIC in the US? The answer to that question sets up the following decision tree:

If no, then the dollar falls back, US equities underperform and emerging market debt outperforms big-time.

If yes, then we have to ask why US ROIC is rising:

Is it at the expense of ROIC outside the US (i.e. protectionism)? If so, then the concern is that we are seeing a rearrangement of the post-World War II world order. In this case, the US will no longer be willing to provide excess liquidity when needed through a widening current account deficit. Investors should start worrying about a global depression, and consider buying long-dated treasuries pretty soon.

Is it through tax cuts and deregulation? If so, then real rates should rise, and the whole world will grow faster. In this environment, sell all assets that have done well from financial engineering, scarcity assets (gold, art collectibles), and private equity; and buy cyclicals and financials everywhere.

So far, the market is clearly pricing in this latter scenario.
I agree with this logic, and I think the full answer is still pending. In my view, people are focusing too much on President-elect Trump’s rhetoric and not enough on his actions. Yes, he intends to demand better trade terms from other countries, but I think his threats are mostly a negotiating tactic. He has placed fellow deal-maker Wilbur Ross in charge of trade negotiations because he wants to make deals. They will favor the US more than current arrangements do. However, Trump and Ross don’t expect to get everything they demand. They will make deals that promote US employment without rearranging the postwar order, as Louis calls it.

Tax cuts and deregulation are a bigger question mark, mainly because they must go through Congress. We are already seeing substantial division on the Republican side on other issues. The sausage making could easily end somewhere far short of what markets currently expect. We will get an update from Louis in May.
David Rosenberg: Return to Disinflation
The furious post-election stock rally/bond crash leveled out in late December but has not yet reversed. Conventional wisdom for much of the financial industry is that tax cuts, deregulation, and fiscal stimulus will change everything this year, particularly for banks and energy companies. Proponents of this view think the US economy has plenty of pent-up demand and is ready to grow – enough so that we may actually see a little inflation for a change.
David Rosenberg of Gluskin Sheff disagrees. Back in mid-December, right after the Fed gave us that tiny rate hike, he headlined his daily letter:
My out-of-the-box call for 2017: Trump accidentally engineers a return to the disinflation trade.
Rosie doesn’t think inflation is coming back or that the economy is ready to soar, notwithstanding the new management in Washington. He thinks the widespread optimism among investors and business owners is no reason to change your plans.
Here’s Rosie:

The markets are indeed forward-looking, but this latest leg of the risk rally has a certain speculative feel to it.

Now, some full disclosure. I actually find it senseless to provide a forecast for the entire year ahead at this time.

We are not in normal, more stable time periods.

We have been in a heightened state of volatility, and that will intensify in 2017 because of the political dynamics in the U.S. as well as in Europe. We have a president who tweets the first thing that comes to his head, has appointed a cabinet filled with billionaires even though it was rural blue-collar voters that pushed him over the top, and every pro-growth promise was met with an anti-growth measure.

We go into the New Year with investor optimism and equity market valuations running at extremely high levels, so initially the risk is that disappointment sets in, but that may not happen until we are well into 2017.

I will go on record to say that sentiment and market positioning are so radically negative on Treasuries that it wouldn’t take much to elicit a countertrend bond market rally. We are way oversold here.

The economy isn’t that strong, and anyone who thinks one man can reverse, on his own, the structural forces that led to the multi-year disinflation trend — and I’m talking about excessive debt, globalization, aging demographics, and technology — needs to go back to economics school right away.

I think it is very dangerous to be basing investment decisions on expectations of government policy. What is done and when it is done is far too uncertain, and uncertainty is inherently difficult to Price.
I have to agree with some of this. Rosenberg is right that the structural forces favoring disinflation/deflation haven’t changed. A different trade policy is not going to restore all the jobs our Rust Belt states lost. Eighty percent of the jobs that have disappeared in the Rust Belt were were lost to technological shifts, not to offshoring. Please note the difference between offshoring and completely new manufacturing businesses being set up in China. The jobs were never here; They first appeared in China (or pick a country).
The irony is that Apple may indeed start making iPhone 8s or 9s in the US at some point in the not-too-distant future, but they will do so on robotic assembly lines and with nowhere near the number of jobs created at Foxconn for the first iPhones made in China.
In fact, Foxconn is now installing robots because they are cheaper than Chinese labor. Can you see a trend here?

Demographics are what they are, too. We Baby Boomers will keep getting older, some of us retiring but others, by choice or not, staying in the labor force and not giving younger workers a chance to take our places. Meanwhile China and Japan will face steadily worsening labor shortages. Chinese businesses are trying to move up the economic food chain, away from the dependence on cheap labor. That trend will further advance automation technology, which will then find its way to the US.

I think we will get tax cuts, but they may or may not stimulate growth. Details are critical. And plenty of other priorities could get in the way of deregulation efforts.

I wish I could prove Rosie wrong, but that’s a feat I have rarely accomplished in many years of trying. He’ll be at SIC, too, so we’ll see what he thinks in May.
Christopher Wood: Inflection Points and Border Taxes
Chris Wood of CLSA has a marvelous newsletter called, aptly, Greed & Fear. He makes the words even more chilling by artfully casing them as GREED & fear. They become his persona as he writes. He began his January 5 issue talking about bond yields possibly bottoming out.

For perspective, he starts with this long-term view of the 10-year US Treasury yield.
This multi-decade downtrend encompasses the careers of virtually everyone in the financial industry today. Only those of us now in our 60s can remember seeing double-digit long bond yields as adults. I promise you, they were not fun; but the subsequent decline to today’s sub-3% yields certainly has been. Are we at the end of the line? Here’s Chris:

An inflection point may have been reached in world financial markets, or at least such is what market participants began to think in the final weeks of 2016. The inflection point referred to is the postulated end to the 35-year-old bull market in Treasury bonds, and the related decline in interest rates that went with it. Remember that the 10-year Treasury bond yield peaked at 15.84% in 1981 and hit a low of 1.36% in July of last year (see Figure 1).

Such a view… also assumes the end of lower for longer, clearly assumes that deflationary pressures have also peaked and that inflationary pressures are set to return. It also assumes that there will be no return to unconventional monetary policies in America even as these policies continue for now in Japan and the Eurozone….

Now it is true that there have been countless wrong predictions made before of the end of the bull market in Treasury bonds. Indeed, in every year since the post-financial-crisis recovery began in 2009, the vast majority of economists and strategists have wrongly predicted higher bond yields in the year ahead, only to be forced to change that view later.

This possible inflection point in bond yields, if that’s what it is, would seem to be the result of another inflection point: the impending Trump presidency. Chris is not convinced:

It is of course this perceived inflection point in policy, and the related hopes for the return of animal spirits to the American economy, which has been the key trigger for the dramatic sell-off in the American bond market and almost equally dramatic rally in the US dollar since Donald Trump’s victory. But from a Federal Reserve perspective these market moves have already served to tighten financial conditions significantly.

Meanwhile, the current establishment consensus, most particularly in Washington, is that monetary policy is increasingly impotent and that the “heavy lifting” should now be done by fiscal policy. This explains why just about the only issue which Donald Trump and Hillary Clinton agreed on during last year’s presidential campaign was infrastructure stimulus. Still, if this is the context, it must also be admitted that Trump’s fiscal easing plans are much more aggressive than what most neo-Keynesian establishment policymakers would be comfortable with, most particularly his proposed aggressive tax cuts.

While GREED & fear will be the first to admit that tax cuts and deregulation are clearly positive triggers for animal spirits and growth, extreme skepticism is warranted on the hopes currently being invested in the powers of fiscal easing and related infrastructure spending even if it is assumed that a Republican-controlled Congress, containing many fiscal conservatives, is really willing to sign up to The Donald’s spending plans.
I agree, too, that the jury is still out on the infrastructure spending issue. Chris goes on to discuss potential trade policy changes:
Meanwhile, the much-discussed protectionist threat represented by the President-elect remains harder to call. GREED & fear’s view is that the most-substantive proposal out there is the draft legislation currently before the House of Representatives discussed here in some detail last month. While not proposing outright tariffs, this draft legislation, which some are calling a “border tax,” has in some respects the same practical effect, as it would remove the tax deductibility against corporate tax of imports, while profits would no longer be taxed at the US rate for exports. Thus, the proposal would tax US imports at the corporate income tax rate, while exempting income earned from exports from any US taxation.

There is apparently huge lobbying currently going on against this legislation by corporate America, which is not surprising given annual imports in America are US $2.7tn (see Figure 11) and given the money that has been invested by American companies in their supply chains in the era of globalisation, be it in Mexico, Asia or elsewhere. Still this week’s news of Ford abandoning plans for a US $1.6bn Mexican plant in favour of Michigan is a sign of the direction in which the political winds are blowing. Thus, Ford said on Tuesday that it is cancelling plans for a new US $1.6bn plant in Mexico and investing US $700m in its Flat Rock, Michigan, plant’s expansion.
It is also worth reiterating that this proposed Republican legislation is in part sponsored by Republican House Speaker Paul Ryan.
To extend Chris’s last point, I hear many critics say Trump can’t win by going after companies one at a time. But he doesn’t need to. Making examples of a few high-profile companies already sends a message to others. No CEO wants to be the subject of a Trump tweet. I expect such tweets to slow and eventually stop as corporate leaders refocus their expansion plans on domestic production.

This shift will create vulnerabilities in export-heavy economies. For that reason, Chris favors “domestic demand” stories, i.e., countries large enough and wealthy enough to sustain growth within their own borders. That’s a small list, of course.
Bank Credit Analyst: Looming Shifts
I look forward every year-end to the annual “Mr. X” issue from Bank Credit Analyst.
They frame their forecast issue as a conversation with a longtime client. I have no idea whether Mr. X is a real person or mythical, but I feel like he’s an old friend.
BCA is also skeptical that the new Washington leadership will deliver much fiscal stimulus this year. They go further and ask the same question about the rest of the world.

Mr. X: What about fiscal developments in other countries?

BCA: The Japanese government has boosted government spending again, but the IMF estimates that fiscal changes added only 0.3% to GDP in 2016, with an even smaller impact expected for 2017. And a renewed tightening is assumed to occur in 2018 as postponed efforts to rein in the deficit take hold. Of course, a sales tax hike could be delayed yet again if the economy continues to disappoint. But, with an overall budget deficit of 5% of GDP and gross government debt of more than 250% of GDP, Japan’s room for additional stimulus is limited (Chart 4). Although the Bank of Japan owns around 40% of outstanding government debt, the authorities cannot openly admit that this will be written off. While more fiscal moves are possible in Japan, it is doubtful they would significantly alter the growth picture.
The euro-area peripheral countries have moved past the drastic fiscal austerity that was imposed on them a few years ago. Nevertheless, there is not much room for maneuver with regard to adopting an overtly reflationary stance.

It is one thing to turn a blind eye to the fiscal constraints of the EU’s Growth and Stability Pact and quite another to move aggressively in the opposite direction. Most of the region’s economies have government debt-to-GDP ratios far above the 60% required under the Maastricht Treaty. In sum, a move to fiscal stimulus is not in the cards for the euro area. The U.K. is set to adopt more reflationary policies following the Brexit vote, but this would at most offset private sector retrenchment.

In conclusion, looming shifts in fiscal policy will be positive for global growth in the next couple of years, but are unlikely to be game changers.
Their conclusion is classic BCA moderation. I think they may be the unconscious source of my “Muddle Through” philosophy. As much as I would like to forecast either gloom or euphoria, we rarely get too much of either.

On trade, BCA sees little risk of a trade war and notes that trade actually ceased to be a net contributor to world growth several years ago: global export volumes have been growing more slowly than GDP. That’s mainly a result of China’s growing domestic production.
They also have some thoughts on US inflation:

Inflation and bond yields in the U.S. have passed a cyclical turning point, but this does not mean that a sustained major uptrend is imminent. Let’s start with inflation. A good portion of the rise in the underlying U.S. inflation rate has been due to a rise in housing rental costs, and, more recently, a spike in medical care costs. Neither of these trends should last: changes to the ACA should arrest the rising cost of medical care while increased housing construction will cap the rise in rent inflation. The rental vacancy rate looks to be stabilizing while rent inflation is rolling over. Meanwhile, the inflation rate for core goods has held at a low level and likely will be pushed lower as a result of the dollar’s ascent. Of course, this all assumes that we do not end up with sharply higher import tariffs and a trade war.

The main reason to expect a further near-term rise in underlying U.S. inflation is the tightening labor market and resulting firming in wage growth.
With the economy likely to grow above a 2% pace in 2017, the labor market should continue to tighten, pushing wage inflation higher. So the core PCE inflation rate has a good chance of hitting the Federal Reserve’s 2% target before the year is out. And bond investors have responded accordingly, with one-year inflation expectations moving to their highest level since mid-2014, when oil prices were above $110 a barrel (Chart 10). Long-run inflation expectations also have spiked since the U.S. election, perhaps reflecting the risk of higher import tariffs and the risks of political interference with the Fed.
So again, it’s Muddle Through. Inflation will hit the Fed’s target but not get much higher.

As for the stock market, BCA sees reasons for caution but not an imminent crash.  
They make a good point about wages affecting corporate profit margins.

Investors are excited about the prospect that U.S. earnings will benefit from both faster economic growth and a drop in corporate tax rates. We don’t disagree that those trends would be positive, but there is another important issue to consider. One of the defining characteristics of the past several years has been the extraordinary performance of profit margins which have averaged record levels, despite the weak economic recovery (Chart 28). The roots of this rise lay in the fact that businesses rather than employees were able to capture most of the benefits of rising productivity. This showed up in the growing gap between real employee compensation and productivity. As a result, the owners of capital benefited, while the labor share of income – previously a very mean reverting series – dropped to extremely low levels.
The causes of this divergence are complex but include the impact of globalization, technology and a more competitive labor market.
With the U.S. unemployment back close to full-employment levels, the tide is now turning in favor of labor. The labor share of income is rising, and this trend likely will continue as the economy strengthens. And any moves by the incoming administration to erect barriers to trade and/or immigration would underpin the trend. The implication is that profit margins are more likely to compress than expand in the coming years, suggesting that analysts are far too optimistic about earnings. Long-term growth will be closer to 5% than 12%. The turnaround in the corporate income shares going to labor versus capital represents another important element of our theme of regime changes.
As they say, this is potentially a major shift for equity valuations. Companies have sustained profit margins largely due to automation, offshoring, immigration, and surplus labor supply in the US. All those conditions are now changing. That will affect corporate earnings, though the impact will take time to show.
Indeed, BCA is bullish on stocks for 2017.

The stock market is vulnerable to a near-term setback following recent strong gains, so this is not a great time to increase exposure. However, we do expect prices to be higher in a year’s time, so you could use setbacks as a buying opportunity. Of course, this is with the caveat that long-run returns are likely to be poor from current levels, and we have the worry about a bear market some time in 2018 if recession risks are building. Playing market overshoots can be very profitable, but it is critical to remember that the fundamental foundations are weak and you need to be highly sensitive to signs that conditions are deteriorating.

By BCA standards, this is a pretty bold call. “We do expect prices to be higher in a year’s time” may come back to haunt them when Mr. X comes around next December. Or maybe he’ll be grateful they had him hold on. 
Enough Forecasts
I could actually add another 15 or 20 pages from another dozen forecasts that are worth reading. (Forget the dozens that are either just boring, too dense to understand, or insane.) I selected these for the variety of views they give us (and views that are not necessarily my own).

One anecdotal note: I had lunch today with some colleagues, and one of the ladies is a good friend who is a broker/advisor at one of the big wire houses. She’s normally a fairly cautious, value-oriented investor; and she commented on how her client seemed to have changed in the last two months (read: since the election).
“They became way more positive and ready to make decisions. They are moving money from the sidelines and taking positions.” That squares with all of the sentiment polls and especially with the latest NFIB (National Federation of Independent Businesses) optimism poll.

I am hearing that from all over. That optimism is good for the economy and markets, but it doubles the pressure on Congress and President-elect Trump to actually follow through with legislation, tax cuts, and the cutting of unnecessary bureaucratic regulations, in order to actually provide economic stimulus. I am worried that no matter how much Congress thinks it will deliver – and believe me, if they give us anything they will think it’s a big deal – it won’t be seen as enough.

I guess the flip side of that trade is that half the country has extraordinarily low expectations of soon-to-be-president Trump and Congress, so it may not be hard to clear that bar. I’m trying to arrange a series of meetings in Washington DC next week, which I hope to videotape for you, to try to cut through some of the stuff that dreams are made of and try to figure out what might actually come out of the sausage mill. Right now, schedules are very up in the air, so it may be hit or miss as to whom I get to see, but I expect to come back with more information than I have now.

DC, Florida, and the Caymans
Shane and I will be in Washington for the inauguration. We will go straight from DC to the Inside ETFs Conference in Hollywood, Florida, January 22–25. If you are in the industry and coming to that conference, make a point to meet with me. Mauldin Solutions (my investment advisor firm) will have a booth where I will try to hang out some. If you are an independent broker advisor in the area, come by and see me. I will be making some big announcements at the conference.

Then I’ll be speaking at a one-afternoon conference hosted by S&P Dow Jones here in Dallas on February 1. I will then be at the Orlando Money Show February 8–11 at the Omni in Orlando. Registration is free. I am also scheduled to speak at a large hedge fund conference in the Cayman Islands February 14 to 18. Details on the Cayman conference to follow.

I confess to being a political junkie. I was seriously involved with Republican Party politics in Texas and nationwide for about 20 years, ending in the early 2000s.
Business and personal issues dictated that I reduce my involvement; and, surprisingly, I have not missed it. But that didn’t end my fascination with the process. That started in high school and college. (I told George McGovern I would vote for him and I did. I also voted for Jimmy Carter.) But then things changed. Back in the late ’80s, you could get all the Republicans in Texas together in a small hotel and have room left over. We pretty much all knew each other. Not so much today.

But I don’t recall ever having the political process captivate my attention as much as it has this last cycle. I think that’s because soon-to-be-president Trump represents the potential for significant change, and not just in things like entitlement programs or taxes or the posture of the Defense Department.

There is the potential – and let me emphasize that it is potential, and not something we will know about for at least a few years – that the businessman side of Trump can reorganize how the federal government works. I have observed over the last 20 to 30 years that good – even great – ideas from well-meaning politicians, even if they are passed into law, get lost in bureaucratic political correctness and legalese and die. Whatever emerges from the bureaucracy is not what was intended by the original legislation. And this has become an increasing problem.

I am driven, and have been since elementary school, to try to figure out what the future will look like. In a way, I am much more interested in “future history” than actual history as the driver of my thinking and writing. I can imagine a future in which Trump “drains the swamp” of bureaucratic detritus and makes some real, positive change possible.

I’ve been reading and listening to Newt Gingrich’s essays on what he calls Trumpism. He is doing a seven-part video series at the Heritage Foundation and has delivered the first two sessions, which you can see here. (You will have to scroll through other videos.) Those interested can sign up for the entire series here: "Understanding Trump and Trumpism." There was also a great essay by John Steele Gordon in this morning’s Wall Street Journal. In fact, there are dozens of people trying to figure out what the heck is going on with the new administration. It is unlike anything we have seen.

My associate Patrick Watson, a voracious and omnivorous reader of a wide variety of media (I simply can’t keep up), noticed the following coincidences. Kissinger shows up to meet with Trump, on the same day that Japanese Prime Minister Abe is also in Trump Tower. Two weeks later, at the very moment when Trump is calling the Taiwanese president (upsetting the apple cart in the press), a 93-year-old Kissinger decides to visit China on the same day and happens to be meeting with Chinese Prime Minister Xi Jinping. I don’t know that that has been picked up in the press. Neither has the fact that a short time later Mr. Kissinger comes back to Trump Tower, and now State Department Secretary nominee Rex Tillerson is there on the same day. Other than the coincidences of timing, there is nothing that we know about these interesting events.

I had an interview yesterday with Christof Leisinger, business editor of the New Zürich Times (technically Neue Zürcher Zeitung, Wirtschaftsredaktion, for my German readers). He asked me a question that I’ve been asked several times: who do I think will be the next Fed chair? My standard glib answer is, “the usual suspects,” and I’m always adding my favorite, Richard Fisher. But then it occurred to me that that is not the correct answer. The correct answer is, we literally don’t know and have no idea.

Trump had 75 people in his pool of potential nominees for the head of Veterans Affairs. I am told he interviewed 25 of them. Who in God’s green earth would ever interview 25 people for one position? One that, frankly, is a little bit down the league tables. Not that it isn’t important, as there are literally scores of important appointments, but 25 interviews?

If we look at what Trump is doing so far, it seems to be what he does in business: He looks for the best person he can find, searching and interviewing, and then goes with what his instincts say is the right person to make the vision happen. Of course he interviews the usual suspects, but he also reaches outside the box. That is why we are getting so many new faces and people that, frankly, would have been excluded from a Bush administration as too controversial. I look at each appointment, and what I see is somebody carefully chosen to do a particular thing and to force the bureaucracy to move in a particular direction.

So when it comes to picking a Fed chairperson, I expect him to interview Warsh and Taylor and Fisher, but I’ll make a side bet that he interviews a few other people as well. I think it’s only 50-50 that he picks one of the usual suspects. We’ll get some indications of the direction he wants to see the Fed go with the first two nominations for the governor positions, nominations that I assume will happen within the next 60 to 90 days. That will give us a little clarity, but I don’t think his process of interviewing a number of people and listening to them as to what to do before he makes a decision (the opposite of what he does when tweeting) is going to change.

When was the last time we had a cabinet room with only one lawyer in it? And that is Attorney General nominee Jeff Sessions, who by the very nature of the office has to be an attorney. None of these new cabinet appointees are walking into the room already having ruled out choices because they conflict with current legalistic thinking. They are thinking “How do we accomplish the job?” After they have figured out what they want to do, they will deal with the lawyers.

How many of you know entrepreneurs who see lawyers as something they use in order to get done what they want to do rather than as somebody to tell them what they can’t do? They pay attention to good advice, and they stay within the rules, but doing so is not at the top of their mind when they are trying to figure out how to achieve the objective they have for their business.

Can Trump really do that? It certainly seems like he’s going to give it the old college try. If he can really relieve the bureaucratic sclerosis that we have created in government, that, more than anything else, will be a lasting and meaningful change.

Anyway, it is a fascinating thing to watch, especially for those of us who are trying to figure out how the markets and the economy will turn out. I am not so certain that world events, at least from an economic standpoint, will allow Trump to deal with pressing economic issues sooner rather than later. The problems of Italy and China are not running on his timetable. I keep hearing Trump say that China is a currency manipulator; but the euro, yen, and pound are all down anywhere from 30–40%; and the yuan is hardly even down 5% from its peak. They are spending huge amounts of their reserves to prop up the yuan, and I am not certain how long they can keep it up. The international desk at the Treasury Department is going to be very busy.

It’s time to hit the send button. You have a great week and remember that, in our own little part of the world, we are allowed to try to create our own future rather than simply allowing larger forces, the sturm und drang of the world, to dictate it. We have to pay attention to the world, but we are not bound by it. Have a great week!

Your trying to create his own future analyst,

John Mauldin

How Did We Get 2016 So Wrong?

Go through the late 2015/early 2016 articles published on this and similar sites and you’ll find a consensus that 2016 was going to be a really bad year. Corporate profits were falling, business inventories had spiked, and deflation was deepening in Japan and Europe. See More Ominous Charts For 2016 for a longer list of indicators that seemed, a year ago, to portend imminent recession if not full-blown financial crisis.

As David Stockman put it in a late-2015 prediction piece,

The Keynesian Recovery Meme Is About To Get Mugged, Part 1
Just consider the most recent data on wholesale sales and inventory. This sector of the domestic economy embodies the leading edge of business activity, meaning that trends in wholesale level sales and inventory stocking are advance indicators of the general macroeconomic Outlook. 
Needless to say, the soaring inventory-sales ratio is not a sign that “escape velocity” is just around the corner. Contrariwise, whenever the ratio has busted through 1.30X in the past, what came next was a recession. 

Recessions happen on the main street economy, of course, when sales weaken and inventories build to the point where liquidation of excess stocks becomes unavoidable.  
Accordingly, of far greater significance than the 19 labor market graphs supposedly on Yellen’s dashboard is the unassailable fact that wholesale sales have now rolled over. 
The natural market driven bounce back from the deep liquidation during the Great Recession is now over and done. Wholesale sales are down 4.5% from their June 2014 peak and have returned to September 2013 levels. 
Moreover, it is also well worth noting that at the most recent October 2015 level, wholesale sales are now up at only a 1.6% annual rate from the pre-crisis peak. 
Surely that does not measure an economy that is healed and heading toward the promised land of full-employment. 
So the false conclusion about the US economy’s strength derived from the Fed’s faulty labor market telemetry cannot be emphasized enough. 
There has been no Fed driven main street recovery. Instead, the tepid business expansion after the 2009 bottom embodied nothing more than the natural regenerative impulses of our badly impaired but still functioning capitalist system.  
As the inventories of goods and labor that were thrown overboard during the post-crisis plunge were rebuilt, incomes recovered and the cycle of expansion paddled forward on its own motion. 

But that’s now done, and the US economy stands fully exposed to the albatross of peak debt and the gale forces of global deflation.

Yet here we are a year later, with US stocks at record levels, growth apparently accelerating and deflation morphing into modest inflation. What happened? Two things.

First, 2016 was a US presidential election year, and the desire to see incumbents hold power trumped whatever qualms Washington might have had about adding to its debt. So the Feds borrowed another trillion+ dollars, presumably spending it on things designed to make voters want to stay the course.

Second, the threat of deflation terrified governments from Japan to Germany, leading them to push interest rates into negative territory for a wide range of sovereign (and some corporate) bonds. Corporations, as a result, felt compelled to borrow as much as possible even if they had no material use for the money.

All this government/corporate cash sloshing around the global financial system has pushed up equity prices and led to a bit more hiring – though apparently still mostly of bartenders and waiters – that has in turn generated some good headline numbers. See Debt Surge Producing Fake Recovery.

The success of this latest bit of can-kicking leaves critics of the current system with a bad case of prediction fatigue. We’ve been tossing around terms like “unsustainable” and “imminent crisis” for so long that they’ve begun to lose both meaning and credibility.

The only consolation is that this is familiar territory. Bubbles tend to go on until most of their critics have been silenced. Tech stocks, for instance, were clearly a bubble in 1998 but didn’t pop until 2000. US housing became an obvious bubble in 2005 but didn’t pop until 2007. In each case the people who initially pointed out the danger were exhausted and/or ignored by the time they were finally proven right.

Since the current bubble – encompassing fiat currencies, government bonds and related derivatives – is by far the biggest and broadest ever, it shouldn’t be a surprise that it’s lasted well beyond what rational analysis says is possible. But it too will pop. In fact, 2017 is looking pretty bad…

Finding a voice

Language: Finding a voice 1

Computers have got much better at translation, voice recognition and speech synthesis, says Lane Greene. But they still don’t understand the meaning of language

I’M SORRY, Dave. I’m afraid I can’t do that.” With chilling calm, HAL 9000, the on-board computer in “2001: A Space Odyssey”, refuses to open the doors to Dave Bowman, an astronaut who had ventured outside the ship. HAL’s decision to turn on his human companion reflected a wave of fear about intelligent computers.

When the film came out in 1968, computers that could have proper conversations with humans seemed nearly as far away as manned flight to Jupiter. Since then, humankind has progressed quite a lot farther with building machines that it can talk to, and that can respond with something resembling natural speech. Even so, communication remains difficult. If “2001” had been made to reflect the state of today’s language technology, the conversation might have gone something like this: “Open the pod bay doors, Hal.” “I’m sorry, Dave. I didn’t understand the question.” “Open the pod bay doors, Hal.” “I have a list of eBay results about pod doors, Dave.”

Creative and truly conversational computers able to handle the unexpected are still far off. Artificial-intelligence (AI) researchers can only laugh when asked about the prospect of an intelligent HAL, Terminator or Rosie (the sassy robot housekeeper in “The Jetsons”). Yet although language technologies are nowhere near ready to replace human beings, except in a few highly routine tasks, they are at last about to become good enough to be taken seriously.

They can help people spend more time doing interesting things that only humans can do. After six decades of work, much of it with disappointing outcomes, the past few years have produced results much closer to what early pioneers had hoped for.

Speech recognition has made remarkable advances. Machine translation, too, has gone from terrible to usable for getting the gist of a text, and may soon be good enough to require only modest editing by humans. Computerised personal assistants, such as Apple’s Siri, Amazon’s Alexa, Google Now and Microsoft’s Cortana, can now take a wide variety of questions, structured in many different ways, and return accurate and useful answers in a natural-sounding voice. Alexa can even respond to a request to “tell me a joke”, but only by calling upon a database of corny quips. Computers lack a sense of humour.

When Apple introduced Siri in 2011 it was frustrating to use, so many people gave up. Only around a third of smartphone owners use their personal assistants regularly, even though 95% have tried them at some point, according to Creative Strategies, a consultancy. Many of those discouraged users may not realise how much they have improved.

In 1966 John Pierce was working at Bell Labs, the research arm of America’s telephone monopoly. Having overseen the team that had built the first transistor and the first communications satellite, he enjoyed a sterling reputation, so he was asked to take charge of a report on the state of automatic language processing for the National Academy of Sciences. In the period leading up to this, scholars had been promising automatic translation between languages within a few years.

But the report was scathing. Reviewing almost a decade of work on machine translation and automatic speech recognition, it concluded that the time had come to spend money “hard-headedly toward important, realistic and relatively short-range goals”—another way of saying that language-technology research had overpromised and underdelivered. In 1969 Pierce wrote that both the funders and eager researchers had often fooled themselves, and that “no simple, clear, sure knowledge is gained.” After that, America’s government largely closed the money tap, and research on language technology went into hibernation for two decades.

The story of how it emerged from that hibernation is both salutary and surprisingly workaday, says Mark Liberman. As professor of linguistics at the University of Pennsylvania and head of the Linguistic Data Consortium, a huge trove of texts and recordings of human language, he knows a thing or two about the history of language technology. In the bad old days researchers kept their methods in the dark and described their results in ways that were hard to evaluate.

But beginning in the 1980s, Charles Wayne, then at America’s Defence Advanced Research Projects Agency, encouraged them to try another approach: the “common task”.

Step by step
Researchers would agree on a common set of practices, whether they were trying to teach computers speech recognition, speaker identification, sentiment analysis of texts, grammatical breakdown, language identification, handwriting recognition or anything else. They would set out the metrics they were aiming to improve on, share the data sets used to train their software and allow their results to be tested by neutral outsiders. That made the process far more transparent. Funding started up again and language technologies began to improve, though very slowly.

Many early approaches to language technology—and particularly translation—got stuck in a conceptual cul-de-sac: the rules-based approach. In translation, this meant trying to write rules to analyse the text of a sentence in the language of origin, breaking it down into a sort of abstract “interlanguage” and rebuilding it according to the rules of the target language. These approaches showed early promise. But language is riddled with ambiguities and exceptions, so such systems were hugely complicated and easily broke down when tested on sentences beyond the simple set they had been designed for. Nearly all language technologies began to get a lot better with the application of statistical methods, often called a “brute force” approach. This relies on software scouring vast amounts of data, looking for patterns and learning from precedent. For example, in parsing language (breaking it down into its grammatical components), the software learns from large bodies of text that have already been parsed by humans. It uses what it has learned to make its best guess about a previously unseen text. In machine translation, the software scans millions of words already translated by humans, again looking for patterns. In speech recognition, the software learns from a body of recordings and the transcriptions made by humans. Thanks to the growing power of processors, falling prices for data storage and, most crucially, the explosion in available data, this approach eventually bore fruit. Mathematical techniques that had been known for decades came into their own, and big companies with access to enormous amounts of data were poised to benefit.

People who had been put off by the hilariously inappropriate translations offered by online tools like BabelFish began to have more faith in Google Translate. Apple persuaded millions of iPhone users to talk not only on their phones but to them. The final advance, which began only about five years ago, came with the advent of deep learning through digital neural networks (DNNs). These are often touted as having qualities similar to those of the human brain: “neurons” are connected in software, and connections can become stronger or weaker in the process of learning. But Nils Lenke, head of research for Nuance, a language-technology company, explains matter-of-factly that “DNNs are just another kind of mathematical model,” the basis of which had been well understood for decades. What changed was the hardware being used. Almost by chance, DNN researchers discovered that the graphical processing units (GPUs) used to render graphics fluidly in applications like video games were also brilliant at handling neural networks. In computer graphics, basic small shapes move according to fairly simple rules, but there are lots of shapes and many rules, requiring vast numbers of simple calculations. The same GPUs are used to fine-tune the weights assigned to “neurons” in DNNs as they scour data to learn. The technique has already produced big leaps in quality for all kinds of deep learning, including deciphering handwriting, recognising faces and classifying images.

Now they are helping to improve all manner of language technologies, often bringing enhancements of up to 30%. That has shifted language technology from usable at a pinch to really rather good. But so far no one has quite worked out what will move it on from merely good to reliably great.
Computers have made huge strides in understanding human speech

WHEN a person speaks, air is forced out through the lungs, making the vocal chords vibrate, which sends out characteristic wave patterns through the air. The features of the sounds depend on the arrangement of the vocal organs, especially the tongue and the lips, and the characteristic nature of the sounds comes from peaks of energy in certain frequencies. The vowels have frequencies called “formants”, two of which are usually enough to differentiate one vowel from another. For example, the vowel in the English word “fleece” has its first two formants at around 300Hz and 3,000Hz.

Consonants have their own characteristic features.

In principle, it should be easy to turn this stream of sound into transcribed speech. As in other language technologies, machines that recognise speech are trained on data gathered earlier. In this instance, the training data are sound recordings transcribed to text by humans, so that the software has both a sound and a text input. All it has to do is match the two. It gets better and better at working out how to transcribe a given chunk of sound in the same way as humans did in the training data. The traditional matching approach was a statistical technique called a hidden Markov model (HMM), making guesses based on what was done before. More recently speech recognition has also gained from deep learning.

English has about 44 “phonemes”, the units that make up the sound system of a language. P and b are different phonemes, because they distinguish words like pat and bat. But in English p with a puff of air, as in “party”, and p without a puff of air, as in “spin”, are not different phonemes, though they are in other languages. If a computer hears the phonemes s, p, i and n back to back, it should be able to recognise the word “spin”.

But the nature of live speech makes this difficult for machines. Sounds are not pronounced individually, one phoneme after the other; they mostly come in a constant stream, and finding the boundaries is not easy. Phonemes also differ according to the context. (Compare the l sound at the beginning of “light” with that at the end of “full”.) Speakers differ in timbre and pitch of voice, and in accent. Conversation is far less clear than careful dictation. People stop and restart much more often than they realise.

All the same, technology has gradually mitigated many of these problems, so error rates in speech-recognition software have fallen steadily over the years—and then sharply with the introduction of deep learning. Microphones have got better and cheaper. With ubiquitous wireless internet, speech recordings can easily be beamed to computers in the cloud for analysis, and even smartphones now often have computers powerful enough to carry out this task.

Bear arms or bare arms?
Perhaps the most important feature of a speech-recognition system is its set of expectations about what someone is likely to say, or its “language model”. Like other training data, the language models are based on large amounts of real human speech, transcribed into text. When a speech-recognition system “hears” a stream of sound, it makes a number of guesses about what has been said, then calculates the odds that it has found the right one, based on the kinds of words, phrases and clauses it has seen earlier in the training text.

At the level of phonemes, each language has strings that are permitted (in English, a word may begin with str-, for example) or banned (an English word cannot start with tsr-). The same goes for words. Some strings of words are more common than others. For example, “the” is far more likely to be followed by a noun or an adjective than by a verb or an adverb. In making guesses about homophones, the computer will have remembered that in its training data the phrase “the right to bear arms” came up much more often than “the right to bare arms”, and will thus have made the right guess.

Training on a specific speaker greatly cuts down on the software’s guesswork. Just a few minutes of reading training text into software like Dragon Dictate, made by Nuance, produces a big jump in accuracy. For those willing to train the software for longer, the improvement continues to something close to 99% accuracy (meaning that of each hundred words of text, not more than one is wrongly added, omitted or changed). A good microphone and a quiet room help.

Advance knowledge of what kinds of things the speaker might be talking about also increases accuracy. Words like “phlebitis” and “gastrointestinal” are not common in general discourse, and uncommon words are ranked lower in the probability tables the software uses to guess what it has heard. But these words are common in medicine, so creating software trained to look out for such words considerably improves the result. This can be done by feeding the system a large number of documents written by the speaker whose voice is to be recognised; common words and phrases can be extracted to improve the system’s guesses.

As with all other areas of language technology, deep learning has sharply brought down error rates. In October Microsoft announced that its latest speech-recognition system had achieved parity with human transcribers in recognising the speech in the Switchboard Corpus, a collection of thousands of recorded conversations in which participants are talking with a stranger about a randomly chosen subject.

Error rates on the Switchboard Corpus are a widely used benchmark, so claims of quality improvements can be easily compared. Fifteen years ago quality had stalled, with word-error rates of 20-30%. Microsoft’s latest system, which has six neural networks running in parallel, has reached 5.9% (see chart), the same as a human transcriber’s. Xuedong Huang, Microsoft’s chief speech scientist, says that he expected it to take two or three years to reach parity with humans. It got there in less than one.

The improvements in the lab are now being applied to products in the real world. More and more cars are being fitted with voice-activated controls of various kinds; the vocabulary involved is limited (there are only so many things you might want to say to your car), which ensures high accuracy. Microphones—or often arrays of microphones with narrow fields of pick-up—are getting better at identifying the relevant speaker among a group.

Some problems remain. Children and elderly speakers, as well as people moving around in a room, are harder to understand. Background noise remains a big concern; if it is different from that in the training data, the software finds it harder to generalise from what it has learned. So Microsoft, for example, offers businesses a product called CRIS that lets users customise speech-recognition systems for the background noise, special vocabulary and other idiosyncrasies they will encounter in that particular environment. That could be useful anywhere from a noisy factory floor to a care home for the elderly.

But for a computer to know what a human has said is only a beginning. Proper interaction between the two, of the kind that comes up in almost every science-fiction story, calls for machines that can speak back.

Hasta la vista, robot voice

Machines are starting to sound more like humans

“I’LL be back.” “Hasta la vista, baby.” Arnold Schwarzenegger’s Teutonic drone in the “Terminator” films is world-famous. But in this instance film-makers looking into the future were overly pessimistic. Some applications do still feature a monotonous “robot voice”, but that is changing fast.
Examples of speech synthesis from OSX synthesiser:
A basic simple:
An advanced sample:
Example from Amazon's "Polly" synthesiser:
Amazon's Polly:
Creating speech is roughly the inverse of understanding it. Again, it requires a basic model of the structure of speech. What are the sounds in a language, and how do they combine? What words does it have, and how do they combine in sentences? These are well-understood questions, and most systems can now generate sound waves that are a fair approximation of human speech, at least in short bursts.
Heteronyms require special care. How should a computer pronounce a word like “lead”, which can be a present-tense verb or a noun for a heavy metal, pronounced quite differently? Once again a language model can make accurate guesses: “Lead us not into temptation” can be parsed for its syntax, and once the software has worked out that the first word is almost certainly a verb, it can cause it to be pronounced to rhyme with “reed”, not “red”.
Traditionally, text-to-speech models have been “concatenative”, consisting of very short segments recorded by a human and then strung together as in the acoustic model described above. More recently, “parametric” models have been generating raw audio without the need to record a human voice, which makes these systems more flexible but less natural-sounding.
DeepMind, an artificial-intelligence company bought by Google in 2014, has announced a new way of synthesising speech, again using deep neural networks. The network is trained on recordings of people talking, and on the texts that match what they say. Given a text to reproduce as speech, it churns out a far more fluent and natural-sounding voice than the best concatenative and parametric approaches.
The last step in generating speech is giving it prosody—generally, the modulation of speed, pitch and volume to convey an extra (and critical) channel of meaning. In English, “a German teacher”, with the stress on “teacher”, can teach anything but must be German. But “a German teacher” with the emphasis on “German” is usually a teacher of German (and need not be German). Words like prepositions and conjunctions are not usually stressed. Getting machines to put the stresses in the correct places is about 50% solved, says Mark Liberman of the University of Pennsylvania.
Many applications do not require perfect prosody. A satellite-navigation system giving instructions on where to turn uses just a small number of sentence patterns, and prosody is not important. The same goes for most single-sentence responses given by a virtual assistant on a Smartphone.
But prosody matters when someone is telling a story. Pitch, speed and volume can be used to pass quickly over things that are already known, or to build interest and tension for new information.
Myriad tiny clues communicate the speaker’s attitude to his subject. The phrase “a German teacher”, with stress on the word “German”, may, in the context of a story, not be a teacher of German, but a teacher being explicitly contrasted with a teacher who happens to be French or British.
Text-to-speech engines are not much good at using context to provide such accentuation, and where they do, it rarely extends beyond a single sentence. When Alexa, the assistant in Amazon’s Echo device, reads a news story, her prosody is jarringly un-humanlike. Talking computers have yet to learn how to make humans want to listen.

Computer translations have got strikingly better, but still need human input

IN “STAR TREK” it was a hand-held Universal Translator; in “The Hitchhiker’s Guide to the Galaxy” it was the Babel Fish popped conveniently into the ear. In science fiction, the meeting of distant civilisations generally requires some kind of device to allow them to talk. High-quality automated translation seems even more magical than other kinds of language technology because many humans struggle to speak more than one language, let alone translate from one to another.

The idea has been around since the 1950s, and computerised translation is still known by the quaint moniker “machine translation” (MT). It goes back to the early days of the cold war, when American scientists were trying to get computers to translate from Russian. They were inspired by the code-breaking successes of the second world war, which had led to the development of computers in the first place. To them, a scramble of Cyrillic letters on a page of Russian text was just a coded version of English, and turning it into English was just a question of breaking the code.

Scientists at IBM and Georgetown University were among those who thought that the problem would be cracked quickly. Having programmed just six rules and a vocabulary of 250 words into a computer, they gave a demonstration in New York on January 7th 1954 and proudly produced 60 automated translations, including that of “Mi pyeryedayem mislyi posryedstvom ryechyi,” which came out correctly as “We transmit thoughts by means of speech.” Leon Dostert of Georgetown, the lead scientist, breezily predicted that fully realised MT would be “an accomplished fact” in three to five years.

Instead, after more than a decade of work, the report in 1966 by a committee chaired by John Pierce, mentioned in the introduction to this report, recorded bitter disappointment with the results and urged researchers to focus on narrow, achievable goals such as automated dictionaries. Government-sponsored work on MT went into near-hibernation for two decades.

What little was done was carried out by private companies. The most notable of them was Systran, which provided rough translations, mostly to America’s armed forces.

La plume de mon ordinateur
The scientists got bogged down by their rules-based approach. Having done relatively well with their six-rule system, they came to believe that if they programmed in more rules, the system would become more sophisticated and subtle. Instead, it became more likely to produce nonsense. Adding extra rules, in the modern parlance of software developers, did not “scale”.

Besides the difficulty of programming grammar’s many rules and exceptions, some early observers noted a conceptual problem. The meaning of a word often depends not just on its dictionary definition and the grammatical context but the meaning of the rest of the sentence.

Yehoshua Bar-Hillel, an Israeli MT pioneer, realised that “the pen is in the box” and “the box is in the pen” would require different translations for “pen”: any pen big enough to hold a box would have to be an animal enclosure, not a writing instrument.

How could machines be taught enough rules to make this kind of distinction? They would have to be provided with some knowledge of the real world, a task far beyond the machines or their programmers at the time. Two decades later, IBM stumbled on an approach that would revive optimism about MT. Its Candide system was the first serious attempt to use statistical probabilities rather than rules devised by humans for translation. Statistical, “phrase-based” machine translation, like speech recognition, needed training data to learn from. Candide used Canada’s Hansard, which publishes that country’s parliamentary debates in French and English, providing a huge amount of data for that time. The phrase-based approach would ensure that the translation of a word would take the surrounding words properly into account.

But quality did not take a leap until Google, which had set itself the goal of indexing the entire internet, decided to use those data to train its translation engines; in 2007 it switched from a rules-based engine (provided by Systran) to its own statistics-based system. To build it, Google trawled about a trillion web pages, looking for any text that seemed to be a translation of another—for example, pages designed identically but with different words, and perhaps a hint such as the address of one page ending in /en and the other ending in /fr. According to Macduff Hughes, chief engineer on Google Translate, a simple approach using vast amounts of data seemed more promising than a clever one with fewer data.

Training on parallel texts (which linguists call corpora, the plural of corpus) creates a “translation model” that generates not one but a series of possible translations in the target language. The next step is running these possibilities through a monolingual language model in the target language. This is, in effect, a set of expectations about what a well-formed and typical sentence in the target language is likely to be. Single-language models are not too hard to build. (Parallel human-translated corpora are hard to come by; large amounts of monolingual training data are not.) As with the translation model, the language model uses a brute-force statistical approach to learn from the training data, then ranks the outputs from the translation model in order of plausibility.

Statistical machine translation rekindled optimism in the field. Internet users quickly discovered that Google Translate was far better than the rules-based online engines they had used before, such as BabelFish. Such systems still make mistakes—sometimes minor, sometimes hilarious, sometimes so serious or so many as to make nonsense of the result. And language pairs like Chinese-English, which are unrelated and structurally quite different, make accurate translation harder than pairs of related languages like English and German. But more often than not, Google Translate and its free online competitors, such as Microsoft’s Bing Translator, offer a usable approximation.

Such systems are set to get better, again with the help of deep learning from digital neural networks.

The Association for Computational Linguistics has been holding workshops on MT every summer since 2006. One of the events is a competition between MT engines turned loose on a collection of news text. In August 2016, in Berlin, neural-net-based MT systems were the top performers (out of 102), a first.

Now Google has released its own neural-net-based engine for eight language pairs, closing much of the quality gap between its old system and a human translator. This is especially true for closely related languages (like the big European ones) with lots of available training data.

The results are still distinctly imperfect, but far smoother and more accurate than before.

Translations between English and (say) Chinese and Korean are not as good yet, but the neural system has brought a clear improvement here too.
The Coca-Cola factor

Neural-network-based translation actually uses two networks. One is an encoder. Each word of an input sentence is converted into a multidimensional vector (a series of numerical values), and the encoding of each new word takes into account what has happened earlier in the sentence. Marcello Federico of Italy’s Fondazione Bruno Kessler, a private research organisation, uses an intriguing analogy to compare neural-net translation with the phrase-based kind. The latter, he says, is like describing Coca-Cola in terms of sugar, water, caffeine and other ingredients. By contrast, the former encodes features such as liquidness, darkness, sweetness and fizziness.

Once the source sentence is encoded, a decoder network generates a word-for-word translation, once again taking account of the immediately preceding word. This can cause problems when the meaning of words such as pronouns depends on words mentioned much earlier in a long sentence. This problem is mitigated by an “attention model”, which helps maintain focus on other words in the sentence outside the immediate context.

Neural-network translation requires heavy-duty computing power, both for the original training of the system and in use. The heart of such a system can be the GPUs that made the deep-learning revolution possible, or specialised hardware like Google’s Tensor Processing Units (TPUs). Smaller translation companies and researchers usually rent this kind of processing power in the cloud. But the data sets used in neural-network training do not need to be as extensive as those for phrase-based systems, which should give smaller outfits a chance to compete with giants like Google.

Fully automated, high-quality machine translation is still a long way off. For now, several problems remain. All current machine translations proceed sentence by sentence. If the translation of such a sentence depends on the meaning of earlier ones, automated systems will make mistakes. Long sentences, despite tricks like the attention model, can be hard to translate. And neural-net-based systems in particular struggle with rare words.

Training data, too, are scarce for many language pairs. They are plentiful between European languages, since the European Union’s institutions churn out vast amounts of material translated by humans between the EU’s 24 official languages. But for smaller languages such resources are thin on the ground. For example, there are few Greek-Urdu parallel texts available on which to train a translation engine. So a system that claims to offer such translation is in fact usually running it through a bridging language, nearly always English.

That involves two translations rather than one, multiplying the chance of errors.

Even if machine translation is not yet perfect, technology can already help humans translate much more quickly and accurately. “Translation memories”, software that stores already translated words and segments, first came into use as early as the 1980s. For someone who frequently translates the same kind of material (such as instruction manuals), they serve up the bits that have already been translated, saving lots of duplication and time.

A similar trick is to train MT engines on text dealing with a narrow real-world domain, such as medicine or the law. As software techniques are refined and computers get faster, training becomes easier and quicker. Free software such as Moses, developed with the support of the EU and used by some of its in-house translators, can be trained by anyone with parallel corpora to hand. A specialist in medical translation, for instance, can train the system on medical translations only, which makes them far more accurate.

At the other end of linguistic sophistication, an MT engine can be optimised for the shorter and simpler language people use in speech to spew out rough but near-instantaneous speech-to-speech translations. This is what Microsoft’s Skype Translator does. Its quality is improved by being trained on speech (things like film subtitles and common spoken phrases) rather than the kind of parallel text produced by the European Parliament.

Translation management has also benefited from innovation, with clever software allowing companies quickly to combine the best of MT, translation memory, customisation by the individual translator and so on. Translation-management software aims to cut out the agencies that have been acting as middlemen between clients and an army of freelance translators. Jack Welde, the founder of Smartling, an industry favourite, says that in future translation customers will choose how much human intervention is needed for a translation. A quick automated one will do for low-stakes content with a short life, but the most important content will still require a fully hand-crafted and edited version. Noting that MT has both determined boosters and committed detractors, Mr Welde says he is neither: “If you take a dogmatic stance, you’re not optimised for the needs of the customer.”

Translation software will go on getting better. Not only will engineers keep tweaking their statistical models and neural networks, but users themselves will make improvements to their own systems. For example, a small but much-admired startup, Lilt, uses phrase-based MT as the basis for a translation, but an easy-to-use interface allows the translator to correct and improve the MT system’s output.

Every time this is done, the corrections are fed back into the translation engine, which learns and improves in real time. Users can build several different memories—a medical one, a financial one and so on—which will help with future translations in that specialist field.

TAUS, an industry group, recently issued a report on the state of the translation industry saying that “in the past few years the translation industry has burst with new tools, platforms and solutions.” Last year Jaap van der Meer, TAUS’s founder and director, wrote a provocative blogpost entitled “The Future Does Not Need Translators”, arguing that the quality of MT will keep improving, and that for many applications less-than-perfect translation will be good enough.

 The “translator” of the future is likely to be more like a quality-control expert, deciding which texts need the most attention to detail and editing the output of MT software. That may be necessary because computers, no matter how sophisticated they have become, cannot yet truly grasp what a text means.
Machines cannot conduct proper conversations with humans because they do not understand the world

IN “BLACK MIRROR”, a British science-fiction satire series set in a dystopian near future, a young woman loses her boyfriend in a car accident. A friend offers to help her deal with her grief. The dead man was a keen social-media user, and his archived accounts can be used to recreate his personality. Before long she is messaging with a facsimile, then speaking to one. As the system learns to mimic him ever better, he becomes increasingly real.

This is not quite as bizarre as it sounds. Computers today can already produce an eerie echo of human language if fed with the appropriate material. What they cannot yet do is have true conversations.

Truly robust interaction between man and machine would require a broad understanding of the world.

In the absence of that, computers are not able to talk about a wide range of topics, follow long conversations or handle surprises.

Machines trained to do a narrow range of tasks, though, can perform surprisingly well. The most obvious examples are the digital assistants created by the technology giants. Users can ask them questions in a variety of natural ways: “What’s the temperature in London?” “How’s the weather outside?” “Is it going to be cold today?” The assistants know a few things about users, such as where they live and who their family are, so they can be personal, too: “How’s my commute looking?” “Text my wife I’ll be home in 15 minutes.”

And they get better with time. Apple’s Siri receives 2bn requests per week, which (after being anonymised) are used for further teaching. For example, Apple says Siri knows every possible way that users ask about a sports score. She also has a delightful answer for children who ask about Father Christmas. Microsoft learned from some of its previous natural-language platforms that about 10% of human interactions were “chitchat”, from “tell me a joke” to “who’s your daddy?”, and used such chat to teach its digital assistant, Cortana.

The writing team for Cortana includes two playwrights, a poet, a screenwriter and a novelist. Google hired writers from Pixar, an animated-film studio, and The Onion, a satirical newspaper, to make its new Google Assistant funnier. No wonder people often thank their digital helpers for a job well done.

The assistants’ replies range from “My pleasure, as always” to “You don’t need to thank me.”

Good at grammar
How do natural-language platforms know what people want? They not only recognise the words a person uses, but break down speech for both grammar and meaning. Grammar parsing is relatively advanced; it is the domain of the well-established field of “natural-language processing”. But meaning comes under the heading of “natural-language understanding”, which is far harder.

First, parsing. Most people are not very good at analysing the syntax of sentences, but computers have become quite adept at it, even though most sentences are ambiguous in ways humans are rarely aware of. Take a sign on a public fountain that says, “This is not drinking water.” Humans understand it to mean that the water (“this”) is not a certain kind of water (“drinking water”). But a computer might just as easily parse it to say that “this” (the fountain) is not at present doing something (“drinking water”).

As sentences get longer, the number of grammatically possible but nonsensical options multiplies exponentially. How can a machine parser know which is the right one? It helps for it to know that some combinations of words are more common than others: the phrase “drinking water” is widely used, so parsers trained on large volumes of English will rate those two words as likely to be joined in a noun phrase. And some structures are more common than others: “noun verb noun noun” may be much more common than “noun noun verb noun”. A machine parser can compute the overall probability of all combinations and pick the likeliest.

A “lexicalised” parser might do even better. Take the Groucho Marx joke, “One morning I shot an elephant in my pyjamas. How he got in my pyjamas, I’ll never know.” The first sentence is ambiguous (which makes the joke)—grammatically both “I” and “an elephant” can attach to the prepositional phrase “in my pyjamas”. But a lexicalised parser would recognise that “I [verb phrase] in my pyjamas” is far more common than “elephant in my pyjamas”, and so assign that parse a higher probability.

But meaning is harder to pin down than syntax. “The boy kicked the ball” and “The ball was kicked by the boy” have the same meaning but a different structure. “Time flies like an arrow” can mean either that time flies in the way that an arrow flies, or that insects called “time flies” are fond of an arrow.

“Who plays Thor in ‘Thor’?” Your correspondent could not remember the beefy Australian who played the eponymous Norse god in the Marvel superhero film. But when he asked his iPhone, Siri came up with an unexpected reply: “I don’t see any movies matching ‘Thor’ playing in Thor, IA, US, today.” Thor, Iowa, with a population of 184, was thousands of miles away, and “Thor”, the film, has been out of cinemas for years. Siri parsed the question perfectly properly, but the reply was absurd, violating the rules of what linguists call pragmatics: the shared knowledge and understanding that people use to make sense of the often messy human language they hear. “Can you reach the salt?” is not a request for information but for salt. Natural-language systems have to be manually programmed to handle such requests as humans expect them, and not literally.

Multiple choice
Shared information is also built up over the course of a conversation, which is why digital assistants can struggle with twists and turns in conversations. Tell an assistant, “I’d like to go to an Italian restaurant with my wife,” and it might suggest a restaurant. But then ask, “is it close to her office?”, and the assistant must grasp the meanings of “it” (the restaurant) and “her” (the wife), which it will find surprisingly tricky. Nuance, the language-technology firm, which provides natural-language platforms to many other companies, is working on a “concierge” that can handle this type of challenge, but it is still a prototype.

Such a concierge must also offer only restaurants that are open. Linking requests to common sense (knowing that no one wants to be sent to a closed restaurant), as well as a knowledge of the real world (knowing which restaurants are closed), is one of the most difficult challenges for language technologies.

Common sense, an old observation goes, is uncommon enough in humans. Programming it into computers is harder still. Fernando Pereira of Google points out why. Automated speech recognition and machine translation have something in common: there are huge stores of data (recordings and transcripts for speech recognition, parallel corpora for translation) that can be used to train machines. But there are no training data for common sense.

Brain scan: Terry Winograd

The Winograd Schema tests computers’ “understanding” of the real world
THE Turing Test was conceived as a way to judge whether true artificial intelligence has been achieved. If a computer can fool humans into thinking it is human, there is no reason, say its fans, to say the machine is not truly intelligent.
Few giants in computing stand with Turing in fame, but one has given his name to a similar challenge: Terry Winograd, a computer scientist at Stanford. In his doctoral dissertation Mr Winograd posed a riddle for computers: “The city councilmen refused the demonstrators a permit because they feared violence. Who feared violence?”
It is a perfect illustration of a well-recognised point: many things that are easy for humans are crushingly difficult for computers. Mr Winograd went into AI research in the 1960s and 1970s and developed an early natural-language program called SHRDLU that could take commands and answer questions about a group of shapes it could manipulate: “Find a block which is taller than the one you are holding and put it into the box.” This work brought a jolt of optimism to the AI crowd, but Mr Winograd later fell out with them, devoting himself not to making machines intelligent but to making them better at helping human beings. (These camps are sharply divided by philosophy and academic pride.) He taught Larry Page at Stanford, and after Mr Page went on to co-found Google, Mr Winograd became a guest researcher at the company, helping to build Gmail.
In 2011 Hector Levesque of the University of Toronto became annoyed by systems that “passed” the Turing Test by joking and avoiding direct answers. He later asked to borrow Mr Winograd’s name and the format of his dissertation’s puzzle to pose a more genuine test of machine “understanding”: the Winograd Schema. The answers to its battery of questions were obvious to humans but would require computers to have some reasoning ability and some knowledge of the real world. The first official Winograd Schema Challenge was held this year, with a $25,000 prize offered by Nuance, the language-software company, for a program that could answer more than 90% of the questions correctly. The best of them got just 58% right.
Though officially retired, Mr Winograd continues writing and researching. One of his students is working on an application for Google Glass, a computer with a display mounted on eyeglasses. The app would help people with autism by reading the facial expressions of conversation partners and giving the wearer information about their emotional state. It would allow him to integrate linguistic and non-linguistic information in a way that people with autism find difficult, as do computers.
Asked to trick some of the latest digital assistants, like Siri and Alexa, he asks them things like “Where can I find a nightclub my Methodist uncle would like?”, which requires knowledge about both nightclubs (which such systems have) and Methodist uncles (which they don’t).
When he tried “Where did I leave my glasses?”, one of them came up with a link to a book of that name. None offered the obvious answer: “How would I know?”
Knowledge of the real world is another matter. AI has helped data-rich companies such as America’s West-Coast tech giants organise much of the world’s information into interactive databases such as Google’s Knowledge Graph. Some of the content of that appears in a box to the right of a Google page of search results for a famous figure or thing. It knows that Jacob Bernoulli studied at the University of Basel (as did other people, linked to Bernoulli through this node in the Graph) and wrote “On the Law of Large Numbers” (which it knows is a book).

Organising information this way is not difficult for a company with lots of data and good AI capabilities, but linking information to language is hard. Google touts its assistant’s ability to answer questions like “Who was president when the Rangers won the World Series?” But Mr Pereira concedes that this was the result of explicit training. Another such complex query—“What was the population of London when Samuel Johnson wrote his dictionary?”—would flummox the assistant, even though the Graph knows about things like the historical population of London and the date of Johnson’s dictionary. IBM’s Watson system, which in 2011 beat two human champions at the quiz show “Jeopardy!”, succeeded mainly by calculating huge numbers of potential answers based on key words by probability, not by a human-like understanding of the question.

Making real-world information computable is challenging, but it has inspired some creative approaches., a Vienna-based startup, took hundreds of Wikipedia articles, cut them into thousands of small snippets of information and ran an “unsupervised” machine-learning algorithm over it that required the computer not to look for anything in particular but to find patterns. These patterns were then represented as a visual “semantic fingerprint” on a grid of 128x128 pixels. Clumps of pixels in similar places represented semantic similarity. This method can be used to disambiguate words with multiple meanings: the fingerprint of “organ” shares features with both “liver” and “piano” (because the word occurs with both in different parts of the training data). This might allow a natural-language system to distinguish between pianos and church organs on one hand, and livers and other internal organs on the other.

Proper conversation between humans and machines can be seen as a series of linked challenges: speech recognition, speech synthesis, syntactic analysis, semantic analysis, pragmatic understanding, dialogue, common sense and real-world knowledge. Because all the technologies have to work together, the chain as a whole is only as strong as its weakest link, and the first few of these are far better developed than the last few.

The hardest part is linking them together. Scientists do not know how the human brain draws on so many different kinds of knowledge at the same time. Programming a machine to replicate that feat is very much a work in progress.
Talking machines are the new must-haves

IN “WALL-E”, an animated children’s film set in the future, all humankind lives on a spaceship after the Earth’s environment has been trashed. The humans are whisked around in intelligent hovering chairs; machines take care of their every need, so they are all morbidly obese. Even the ship’s captain is not really in charge; the actual pilot is an intelligent and malevolent talking robot, Auto, and like so many talking machines in science fiction, he eventually makes a grab for power.

Speech is quintessentially human, so it is hard to imagine machines that can truly speak conversationally as humans do without also imagining them to be superintelligent. And if they are super intelligent, with none of humans’ flaws, it is hard to imagine them not wanting to take over, not only for their good but for that of humanity. Even in a fairly benevolent future like “WALL-E’s”, where the machines are doing all the work, it is easy to see that the lack of anything challenging to do would be harmful to people.

Fortunately, the tasks that talking machines can take off humans’ to-do lists are the sort that many would happily give up. Machines are increasingly able to handle difficult but well-defined jobs. Soon all that their users will have to do is pipe up and ask them, using a naturally phrased voice command.

Once upon a time, just one tinkerer in a given family knew how to work the computer or the video recorder. Then graphical interfaces (icons and a mouse) and touchscreens made such technology accessible to everyone. Frank Chen of Andreessen Horowitz, a venture-capital firm, sees natural-language interfaces between humans and machines as just another step in making information and services available to all. Silicon Valley, he says, is enjoying a golden age of AI technologies. Just as in the early 1990s companies were piling online and building websites without quite knowing why, now everyone is going for natural language. Yet, he adds, “we’re in 1994 for voice.”

1995 will soon come. This does not mean that people will communicate with their computers exclusively by talking to them. Websites did not make the telephone obsolete, and mobile devices did not make desktop computers obsolete. In the same way, people will continue to have a choice between voice and text when interacting with their machines.

Not all will choose voice. For example, in Japan yammering into a phone is not done in public, whether the interlocutor is a human or a digital assistant, so usage of Siri is low during business hours but high in the evening and at the weekend. For others, voice-enabled technology is an obvious boon.

It allows dyslexic people to write without typing, and the very elderly may find it easier to talk than to type on a tiny keyboard. The very young, some of whom today learn to type before they can write, may soon learn to talk to machines before they can type.

Those with injuries or disabilities that make it hard for them to write will also benefit. Microsoft is justifiably proud of a new device that will allow people with amyotrophic lateral sclerosis (ALS), which immobilises nearly all of the body but leaves the mind working, to speak by using their eyes to pick letters on a screen. The critical part is predictive text, which improves as it gets used to a particular individual. An experienced user will be able to “speak” at around 15 words per minute.

People may even turn to machines for company. Microsoft’s Xiaoice, a chatbot launched in China, learns to come up with the responses that will keep a conversation going longest. Nobody would think it was human, but it does make users open up in surprising ways. Jibo, a new “social robot”, is intended to tell children stories, help far-flung relatives stay in touch and the like.

Another group that may benefit from technology is smaller language communities. Networked computers can encourage a winner-take-all effect: if there is a lot of good software and content in English and Chinese, smaller languages become less valuable online. If they are really tiny, their very survival may be at stake. But Ross Perlin of the Endangered Languages Alliance notes that new software allows researchers to document small languages more quickly than ever. With enough data comes the possibility of developing resources—from speech recognition to interfaces with software—for smaller and smaller languages. The Silicon Valley giants already localise their services in dozens of languages; neural networks and other software allow new versions to be generated faster and more efficiently than ever.

 There are two big downsides to the rise in natural-language technologies: the implications for privacy, and the disruption it will bring to many jobs.

Increasingly, devices are always listening. Digital assistants like Alexa, Cortana, Siri and Google Assistant are programmed to wait for a prompt, such as “Hey, Siri” or “OK, Google”, to activate them. But allowing always-on microphones into people’s pockets and homes amounts to a further erosion of traditional expectations of privacy. The same might be said for all the ways in which language software improves by training on a single user’s voice, vocabulary, written documents and habits.

All the big companies’ location-based services—even the accelerometers in phones that detect small movements—are making ever-improving guesses about users’ wants and needs. The moment when a digital assistant surprises a user with “The chemist is nearby—do you want to buy more haemorrhoid cream, Steve?” could be when many may choose to reassess the trade-off between amazing new services and old-fashioned privacy. The tech companies can help by giving users more choice; the latest iPhone will not be activated when it is laid face down on a table. But hackers will inevitably find ways to get at some of these data.
The other big concern is for jobs. To the extent that they are routine, they face being automated away. A good example is customer support. When people contact a company for help, the initial encounter is usually highly scripted. A company employee will verify a customer’s identity and follow a decision-tree. Language technology is now mature enough to take on many of these tasks.

For a long transition period humans will still be needed, but the work they do will become less routine. Nuance, which sells lots of automated online and phone-based help systems, is bullish on voice biometrics (customers identifying themselves by saying “my voice is my password”).

Using around 200 parameters for identifying a speaker, it is probably more secure than a fingerprint, says Brett Beranek, a senior manager at the company. It will also eliminate the tedium, for both customers and support workers, of going through multi-step identification procedures with PINs, passwords and security questions. When Barclays, a British bank, offered it to frequent users of customer-support services, 84% signed up within five months.

Digital assistants on personal smartphones can get away with mistakes, but for some business applications the tolerance for error is close to zero, notes Nikita Ivanov. His company, Datalingvo, a Silicon Valley startup, answers questions phrased in natural language about a company’s business data. If a user wants to know which online ads resulted in the most sales in California last month, the software automatically translates his typed question into a database query. But behind the scenes a human working for Datalingvo vets the query to make sure it is correct. This is because the stakes are high: the technology is bound to make mistakes in its early days, and users could make decisions based on bad data.

This process can work the other way round, too: rather than natural-language input producing data, data can produce language. Arria, a company based in London, makes software into which a spreadsheet full of data can be dragged and dropped, to be turned automatically into a written description of the contents, complete with trends. Matt Gould, the company’s chief strategy officer, likes to think that this will free chief financial officers from having to write up the same old routine analyses for the board, giving them time to develop more creative approaches.

Carl Benedikt Frey, an economist at Oxford University, has researched the likely effect of artificial intelligence on the labour market and concluded that the jobs most likely to remain immune include those requiring creativity and skill at complex social interactions. But not every human has those traits. Call centres may need fewer people as more routine work is handled by automated systems, but the trickier inquiries will still go to humans.

Much of this seems familiar. When Google search first became available, it turned up documents in seconds that would have taken a human operator hours, days or years to find.

This removed much of the drudgery from being a researcher, librarian or journalist. More recently, young lawyers and paralegals have taken to using e-discovery. These innovations have not destroyed the professions concerned but merely reshaped them.

Machines that relieve drudgery and allow people to do more interesting jobs are a fine thing. In net terms they may even create extra jobs. But any big adjustment is most painful for those least able to adapt. Upheavals brought about by social changes—like the emancipation of women or the globalisation of labour markets—are already hard for some people to bear. When those changes are wrought by machines, they become even harder, and all the more so when those machines seem to behave more and more like humans. People already treat inanimate objects as if they were alive: who has never shouted at a computer in frustration? The more that machines talk, and the more that they seem to understand people, the more their users will be tempted to attribute human traits to them.

That raises questions about what it means to be human. Language is widely seen as humankind’s most distinguishing trait. AI researchers insist that their machines do not think like people, but if they can listen and talk like humans, what does that make them? As humans teach ever more capable machines to use language, the once-obvious line between them will blur.