Hadoop-onomics: Banking on big data opportunities
Image: Karsten Petrat
Share on LinkedIn
Share on Xing

Hadoop-onomics: Banking on big data opportunities

Jessica Twentyman — May 2014

How the world’s largest financial services companies are joining the rush to transform big data into big value.

Given that Hadoop was born a decade ago within the Silicon Valley R&D labs of Yahoo, it is hardly surprising that the first users of the open source big data analysis framework were other Internet giants such as Facebook, eBay and Twitter. Its innate ability to handle complex data sets — any mix of structured, unstructured, log files, pictures, audio files, text and more — also meant it was an instant hit with many media companies.

Such early adopters had one thing in common: they wanted to analyze vast volumes of mostly unstructured data by spreading the load across large pools of low-cost commodity hardware with the goal of attaining highly valuable insight into customer usage patterns and preferences.

In the past few years, however, the power of Hadoop has started to be recognized in more conservative industries and among far more risk-averse organizations — not least those in the financial services sector.

According to Mike Olson, founder and chief strategy officer at Cloudera, one of the largest distributors of the open source platform, three out of the top five banks in the US are already customers of the company. In Europe, he adds, major banks were among the company’s earliest customer wins. “Banks are big users of Hadoop; they just tend not to talk about it publicly,” he says. Or at least they didn’t: now that they are seeing concrete results from the technology’s deployment, some can’t contain their enthusiasm.
Monetizing data

At the recent Hadoop Summit Europe in Amsterdam, for example, international banks HSBC and ING were only too keen to outline the transformational power of Hadoop, as were senior executives from BSkyB, BT and Deutsche Telekom.

That potential not only results from the fact that Hadoop is recognized as being “faster and cheaper than traditional technologies,” as one keynote speaker emphasized, but because it has “fundamentally altered the cost and risk profile of building data platforms at scale. It enables businesses to monetize data and develop product that, just a few years ago, would have been deemed high risk or too costly on traditional data platforms.”

Those kinds of glowing endorsements are taking Hadoop into the mainstream and encouraging others to move beyond early experimentation. At Komerční banka, the Czech financial services provider owned by France’s Société Générale, head of information management Petr Novak says he has plans for the technology.

“This year we want to get a proof of concept in place to trial the analysis of unstructured data,” he says.
Deep customer insight

Novak sees three significant use-cases for Hadoop at the bank. First, it might be valuable in exploring the written records taken by the bank’s relationship managers when meeting with customers. This would give Komerční banka a better understanding of which relationship managers are more effective at signing up customers for particular financial services products, as well as highlighting those customers who might be receptive to upselling or cross-selling campaigns.

Cropped Petr Novak
Petr Novak, head of information management, Komerční banka

Second, Hadoop could be used to analyze the free-text fields that exist on some of the bank's application forms and transaction documents, where customers are able to record their own text: marking a payment as ‘personal funds’ or ‘loan payment,’ for example.

And third, it could be used to explore customer sentiment — both positive and negative — expressed in emails to the bank. “This kind of sentiment analysis would allow us to understand better what products and services they need from us and what aspects of our service they complain about the most,” says Novak.

“Historically, the biggest questions we’ve had to answer as a bank have been questions about our own operations, often put to us by regulators,” says Novak. This, he adds, is where the bank already makes extensive use of structured data analysis and data warehousing technologies from Teradata. “But new questions are emerging for banks to answer — and these are questions that we ask ourselves about our customers,” he says. “This is where I can really see Hadoop playing a part in our information management strategy.”

Beyond this customer focus, the most common uses of Hadoop by banks, according to Olson of Cloudera, are for fraud detection and anti-money laundering initiatives, portfolio valuations and market-risk modeling.

“Most companies only ever analyze around 12% of the data they hold, leaving potentially valuable data on the cutting-room floor.”

All these use-cases seek to draw value from mountains of unstructured data, but for banks itching to make a start on their own Hadoop pilots that may not be the best place to start, according to Mike Gualtieri, an analyst with IT market research company Forrester Research.

“Most companies have some use for Hadoop right now, but the need to analyze unstructured data is not a prerequisite. In fact, I’d go as far as to say that that is how to start the hard way,” he says.

A better approach, he says, is to test out Hadoop with structured data first. Most banks have plenty of that, much of which goes unanalyzed today. In fact, says Gualtieri, across all sectors, most companies only ever analyze around 12% of the data they hold.

“They’re constantly leaving potentially valuable data on the cutting-room floor and that’s often because of the costs associated with staging it in enterprise data warehouses,” he adds. Hadoop, by contrast, provides a low-cost way to gather together data from a range of different databases and systems to discover new insights.

“The biggest problem I see in current data analysis environments is fragmentation. The biggest challenge for companies is bringing together regular, structured data from multiple data sources,” he says. As a result, this is one of the most common opportunities for Hadoop deployment today, but it’s not one considered ‘sexy’ enough for vendors to shout much about, he says.

Regardless, getting to grips with structured data in Hadoop is the ‘low-hanging fruit’ that can help banks and others achieve their first quick wins with the technology. “Hadoop can handle both structured and unstructured data, which is what’s so interesting about the platform. Over time, most organizations will do both but for now, start with structured and then move onto unstructured,” Gualtieri counsels.

First published May 2014
Share on LinkedIn
Share on Xing

    Your choice regarding cookies on this site

    Our website uses cookies for analytical purposes and to give you the best possible experience.

    Click on Accept to agree or Preferences to view and choose your cookie settings.

    This site uses cookies to store information on your computer.

    Some cookies are necessary in order to deliver the best user experience while others provide analytics or allow retargeting in order to display advertisements that are relevant to you.

    For a full list of our cookies and how we use them, please visit our Cookie Policy

    Essential Cookies

    These cookies enable the website to function to the best of its ability and provide the best user experience for you. They can still be disabled via your browser settings.

    Analytical Cookies

    We use analytical cookies such as those used by Google Analytics to give us information about the way our users interact with i-cio.com - this helps us to make improvements to the site to enhance your experience.

    For a full list of analytical cookies and how we use them, visit our Cookie Policy

    Social Media Cookies

    We use cookies that track visits from social media platforms such as Facebook and LinkedIn - these cookies allow us to re-target users with relevant advertisements from i-cio.com.

    For a full list of social media cookies and how we use them, visit our Cookie Policy