Executives talk about the value of data in generalities, but Michele Koch, director of enterprise data intelligence at Navient Solutions, can calculate the actual worth of her company’s data.
In fact, Koch can figure, in real dollars, the increased revenue and decreased costs produced by the company’s various data elements. As a result, she is well aware that problems within Navient’s data can hurt its bottom line. A mistake in a key data field within a customer’s profile, for instance, could mean the company can’t process a loan at the lowest cost.
“There’s money involved here, so we have a data quality dashboard where we track all of this. We track actual and potential value,” she says.
An early data-related initiative within Navient, an asset management and business processing service company based in Wilmington, Del., illustrates what’s at stake, says Barbara Deemer, chief data steward and vice president of finance. The 2006 initiative focused on improving data quality for marketing and yielded a $7.2 million ROI, with returns coming from an increased loan volume and decreased operating expenses.
Since then, Navient executives committed themselves to supporting a strong data governance program as a key part to a successful analytics effort, Koch says. Navient’s governance program includes long-recognized best practices, such as standardizing definitions for data fields and ensuring clean data.
It assigns ownership for each of its approximately 2,600 enterprise data elements; ownership goes either to the business area where the data field first originated or the business area where the particular data field is integral to its processes.
The company also has a data quality program that actively monitors the quality of fields to ensure high standards are constantly met. The company also launched a Data Governance Council (in 2006) and an Analytics Data Governance Council (in 2017) to address ongoing questions or concerns, make decisions across the enterprise, and continually improve data operations and how data feeds the company’s analytics work.
“Data is so important to our business initiatives and to new business opportunities that we want to focus on always improving the data that supports our analytics program,” Koch says.
Most executives agree that data governance is vital, citing compliance, customer satisfaction and better decision-making as key drivers, according to the 2018 State of Data Governance from data governance solutions company Erwin and UBM. However, the report found that almost 40 percent of responding organizations don’t have a separate budget for data governance and some 46 percent don’t have a formal strategy for it.
The findings are based on responses from 118 respondents, including CIOs, CTOs, data center managers, IT staff and consultants.
Given those figures, experts say it’s not surprising that there are weak spots in many enterprise data programs. Here’s a look at seven such problematic data practices.
Bringing data together, but not really integrating it
Integration tops the list of challenges in the world of data and analytics today, says Anne Buff, vice president of communications for the Data Governance Professionals Organization.
True, many organizations gather all their data in one place. But in reality they don’t integrate the various pieces from the multiple data sources, Buff explains. So the Bill Smith from one system doesn’t connect with the data on Bill Smith (and the variations of his name) generated by other systems. This gives the business multiple, incomplete pictures of who he is.
“Co-located data is not the same as integrated data,” Buff says. “You have to have a way to match records from disparate sources. You need to make it so, when this all comes together, it creates this larger view of who Bill Smith is. You have to have something to connect the dots.”
Various data integration technologies enable that, Buff says, and selecting, implementing and executing the right tools is critical to avoid both too much manual work or redoing the same work over and over.
Moreover, integration is becoming increasingly critical because data scientists are searching for patterns within data to gain the kind of insights that can yield breakthroughs, competitive advantages and the like.
“But if you can’t bring together data that has never been brought together before, you can’t find those patterns,” says Buff, who is also an advisory business solutions manager at SAS in Cary, N.C.
Not realizing business units have unique needs
Yes, consolidated, integrated data is critical for a successful analytics program. But some business users may need a different version of that data, Buff says.
“Data in one form doesn’t meet the needs for everyone across the organization,” she adds.
Instead, IT needs to think about data provisioning, that is, providing the data needed for the business case determined by the business user or business division.
She points to a financial institution’s varying needs as an example. While some departments might want integrated data, the fraud detection department might want its data scientists to use unfettered data that isn’t clean so they can search for red flags. They might want to search for someone at the same address using slight variations of their personal identifying information to apply for multiple loans.
“You’ll see similar data elements but with some variables, so you don’t want to knock out too much of those variances and clean it up too much,” Buff explains.
On the other hand, she says, the marketing department at that financial institution would want to have the correct version of a customer’s name, address and the like to properly target communications.
Recruiting only data scientists, not data engineers, too
As companies seek to move beyond basic business intelligence to predictive and prescriptive analytics as well as machine learning and artificial intelligence, they need increasing levels of expertise on their data teams.
That in turn has shined a spotlight on the data scientist position. But equally important is the data engineer, who wrangles all the data sets that need to come together for data scientists to do their work but has (so far) gained less attention in many organizations.
That’s been changing, says Lori Sherer, a partner in Bain & Co.’s San Francisco office and leader of the firm’s Advanced Analytics and Digital practices.
“We’ve seen the growth in the demand for data engineer is about 2x the growth in the demand for data scientist,” Sherer says.
The federal Bureau of Labor Statistics predicts that demand for data engineers will continue to grow at a fast clip for the next decade, with the U.S. economy adding 44,200 positions between 2016 and 2026 with an average annual pay already at $135,800.
Yet, like many key positions in IT, experts say there aren’t enough data engineers to match demand — making IT departments who are now just beginning to hire or train for the position playing catch up.
Keeping data past its prime, instead of managing its lifecycle
The cost of storage has dropped dramatically over the past decade, enabling IT to more easily afford to store reams of data for much longer than it ever could before. That might seem like good news, considering the volume and speed at which data is now created along with the increasing demand to have it for analysis.
But while many have hailed the value of having troves and troves of data, it’s often too much of a good thing, says Penny Garbus, co-founder of Soaring Eagle Consulting in Apollo Beach, Fla., and co-author of Mining New Gold: Managing Your Business Data.
Garbus says too many businesses hold onto data for way too long.
“Not only do you have to pay for it, but if it’s older than 10 years, chances are the information is far from current,” she says. “We encourage people to put some timelines on it.”
The expiration date for data varies not only from organization to organization, it varies by departments, Garbus says. The inventory division within a retail company might only want relatively recent data, while marketing might want data that’s years old to track trends.
If that’s the case, IT needs to implement the architecture that delivers the right timeframe of data to the right spot, to ensure everyone’s needs are met and old data doesn’t corrupt timely analytics programs.
As Garbus notes: “Just because you have to keep [old data], doesn’t mean you have to keep it inside your core environment. You just have to have it.”
Focusing on volume, rather than targeting relevancy
“We’re still building models and running analytics with the data that is most available rather than with the data that is most relevant,” says Steve Escaravage, senior vice president of IT consulting company Booz Allen Hamilton.
He says organizations frequently hold the mistaken notion that they should capture and add more and more datasets. He says they think “that maybe there’s something in there that we haven’t found rather than asking: Do we have the right data?”
Consider, he says, that many institutions look for fraud by analyzing vast amounts of data to look for anomalies. While an important activity, leading institutions also analyze more targeted datasets that can yield better results. In this case, they might look at individuals or institutions that are generating certain types of transactions that could indicate trouble. Or healthcare institutions might consider, when analyzing patient outcomes, data regarding how long doctors were on their shifts when they delivered patient care.
Escaravage says organizations could start by creating a data wish list. Although that exercise starts with the business side, “the mechanisms to capture it and make it available, that’s the realm of the CIO, CTO or chief data officer.”
Providing data, but ignoring where it came from
One of the big topics today is bias in analytics, a scenario that can skew results or even produce faulty conclusions that lead to bad business decisions or outcomes. The problems that produce bias reside in many different arenas within an enterprise analytics program — including how IT handles the data itself, Escaravage says.
Too often, he says, IT doesn’t do a good enough job tracking the provenance of the data it holds.
“And if you don’t know that, it can impact the performance of your models,” Escaravage says, noting the lack of visibility into how and where data originated makes controlling for bias even more difficult.
“It’s IT’s responsibility to understand where the data came from and what happened to it. There’s so much investment in data management, but there should also be a meta data management solution,” he says.
Providing data, but failing to help users understand context
IT should not only have a strong metadata management program in place, where it tracks the origin of data and how it moves through its systems, it should provide users insight into some of that history and provide context for some of the results produced via analytics, Escaravage says.
“We get very excited about what we can create. We think we have pretty good data, particularly data that’s not been analyzed, and we can build a mental model around how this data will be helpful,” he says. “But while the analytics methods of the past half-decade have been amazing, the results of these techniques are less interpretable than in the past when you had business rules applied after doing the data mining and it was easy to interpret the data.”
The newer, deep learning models offer insights and actionable suggestions, Escaravage explains. But these systems don’t usually provide context that could be helpful or even critical to the best decision-making. It doesn’t provide, for instance, information on the probability vs. the certainty that something will occur based on the data.
Better user interfaces are needed to help provide that context, Escaravage says.
“The technical issue is how people will interface with these models. This is where a focus on the UI/UX from a transparency standpoint will be very important. So if someone sees a recommendation from an AI platform, to what degree can they drill down to see an underlying probably, the data source, etc.?” he says. “CIOs will have to ask how to build into their systems that level of transparency.”
AMAZON ERROR ALLOWED ALEXA USER TO EAVESDROP ON ANOTHER HOME
A user of Amazon’s Alexa voice assistant in Germany got access to more than a thousand recordings from another user because of “a human error” by the company.
The customer had asked to listen back to recordings of his own activities made by Alexa but he was also able to access 1,700 audio files from a stranger when Amazon sent him a link, German trade publication c’t reported.
“This unfortunate case was the result of a human error and an isolated single case,” an Amazon spokesman said.
The first customer had initially got no reply when he told Amazon about the access to the other recordings, the report said. The files were then deleted from the link provided by Amazon but he had already downloaded them on to his computer, added the report from c’t, part of German tech publisher Heise.
CRYPTOCURRENCY INDUSTRY FACES INSURANCE HURDLE TO MAINSTREAM AMBITIONS
Cryptocurrency exchanges and traders in Asia are struggling to insure themselves against the risk of hacks and theft, a factor they claim is deterring large fund managers from investing in a nascent market yet to be embraced by regulators.
Getting the buy-in from insurers would mark an important step in crypto industry efforts to show that it has solved the problem of storing digital assets safely following the reputational damage of a series of thefts, and allow it to attract investment from mainstream asset managers.
“Most institutionally minded crypto firms want to buy proper insurance, and in many cases, getting adequate insurance coverage is a regulatory or legal requirement,” said Henri Arslanian, PwC fintech and crypto leader for Asia.
“However, getting such coverage is almost impossible despite their best efforts.”
Many asset managers are interested in digital assets. A Greenwich Associates survey, published in September, said 72% of institutional investors who responded to the research firm believe crypto has a place in the future.
Last month, Mohamed El-Erian, Allianz’s chief economic adviser said that cryptocurrencies would gain wider acceptance as institutions began to invest in the space.
Most have held off investing so far however, citing regulatory uncertainty and a lack of faith in existing market infrastructure for storing and trading digital assets following a series of hacks, as well the plunge in prices.
The total market capitalisation of crypto currencies is currently estimated at approximately US$120bil (RM502bil) compared to over US$800bil (RM3.3tril) at its peak in January.
“Institutional investors who are interested in investing in crypto will have various requirements, including reliable custody and risk management arrangements,” said Hoi Tak Leung, a senior lawyer in Ashurst’s digital economy practice.
“Insufficient insurance coverage, particularly in a volatile industry such as crypto, will be a significant impediment to greater ‘institutionalisation’ of crypto investments.”
Regulatory uncertainty is another problem for large asset managers. While crypto currencies raise a number of concerns for regulators, including money laundering risks, few have set out clear frameworks for how cryptocurrencies should be traded, and by whom.
Insurance might allay some of the regulators’ concerns around cyber security. Hong Kong’s Securities and Futures Commission recently said it was exploring regulating crypto exchanges, and signalled that the vast majority of the virtual assets held by a regulated exchange would need insurance cover.
Keeping crypto assets secure involves storing a 64 character alphanumeric private key. If the key is lost, the assets are effectively lost too.
Assets can be stored online, in so-called hot wallets, which are convenient to trade though vulnerable to being hacked, or in ‘cold’ offline storage solutions, safe from hacks, but often inconvenient to access frequently.
Over US$800mil worth of crypto currencies were stolen in the first half of this year according to data from Autonomous NEXT, a financial research firm.
Some institutions have started working to solve this problem, and may provide fierce competition to the incumbent players.
This year, Fidelity, and a group including Japanese investment bank Nomura have launched platforms that will offer custody services for digital assets.
Despite the industry’s complaints, insurers say that they do offer cover. Risk advisor Aon, received some two dozen inquiries this year from exchanges and crypto vaults seeking insurance, according to Thomas Cain, regional director, commercial risk solutions, at Aon’s Asian financial services and professions group.
“It is not difficult to insure companies that hold large amounts of crypto assets, but given the newness of the asset class and the publicity some of the crypto breaches have received, applicants need to make an effort to distinguish themselves,” Cain said.
The industry also says it is getting closer to solving the custody problem.
“This year there have been a number of developments, and some providers have developed custody solutions suitable for institutional clients’ needs,” said Tony Gravanis, managing director investments at blockchain investment firm Kenetic Capital.
“Players at the top end of the market have also been able to get insurance,” he said.
But this is not the case for all.
One cryptocurrency broker, declining to be named because of the subject’s sensitivity, said insurers struggled to understand the new technology and its implications, and that even those who were prepared to provide insurance would only offer limited cover. “We’ve not yet found an insurer who will offer coverage of a meaningful enough size to make it worthwhile,” he said. – Reuters
PICHAI PUTS KIBOSH ON GOOGLE SEARCH ENGINE FOR CHINA
Google is not working on a bespoke search engine that caters to China’s totalitarian tastes, and it has no plans to develop one, CEO Sundar Pichai told lawmakers on Capitol Hill Tuesday.
“Right now, we have no plans to launch in China,” he told members of the U.S. House Judiciary Committee at a public hearing on Google’s data collection, use and filtering practices.
“We don’t have a search product there,” he said. “Our core mission is to provide users access to information, and getting access to information is an important human right.”
Pichai acknowledged that the company had assigned some 100 workers to develop a search engine for totalitarian countries, however.
“We explored what search would look like if it were to be launched in a country like China,” he revealed.
A report about a Google search engine for China appeared in The Intercept this summer.
The project, code-named “Dragonfly,” had been under way since the spring of 2017, according to the report, but development picked up after Pichai met with Chinese government officials about a year ago.
Special Android apps also had been developed for the Chinese market, The Intercept stated, and had been demonstrated to the Chinese government for a possible rollout this year.
“We certainly hope they abandoned those plans,” said Chris Calabrese, vice president for policy for the Center for Democracy & Technology, an individual rights advocacy group in Washington, D.C.
“We didn’t think it was a good idea to build a search engine that would censor speech in order to go into the Chinese market,” he told the E-Commerce Times.
Google may have been testing the waters with its Chinese browser, maintained Russell Newman, assistant professor for the Institute for Liberal Arts & Interdisciplinary Studies at Emerson College in Boston.
“It’s an example of a firm seeing how far down the road it can go before it receives pushback,” he told the E-Commerce Times. “It discovers a limit, then pushes that limit a little more. I’d be surprised if they wholly gave up on the search engine for China.”
Mission: Protecting Privacy
In his opening remarks to the committee, Pichai declared that protecting the privacy and security of its users was an essential part of Google’s mission.
“We have invested an enormous amount of work over the years to bring choice, transparency and control to our users. These values are built into every product we make,” he said.
“We recognize the important role of governments, including this committee, in setting rules for the development and use of technology,” Pichai added. “To that end, we support federal privacy legislation and proposed a legislative framework for privacy earlier this year.”
Pichai also addressed a burning issue for Republican members of the panel.
“I lead this company without political bias and work to ensure that our products continue to operate that way,” he said. “To do otherwise would go against our core principles and our business interests.”
‘Bias Running Amok’
Among the Republicans on the committee who raised the issue of unfairness with respect to the way Google’s search algorithm treats conservative views was Mike Johnson, R-La.
“My conservative colleagues and I are fierce advocates of limited government, and we’re also committed guardians of free speech and the free marketplace of ideas,” he told Pichai.
“We do not want to impose burdensome government regulations on your industry,” Johnson continued. “However, we do believe we have an affirmative duty to ensure that the engine that processes as much as … 90 percent of all Internet searches, is never unfairly used to unfairly censor conservative viewpoints or suppress political views.”
Political bias is running amok at Google, charged committee member Louie Gohmert, R-Texas.
“You’re so surrounded by liberality that hates conservatism, hates people that really love our Constitution and the freedoms that it’s afforded people like you, that you don’t even recognize it,” he told Pichai, who was born in India.
“It’s like a blind man not even knowing what light looks like because you’re surrounded by darkness,” Gohmert added.
Despite Republican claims of liberal bias in Google’s algorithm, “there isn’t any evidence to back that up empirically,” Calabrese said.
Committee members also were concerned about Google’s market dominance.
“I’m deeply concerned by reports of Google’s discriminatory conduct in the market for Internet search,” said David Cicilline, D-R.I.
Google has harmed competition in Europe by favoring its own products and services over rivals, and by deprioritizing or delisting its competitors’ content, he noted citing European Commission findings.
“It is important for the U.S. government to follow the lead of other countries and closely examine the market dominance of Google and Facebook, including their impact on industries such as news media,” observed David Chavern, CEO of the News Media Alliance in Arlington, Va., a trade association representing some 2,000 newspapers in the United States and Canada.
“We will continue to urge for more hearings to examine ways in which the duopoly impacts the business of journalism, which is essential to democracy and civic society,” he told the E-Commerce Times.
Prelude to Privacy Law
House and Senate hearings in recent months are just the prelude to data privacy legislation that could be introduced next year.
“We’re certainly going to see a wide variety of comprehensive privacy bills filed, and I think we’ll make some progress,” Calabrese said.
“Advocates have seen the need for privacy legislation for a long time,” he said, “and now that we have privacy legislation set to kick in in California in 2020, there’s a lot of companies who would rather be governed by a federal law than they would a bunch of different state laws.”
If a general privacy law is enacted, it shouldn’t use Europe’s General Data Protection Regulation as a model, maintained Alan McQuinn, senior policy analyst for the Information Technology and Innovation Foundation, a public policy and technology innovation organization in Washington, D.C.
“We don’t want to see the GDPR enacted here in the states,” he told the E-Commerce Times.
“It is highly likely to create a drag on the European economy and hurt innovation and businesses,” McQuinn explained.
Privacy rules should be styled to fit industries, such as healthcare, finance and commerce, he suggested.
“The sector-specific approach that the U.S. has taken toward privacy has allowed for more innovation,” McQuinn noted, “and created the powerhouse of the digital economy that we have here.”