Customer Login  |  

by Selene Arrazolo

Two groups of data professionals did a deep dive into the many issues surrounding the mining, organizing, and presenting data on February 24th at the University of Texas at Austin’s School of Information (the UT iSchool). Iron Mountain sponsored and moderated the event.

The opening panel on Data Wrangling discussed the “readiness” of available data sources and how to mine, format, analyze, and manage data. Offering their views were Dr. Byron Wallace from the iSchool; Abha Dogra, VP of Global Enterprise Architecture for Iron Mountain; Tyron Stading, President and Founder of Innography; and Chris McKinzie, founder of Enlyton (an IEI partner software company). They talked about the value of relevant data, the importance of keeping data extraction algorithms up-to-date, and trends in data management. Ms. Dogra stressed the difficulty of designing and implementing effective data management workflows.

The second panel, Harnessing Big Data, focused on data visualizations that allow people to “connect” with data. Speakers included Dr. Unmil Karadkar from the iSchool, Fuyu Li, Earth Scientist from Chevron; Cynthia Mancha, Analyst from Tasktop; and Beth Hallmark, Travis County Deputy Chief of Communications, Public Outreach and Open Data. Dr. Karadkar opened the discussion by admitting that data comprehension can be difficult, so it helps to use visual surrogates to get data into a person’s head. He joked, “think of Oprah with the cars….everybody gets a visualization.” Dr. Li from Chevron, who works with meteorological, seismic, oceanographic and other scientific data, disagreed with Dr. Karadkar’s stance on aesthetically pleasing graphics, preferring to focus on simplicity. “The purpose is to answer a specific question. Make the visualization simple enough for the audience to understand.”

Cynthia Mancha looked at data from the project manager’s point of view. To her, visualizations are a necessary management function. They provide insight into where valuable time is spent, but, for a visualization to be helpful, “familiarization with data and context is critical and the team member must be intimately involved.” Dr. Li echoed her perspective by pointing out that detail-oriented quality control is key with data used in visualizations. A good visualization will only serve to spotlight underlying bad data.

Last to present was Beth Hallmark, responsible for making data on Travis County’s 90 billion transactions a year both accessible and understandable to her audience of budget writers and the general public. Beth shared the difficulty she had finding someone to manage her county’s visualizations. The perfect candidate had to be both tech savvy and able to illustrate a story. In the end, she chose someone with a journalism background who fit the bill.

Key takeaways both groups emphasized included the importance of knowing your clients, and the need to define customer requirements early on, even if it takes outside guidance.

{ 0 comments }

posted by Shyamali Ghosh on February 25, 2015

by Matt Manning

Every information service wants their content to be discovered on the Internet and billions have been made serving that need. Lately there’s been a lot of talk about the “data discovery” technology that powers services like Outbrain, Taboola, nRelate (IAC), Gravity (AOL), Disqus, Scribol, and ShareThrough. Using a combination of content recommendation algorithms or content sharing tools and a simple affiliate-type marketing business models, these services have taken the Internet’s advertising-supported information services by storm.

You want 1 million new unique visitors to your site? Just pay these folks $15/M for the traffic and the traffic will come… guaranteed. Also, if you want to trade some space at the end of your articles to display teasers for third-party articles, then you can earn $1 for every thousand folks who click-through to the articles. What’s not to like?

The technology offering itself is usually a “more like this” algorithm that suggests “more articles like this,” “more articles that your profile indicates that you would like,“ or just popular types of articles or popular articles by category. The “sharing” services are just widgets embedded in your site to make social sharing easier and less breakable, and they earn you a little incremental revenue on the side as well.

The models are simple, the technology is easy to implement, and it’s less expensive than complex SEO efforts or buying premium US traffic from Google AdWords. And, like anything that sounds a bit too simple, there are some issues.

  • The traffic you get has a high bounce rate (i.e., one impression and gone) of around 85%, which makes it more expensive than Facebook for acquiring an active, engaged user.
  • The recommended articles can be unabashed link bait from dubious sources that can tarnish your brand (“Kim Kardashian went bikini shopping and you’ll never guess what happened next!”)
  • The teasers are ads that compete for reader attention with other advertising that you sell directly at a much higher price.
  • If you want very high volumes you need to use several of these services. For more control over the sources, traffic, and the recommended articles, you have to spend the time and money to actively manage this.

For these reasons some publishers prefer to handle their own data discovery the “organic” way. At a recent Open Data Institute event in London the folks behind the BBC News sites spoke about the tools they used to improve “discovery” of their content. They append subject taxonomies, geo tags, and normalized organization and people names like you’d imagine, but they also have authors associate thematic topics to the articles as well. Add links to underlying primary sources and you’ve got enough “hooks” to each article to make them much easier to find both on BBC sites and via search engines. Topical monitoring also gets much more granular this way.

Of course, these organic approaches will never have the immediate gratification appeal of the content syndication platforms, but they incrementally increase engagement and pageviews while preserving the integrity of the brand and sometimes a smaller, demographically more desirable, more engaged audience is exactly what product-oriented advertisers would like to discover.

{ 0 comments }

posted by Shyamali Ghosh on February 17, 2015