Selasa, 02 April 2013

Nieman Journalism Lab

Nieman Journalism Lab


Shaping technology to the story: The Brown Institute for Media Innovation is finding its niche

Posted: 01 Apr 2013 11:15 AM PDT

The Brown Institute for Media Innovation just began accepting applications from students or alumni of Columbia and Stanford for its second round of Magic Grants. Helen Gurley Brown made headlines last year when she donated $30 million jointly to Columbia and Stanford to found the Brown Institute for Media Innovation, a bicoastal effort toward helping students build “usable tools” for the proliferation of “great content.”

The idea was that combining the engineering prowess of Stanford students with the journalistic know-how of Columbia students would propel innovation in the news industry. To that end, Columbia would construct a $6 million state-of-the-art newsroom within its j-school building (now under construction), and the institute would offer serious grant money — up to $150,000 per team, or $300,000 if it features members from both schools — for projects. Its next batch of Magic Grantees — due to be announced at the end of May — will go a long way toward further defining what a direct collaboration between computer science and journalism can produce.

The quest for personalized TV

The first three Magic Grants were awarded last June. Connecting the Dots is a project by two Stanford students dedicated to drawing out large, complex, data-heavy news stories through logic mapping, similar to the way that metropolitan transit maps simplify networks of trains and busses. Dispatch, a joint startup that already has an app for sale through Apple, helps journalists in crisis scenarios conceal their identities while publishing via mobile device.

The largest team belongs to the third winner, EigenNews — 10 members from both campuses combined. The idea: personalized television, built around a playlist of of national news clips based on the user’s selected preferences (by both category and by show) and by viewing behavior and user voting. (You can sign up and get a daily email update from EigenNews — it works pretty well.)

eigennews-screenshot

The design is meant to provide the user up-to-the-minute broadcast news while filtering out certain types of stories, but to maintain a sense of immediacy, some current very popular current stories make the playlist no matter what. “The playlist strikes a balance between presenting the most important stories currently and those stories that might be of particular interest to you,” wrote Stanford-based team member David Chen in an email. “For the second factor to be more evident, the user’s view history has to contain a sufficient number of samples.” As the project’s description puts it:

We forecast that next-generation video news consumption will be more personalized, device agnostic, and pooled from many different information sources. The technology for our project represents a major step in this direction, providing each viewer with a personalized newscast with stories that matter most to them…

Our personalized news platform will analyze readily available user data, such as recent viewing history and social media profiles. Suppose the viewer has recently watched the Republican presidential candidates debate held in Arizona, an interview with a candidate's campaign manager, and another interview with the candidate himself. The debate and the candidate's interview are "liked" by the viewer and several friends on Facebook. This evidence points to a high likelihood that a future video story about the Republican presidential race will interest the viewer. The user's personalized news stream will feature high-quality, highly-relevant stories from multiple channels that cover the latest developments in the presidential race.

Chen said the EigenNews team wants to incorporate more sharability in the future — currently, you can generate a link by click a button on the player, but they hope to add comments soon. He also said they’re looking toward a future model that would incorporate more local coverage and user-generated video content.

“Seeing situations where the journalism is leading”

Mark Hansen, who was appointed director of the Columbia side of the Brown Institute last fall, says he imagines some form of the EigenNews project will probably live on. “That work is work that Bernd [Girod, his Stanford counterpart] does as part of his research program, so my guess would be that some part of that work will be funded consistently.” Hansen will be overseeing the administration of the second round of funding. Coming from the Center for Embedded Networked Sensing at UCLA, where he gradually began to realize the implications of data journalism, he is a blend of journalist and statistician.

“Over the course of my ten years at UCLA, the Center shifted…to more participatory systems, where we were encouraging the public to get involved with data collection. As we started working with community groups, as we started reaching out to high schools, the character of the enterprise changed,” he says. While sensor networks are opening up the power of public data, coordinating the gathering, calibration, analysis, and dissemination of that information is no small order. Hansen says that realization has honed his understanding of the important role that journalists play. His students learn to code — not just how to work with engineers who code — but what he’s most interested in are projects whose genesis is a journalistic question, not a technological advancement.

“I’m interested in seeing situations where the journalism is leading. Where there’s some story that needs to be told, or some aspect of a story that can’t be told with existing technology, but then drives the creation of a new technology,” he said. “As opposed to, ‘Look, we made tablets — okay, now you guys tell stories around tablets.’”

Since moving to Columbia, Hansen has had ample opportunity to observe the interplay of hard science and journalistic practice. He teaches a course on computational journalism, and he says the transition from teaching statisticians to journalism students has been enlightening. “When you teach a statistician about means, for example, their comment on the data will end with ‘The mean is 5.’ The journalist will say: ‘The mean is 5, which means, compared to this other country, or five countries, or other neighborhood…’ The journalists will go from the output of the method to the world. They contextualize, they tell stories — Emily Bell calls this narrative imagination — and they are hungrier than any other students I have ever worked with.”

Hansen plans to use the resources of the Brown Institute to recreate the open dialogue and experimentation of the classroom, in hopes of uncovering ideas for projects and prototypes to receive Magic Grant funding. “I’m usually the one writing the grants, not the one giving them away,” he joked. To that end, he’s been in conversation with industry professionals from the likes of ProPublica, The New York Times and Thomson Reuters, trying to figure out “what the interesting questions are,” he says. Defining what Brown can do that is distinct from the other institutes, labs, and other entities in the space is a top priority.

Organizing hackathons and other collaborative events is another route Hansen wants to explore. He is interested in a hackathon model with more concrete pedagogical objectives than the typical open-ended approach. The Brown Institute has already hosted a data hackathon, as well as a conference Hansen calls a “research threeway,” after the three sectors he aims to bring together — journalism, technology, and “opportunity” (that is, funding). Mixing speakers with journalism, media, and engineering backgrounds resulted in a “beautiful collision of language,” he said, and some intriguing ideas.

“There was a nice conversation around serendipity, especially as it connects to large data sets. I think often times we fall back on a kind of search metaphor where we are constantly googling something. If we don’t know what it is we’re looking for, how do we activate an archive, how do we activate a data set? How do you engineer serendipity?”

Building a space

Meanwhile, Hansen has also been overseeing some engineering in a more concrete sense. He hopes to unveil the Brown Institute’s newsroom by summer 2014, a two-story facility which he says draws inspiration from both traditional newsrooms and the “large, open, reconfigurable workspace” that we associate with startups and tech incubators. The space will feature a mezzanine, transparent conference rooms, and shared workspaces called “garages.” It’ll be a wireless office space with flat panel displays and a number of projectors, shared by Brown grantees, fellows, and faculty. “Emily Bell will be teaching a class on the sensor newsroom, a kind of pop-up newsroom,” Hansen says, “and that space will be the perfect space to try out the ideas that are coming out of that class.”

Hansen says one of the most rewarding parts of his directorship so far was having the chance to share the plans for the newsroom with donor Helen Gurley Brown just before she passed away last August. Both the architects and the web designers for the Institute’s new website were told to use the creative output of Brown and her husband, film producer David Brown, as a design compass. As a result, the website will feature a rotating color palette, updated on a monthly basis to reflect covers from Cosmopolitan magazine throughout Brown’s career.

Running a bicoastal institute is not without its challenges, and the hope is that the new space in New York and a newly unified website should help to deal with those. Stanford grantees and fellows don’t have a centralized office space like their New York counterparts, but travel costs are covered by Magic Grants for bicoastal projects and regular project reviews.

Still, Hansen says figuring out how to operate as one entity has been challenging. “Not only is [Stanford] 3,000 miles away, and not only is it two different disciplines,” he says, “but it’s also the quarter system and the semester system, and three hours’ [time] difference — every little thing you could imagine is different is different.” In addition, engineering grad students study for four to five years, while Columbia’s main graduate journalism program is only one year long. To allow the journalism students equal opportunity to participate, they’ll be eligible to apply for Magic Grants as part of an additional, second year. Says Hansen: “We’re doing what we can to make it feel like a cohesive whole.”

The Brown Institute is also invested in ensuring that, when it funds successful projects, they have the opportunity to live on. While grant winners can apply for a second year of funding, Hansen is also focused on communicating with private investors, companies, and other foundations. He’s particularly excited about the potential addition of computational journalism to the National Science Foundation‘s official subfields, which would open up significant additional funding for Brown Institute alums.

“It does really feel like a great moment to be thinking about technology and storytelling, technology and journalism,” Hansen says. But in addition to using technology to propel the journalism industry into the future, he takes cues from the memory of the Browns, and hopes to shape the Institute into something that reflects them both.

“Helen and David were showmen, if you will,” Hansen says. “They really understood audiences and how to tell a good story.”

Not an April Fool’s joke: The New York Times has built a haiku bot

Posted: 01 Apr 2013 07:18 AM PDT

timeshaiku2

New York Times senior software architect Jacob Harris has a thing for robots and wordplay. You may recall he’s the guy behind @nytimes_ebooks, the Times answer to the elusive and inscrutable Twitter bot @Horse_ebooks.

So it’s only natural that Harris has now created an algorithm that extrudes haiku out of the text of Times stories. In other words:

Haiku harvester
built inside The New York Times —
does it have a soul?

(If my eighth grade English teacher is reading this. Sorry.)

Here’s a better, more Times-y example:

timeshaiku1

Times Haiku is a collection of what they are calling “serendipitous poetry,” derived from stories that have made the homepage of NYTimes.com. The haiku live on a Tumblr hosted by the Times. Harris built a script that mines stories for haiku-friendly words and then reassembles them into poetry. (For those of you that may have zoned out in class, haiku are comprised of three lines with, in order, five, seven, and five syllables.) The code checks words against an open source pronunciation dictionary, which handily also contains syllable counts.

“Sometimes it can be an ordinary sentence in context, but pulled out of context it has a strange comedy or beauty to it,” Harris said.

Harris was inspired by Haikuleaks, a similar project that found poetry in the cache of diplomatic cables released by WikiLeaks in 2010. The backbone of that project was an open source program called Haiku Finder, which crawls through text to generate haiku. The program was built in Python; Harris made his own version in Ruby on Rails.

The result, much like @nytimes_ebooks, is bizarre, quirky, and kind of zen. The haiku have a strange way of getting at the heart of a story, or teasing out interesting fragments from an article. “There’s something appealing about finding these snippets of text, these turns of phrase and pulling them out,” Harris said. “You find it compelling and it drives you to read the article that it came from.” (Think of it as a more lexicographically strict version of Paul Ford’s SavePublishing.)

In its own poetic way, Times Haiku will be another access point for Times stories, said Marc Lavallee, assistant editor for interactive news at the Times. “If someone sees the site, or the image of an individual haiku and shares it on Tumblr, and it gets them to think about who we are and what we do, or gives them a moment of pause, I think we’ve succeeded in a way,” Lavallee said.

Lexi Mainland, social media editor for the Times, said they wanted the poems to be able to stand on their own and be readily sharable. That’s why the haiku are actually images, which fits well with the aesthetic of Tumblr, she said. Outside of Tumblr, the Times will promote the haiku through the paper’s flagship Twitter account.

That the Times has the ability to build a haiku bot isn’t surprising. But why build a haiku bot? “A lot of the projects we work on here are these incredibly big heaves, which are very, very gratifying,” said Mainland. “But you crave these smaller projects, which are just as valuable.” Similarly, projects like the haiku bot may seem silly on the surface, but the underlying code, the use of natural language processing, or other components could be valuable to future projects, Lavallee said.

It helps that the project came at little expense to the Times — Harris put it together on his own during a fit of post-election letdown. Harris had been working on projects connected to the presidential race for over a year, and after election day suddenly found himself with idle hands. He wrote the code in November and began monitoring what it was spitting out. After showing it to Mainland, Lavallee, and other editors, they gave the project a green light. Designer Heena Ko and software developer Anjali Bhojani gave the haiku their distinctive appearance for Tumblr. (Those lines you see running askew of the text of the haiku? The length is computer generated, based on the meter of the first line of text.)

As whimsical as a haiku bot or a spammy-sounding Twitter bot might be, both are efforts to find new uses for the Times’ vast collection of work. “It’s just this large corpus of text that gets very dizzing to look through,” Harris said.

The Times may also have a soft spot for artwork inspired by the written word. Anyone who has visited the lobby of The New York Times Building has likely seen Moveable Type, an algorithm-backed art installation that displays fragments of Times content across 560 display screens.

But why poetry? For starters, today is the first day of National Poetry Month, Mainland said. (Today is also April Fool’s — and if you were wondering, this is not a joke.) Still, for lovers of verse, it may sound like a cold and bloodless way to create poetry. Can you really create poetry without a soul? Do robots have feelings? Can they really see a sunset, or be moved by the sounds of a whale songs CD?

Harris admits the bot is imperfect; it’s required a little teaching along the way. One reason he limited the scope to the front page was because it provides an editor-picked selection, which tends to be richer features and important daily fare. (Running the bot on the Times Wire, Harris said he often got haiku made up of basketball scores, which may be too esoteric for any lit major or stat nerd.) The algorithm is designed to toss haiku with certain sentence constructions (sentences that start with a preposition, for instance) or from sensitive stories. Mainland, Lavallee, and Harris also keep an eye on the haiku being created to see if anything untoward sneaks through.

But Harris also has to do some syllable counting himself, teaching the bot words that appear in the Times (“Rihanna,” for instance) that it doesn’t know. Henry Higgins would be proud.