Skip to main content

Open Source Feeds

Report from HIMSS 2012: toward interoperability and openness

O'Reilly Radar - 1 hour 40 min ago

I was wondering how it would feel to be in the midst of 35,000 people whose livelihoods are driven by the decisions of a large institution at the moment when that institution releases a major set of rules. I didn't really find out, though. The 35,000 people I speak of are the attendees of the HIMSS conference and the institution is the Department of Health and Human Services. But HHS just sort of half-released the rules (called Stage 2 of meaningful use), telling us that they would appear online tomorrow and meanwhile rushing over a few of the key points in a presentation that drew overflow crowds in two rooms.

The reaction, I sensed, was a mix of relief and frustration. Relief because Farzad Mostashari, National Coordinator for Health Information Technology, promised us the rules would be familiar and hewed closely to what advisors had requested. Frustration, however, at not seeing the details. The few snippets put up on the screen contained enough ambiguities and poorly worded phrases that I'm glad there's a 60-day comment period before the final rules are adopted.

There isn't much one can say about the Stage 2 rules until they are posted and the experts have a chance to parse them closely, and I'm a bit reluctant to throw onto the Internet one of potentially 35,000 reactions to the announcement, but a few points struck me enough to be worth writing about. Mostashari used his pulpit for several pronouncements about the rules:

  • HHS would push ahead on goals for interoperability and health information exchange. "We can't wait five years," said Mostashari. He emphasized the phrase "standard-based" in referring to HIE.

  • Patient engagement was another priority. To attest to Stage 2, institutions will have to allow at least half their patients to download and transfer their records.

  • They would strive for continuous quality improvement and clinical decision support, key goals enabled by the building blocks of meaningful use.

Two key pillars of the Stage 2 announcement are requirements to use the Direct project for data exchange and HL7's consolidated CDA for the format (the only data exchange I heard mentioned was a summary of care, which is all that most institutions exchange when a patient is referred).

The announcement demonstrates the confidence that HHS has in the Direct project, which it launched just a couple years ago and that exemplifies a successful joint government/private sector project. Direct will allow health care providers of any size and financial endowment to use email or the Web to share summaries of care. (I mentioned it in yesterday's article.) With Direct, we can hope to leave the cumbersome and costly days of health information exchange behind. The older and more complex CONNECT project will be an option as well.

The other half of that announcement, regarding adoption of the CDA (incarnated as a CCD for summaries of care), is a loss for the older CCR format, which was an option in Stage 1. The CCR was the Silicon Valley version of health data, a sleek and consistent XML format used by Google Health and Microsoft HealthVault. But health care experts criticized the CCR as not rich enough to convey the information institutions need, so it lost out to the more complex CCD.

The news on formats is good overall, though. The HL7 consortium, which has historically funded itself by requiring organizations to become members in order to use its standards, is opening some of them for free use. This is critical for the development of open source projects. And at an HL7 panel today, a spokesperson said they would like to head more in the direction of free licensing and have to determine whether they can survive financially while doing so.

So I'm feeling optimistic that U.S. health care is moving "toward interoperability and openness," the phrase I used in the title to his article and also used in a posting from HIMSS two years ago.

HHS allowed late-coming institutions (those who began the Stage 1 process in 2011) to continue at Stage 1 for another year. This is welcome because they have so much work to do, but means that providers who want to demonstrate Stage 2 information exchange may have trouble because they can't do it with other providers who are ready only for Stage 1.

HHS endorsed some other standards today as well, notably SNOMED for diseases and LRI for lab results. Another nice tidbit from the summit includes the requirement to use electronic medication administration (for instance, bar codes to check for errors in giving medicine) to foster patient safety.

Categories: Open Source Feeds

Big data in the cloud

O'Reilly Radar - Wed, 02/22/2012 - 11:00

Sections

Big data and cloud technology go hand-in-hand. Big data needs clusters of servers for processing, which clouds can readily provide. So goes the marketing message, but what does that look like in reality? Both "cloud" and "big data" have broad definitions, obscured by considerable hype. This article breaks down the landscape as simply as possible, highlighting what's practical, and what's to come.

IaaS and private clouds



What is often called "cloud" amounts to virtualized servers: computing
resource that presents itself as a regular server, rentable per
consumption. This is generally called infrastructure as a service
(IaaS), and is offered by platforms such as Rackspace Cloud or Amazon
EC2. You buy time on these services, and install and configure your
own software, such as a Hadoop cluster or NoSQL database. Most of the
solutions I described in my Big Data Market Survey can be deployed on
IaaS services.



Using IaaS clouds doesn't mean you must handle all deployment
manually: good news for the clusters of machines big data
requires. You can use orchestration frameworks, which handle the
management of resources, and automated infrastructure tools, which
handle server installation and configuration. RightScale offers a
commercial multi-cloud management platform that mitigates some of the
problems of managing servers in the cloud.



Frameworks such as OpenStack and Eucalyptus aim to present a uniform
interface to both private data centers and the public
cloud. Attracting a strong flow of cross industry support, OpenStack
currently addresses computing resource (akin to Amazon's EC2) and
storage (parallels Amazon S3).



The race is on to make private clouds and IaaS services more usable:
over the next two years using clouds should become much more
straightforward as vendors adopt the nascent standards. There'll be a
uniform interface, whether you're using public or private cloud
facilities, or a hybrid of the two.



Particular to big data, several configuration tools already target
Hadoop explicitly: among them Dell's Crowbar, which aims to make
deploying and configuring clusters simple, and Apache Whirr, which is
specialized for running Hadoop services and other clustered data processing systems.



Today, using IaaS gives you a broad choice of cloud supplier, the
option of using a private cloud, and complete control: but you'll be
responsible for deploying, managing and maintaining your clusters.

Microsoft SQL Server is a comprehensive information platform offering enterprise-ready technologies and tools that help businesses derive maximum value from information at the lowest TCO. SQL Server 2012 launches next year, offering a cloud-ready information platform delivering mission-critical confidence, breakthrough insight, and cloud on your terms; find out more at www.microsoft.com/sql. Platform solutions

Using IaaS only brings you so far for with big data applications: they handle the creation of computing and storage resources, but don't address anything at a higher level. The set up of Hadoop and Hive or a similar solution is down to you.

Beyond IaaS, several cloud services provide application layer support for big data work. Sometimes referred to as managed solutions, or platform as a service (PaaS), these services remove the need to configure or scale things such as databases or MapReduce, reducing your workload and maintenance burden. Additionally, PaaS providers can realize great efficiencies by hosting at the application level, and pass those savings on to the customer.

The general PaaS market is burgeoning, with major players including VMware (Cloud Foundry) and Salesforce (Heroku, force.com). As big data and machine learning requirements percolate through the industry, these players are likely to add their own big-data-specific services. For the purposes of this article, though, I will be sticking to the vendors who already have implemented big data solutions.

Today's primary providers of such big data platform services are Amazon, Google and Microsoft. You can see their offerings summarized in the table toward the end of this article. Both Amazon Web Services and Microsoft's Azure blur the lines between infrastructure as a service and platform: you can mix and match. By contrast, Google's philosophy is to skip the notion of a server altogether, and focus only on the concept of the application. Among these, only Amazon can lay claim to extensive experience with their product.

Amazon Web Services

Amazon has significant experience in hosting big data processing. Use of Amazon EC2 for Hadoop was a popular and natural move for many early adopters of big data, thanks to Amazon's expandable supply of compute power. Building on this, Amazon launched Elastic Map Reduce in 2009, providing a hosted, scalable Hadoop service.

Applications on Amazon's platform can pick from the best of both the IaaS and PaaS worlds. General purpose EC2 servers host applications that can then access the appropriate special purpose managed solutions provided by Amazon.

As well as Elastic Map Reduce, Amazon offers several other services relevant to big data, such as the Simple Queue Service for coordinating distributed computing, and a hosted relational database service. At the specialist end of big data, Amazon's High Performance Computing solutions are tuned for low-latency cluster computing, of the sort required by scientific and engineering applications.


Elastic Map Reduce

Elastic Map Reduce (EMR) can be programmed in the usual Hadoop ways, through Pig, Hive or other programming language, and uses Amazon's S3 storage service to get data in and out.

Access to Elastic Map Reduce is through Amazon's SDKs and tools, or with GUI analytical and IDE products such as those offered by Karmasphere. In conjunction with these tools, EMR represents a strong option for experimental and analytical work. Amazon's EMR pricing makes it a much more attractive option to use EMR, rather than configure EC2 instances yourself to run Hadoop.

When integrating Hadoop with applications generating structured data, using S3 as the main data source can be unwieldy. This is because, similar to Hadoop's HDFS, S3 works at the level of storing blobs of opaque data. Hadoop's answer to this is HBase, a NoSQL database that integrates with the rest of the Hadoop stack. Unfortunately, Amazon does not currently offer HBase with Elastic Map Reduce.

DynamoDB

Instead of HBase, Amazon provides DynamoDB, its own managed, scalable NoSQL database. As this a managed solution, it represents a better choice than running your own database on top of EC2, in terms of both performance and economy.

DynamoDB data can be exported to and imported from S3, providing interoperability with EMR.

Google

Google's cloud platform stands out as distinct from its competitors. Rather than offering virtualization, it provides an application container with defined APIs and services. Developers do not need to concern themselves with the concept of machines: applications execute in the cloud, getting access to as much processing power as they need, within defined resource usage limits.

To use Google's platform, you must work within the constraints of its APIs. However, if that fits, you can reap the benefits of the security, tuning and performance improvements inherent to the way Google develops all its services.

AppEngine, Google's cloud application hosting service, offers a MapReduce facility for parallel computation over data, but this is more of a feature for use as part of complex applications rather than for analytical purposes. Instead, BigQuery and the Prediction API form the core of Google's big data offering, respectively offering analysis and machine learning facilities. Both these services are available exclusively via REST APIs, consistent with Google's vision for web-based computing.

BigQuery

BigQuery is an analytical database, suitable for interactive analysis over datasets of the order of 1TB. It works best on a small number of tables with a large number of rows. BigQuery offers a familiar SQL interface to its data. In that, it is comparable to Apache Hive, but the typical performance is faster, making BigQuery a good choice for exploratory data analysis.

Getting data into BigQuery is a matter of directly uploading it, or importing it from Google's Cloud Storage system. This is the aspect of BigQuery with the biggest room for improvement. Whereas Amazon's S3 lets you mail in disks for import, Google doesn't currently have this facility. Streaming data into BigQuery isn't viable either, so regular imports are required for constantly updating data. Finally, as BigQuery only accepts data formatted as comma-separated value (CSV) files, you will need to use external methods to clean up the data beforehand.

Rather than provide end-user interfaces itself, Google wants an ecosystem to grow around BigQuery, with vendors incorporating it into their products, in the same way Elastic Map Reduce has acquired tool integration. Currently in beta test, to which anybody can apply, BigQuery is expected to be publicly available during 2012.

Prediction API

Many uses of machine learning are well defined, such as classification, sentiment analysis, or recommendation generation. To meet these needs, Google offers its Prediction API product.

Applications using the Prediction API work by creating and training a model hosted within Google's system. Once trained, this model can be used to make predictions, such as spam detection. Google is working on allowing these models to be shared, optionally with a fee. This will let you take advantage of previously trained models, which in many cases will save you time and expertise with training.

Though promising, Google's offerings are in their early days. Further integration between its services is required, as well as time for ecosystem development to make their tools more approachable.

Microsoft

I have written in some detail about Microsoft's big data strategy in Microsoft's plan for Hadoop and big data. By offering its data platforms on Windows Azure in addition to Windows Server, Microsoft's aim is to make either on-premise or cloud-based deployments equally viable with its technology. Azure parallels Amazon's web service offerings in many ways, offering a mix of IaaS services with managed applications such as SQL Server.

Hadoop is the central pillar of Microsoft's big data approach, surrounded by the ecosystem of its own database and business intelligence tools. For organizations already invested in the Microsoft platform, Azure will represent the smoothest route for integrating big data into the operation. Azure itself is pragmatic about language choice, supporting technologies such as Java, PHP and Node.js in addition to Microsoft's own.

As with Google's BigQuery, Microsoft's Hadoop solution is currently in closed beta test, and is expected to be generally available sometime in the middle of 2012.

Big data cloud platforms compared

The following table summarizes the data storage and analysis capabilities of Amazon, Google and Microsoft's cloud platforms. Intentionally excluded are IaaS solutions without dedicated big data offerings.

  Amazon Google Microsoft



Product(s)
Amazon Web Services
Google Cloud Services
Windows Azure


Big data storage
S3
Cloud Storage
HDFS on Azure


Working storage
Elastic Block Store
AppEngine (Datastore, Blobstore)
Blob, table, queues


NoSQL database
DynamoDB1
AppEngine Datastore
Table storage


Relational database
Relational Database Service (MySQL or Oracle)
Cloud SQL (MySQL compatible)
SQL Azure


Application hosting
EC2
AppEngine
Azure Compute


Map/Reduce service
Elastic MapReduce (Hadoop)
AppEngine (limited capacity)
Hadoop on Azure2


Big data analytics
Elastic MapReduce (Hadoop interface3)
BigQuery2 (TB-scale, SQL interface)
Hadoop on Azure (Hadoop interface3)


Machine learning
Via Hadoop + Mahout on EMR or EC2
Prediction API
Mahout with Hadoop


Streaming processing
Nothing prepackaged: use custom solution on EC2
Prospective Search API 4
StreamInsight2 ("Project Austin")


Data import
Network, physically ship drives
Network
Network


Data sources
Public Data Sets
A few sample datasets
Windows Azure Marketplace


Availability
Public production
Some services in private beta
Some services in private beta

Conclusion

Cloud-based big data services offer considerable advantages in removing the overhead of configuring and tuning your own clusters, and in ensuring you pay only for what you use. The biggest issue is always going to be data locality, as it is slow and expensive to ship data. The most effective big data cloud solutions will be the ones where the data is also collected in the cloud. This is an incentive to investigate EC2, Azure or AppEngine as a primary application platform, and an indicator that PaaS competitors such as Cloud Foundry and Heroku will have to address big data as a priority.

It is early days yet for big data in the cloud, with only Amazon offering battle-tested solutions at this point. Cloud services themselves are at an early stage, and we will see both increasing standardization and innovation over the next two years.

However, the twin advantages of not having to worry about infrastructure and economies of scale mean it is well worth investigating cloud services for your big data needs, especially for an experimental or green-field project. Looking to the future, there's no doubt that big data analytical capability will form an essential component of utility computing solutions.

Notes:

1 In public beta.

2 In controlled beta test.

3 Hive and Pig compatible.

4 Experimental status.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Related:

Categories: Open Source Feeds

Data for the public good

O'Reilly Radar - Wed, 02/22/2012 - 10:00

Sections

Download this free report

Can data save the world? Not on its own. As an age of technology-fueled transparency, open innovation and big data dawns around the world, the success of new policy won't depend on any single chief information officer, chief executive or brilliant developer. Data for the public good will be driven by a distributed community of media, nonprofits, academics and civic advocates focused on better outcomes, more informed communities and the new news, in whatever form it is delivered.

Advocates, watchdogs and government officials now have new tools for data journalism and open government. Globally, there's a wave of transparency that will wash over every industry and government, from finance to healthcare to crime.

In that context, open government is about much more than open data — just look at the issues that flow around the #opengov hashtag on Twitter, including the nature of identity, privacy, security, procurement, culture, cloud computing, civic engagement, participatory democracy, corruption, civic entrepreneurship or transparency.

If we accept the premise that Gov 2.0 is a potent combination of open government, mobile, open data, social media, collective intelligence and connectivity, the lessons of the past year suggest that a tidal wave of technology-fueled change is still building worldwide.

The Economist's support for open government data remains salient today:

"Public access to government figures is certain to release economic value and encourage entrepreneurship. That has already happened with weather data and with America's GPS satellite-navigation system that was opened for full commercial use a decade ago. And many firms make a good living out of searching for or repackaging patent filings."

As Clive Thompson reported at Wired last year, public sector data can help fuel jobs, and "shoving more public data into the commons could kick-start billions in economic activity." In the transportation sector, for instance, transit data is open government fuel for economic growth.

There is a tremendous amount of work ahead in building upon the foundations that civil society has constructed over decades. If you want a deep look at what the work of digitizing data really looks like, read Carl Malamud's interview with Slashdot on opening government data.

Data for the public good, however, goes far beyond government's own actions. In many cases, it will happen despite government action — or, often, inaction — as civic developers, data scientists and clinicians pioneer better analysis, visualization and feedback loops.

For every civic startup or regulation, there's a backstory that often involves a broad number of stakeholders. Governments have to commit to open up themselves but will, in many cases, need external expertise or even funding to do so. Citizens, industry and developers have to show up to use the data, demonstrating that there's not only demand, but also skill outside of government to put open data to work in service accountability, citizen utility and economic opportunity. Galvanizing the co-creation of civic services, policies or apps isn't easy, but tapping the potential of the civic surplus has attracted the attention of governments around the world.

There are many challenges for that vision to pass. For one, data quality and access remain poor. Socrata's open data study identified progress, but also pointed to a clear need for improvement: Only 30% of developers surveyed said that government data was available, and of that, 50% of the data was unusable.

Open data will not be a silver bullet to all of society's ills, but an increasing number of states are assembling platforms and stimulating an app economy.

Results-oriented mayors like Rahm Emanuel and Mike Bloomberg are committing to opening Chicago and opening government data in New York City, respectively.

Following are examples of where data for the public good is already having an impact upon the world we live in, along with some ideas about what lies ahead.

Financial good

Anyone looking for civic entrepreneurship will be hard pressed to find a better recent example than BrightScope. The efforts of Mike and Ryan Alfred are in line with traditional entrepreneurship: identifying an opportunity in a market that no one else has created value around, building a team to capitalize on it, and then investing years of hard work to execute on that vision. In the process, BrightScope has made government data about the financial industry more usable, searchable and open to the public.

Due to the efforts of these two entrepreneurs and their California-based startup, anyone who wants to learn more about financial advisers before tapping one to manage their assets can do so online.

Prior to BrightScope, the adviser data was locked up at the Securities and Exchange Commission (SEC) and the Financial Industry Regulatory Authority (FINRA).

"Ryan and I knew this data was there because we were advisers," said BrightScope co-founder Mike Alfred in a 2011 interview. "We knew data had been filed, but it wasn't clear what was being done with it. We'd never seen it liberated from the government databases."

While they knew the public data existed and had their idea years ago, Alfred said it didn't happen because they "weren't in the mindset of being data entrepreneurs" yet. "By going after 401(k) first, we could build the capacity to process large amounts of data," Alfred said. "We could take that data and present it on the web in a way that would be usable to the consumer."

Notably, the government data that BrightScope has gathered on financial advisers goes further than a given profile page. Over time, as search engines like Google and Bing index the information, the data has become searchable in places consumers are actually looking for it. That's aligned with one of the laws for open data that Tim O'Reilly has been sharing for years: Don't make people find data. Make data find the people.

As agencies adapt to new business relationships, consumers are starting to see increased access to government data. Now, more data that the nation's regulatory agencies collected on behalf of the public can be searched and understood by the public. Open data can improve lives, not least through adding more transparency into a financial sector that desperately needs more of it. This kind of data transparency will give the best financial advisers the advantage they deserve and make it much harder for your Aunt Betty to choose someone with a history of financial malpractice.

The next phase of financial data for good will use big data analysis and algorithmic consumer advice tools, or "choice engines," to make better decisions. The vast majority of consumers are unlikely to ever look directly at raw datasets themselves. Instead, they'll use mobile applications, search engines and social recommendations to make smarter choices.

There are already early examples of such services emerging. Billshrink, for example, lets consumers get personalized recommendations for a cheaper cell phone plan based on calling histories. Mint makes specific recommendations on how a citizen can save money based upon data analysis of the accounts added. Moreover, much of the innovation in this area is enabled by the ability of entrepreneurs and developers to go directly to data aggregation intermediaries like Yodlee or CashEdge to license the data.

EMC's Big Data solution accelerates business transformation. We offer a cost-efficient and scale-out IT infrastructure that allows organizations to access broad data sources, collaborate and execute real-time analysis and drive actionable insight. Transit data as economic fuel

Transit data continues to be one of the richest and most dynamic areas for co-creation of services. Around the United States and beyond, there has been a blossoming of innovation in the city transit sector, driven by the passion of citizens and fueled by the release of real-time transit data by city governments.

Francisca Rojas, research director at the Harvard Kennedy School's Transparency Policy Project, has investigated the dynamics behind the disclosure of data by transit agencies in the United States, which she calls one of the most successful implementations of open government. "In just a few years, a rich community has developed around this data, with visionary champions for disclosure inside transit agencies collaborating with eager software developers to deliver multiple ways for riders to access real-time information about transit," wrote Rojas.

The Massachusetts Bay Transit Authority (MBTA) learned from Portland, Oregon's, TriMet that open data is better. "This was the best thing the MBTA had done in its history," said Laurel Ruma, O'Reilly's director of talent and a long-time resident in greater Boston, in her 2010 Ignite talk on real-time transit data. The MBTA's move to make real-time data available and support it has spawned a new ecosystem of mobile applications, many of which are featured at MBTA.com.

There are now 44 different consumer-facing applications for the TriMet system. Chicago, Washington and New York City also have a growing ecosystem of applications.

As more sensors go online in smarter cities, tracking the movements of traffic patterns will enable public administrators to optimize routes, schedules and capacity, driving efficiency and a better allocation of resources.

Transparency and civic goods

As John Wonderlich, policy director at the Sunlight Foundation, observed last year, access to legislative data brings citizens closer to their representatives. "When developers and programmers have better access to the data of Congress, they can better build the databases and tools that let the rest of us connect with the legislature."

That's the promise of the Sunlight Foundation's work, in general: Technology-fueled transparency will help fight corruption, fraud and reveal the influence behind policies. That work is guided by data, generated, scraped and aggregated from government and regulatory bodies. The Sunlight Foundation has been focused on opening up Congress through technology since the organization was founded. Some of its efforts culminated recently with the publication of a live XML feed for the House floor and a transparency portal for House legislative documents.

There are other horizons for transparency through open government data, which broadly refers to public sector records that have been made available to citizens. For a canonical resource on what makes such releases truly "open," consult the "8 Principles of Open Government Data."

For instance, while gerrymandering has been part of American civic life since the birth of the republic, one of the best policy innovations of 2011 may offer hope for improving the redistricting process. DistrictBuilder, an open-source tool created by the Public Mapping Project, allows anyone to easily create legal districts.

"During the last year, thousands of members of the public have participated in online redistricting and have created hundreds of valid public plans," said Micah Altman, senior research scientist at Harvard University Institute for Quantitative Social Science, via an email last year.

"In substantial part, this is due to the project's effort and software. This year represents a huge increase in participation compared to previous rounds of redistricting — for example, the number of plans produced and shared by members of the public this year is roughly 100 times the number of plans submitted by the public in the last round of redistricting 10 years ago," Altman said. "Furthermore, the extensive news coverage has helped make a whole new set of people aware of the issue and has re framed it as a problem that citizens can actively participate in to solve, rather than simply complain about."

Principles for data in the public good

As a result of digital technology, our collective public memory can now be shared and expanded upon daily. In a recent lecture on public data for public good at Code for America, Michal Migurski of Stamen Design made the point that part of the global financial crisis came through a crisis in public knowledge, citing "The Destruction of Economic Facts," by Hernando de Soto.

To arrive at virtuous feedback loops that amplify the signals that citizens, regulators, executives and elected leaders inundated with information need to make better decisions, data providers and infomediaries will need to embrace key principles, as Migurski's lecture outlined.

First, "data drives demand," wrote Tim O'Reilly, who attended the lecture and distilled Migurski's insights. "When Stamen launched crimespotting.org, it made people aware that the data existed. It was there, but until they put visualization front and center, it might as well not have been."

Second, "public demand drives better data," wrote O'Reilly. "Crimespotting led Oakland to improve their data publishing practices. The stability of the data and publishing on the web made it possible to have this data addressable with public links. There's an 'official version,' and that version is public, rather than hidden."

Third, "version control adds dimension to data," wrote O'Reilly. "Part of what matters so much when open source, the web, and open data meet government is that practices that developers take for granted become part of the way the public gets access to data. Rather than static snapshots, there's a sense that you can expect to move through time with the data."

The case for open data

Accountability and transparency are important civic goods, but adopting open data requires grounded arguments for a city chief financial officer to support these initiatives. When it comes to making a business case for open data, John Tolva, the chief technology officer for Chicago, identified four areas that support the investment in open government:

  1. Trust — "Open data can build or rebuild trust in the people we serve," Tolva said. "That pays dividends over time."
  2. Accountability of the work force — "We've built a performance dashboard with KPIs [key performance indicators] that track where the city directly touches a resident."
  3. Business building — "Weather apps, transit apps ... that's the easy stuff," he said. "Companies built on reading vital signs of the human body could be reading the vital signs of the city."
  4. Urban analytics — "Brett [Goldstein] established probability curves for violent crime. Now we're trying to do that elsewhere, uncovering cost savings, intervention points, and efficiencies."

New York City is also using data internally. The city is doing things like applying predictive analytics to building code violations and housing data to try to understand where potential fire risks might exist.

"The thing that's really exciting to me, better than internal data, of course, is open data," said New York City chief digital officer Rachel Sterne during her talk at Strata New York 2011. "This, I think, is where we really start to reach the potential of New York City becoming a platform like some of the bigger commercial platforms and open data platforms. How can New York City, with the enormous amount of data and resources we have, think of itself the same way Facebook has an API ecosystem or Twitter does? This can enable us to produce a more user-centric experience of government. It democratizes the exchange of information and services. If someone wants to do a better job than we are in communicating something, it's all out there. It empowers citizens to collaboratively create solutions. It's not just the consumption but the co-production of government services and democracy."

The promise of data journalism

The ascendance of data journalism in media and government will continue to gather force in the years ahead.

Journalists and citizens are confronted by unprecedented amounts of data and an expanded number of news sources, including a social web populated by our friends, family and colleagues. Newsrooms, the traditional hosts for information gathering and dissemination, are now part of a flattened environment for news. Developments often break first on social networks, and that information is then curated by a combination of professionals and amateurs. News is then analyzed and synthesized into contextualized journalism.

Data is being scraped by journalists, generated from citizen reporting, or gleaned from massive information dumps — such as with the Guardian's formidable data journalism, as detailed in a recent ebook. ScraperWiki, a favorite tool of civic coders at Code for America and elsewhere, enables anyone to collect, store and publish public data. As we grapple with the consumption challenges presented by this deluge of data, new publishing platforms are also empowering us to gather, refine, analyze and share data ourselves, turning it into information.

There are a growing number of data journalism efforts around the world, from New York Times interactive features to the award-winning investigative work of ProPublica. Here are just a few promising examples:

  • Spending Stories, from the Open Knowledge Foundation, is designed to add context to news stories based upon government data by connecting stories to the data used.
  • Poderopedia is trying to bring more transparency to Chile, using data visualizations that draw upon a database of editorial and crowdsourced data.
  • The State Decoded is working to make the law more user-friendly.
  • Public Laboratory is a tool kit and online community for grassroots data gathering and research that builds upon the success of Grassroots Mapping.
  • Internews and its local partner Nai Mediawatch launched a new website that shows incidents of violence against journalists in Afghanistan.

Open aid and development

The World Bank has been taking unprecedented steps to make its data more open and usable to everyone. The data.worldbank.org website that launched in September 2010 was designed to make the bank's open data easier to use. In the months since, more than 100 applications have been built using the data.

"Up until very recently, there was almost no way to figure out where a development project was," said Aleem Walji, practice manager for innovation and technology at the World Bank Institute, in an interview last year. "That was true for all donors, including us. You could go into a data bank, find a project ID, download a 100-page document, and somewhere it might mention it. To look at it all on a country level was impossible. That's exactly the kind of organization-centric search that's possible now with extracted information on a map, mashed up with indicators. All of sudden, donors and recipients can both look at relationships."

Open data efforts are not limited to development. More data-driven transparency in aid spending is also going online. Last year, the United States Agency for International Development (USAID) launched a public engagement effort to raise awareness about the devastating famine in the Horn of Africa. The FWD campaign includes a combination of open data, mapping and citizen engagement.

"Frankly, it's the first foray the agency is taking into open government, open data, and citizen engagement online," said Haley Van Dyck, director of digital strategy at USAID, in an interview last year.

"We recognize there is a lot more to do on this front, but are happy to start moving the ball forward. This campaign is different than anything USAID has done in the past. It is based on informing, engaging, and connecting with the American people to partner with us on these dire but solvable problems. We want to change not only the way USAID communicates with the American public, but also the way we share information."

USAID built and embedded interactive maps on the FWD site. The agency created the maps with open source mapping tools and published the datasets it used to make these maps on data.gov. All are available to the public and media to download and embed as well.

The combination of publishing maps and the open data that drives them simultaneously online is significantly evolved for any government agency, and it serves as a worthy bar for other efforts in the future to meet. USAID accomplished this by migrating its data to an open, machine-readable format.

"In the past, we released our data in inaccessible formats — mostly PDFs — that are often unable to be used effectively," said Van Dyck. "USAID is one of the premiere data collectors in the international development space. We want to start making that data open, making that data sharable, and using that data to tell stories about the crisis and the work we are doing on the ground in an interactive way."

Crisis data and emergency response

Unprecedented levels of connectivity now exist around the world. According to a 2011 survey from the Pew Internet and Life Project, more than 50% of American adults use social networks, 35% of American adults have smartphones, and 78% of American adults are connected to the Internet. When combined, those factors mean that we now see earthquake tweets spread faster than the seismic waves themselves. Networked publics can now share the effects of disasters in real time, providing officials with unprecedented insight into what's happening. Citizens act as sensors in the midst of the storm, creating an ad hoc system of networked accountability through data.

The growth of an Internet of Things is an important evolution. What we saw during Hurricane Irene in 2011 was the increasing importance of an Internet of people, where citizens act as sensors during an emergency. Emergency management practitioners and first responders have woken up to the potential of using social data for enhanced situational awareness and resource allocation.

An historic emergency social data summit in Washington in 2010 highlighted how relevant this area has become. And last year's hearing in the United States Senate on the role of social media in emergency management was "a turning point in Gov 2.0," said Brian Humphrey of the Los Angeles Fire Department.

The Red Cross has been at the forefront of using social data in a time of need. That's not entirely by choice, given that news of disasters has consistently broken first on Twitter. The challenge is for the men and women entrusted with coordinating response to identify signals in the noise.

First responders and crisis managers are using a growing suite of tools for gathering information and sharing crucial messages internally and with the public. Structured social data and geospatial mapping suggest one direction where these tools are evolving in the field.

A web application from ESRI deployed during historic floods in Australia demonstrated how crowdsourced social intelligence provided by Ushahidi can enable emergency social data to be integrated into crisis response in a meaningful way.

The Australian flooding web app includes the ability to toggle layers from OpenStreetMap, satellite imagery, and topography, and then filter by time or report type. By adding structured social data, the web app provides geospatial information system (GIS) operators with valuable situational awareness that goes beyond standard reporting, including the locations of property damage, roads affected, hazards, evacuations and power outages.

Long before the floods or the Red Cross joined Twitter, however, Brian Humphrey of the Los Angeles Fire Department (LAFD) was already online, listening. "The biggest gap directly involves response agencies and the Red Cross," said Humphrey, who currently serves as the LAFD's public affairs officer. "Through social media, we're trying to narrow that gap between response and recovery to offer real-time relief."

After the devastating 2010 earthquake in Haiti, the evolution of volunteers working collaboratively online also offered a glimpse into the potential of citizen-generated data. Crisis Commons has acted as a sort of "geeks without borders." Around the world, developers, GIS engineers, online media professionals and volunteers collaborated on information technology projects to support disaster relief for post-earthquake Haiti, mapping streets on OpenStreetMap and collecting crisis data on Ushahidi.

Healthcare

What happens when patients find out how good their doctors really are? That was the question that Harvard Medical School professor Dr. Atul Gawande asked in the New Yorker, nearly a decade ago.

The narrative he told in that essay makes the history of quality improvement in medicine compelling, connecting it to the creation of a data registry at the Cystic Fibrosis Foundation in the 1950s. As Gawande detailed, that data was privately held. After it became open, life expectancy for cystic fibrosis patients tripled.

In 2012, the new hope is in big data, where techniques for finding meaning in the huge amounts of unstructured data generated by healthcare diagnostics offer immense promise.

The trouble, say medical experts, is that data availability and quality remain significant pain points that are holding back existing programs.

There are, literally, bright spots that suggest what's possible. Dr. Gawande's 2011 essay, which considered whether "hotspotting" using health data could help lower medical costs by giving the neediest patients better care, offered another perspective on the issue. Early outcomes made the approach look compelling. As Dr. Gawande detailed, when a Medicare demonstration program offered medical institutions payments that financed the coordination of care for its most chronically expensive beneficiaries, hospital stays and trips to the emergency rooms dropped more than 15% over the course of three years. A test program adopting a similar approach in Atlantic City saw a 25% drop in costs.

Through sharing data and knowledge, and then creating a system to convert ideas into practice, clinicians in the ImproveCareNow network were able to improve the remission rate for Crohn's disease from 49% to 67% without the introduction of new drugs.

In Britain, researchers found that the outcomes for adult cardiac patients improved after the publication of information on death rates. With the release of meaningful new open government data about performance and outcomes from the British national healthcare system, similar improvements may be on the way.

"I do believe we are at the beginning of a revolutionary moment in health care, when patients and clinicians collect and share data, working together to create more effective health care systems," said Susannah Fox, associate director for digital strategy at the Pew Internet and Life Project, in an interview in January. Fox's research has documented the social life of health information, the concept of peer-to-peer healthcare, and the role of the Internet among people living with chronic disease.

In the past few years, entrepreneurs, developers and government agencies have been collaboratively exploring the power of open data to improve health. In the United States, the open data story in healthcare is evolving quickly, from new mobile apps that lead to better health decisions to data spurring changes in care at the U.S. Department of Veterans Affairs.

Since he entered public service, Todd Park, the first chief technology officer of the U.S. Department of Health and Human Services (HHS), has focused on unleashing the power of open data to improve health. If you aren't familiar with this story, read the Atlantic's feature article that explores Park's efforts to revolutionize the healthcare industry through better use of data.

Park has focused on releasing data at Health.Data.Gov. In a speech to a Hacks and Hackers meetup in New York City in 2011, Park emphasized that HHS wasn't just releasing new data: "[We're] also making existing data truly accessible or usable," he said, taking "stuff that's in a book or on a website and turning it into machine-readable data or an API."

Park said it's still quite early in the project and that the work isn't just about data — it's about how and where it's used. "Data by itself isn't useful. You don't go and download data and slather data on yourself and get healed," he said. "Data is useful when it's integrated with other stuff that does useful jobs for doctors, patients and consumers."

What lies ahead

There are four trends that warrant special attention as we look to the future of data for public good: civic network effects, hybridized data models, personal data ownership and smart disclosure.

Civic network effects

Community is a key ingredient in successful open government data initiatives. It's not enough to simply release data and hope that venture capitalists and developers magically become aware of the opportunity to put it to work. Marketing open government data is what repeatedly brought federal Chief Technology Officer Aneesh Chopra and Park out to Silicon Valley, New York City and other business and tech hubs.

Despite the addition of topical communities to Data.gov, conferences and new media efforts, government's attempts to act as an "impatient convener" can only go so far. Civic developer and startup communities are creating a new distributed ecosystem that will help create that community, from BuzzData to Socrata to new efforts like Max Ogden's DataCouch.

Smart disclosure

There are enormous economic and civic good opportunities in the "smart disclosure" of personal data, whereby a private company or government institution provides a person with access to his or her own data in open formats. Smart disclosure is defined by Cass Sunstein, Administrator of the White House Office for Information and Regulatory Affairs, as a process that "refers to the timely release of complex information and data in standardized, machine-readable formats in ways that enable consumers to make informed decisions."

For instance, the quarterly financial statements of the top public companies in the world are now available online through the Securities and Exchange Commission.

Why does it matter? The interactions of citizens with companies or government entities generate a huge amount of economically valuable data. If consumers and regulators had access to that data, they could tap it to make better choices about everything from finance to healthcare to real estate, much in the same way that web applications like Hipmunk and Zillow let consumers make more informed decisions.

Personal data assets

When a trend makes it to the World Economic Forum (WEF) in Davos, it's generally evidence that the trend is gathering steam. A report titled "Personal Data Ownership: The Emergence of a New Asset Class" suggests that 2012 will be the year when citizens start thinking more about data ownership, whether that data is generated by private companies or the public sector.

"Increasing the control that individuals have over the manner in which their personal data is collected, managed and shared will spur a host of new services and applications," wrote the paper's authors. "As some put it, personal data will be the new 'oil' — a valuable resource of the 21st century. It will emerge as a new asset class touching all aspects of society."

The idea of data as a currency is still in its infancy, as Strata Conference chair Edd Dumbill has emphasized. The Locker Project, which provides people with the ability to move their own data around, is one of many approaches.

The growth of the Quantified Self movement and online communities like PatientsLikeMe and 23andMe validates the strength of the movement. In the U.S. federal government, the Blue Button initiative, which enables veterans to download personal health data, has now spread to all federal employees and earned adoption at Aetna and Kaiser Permanente.

In early 2012, a Green Button was launched to unleash energy data in the same way. Venture capitalist Fred Wilson called the Green Button an "OAuth for energy data."

Wilson wrote:

"It is a simple standard that the utilities can implement on one side and web/mobile developers can implement on the other side. And the result is a ton of information sharing about energy consumption and, in all likelihood, energy savings that result from more informed consumers."

Hybridized public-private data

Free or low-cost online tools are empowering citizens to do more than donate money or blood: Now, they can donate, time, expertise or even act as sensors. In the United States, we saw a leading edge of this phenomenon in the Gulf of Mexico, where Oil Reporter, an open source oil spill reporting app, provided a prototype for data collection via smartphone. In Japan, an analogous effort called Safecast grew and matured in the wake of the nuclear disaster that resulted from a massive earthquake and subsequent tsunami in 2011.

Open source software and citizens acting as sensors have steadily been integrated into journalism over the past few years, most dramatically in the videos and pictures uploaded after the 2009 Iran election and during 2011's Arab Spring.

Citizen science looks like the next frontier. Safecast is combining open data collected by citizen science with academic, NGO and open government data (where available), and then making it widely available. It's similar to other projects, where public data and experimental data are percolating.

Public data is a public good

Despite the myriad challenges presented by legitimate concerns about privacy, security, intellectual property and liability, the promise of more informed citizens is significant. McKinsey's 2011 report dubbed big data as the next frontier for innovation, with billions of dollars of economic value yet to be created. When that innovation is applied on behalf of the public good, whether it's in city planning, transit, healthcare, government accountability or situational awareness, those effects will be extended.

We're entering the feedback economy, where dynamic feedback loops between customers and corporations, partners and providers, citizens and governments, or regulators and companies can both drive efficiencies and leaner, smarter governments.

The exabyte age will bring with it the twin challenges of information overload and overconsumption, both of which will require organizations of all sizes to use the emerging toolboxes for filtering, analysis and action. To create public good from public goods — the public sector data that governments collect, the private sector data that is being collected and the social data that we generate ourselves — we will need to collectively forge new compacts that honor existing laws and visionary agreements that enable the new data science to put the data to work.

Photo: NYTimes: 365/360 - 1984 (in color) by blprnt_van, on Flickr

Related:

Categories: Open Source Feeds

Four short links: 22 February 2012

O'Reilly Radar - Wed, 02/22/2012 - 07:00

  1. Hashbangs (Dan Webb) -- why those terrible #! URLs are a bad idea. Looks like they're going away with pushState coming to browsers. As Dan says, "URLs are forever". Let's get them right. I'm fascinated by how URLs are changing meaning and use over time.
  2. DNA Sequencing on a USB Stick -- this has been going the rounds, but I think there's a time coming when scientific data generation can be crowdsourced. I care about a particular type of fish, but it hasn't been sequenced. Can I catch one, sequence it, upload the sequence, and get insight into the animal by automated detection of similar genes from other animals? Let those who care do the boring work, let scientists work on the analysis.
  3. The US Recording Industry is Stealing From Me (Bruce Simpson) -- automated content detection at YouTube has created an industry of parasites who claim copyright infringement and then receive royalties from the ads shown on the allegedly infringing videos.
  4. Ubuntu on Android -- carry a desktop in your pocket? Tempting. It's for manufacturers, not something you install on existing handsets, which I'm sure will create tension with the open source world at Ubuntu's heart. Then again, creating tension with the open source world at Ubuntu's heart does seem to be Canonical's core competency ....

Categories: Open Source Feeds

Report from HIMSS: health care tries to leap the chasm from the average to the superb

O'Reilly Radar - Tue, 02/21/2012 - 21:04

I couldn't attend the session today on StealthVest--and small surprise. Who wouldn't want to come see an Arduino-based garment that can hold numerous health-monitoring devices in a way that is supposed to feel like a completely normal piece of clothing? As with many events at the HIMSS conference, which has registered over 35,000 people (at least four thousand more than last year), the StealthVest presentation drew an overflow crowd.

StealthVest sounds incredibly cool (and I may have another chance to report on it Thursday), but when I gave up on getting into the talk I walked downstairs to a session that sounds kind of boring but may actually be more significant: Practical Application of Control Theory to Improve Capacity in a Clinical Setting.

The speakers on this session, from Banner Gateway Medical Center in Gilbert, Arizona, laid out a fairly standard use of analytics to predict when the hospital units are likely to exceed their capacity, and then to reschedule patients and provider schedules to smooth out the curve. The basic idea comes from chemical engineering, and requires them to monitor all the factors that lead patients to come in to the hospital and that determine how long they stay. Queuing theory can show when things are likely to get tight. Hospitals care a lot about these workflow issues, as Fred Trotter and David Uhlman discuss in the O'Reilly book Beyond Meaningful Use, and they have a real effect on patient care too.

The reason I find this topic interesting is that capacity planning leads fairly quickly to visible cost savings. So hospitals are likely to do it. Furthermore, once they go down the path of collecting long-term data and crunching it, they may extend the practice to clinical decision support, public health reporting, and other things that can make a big difference to patient care.

A few stats about data in U.S. health care

Do we need a big push to do such things? We sure do, and that's why meaningful use was introduced into HITECH sections of the American Recovery and Reinvestment Act. HHS released mounds of government health data on Health.data.gov hoping to serve a similar purpose. Let's just take a look at how far the United States is from using its health data effectively.

  • Last November, a CompTIA survey (reported by Health Care IT News) found that only 28% of providers have comprehensive EHRs in use, and another 17% have partial implementations. One has to remember that even a "comprehensive" EHR is unlikely to support the sophisticated data mining, information exchange, and process improvement that will eventually lead to lower costs and better care.

  • According to a recent Beacon Partners survey (PDF), half of the responding institutions have not yet set up an infrastructure for pursuing health information exchange, although 70% consider it a priority. The main problem, according to a HIMSS survey, is budget: HIEs are shockingly expensive. There's more to this story, which I reported on from a recent conference in Massachusetts.

Stats like these have to be considered when HIMSS board chair, Charlene S. Underwood, extolled the organization's achievements in the morning keynote. HIMSS has promoted good causes, but only recently has it addressed cost, interoperability, and open source issues that can allow health IT to break out of the elite of institutions large or sophisticated enough to adopt the right practices.

As signs of change, I am particularly happy to hear of HIMSS's new collaboration with Open Health Tools and their acquisition of the mHealth summit. These should guide the health care field toward more patient engagement and adaptable computer systems. HIEs are another area crying out for change.

An HIE optimist

With the flaccid figures for HIE adoption in mind, I met Charles Parisot, chair of Interoperability Standards and Testing Manager for EHRA, which is HIMSS's Electronic Health Records Association. The biggest EHR vendors and HIEs come together in this association, and Parisot was just stoked with positive stories about their advances.

His take on the cost of HIEs is that most of them just do it in a brute force manner that doesn't work. They actually copy the data from each institution into a central database, which is hard to manage from many standpoints. The HIEs that have done it right (notably in New York state and parts of Tennessee) are sleek and low-cost. The solution involves:

  • Keeping the data at the health care providers, and storing in the HIE only some glue data that associates the patient and the type of data to the provider.

  • Keeping all metadata about formats out to the HIE, so that new formats, new codes, and new types of data can easily be introduced into the system without recoding the HIE.

  • Breaking information exchange down into constituent parts--the data itself, the exchange protocols, identification, standards for encryption and integrity, etc.--and finding standard solutions for each of these.

So EHRA has developed profiles (also known by its ONC term, implementation specifications) that indicate which standard is used for each part of the data exchange. Metadata can be stored in the core HL7 document, the Clinical Document Architecture, and differences between implementations of HL7 documents by different vendors can also be documented.

A view of different architectures in their approach can be found in an EHRA white paper, Supporting a Robust Health Information Exchange Strategy with a Pragmatic Transport Framework. As testament to their success, Parisot claimed that the interoperability lab (a huge part of the exhibit hall floor space, and a popular destination for attendees) could set up the software connecting all the vendors' and HIEs' systems in one hour.

I asked him about the simple email solution promised by the government's Direct project, and whether that may be the path forward for small, cash-strapped providers. He accepted that Direct is part of the solution, but warned that it doesn't make things so simple. Unless two providers have a pre-existing relationship, they need to be part of a directory or even a set of federated directories, and assure their identities through digital signatures.

And what if a large hospital receives hundreds of email messages a day from various doctors who don't even know to whom their patients are being referred? Parisot says metadata must accompany any communications--and he's found that it's more effective for institutions to pull the data they want than for referring physicians to push it.

Intelligence for hospitals

Finally, Parisot told me EHRA has developed standards for submitting data to EHRs from 350 types of devices, and have 50 manufacturers working on devices with these standards. I visited a booth of iSirona as an example. They accept basic monitoring data such as pulses from different systems that use different formats, and translate over 50 items of information into a simple text format that they transmit to an EHR. They also add networking to devices that communicate only over cables. Outlying values can be rejected by a person monitoring the data. The vendor pointed out that format translation will be necessary for some time to come, because neither vendors nor hospitals will replace their devices simply to implement a new data transfer protocol.

For more about devices, I dropped by one of the most entertaining parts of the conference, the Intelligent Hospital Pavilion. Here, after a badge scan, you are somberly led through a series of locked doors into simulated hospital rooms where you get to watch actors in nursing outfits work with lifesize dolls and check innumerable monitors. I think the information overload is barely ameliorated and may be worsened by the arrays of constantly updated screens.

But the background presentation is persuasive: by using attaching RFIDs and all sorts of other devices to everything from people to equipment, and basically making the hospital more like a factory, providers can radically speed up responses in emergency situations and reduce errors. Some devices use the ISM "junk" band, whereas more critical ones use dedicated spectrum. Redundancy is built in throughout the background servers.

Waiting for the main event

The US health care field held their breaths most of last week, waiting for Stage 2 meaningful use guidelines from HHS. The announcement never came, nor did it come this morning as many people had hoped. Because meaningful use is the major theme of HIMSS, and many sessions were planned on helping providers move to Stage 2, the delay in the announcement put the conference in an awkward position.

HIMSS is also nonplussed over a delay in another initiative, the adoption of a new standard in the classification of disease and procedures. ICD-10 is actually pretty old, having been standardized in the 1980s, and the U.S. lags decades behind other countries in adopting it. Advantages touted for ICD-10 are:

  • It incorporates newer discoveries in medicine than the dominant standard in the U.S., ICD-9, and therefore permits better disease tracking and treatment.

  • Additionally, it's much more detailed than ICD-9 (with an order of magnitude more classifications). This allows the recording of more information but complicates the job of classifying a patient correctly.

ICD-10 is rather controversial. Some people would prefer to base clinical decisions on SNOMED, a standard described in the Beyond Meaningful Use book mentioned earlier. Ultimately, doctors lobbied hard against the HHS timeline for adopting ICD-10 because providers are so busy with meaningful use. (But of course, the goals of adopting meaningful use are closely tied to the goals of adopting ICD-10.) It was the pushback from these institutions that led HHS to accede and announce a delay. HIMSS and many of its members were disappointed by the delay.

In addition, there is an upcoming standard, ICD-11, whose sandal some say ICD-10 is not even worthy to lace. A strong suggestion that the industry just move to ICD-11 was aired in Government Health IT, and the possibility was raised in Health Care IT News as well. In addition reflecting the newest knowledge about disease, ICD-11 is praised for its interaction with SNOMED and its use of Semantic Web technology.

That last point makes me a bit worried. The Semantic Web has not been widely adopted, and if people in the health IT field think ICD-10 is complex, how are they going to deal with drawing up and following relationships through OWL? I plan to learn more about ICD-11 at the conference.

Categories: Open Source Feeds

HIMSS asks: Who is Biz Stone and what is Twitter?

O'Reilly Radar - Tue, 02/21/2012 - 17:30


Today, one of the founders of Twitter, Biz Stone, gave the opening keynote at HIMSS.

This is probably going to be the best keynote at HIMSS, followed by a speech from Dr. Farzad Mostashari, which will also be excellent. It goes downhill after that: there will be a talk about politics and another talk from an "explorer." I am sure those will be great talks, but when I go to HIMSS, I want to hear about health information technology. Want to know what @biz actually said? As usual, Twitter itself provides an instant summary.

HIMSS stands for Healthcare Information and Management Systems Society. The annual HIMSS conference is the largest Health IT gathering on the planet. Almost 40,000 people will show up to discuss healthcare information systems. Many of them will be individuals sent by their hospitals to try and find out what solutions they will need to purchase in order to meet meaningful use requirements. But many of the attendees are old school health IT experts, many of whom have spent entire careers trying to bring technology into a healthcare system that has resisted computerization tooth and nail. This year will likely break all kind of attendance records for HIMSS. Rightly so: The value of connecting thousands of health IT experts with tens of thousands who are seeking health IT experts has never been higher.

It is ironic that Biz Stone is keynoting this year's talk, because Twitter has changed the health IT game so substantially. I say Twitter specifically, and not "social media" generally. I do not think Facebook or Google+ or your social media of choice has had nearly the impact that Twitter has had on healthcare communications.

HIMSS, and in many cases traditional health IT along with it, is experiencing something of a whirlwind. One force adding wind has been the fact that President Obama has funded EHR systems with meaningful use, and made it clear that the future of healthcare funding will take place at Accountable Care Organizations (ACO) that are paid to keep people healthy rather than to cover procedures when they are sick. It is hard to understate the importance of this. Meaningful Use and ACOs will do more to computerize medicine in five years than the previous 50 years without these incentive changes.

But in the same breath, we must admit that the healthcare system as a whole is strained and unable to meet the needs of millions of its patients. The new force in healthcare is peer to peer medicine. There are really only a few things that doctors provide to patients. They either provide treatment, or they provide facts, or perhaps, they provide context for those facts. More and more, patients are seeking facts and context for that information, from the Internet generally and other patients specifically. This can be dangerous, but when done correctly it can be revolutionary .

It's not rocket science really; our culture has changed. Baby boomers still wonder if it is OK to discuss sexual issues in polite company. Their kids blog about their vasectomies. It's not just that we blog about vasectomies. We read blogs about vasectomies and consider it normal.

Someday, I will decide whether or not I should get a vasectomy. (I would like to have kids first). When I make that decision, I might just give @johnbiggs a shout and ask him how its going. He might not have time to answer me. But some vasectomy patient somewhere will have the time to tell me what it is like. Some epatient will be willing to spend an hour talking to me about what it meant to them to have this procedure. I can talk with patients who had a good experience, I can talk to patients who had a bad experience. I will have access to insights that my urologist does not have, and most importantly does not have time to discuss with me in any case.

For whatever reason, the epatient community centers around Twitter. More than likely this is because of the fundamentally open nature of this network. Although it is possible to "protect" tweets, most account holders tend to tweet to the whole world. If you are interested in a particular health-related issue, you can use Twitter to find the group of people who are discussing that issue. Twitter is a natural way for people who are connected by a common thought or issue to organize. Facebook, on the other hand, is about connecting with people you already know. The famous quote applies: "Facebook is about people you used to know; Twitter is about people you'd like to know better." You could change that quote to read "Twitter is about people you'd like to know who have had vasectomies."

There are people on Twitter right now discussing very personal health issues. All you need to experience this is to do a little research to understand what hashtag a community is using to connect with each other. For instance:

I intentionally chose diseases that are not easy to discuss in person. Discussion on these delicate issues between people dealing with these problems happens all the time on Twitter. Very often Twitter is the place to find and meet people who are dealing with the same healthcare issues that you are, and then discover another place on the web where patients with similar conditions are gathering and helping each other. For better or worse, Twitter has become a kind of peer-to-peer healthcare marketplace. I think this is about a billion times more interesting than surgeons who update families via Twitter, although that is cool, too.

At Health 2.0 or the OSCON healthcare track, these kinds of insights are regarded as somewhat obvious. It is obvious that patients are seeking each other out using social media technologies and that this must somehow eventually be reconciled with the process that doctors are just undertaking to computerize medicine. But at HIMSS this is a revolutionary idea. HIMSS is full of old-school EHR vendors who are applying technology that was cutting edge in 1995 to 2012 problems. HIMSS is full of hospital administrators who recognize that their biggest barrier to meaningful use dollars is not an EHR, but the fact that 50% of their nurses do not know how to type.

I can promise you that the following conversation will be happening thousands of times in the main hall at HIMSS before Biz Stone speaks:

Attendee 1: Who is this speaking?

Attendee 2: Biz Stone.

Attendee 1: Who is that?

Attendee 2: One of the founders of Twitter.

Attendee 1: What is Twitter?

For this audience, Biz Stone talking about how Twitter revolutionizes healthcare will be electric. I wish I could be there.

Meaningful Use and Beyond: A Guide for IT Staff in Health Care — Meaningful Use underlies a major federal incentives program for medical offices and hospitals that pays doctors and clinicians to move to electronic health records (EHR). This book is a rosetta stone for the IT implementer who wants to help organizations harness EHR systems.

Related:

Categories: Open Source Feeds

Building the health information infrastructure for the modern epatient

O'Reilly Radar - Tue, 02/21/2012 - 10:00

To learn more about what levers the government is pulling to catalyze innovation in the healthcare system, I turned to Dr. Farzad Mostashari (@Farzad_ONC). As the National Coordinator for Health IT, Mostashari is one of the most important public officials entrusted with improving the nation's healthcare system through smarter use of technology.

Mostashari, a public-health informatics specialist, was named ONC chief in April 2011, replacing Dr. David Blumenthal. Mostashari's full biography, available at HHS.gov, notes that he "was one of the lead investigators in the outbreaks of West Nile Virus and anthrax in New York City, and was among the first developers of real-time electronic disease surveillance systems nationwide."

I talked to Mostashari on the same day that he published a look back over 2011, which he hailed as a year of momentous progress in health information technology. Our interview follows.

What excites you about your work? What trends matter here?

Farzad Mostashari‏: Well, it's a really fun job. It feels like this is the ideal time for this health IT revolution to tie into other massive megatrends that are happening around consumer and patient empowerment, payment and delivery reform, as I talked about in my TED Med Talk with Aneesh Chopra.

These three streams [how patients are cared for, how care is paid for, and how people take care of their own health] coming together feels great. And it really feels like we're making amazing progress.

How does what's happening today grow out of the passage of the Health Information Technology for Economic and Clinical Health Act (HITECH) Act in 2009?

Farzad Mostashari‏: HITECH was a key part of ARRA, the American Recovery and Reinvestment Act. This is the reinvestment part. People think of roadways and runways and railways. This is the information infrastructure for healthcare.

In the past two years, we made as much progress on adoption as we had made in the past 20 years before that. We doubled the adoption of electronic health records in physician offices between the time the stimulus passed and now. What that says is that a large number of barriers have been addressed, including the financial barriers that are addressed by the health IT incentive payments.

It also, I think, points to the innovation that's happening in the health IT marketplace, with more products that people want to buy and want to use, and an explosion in the number of options people have.

The programs we put in place, like the Regional Health IT Extension Centers modeled after the Agriculture Extension program, give a helping hand. There are local nonprofits throughout the country that are working with one-third of all primary care providers in this country to help them adopt electronic health records, particularly smaller practices and maybe health centers, critical access hospitals and so forth.

This is obviously a big lift and a big change for medicine. It moves at what Jay Walker called "med speed," not tech speed. The pace of transformation in medicine that's happening right now may be unparalleled. It's a good thing.

Healthcare providers have a number of options as they adopt electronic health records. How do you think about the choice between open source versus proprietary options?

Farzad Mostashari‏: We're pretty agnostic in terms of the technology and the business model. What matters are the outcomes. We've really left the decisions about what technology to use to the people who have to live with it, like the doctors and hospitals who make the purchases.

There are definitely some very successful models, not only on the EHR side, but also on the health information exchange side.

(Note: For more on this subject, read Brian Ahier's Radar post on the Health Internet.)

What role do open standards play in the future of healthcare?

Farzad Mostashari‏: We are passionate believers in open standards. We think that everybody should be using them. We've gotten really great participation by vendors of open source and proprietary software, in terms of participating in an open standards development process.

I think what we've enabled, through things like modular certification, is a lot more innovation. Different pieces of the entire ecosystem could be done through reducing the barrier to entry, enabling a variety of different innovative startups to come to the field. What we're seeing is, a lot of the time, this is migrating from installed software to web services.

If we're setting up a reference implementation of the standards, like the Connect software or popHealth, we do it through a process where the result is open source. I think the government as a platform approach at the Veterans Affairs department, DoD, and so forth is tremendously important.

How is the mobile revolution changing healthcare?

We had Jay Walker talking about big change [at a recent ONC Grantee Meeting]. I just have this indelible image of him waving in his left hand a clay cone with cuneiform on it that is from 2,000 B.C. — 4,000 years ago — and in his right hand he held his iPhone.

He was saying both of them represented the cutting edge of technology that evolved to meet consumer need. His strong assertion was that this is absolutely going to revolutionize what happens in medicine at tech speed. Again, not "med speed."

I had the experience of being at my clinic, where I get care, and the pharmacist sitting in the starched, white coat behind the counter telling me that I should take this medicine at night.

And I said, "Well, it's easier for me to take it in the morning." And he said, "Well, it works better at night."

And I asked, acting as an empowered patient, "Well, what's the half life?" And he answered, "Okay. Let me look it up."

He started clacking away at his pharmacy information system; clickity clack, clickity clack. I can't see what he's doing. And then he says, "Ah hell," and he pulls out his smartphone and Googles it.

There's now a democratization of information and information tools, where we're pushing the analytics to the cloud. Being able to put that in the hand of not just every doctor or every healthcare provider but every patient is absolutely going to be that third strand of the DNA, putting us on the right path for getting healthcare that results in health.

We're making sure that people know they have a right to get their own data, making sure that the policies are aligned with that. We're making sure that we make it easy for doctors to give patients their own information through things like the Direct Project, the Blue Button, meaningful use requirements, or the Consumer E-Health Pledge.

We have more than 250 organizations that collectively hold data for 100 million Americans that pledge to make it easy for people to get electronic copies of their own data.

Do you think people will take ownership of their personal health data and engage in what Susannah Fox has described as "peer-to-peer healthcare"?

Farzad Mostashari‏: I think that it will be not just possible, not even just okay, but actually encouraged for patients to be engaged in their care as partners. Let the epatient help. I think we're going to see that emerging as there's more access and more tools for people to do stuff with their data once they get it through things like the health data initiative. We're also beginning to work with stakeholder groups, like Consumer's Union, the American Nurses Association and some of the disease groups, to change attitudes around it being okay to ask for your own records.

This interview was edited and condensed. Photo from The Office of the National Coordinator for Health Information Technology.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Related:

Categories: Open Source Feeds

Four short links: 21 February 2012

O'Reilly Radar - Tue, 02/21/2012 - 07:00

  1. Stop Paying Your jQuery Tax (Sam Saffron) -- performance advice for front-end developers. The faster your site responds, the more customers will use it.
  2. George Dyson Interviewed (Wired) -- a different perspective on computing, worth reading.
  3. VLC 2.0.0 -- VLC lets you bypass manufacturers' designed-in brokenness so your computer can play media. Glad to see it still being actively developed.
  4. Critical Truth About Power Laws (Science Magazine) -- Although power laws have been reported in areas ranging from finance and molecular biology to geophysics and the Internet, the data are typically insufficient and the mechanistic insights are almost always too limited for the identification of power-law behavior to be scientifically useful (see the figure). Indeed, even most statistically "successful" calculations of power laws offer little more than anecdotal value. (no PDF available unless you pay, because that's how great science works)

Categories: Open Source Feeds

Four short links: 20 February 2012

O'Reilly Radar - Mon, 02/20/2012 - 07:00

  1. University Copyright Fail -- This week, the University of Western Ontario and the University of Toronto signed a deal with the licensing group Access Copyright that includes: provisions defining e-mailing hyperlinks as equivalent to photocopying a document; a flat fee of $27.50 for each full-time equivalent student; and, surveillance of academic staff email. (via Fabiana Kubke)
  2. Peanutty -- I'm not sure it's perfect yet, but it does the best job I've seen of motivating people by connecting code with curiosity. Most of the other "learn to code" systems are big on bite-sized increments of knowledge but short on motivation unless you, for some reason, want to "learn to code".
  3. Why Facebook's Data Will Change Our World (Pete Warden) -- You just can't resist Facebook data can you? Like a dog returning to its own vomit. Great list of reasons why Facebook's data is scary interesting.
  4. Digital Exams on the iPad -- how to lock down an iPad for use in an exam. Love the explanation of how the security-paranoid mind works in action: both evil and methodical at the same time.

Categories: Open Source Feeds

Top stories: February 13-17, 2012

O'Reilly Radar - Fri, 02/17/2012 - 14:30

Here's a look at the top stories published across O'Reilly sites this week.

The stories behind a few O'Reilly "classics"
Tim O'Reilly: "It's amazing to me how books I first published more than 20 years ago are still creating value for readers."

How to create a visualization
Creating a visualization requires more than just data and imagery. Pete Warden outlines the process and actions that drove his new Facebook visualization project.

Let's remember why we got into the publishing business
At the 2012 Tools of Change for Publishing Conference this week in New York City, keynoter LeVar Burton reminded the audience why storytelling will always matter.

There's Plan A, and then there's the plan that will become your business
Drawing from the Lean Startup and other methods, "Running Lean" helps entrepreneurs transform flawed Plan A ideas into viable companies. "Running Lean" author Ash Maurya explains the basics in this interview.

The bond between data and journalism grows stronger
This interview with Liliana Bounegru, project coordinator of Data Driven Journalism at the European Journalism Centre, offers more insight into why the importance of data journalism continues to grow in the age of big data.

Strata 2012, Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work. Save 20% on Strata registration with the code RADAR20.

Categories: Open Source Feeds

Documentation strategy for a small software project: launching VoIP Drupal introductions

O'Reilly Radar - Fri, 02/17/2012 - 13:14

VoIP Drupal is a window onto the promises and challenges faced by a new open source project, including its documentation. At O'Reilly, we've been conscious for some time that we lack a business model for documenting new collaborative projects--near the beginning, at the stage where they could use the most help with good materials to promote their work, but don't have a community large enough to support a book--and I joined VoIP Drupal to explore how a professional editor can help such a team.

Small projects can reach a certain maturity with poor and sparse document. But the critical move from early adopters to mainstream requires a lot more hand-holding for prospective users. And these projects can spare hardly any developer time for documentation. Users and fans can be helpful here, but their documentation needs to be checked and updated over time; furthermore, reliance on spontaneous contributions from users leads to spotty and unpredictable coverage.

Large projects can hire technical writers, but what they do is very different from traditional documentation; they must be community managers as well as writers and editors (see Anne Gentle's book Conversation and Community: The Social Web for Documentation). So these projects can benefit from research into communities also.

I met at the MIT Media Lab this week with Leo Burd, the inventor of VoIP Drupal, and a couple other supporters, notably Micky Metts of DrupalConnection.com. We worked out some long-term plans for firming up VoIP Drupal's documentation and other training materials. But we also had to deal with an urgent need for materials to offer at DrupalCon, which begins in just over one month.

Challenges

One of the difficulties of explaining VoIP Drupal is that it's just so versatile. The foundations are simple:

  • A thin wrapper around PHP permits developers to write simple scripts that dial phone numbers, send SMS messages, etc. These scripts run on services that initiate connections and do translation between voice and text (Tropo, Twilio, and the free Plivo are currently supported).

  • Administrators on Drupal sites can use the Drupal interface to configure VoIP Drupal modules and add phone/SMS scripts to their sites.

  • Content providers can use the VoIP Drupal capabilities provided by their administrators to do such things as send text messages to site users, or to enable site users to record messages using their phone or computer.

Already you can see one challenge: VoIP Drupal has three different audiences that need very different documentation. In fact, we've thought of two more audiences: decision-makers who might build a business or service on top of VoIP Drupal, and potential team members who will maintain and build new features.

Some juicy modules built on top of VoIP Drupal's core extend its versatility to the point where it's hard to explain on an elevator ride what VoIP Drupal could do. Leo tosses out a few ideas such as:

  • Emergency awareness systems that use multiple channels to reach out to a population who live in a certain area. That would require a combination of user profiling, mapping and communication capabilities tend to be extremely hard to put together under one single package.

  • Community polling/voting systems that are accessible via web, SMS, email, phone, etc.

  • CRM systems that keep track (and even record) phone interactions, organize group conference calls with the click of a button, etc.

  • Voice-based bulletin boards.

  • Adding multiple authentication mechanisms to a site.

  • Sending SMS event notifications based on Google Calendars.

In theory you could create a complete voice and SMS based system out of VoIP Drupal and ignore the web site altogether, but that would be a rather cumbersome exercise. VoIP Drupal is well-suited to integrating voice and the Web--and it leaves lots of room for creativity.

Long-term development

A community project, we agreed, needs to be incremental and will result in widely distributed documents. Some people like big manuals, but most want a quickie getting-started guide and then lots of chances to explore different options at their own pace. Communities are good for developing small documents of different types. The challenge is finding someone to cover any particular feature, as well as to do the sometimes tedious work of updating the document over time.

We decided that videos would be valuable for the administrators and content providers, because they work through graphical interfaces. However, the material should also be documented in plain text. This expands access to the material in two ways. First, VoIP Drupal may be popular in part of the world where bandwidth limitations make it hard to view videos. Second, the text pages are easier to translate into other languages.

Just as a video can be worth a thousand words, working scripts can replace a dozen explanations. Leo will set up a code contribution site on Github. This is more work than it may seem, because malicious or buggy scripts can wreak havoc for users (imagine someone getting a thousand identical SMS messages over the course of a single hour, for instance), so contributions have to be vetted.

Some projects assign a knowledgeable person or two to create an outline, then ask community members to fill it in. I find this approach too restrictive. Having a huge unfilled structure is just depressing. And one has to grab the excitement of volunteers wherever it happens to land. Just asking them to document what they love about a project will get you more material than presenting them with a mandate to cover certain topics.

But then how do you get crucial features documented? Wait and watch forums for people discussing those features. When someone seems particularly knowledgeable and eager to help, ask him or her for a longer document that covers the feature. You then have to reward this person for doing the work, and a couple ways that make sense in this situation include:

  • Get an editor to tighten up the document and work with the author to make a really professional article out of it.

  • Highlight it on your web site and make sure people can find it easily. For many volunteers, seeing their material widely used is the best reward.

We also agreed that we should divide documentation into practical, how-to documents and conceptual documents. Users like to grab a hello-world document and throw together their first program. As they start to shape their own projects, they realize they don't really understand how the system fits together and that they need some background concepts. Here is where most software projects fail. They assume that the reader understands the reasoning behind the design and knows how best to use it.

Good conceptual documentation is hard to produce, partly because the lead developers have the concepts so deeply ingrained that they don't realize what it is that other people don't know. Breaking the problems down into small chunks, though, can make it easier to produce useful guides.

Like many software projects, VoIP Drupal documentation currently starts the reader off with a list of modules. The team members liked an idea of mine to replace these with brief tutorials or use cases. Each would start with a goal or question (what the reader wants to accomplish) and then introduce the relevant module. In general, given the flexibility of VoIP Drupal, we agreed we need a lot more "why and when" documentation.

Immediate preparations

Before we take on a major restructuring and expansion of documentation, though, we have a tight deadline for producing some key videos and documents. Leo is going to lead a development workshop at DrupalCon, and he has to determine the minimum documentation needed to make it a productive experience. He also wants to do a webinar on February 28 or 29, and a series of videos on basic topics such as installing VoIP Drupal, a survey of successful sites using it, and a nifty graphical interface called Visual VoIP Drupal. Visual VoIP Drupal, which will be released in a few weeks, is one of the new features Leo would like to promote in order to excite users. It lets a programmer select blocks and blend them into a script through a GUI, instead of typing all the code.

The next few weeks will bring a flurry of work to realize our vision.

Categories: Open Source Feeds

The stories behind a few O'Reilly "classics"

O'Reilly Radar - Fri, 02/17/2012 - 13:00

This post originally appeared in Tim O'Reilly's Google+ feed.

It's amazing to me how books I first published more than 20 years ago are still creating value for readers. O'Reilly Media is running an ebook sale for some of our "classics."

"Vi and Vim" is an updated edition of a book we first published in 1986! Linda Lamb was the original author; I was the editor, and added quite a bit of material of my own. (In those days, being the "editor" for us really meant being ghostwriter and closet co-author.) I still use and love vi/vim.

"DNS and Bind" has an interesting back story too. In the late '80s or early '90s, I was looking for an author for a book on smail, a new competitor to sendmail that seemed to me to have some promise. I found Cricket Liu, and he said, "what I really want to write a book about is Bind and the Domain Name System. Trust me, it's more important than smail." The Internet was just exploding beyond its academic roots (we were still using UUCP!), but I did trust him. We published the first edition in 1992, and it's been a bestseller ever since.

"Unix in a Nutshell" was arguably our very first book. I created the first edition in 1984 for a long-defunct workstation company called Masscomp; we then licensed it to other companies, adapting it for their variants of Unix. In 1986, we published a public edition in two versions: System V and BSD. The original editions were inspired by the huge man page documentation sets that vendors were shipping at the time: I wanted to have something handy to look up command-line options, shell syntax, regular expression syntax, sed and awk command syntax, and even things like the ascii character set.

The books were moderately successful until I tried a price drop from the original $19.50 to $9.95 as an experiment, with the marketing headline "Man bites dog." I told people we'd try the new price for six months, and if it doubled sales, we'd keep it. Instead, the enormous value proposition increased sales literally by an order of magnitude. At the book's peak, we were selling tens of thousands of copies a month.

Every other "in a nutshell" book we published derived from this one, a product line that collectively sold millions of copies, and helped put O'Reilly on the map.

"Essential System Administration" is another book that dates back to our early days as a documentation consulting company. I wrote the first edition of this book for Masscomp in 1984; it might well be the first Unix system administration book ever written. I had just written a graphics programming manual for Masscomp, and was looking for another project. I said, "When any of us have any problems with our machines, we go to Tom Texeira. Where are our customers going to go?" So I interviewed Tom, and wrote down what he knew. (That was the origin of so many of our early books — and the origin of the notion of "capturing the knowledge of innovators.")

I acquired the rights back from Masscomp, and licensed the book to a company called Multiflow, where Mike Loukides ran the documentation department. Mike updated the book. Æleen Frisch, who was working for Mike, did yet another edition for Multiflow, and when the company went belly up, I acquired back the improved version (and hired Mike as our first editor besides me and Dale). He signed Æleen to develop it as a much more comprehensive book, which has been in print ever since.

"Sed and Awk" has a funny backstory too. It was one of the titles that inspired the original animal designs. Edie Freedman thought Unix program names sounded like weird animals, and this was one of the titles she chose to make a cover for, even though the book didn't exist yet. We'd hear for years that people knew it existed — they'd seen it. Dale Dougherty eventually sat down and wrote it, mostly because he loved awk but also just to satisfy those customers who just knew it existed.

(Here's a brief history of how Edie came up with the idea for the animal book covers.)

And then there's "Unix Power Tools." In the late '80s, Dale had discovered hypertext via Hypercard, and when he discovered Viola and the World Wide Web, that became his focus. We had written a book called "Unix Text Processing" together, and I was hoping to lure him back to writing another book that exercised the hypertext style of the web, but in print. Dale was working on GNN by that time and couldn't be lured onto the project, but I was having so much fun that I kept going.

I recruited Jerry Peek and Mike Loukides to the project. It was a remarkable book both in being crowdsourced — we collected material from existing O'Reilly books, from saved Usenet posts, and from tips submitted by customers — and in being cross-linked like the web. Jerry built some great tools that allowed us to assign each article a unique ID, which we could cross-reference by ID in the text. As I rearranged the outline, the cross-references would automatically be updated. (It was all done with shell scripts, sed, and awk.)

Lots more in this trip down memory lane. But the fact is we've kept the books alive, kept updating them, and they are still selling, and still helping people do their jobs, decades later. It's something that makes me proud.

See comments and join the conversation about this topic at Google+.

Categories: Open Source Feeds

Visualization of the Week: Four ways to look at Obama's 2013 Budget

O'Reilly Radar - Fri, 02/17/2012 - 12:00

This week, President Barack Obama submitted to Congress his budget for the 2013 fiscal year. You can wade through the entire budget here, or you can get a different look at the budget data through the New York Times' interactive visualization. The Times visualization offers four different ways to examine the budget proposal: all spending, types of spending, changes, and department totals.


Screenshot from the New York Time's 2013 budget visualization. See the full interactive version.

The visualization opens on the "all spending" tab where you can see circles whose color and size represent the size and changes in spending. The size of the circle depends on the amount of spending, and the colors show change — green for more money proposed, red for less.

The transition between the tabs is animated. For example, when you click between the "all spending" and "types of spending" tabs, the circles reposition and regroup.

The full visualization can be seen here.

(Hat tip to Flowing Data.)

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

More Visualizations:

Categories: Open Source Feeds

Publishing News: Let's remember why we got into this business

O'Reilly Radar - Fri, 02/17/2012 - 10:30

In this special edition of the Publishing Week in Review, I'm taking a look at highlights from the 2012 Tools of Change for Publishing Conference held in New York City earlier this week.

Publishing isn't about print vs digital or incompatible ereading formats — it's about storytelling

As far as inspiration goes, it doesn't get much better than LeVar Burton's TOC keynote address. Burton first talked about how he came to literature and publishing. Going back to his childhood, he reminisced that you were either reading a book or getting hit by one — his mother didn't care how, but "in her house, you were going to have an encounter with the written word."

His experiences with storytelling became more profound when he landed a major role on the miniseries "Roots," which taught him about the transformative nature of literature when combined with a visual medium. That experience was so profound for Burton that he left his priesthood studies, deciding storytelling was more effective at reaching people. This decision also later led to 25 years of "Reading Rainbow," the series that used TV to get kids interested in books.

Burton said that "stories are bridges to real-world experiences" and that he's a "firm believer between that which we imagine and that which we create."

"The stories that we tell each other and have told each other throughout the history of the development of civilization are integrally important, are inextricably linked, to how we continue to invent the world in which we live."

Burton said reading and storytelling go far beyond discussions of print versus digital or which digital format should prevail:

"We are going to be absolutely fine, so long as we do not fail ourselves in the one fundamental aspect of who it is we are and what we bring to the table. Remember, human beings are manifesting machines. We are just like that child watching the episodes of 'Star Trek,' seeing those images, using our imaginations, coming up with a piece of technology that actually serves humanity going forward.

"Our imaginations always have been, always will be, our continuing link into ourselves in order to make contact with ourselves so that then we might share the beauty of ourselves through culture with the rest of the world ... I encourage you to remember the nature of what it is you signed on for. You've come here to make a difference. You've come here to use your imaginations in the service of storytelling. Doing the same things we have done for years with a new opportunity, with new tools, a few more bells and whistles — it's still, and always will be, about storytelling."

Burton's full keynote is available in the following video:

The Publishing Panic of 2015 is coming. Can we stop it?

Joe Karaganis, vice president of The American Assembly at Columbia University, addressed issues of piracy and enforcement in a keynote address. Using his work with the Media Piracy in Emerging Economies project as a backdrop, Karaganis said the opposition to SOPA/PIPA and ACTA has moved the conversation beyond online piracy to the convergence of citizenship, democratic accountability and different rights.

The main ingredients of piracy, Karaganis said, are "high prices, low incomes and cheap digital technologies" and that "enforcement has been irrelevant — it's what happens around the edges of these underlying economic drivers." He argued that the current system doesn't scale well and that prosecution rarely occurs:

"When you look at how enforcement works in middle- and low-income countries, you find a pretty simple, consistent pattern: You find raid-based enforcement, characterized by the ramping up of police actions and little to no follow through. There's little likelihood that these cases will make it to trial, and in fact, little expectation that they will."

There's a simple explanation for the discrepancy: "It's cheaper to buy cops than lawyers — raids are cheap, but due process is expensive and slow." He argued that the new enforcement measures (SOPA/PIPA/ACTA) realize this futility and so they instead focus on abridging due process: "The only way to scale up enforcement is to take it out of the courts, to make it an administrative function, and whenever possible, and automated one."

Karaganis said his research showed there's a lot of casual infringement, but very little large-scale or hard-core infringement — 1-3% are hard-core pirates, according to his data.

Bringing the discussion around to publishing, specifically the education market, Karaganis asked, "What happens when the access problem is solved without any corresponding solution to the crisis of the library or the commercial markets — there will be access; the question is, who will make it convenient and affordable?" Using open-education research as an example, he said the problem is that they're not competing with the commercial market, they're competing with the pirate market:

"They're competing with a 'copy culture' that hasn't waited for approved institutional solutions to emerge. As digital readers get very, very cheap in the next few years, that copy culture is going to grow exponentially and produce a huge democratization in educational opportunity and access to knowledge. That will be a hugely disruptive challenge to all parties involved and produce its own cause for enforcement and control."

Karaganis referred to this impending phenomenon as "The Publishing Panic of 2015," and to address it we'll need more than just opposition to legislation like SOPA and PIPA:

"It's not enough to simply say SOPA is bad or enforcement doesn't work, even among people who agree. We need to develop a positive set of proposals for what we want, collectively, for what the public interest is in and around intellectual property. 'What's the positive agenda?' is a very fair question."

More background on Karaganis' research can be found at The American Assembly website. The "Media Piracy in Emerging Economies" report can be downloaded here.

Karaganis' full keynote can be viewed in the following video:

Bookstores: It's about monetizing relationships and experiences, not about selling books

The "Kepler's 2020: Building the Community Bookstore of the 21st Century" session created quite a buzz at the show. For a bit of background, The Kepler's 2020 Project release described it:

"The project aims to create an innovative hybrid business model that includes a for-profit, community-owned-and-operated bookstore, and a nonprofit organization that will feature on-stage author interviews, lectures by leading intellectuals, educational workshops and other literary and cultural events."

Thad McIlroy, owner of TheFutureofPublishing.com, opened the conference session with thoughts on reinventing "the notion of the bookstore in the midst of this crazy time of change." McIlroy said that the Kepler's 2020 project, being led by literary entrepreneur Praveen Madan, is blazing a trail.

Madan's subsequent presentation focused on debunking industry myths. Specifically, printed books are not going to survive and we don't need bookstores in the age of instantly downloadable ebooks.

Madan shared a survey finding that revealed overwhelming support (95%) for using bookstores as "a place for browsing and discovering new ideas" and (72%) as "a place to buy books." He pointed out that more than half of the responders had ereading devices.

Madan also offered two trends that explain why bookstores need to be reinvented and why they still have a future:

  1. Technology is having an isolating impact — "People are more and more disconnected from each other." We are working from home, shopping from home, and community gathering places (churches, schools, community centers) aren't as effective. So, what places are going to bring people together? "We think that can be bookstores," Madan said. "Bookstores need to be re-imagined as those places."
  2. Browsing — We still need showrooms for books. "The reality is that 18 years after Amazon started tweaking its algorithms for recommending books, a well-curated, physical, in-store experience is still better at helping readers discover books," Madan said.

"What we really need is for someone in the technology world to step up and say, "I think there is an opportunity here," he said. Madan also insisted it needs to be open: "We'll pay for the services and we'll pay for the development, but the platform needs to be open source."

The buzz was heightened at the end of the Q&A session when Madan said he was looking to partner with Amazon to sell ebooks through his store:

"[Ebooks are] something we want to provide; we want to be part of the overall experience. But the solution and the technology has to come from somebody else. I'm very serious about looking at [partnering with] Amazon and just giving away Kindles and telling people it's okay — you have our permission. Walk into the bookstore, browse the books and download the books on your Kindle."

When people ask Madan how he'll make money, he answers that that isn't the point — he doesn't need to make money on every downloaded book; he'll make money on the relationships in other ways.

You can learn more about The Kepler's 2020 Project in the following short video:

If you couldn't make it to TOC, or you missed a session you wanted to see, sign up for the TOC 2012 Complete Video Compilation and check out our archive of free keynotes and interviews.

Related:

Categories: Open Source Feeds

Four short links: 17 February 2012

O'Reilly Radar - Fri, 02/17/2012 - 07:00

  1. How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did -- predictive analytics moves faster than family communications. (via Sara Winge)
  2. JSHint -- a tool to detect errors and potential problems in JavaScript code. (via Hacker News)
  3. Web Caching Tutorial -- explanation of the technical ins and outs of web caching.
  4. Gatekeeper -- Apple's new app security technology for Mac OS X. Identity, general purpose computing, security, and third-party kill switches all in the one technology. (via John Gruber)

Categories: Open Source Feeds

Developer Week in Review: NASA says goodbye to big iron

O'Reilly Radar - Thu, 02/16/2012 - 15:00

It looks like I'm going to have a life-changing decision to make in the next few weeks, one that will be shared by millions of people around the world. At risk, the balance in my bank account.

I refer, of course, to whether I'll pony up the cash to upgrade my iPad 2 to a 3, once Apple actually tells us what the iPad 3 will have in it. Unless it cooks gourmet dinners and transports you to other planets, my best guess is that I won't. For one thing, we're also facing the release of the iPhone 5 later in the year, and I make it a policy only to do one Apple fan-boy "upgrade the expensive toy you just bought last year" purchase a year. For another, it looks like the 3 is going to be a faster version of the 2 with a Retina display, and I just can't see it being enough of a delta in features to make it worth the cost.

If I'm going to upgrade either device, I need cash in the bank, so time to earn my keep with this week's news.

HAL is crestfallen ...

We arrive at a bit of a milestone this week, as NASA says goodbye to the last piece of big iron left in its data processing infrastructure. With the retirement of the last IBM Z9, NASA finishes its mission to boldly go where most of the rest of the high tech world had already gone years ago. I especially liked the shout-out to old-school programmers in JCL at the end of NASA's blog post marking the occasion.

NASA, like many organizations running life-critical applications, has to take a very conservative approach to hardware upgrades, because failure is not an option. The computers installed into NASA space vehicles and probes are notorious for being generations behind the current state of the art, because of the long lead times to get them spec'd out and installed. Obviously, no mainframe flies into space, for reasons of weight and space if nothing else. You can see the same kind of excruciatingly slow hardware progress at agencies like the FAA, which can take a human generation to upgrade to a new air traffic control system.

For now, let us bid farewell to the brave Z9, last of its kind at NASA. It would be nice to fantasize that it was responsible for some intricate detail of manned space flight, but the reality is that it evidently ran business applications. Even so, if you don't pay the engineers and vendors, they don't work, so it did play its own sort of role in the exploration of the universe.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20 Monty Redmond's Visual Python

Visual Studio, like Eclipse and Xcode, provides IDE support for a huge swath of the developer community. While it's still common to find old-schoolers who use Emacs or vi to grind out code, most programmers these days end up using an IDE to take advantage of the debugging and integrated documentation features they provide.

Eclipse is well-known for the wide variety of languages and platforms it supports, but it's easy to forget that Microsoft is making a concerted effort to open up Visual Studio to a wider developer audience as well. One sign of this is the version 1.1 release of Python Tools for Visual Studio, which has just come out. This toolkit is notable for another reason, too: it's one of the projects coming out of Microsoft's Codeplex open source initiative.

I know I'm not alone in having been skeptical of Microsoft's recent warming to open source. It's easy to see it as yet another "embrace, extend and extinguish" play. But at a certain point, you have to say that if it walks and talks like a mule, it may in fact be a mule after all. While I don't expect to see the Windows XP source code being donated to Apache anytime soon, it does seem to appear that Microsoft is making an honest effort to leverage the power of the open source model where it makes sense. That's a huge change from the company's previous "open source is communism" stance. As with most things, time will tell if this is the real deal.

I guess we'll find out what happens when you cross the streams ...

Open source developers have a reputation for bringing a passion, sometimes at an obsessive level, to the projects they work on. But even they would find themselves challenged to keep up with the frenzied level of creative mania displayed by bronies, adult fans of the new My Little Pony reboot. So what happens when you combine the two forces of open source and the brony herd? Wonder Twin developer powers activate!

"PonyKart" is a "Mario Kart"-style game set in the "My Little Pony: Friendship is Magic" universe. It's being developed by a group of brony developers over on SourceForge. It's still in the early days, but the initial videos they've released are impressive.

There's a reason you don't see a lot of open source games with this level of complexity; it's a fairly massive undertaking and is usually only within the resources of major game houses. There is a very capable Linux "MarioKart" clone out there, but consider that the "PonyKart" folks have only been in operation since July of last year, compared to the six years of development that have gone into "Supertuxkart" so far, and you can get a feel for the awesome power that can be brought to bear when two committed movements overlap. To be fair, there are more tools available now — such as physics engines — then when "Supertuxkart" started development, but the "PonyKart" effort is still striking. Imagine what could happen if we could get the Gleeks interested in video editing software ...

Tying in another theme often harped upon in these pages, the reason PonyKart can happen at all is that Hasbro has gone out of its way to apply a light hand as far as their intellectual property is concerned. Rather than wrapping a death-grip around the My Little Pony characters, Hasbro has let fans pretty much run wild with them (including the inevitable Rule 34 stuff). The company has wisely decided to let the fans churn up a meme-storm, while it sits back and counts the profits from toy sales. Are you listening, RIAA and MPAA? You could do much better by cooperating with your fan base, rather than persecuting them.

Of course, "PonyKart" could still lose momentum and die. There's a big difference between a long-term effort and horsing around for a few months (see what I did there?). But given the evidence to date, I wouldn't count this nag out of the race yet.

(Obligatory full disclosure: Your humble chronicler is a member of the herd, although not involved in the "PonyKart" project.)

Got news?

Please send tips and leads here.

Related:

Categories: Open Source Feeds

Strata Week: The data behind Yahoo's front page

O'Reilly Radar - Thu, 02/16/2012 - 13:00

Here are a few of the data stories that caught my attention this week.

Data and personalization drive Yahoo's front page

Yahoo offered a peak behind the scenes of its front page with the release of the Yahoo C.O.R.E. Data Visualization. The visualization provides a way to view some of the demographic details behind what Yahoo visitors are clicking on.

The C.O.R.E. (Content Optimization and Relevance Engine) technology was created by Yahoo Labs. The tech is used by Yahoo News and its Today module to personalize results for its visitors — resulting in some 13,000,000 unique story combinations per day. According to Yahoo:

"C.O.R.E. determines how stories should be ordered, dependent on each user. Similarly, C.O.R.E. figures out which story categories (i.e. technology, health, finance, or entertainment) should be displayed prominently on the page to help deepen engagement for each viewer."


Screenshot from Yahoo's CORE data visualization. See the full visualization here.

Scaling Tumblr

Over on the High Scalability blog, Todd Huff examines how the blogging site Tumblr was able to scale its infrastructure, something that Huff describes as more challenging than the scaling that was necessary at Twitter.

To put give some idea of the scope of the problem, Hoff cites these figures:

"Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers."

Hoff interviews Blake Matheny, distributed systems engineer at Tumblr, for a look at the architecture of both "old" and "new" Tumblr. When the startup began, it was hosted on Rackspace where "it gave each custom domain blog an A record. When they outgrew Rackspace there were too many users to migrate."

The article also describes the Tumblr firehose, noting again its differences from Twitter's. "A challenge is to distribute so much data in real-time," Huff writes. "[Tumblr} wanted something that would scale internally and that an application ecosystem could reliably grow around. A central point of distribution was needed." Although Tumblr initially used Scribe/Hadoop, "this model stopped scaling almost immediately, especially at peak where people are creating 1000s of posts a second."

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20


Visualization creation

Data scientist Pete Warden offers his own lessons learned about building visualizations this week in a story here on Radar. His first tip: "Play with your data" -- that is, before you decide what problem you want to solve or visualization you want to create, take the time to know the data you're working with.

Warden writes:

"The more time you spend manipulating and examining the raw information, the more you understand it at a deep level. Knowing your data is the essential starting point for any visualization."

Warden explains how he was able to create a visualization for his new travel startup, Jetpac, that showed where American Facebook users go on vacation. Warden's tips aren't simply about the tools he used; he also walks through the conceptualization of the project as well as the crunching of the data.

Got data news?

Feel free to email me.

Related:

Categories: Open Source Feeds

Commerce Weekly: Google defends its Wallet

O'Reilly Radar - Thu, 02/16/2012 - 11:00

Here are some of the news stories that caught my eye this week.

Google says its Wallet is still safer than your leather one

Google's mobile commerce team spent the week doing damage control after the revelation of security flaws. Last week, it was widely reported that engineers at Zvelo, which provides web-categorization services, had found vulnerabilities in Google Wallet that allowed an app they had written to expose the PIN and tap prepaid funds in the wallet. Google's initial response was to advise users not to run Google Wallet on rooted phones, and be sure to have the screenlock on. But further work, as reported by Zvelo engineer Joshua Rubin, suggests that the hack requires root access, but not necessarily a pre-rooted phone: "While it is true that this PIN vulnerability requires root privileges to succeed, it does not require that the device be rooted previously." Rubin's post and a nice summary by Neil J. Rubenking at PCMag give a good picture of the vulnerability.

Security flaws like this feel inevitable to those accustomed to the ups and downs of web start-ups and the public bugs that accompany any release-early, release-often philosophy. They are, however, more alarming to those who work with banks, merchants, and anyone else who has experience moving money around. Bank Technology News captured the split between the two attitudes and cited Aaron McPherson, a practice director with IDC Financial Insights saying the recent security problem demonstrates "an almost cavalier attitude by non-payments companies toward protecting consumer security."

Google wasn't cowed by the charges, responding with a calm coolness and an insistence that, despite any flaws in its payments system, it's still better than what everyone else is doing:

"Mobile payments are going to become more common in the coming years and we will learn much more as we continue to develop Google Wallet. In the meantime, you can be confident that the digital wallet you carry provides defenses that plastic and leather simply don't."

X.commerce harnesses the technologies of eBay, PayPal and Magento to create the first end-to-end multi-channel commerce technology platform. Our vision is to enable merchants of every size, service providers and developers to thrive in a marketplace where in-store, online, mobile and social selling are all mission critical to business success. Learn more at x.com. Buck enters the one-click mobile payment fray

Buck (previously Billing Revolution) announced a one-click credit card checkout for goods this week. Entering your credit card information once in the app allows you to buy with a single click at participating online merchants — providing you want to buy from Glamour magazine, Papaya Mobile's social gaming network, or any of the other (relatively few) merchants now offering Buck.

If, on the other hand, you're at your local Starbucks, you'll want to pay with one click by unlocking your Starbucks mobile payment option, generating a 2D barcode, and holding it up for the cashier to scan. But suppose you were feeling too groovy for Starbucks this morning and you stopped at your local independent coffee house? Then you might want to pay with a single click with Square's Card Case, providing your indie coffee guy has signed up for that. At Home Depot, you'll want to use PayPal, at Macy's you can tap-and-pay with Google Wallet, and you might need to pay with American Express to get the Foursquare deal that your local eatery is offering.

Mobile payment is exhausting in its current, fragmented state, but it will be interesting to see which systems gain critical mass. Recent web history offers some clues. It was not too long ago that a half dozen search engines, including AltaVista, Yahoo and AskJeeves competed for your searches until one company offered a simpler way with more effective results. And five years ago there were a handful of social network sites competing for our profiles, including MySpace, Orkut, and Friendster, until Facebook rose on a platform of sharing photos, social games, and an easy interface. So which mobile-payments option will find the right combination of security, usability and adoption first?

Adele scorns freemium model

Freemium may be the up-and-coming dominant model in mobile apps — particularly in games — but not everyone is in love with the concept. Adele, who just took home six Grammy awards, declined Spotify's request to stream her award-winning album "21" on its service. According to Austin Carr on Fast Company, the reason is that Spotify offers two tiers of service: a free ad-supported service and a premium one without ads. Adele was willing to let "21" stream to Spotify's paying customers, but not to those riding for free. Spotify, which doesn't offer different libraries for its two tiers, couldn't accommodate the request. So while you could buy "21" on iTunes or hear it on Rhapsody (where everyone pays to stream), you can't hear it on Spotify. But, as Carr points out, with a 20% conversion rate of free subscribers to paying ones, who can second-guess Spotify?

Got news?

News tips and suggestions are always welcome, so please send them along.

If you're interested in learning more about the commerce space, check out DevZone on x.com, a collaboration between O'Reilly and X.commerce.

Related:

Categories: Open Source Feeds

The Falling Man and a center that cannot hold

O'Reilly Radar - Thu, 02/16/2012 - 10:00

This post originally appeared on The Question Concerning Technology ("Falling Man"). It's republished with permission.

AMC's "Mad Men" returns in March, but already the advertising for this show about advertising has successfully stirred a bit of controversy.

I refer to the video teasers and posters that exploit the Falling Man motif of the show's opening title sequence. The vertiginous imagery is controversial because it evokes, intentionally or not, one of the most harrowing news photographs ever taken: that of the "falling man" plunging to his death from the World Trade Center on 9/11.

I'm a fan of "Mad Men," but I'm also among those who find the title sequence disturbing. That's not because of any personal connection to 9/11, I don't think, although as a longtime resident of New York, it hits close enough. The source of my reaction is the power of the Falling Man photograph itself.

I'm not the first to observe that the Falling Man image is evocative on at least two visceral levels. It captures, in an excruciatingly personal way, the literal terror of 9/11. It also captures what it feels like, existentially, to be living in a world of radical uncertainty. The source of our anxiety isn't only terrorism, although that's part of it now. It's about a loss of psychic footing in a world of overwhelming change.

Critics have noted an infatuation in contemporary culture with nostalgia. This isn't surprising, given the degree of change that's subsumed us lately, and that's subsumed us ever since Watt introduced his steam engine. The past, unlike the present, offers something to hold onto. No accident that even as the Industrial Revolution raged around them, Victorians celebrated medieval chivalry and piety, lounging in drawing rooms that excluded, as Lewis Mumford put it, "every hint of the machine." World War I brutally ended any illusion that the machine could be kept at bay, an awakening depicted on the current season of PBS' "Downton Abbey."

"Mad Men" gets terrific mileage out of nostalgia, but we also enjoy knowing a secret the show's characters mostly don't: that their world is about to be turned upside down. Executive producer Matthew Weiner suggested in a recent interview that the dislocating effects of change may be "Mad Men's" most important underlying theme. Specifically he noted the plaintive question asked by a character in the third season: "When is everything going to get back to normal?"

We know that change has been a constant of human affairs, of course, but we also know that technology has amplified the pace and scale of change exponentially. It's interesting that Alvin Toffler's concept of "future shock" doesn't get talked about much any more, despite the fact that the acceleration of technological change responsible for that state of psychological dislocation has, as he predicted, only increased in the decades since he coined the phrase. Would-be tech billionaires are fond of bragging that the application or device they're selling promises to be the most truly "disruptive" technology to come along since Google and Facebook, but even if they succeed they'll soon be looking over their shoulders for the next disruptive technology coming round the bend, as Google and Facebook already are.

In one form or another, the Falling Man has become the archetypical figure of the technological era, spinning his way into space from a center that cannot hold. A standard-bearer of Gilded Age displacement was Henry Adams, who in the opening pages of his autobiography described himself wondering:

"What could become of such a child of the seventeenth and eighteenth centuries, when he should wake up to find himself required to play the game of the twentieth? ... No such accident had ever happened before in human experience. For him, alone, the old universe was thrown into the ash-heap and a new one created."

Adams was far from alone, but it was no surprise he felt that way. Isolation is another symptom of the psychology of modernism — and another primary theme of "Mad Men," according to Matthew Weiner. The 19th century versions of "future shock" were Marx's "alienation" and Durkheim's "anomie." In 1897 Durkheim published a study on the alarming rise in the number of suicides across Europe, a rise he attributed to the "morbid disturbance" caused by "the brilliant development of sciences, the arts and industry of which we are the witnesses." The work of centuries, he said, "cannot be remade in a few years."

We're often told that in order to maintain some semblance of balance in the world technology has made, we have to get used to the fact that everything is never going to get back to normal. So it is that the nostalgic appeal of "Mad Men" is precisely equivalent to that of "Downton Abbey": We get to watch complacently as complacency is overturned.

Related:

Categories: Open Source Feeds

Four short links: 16 February 2012

O'Reilly Radar - Thu, 02/16/2012 - 07:00

  1. The Undue Weight of Truth (Chronicle of Higher Education) -- Wikipedia has become fossilized fiction because the mechanism of self-improvement is broken.
  2. Playfic -- Andy Baio's new site that lets you write text adventures in the browser. Great introduction to programming for language-loving kids and adults.
  3. Review of Alone Together (Chris McDowall) -- I loved this review, its sentiments, and its presentation. Work on stuff that matters.
  4. Why ESRI As-Is Can't Be Part of the Open Government Movement -- data formats without broad support in open source tools are an unnecessary barrier to entry. You're effectively letting the vendor charge for your data, which is just stupid.

Categories: Open Source Feeds