This essay contains my thoughts, analysis, and supporting links about how Internet has been evolving over the past 20 years. These ideas have been driving me over the past decade to develop a set of improved technologies, practices and standards.
Let’s come together and fulfill the dream that was at the core of Internet: to unleash the potential of networked humanity. Please reach out to firstname.lastname@example.org.
The Internet originated from an effort to develop reliable communications in the case of a nuclear war. It’s been an incredible success even in peacetime, a science fiction tale come true: the Internet now connects well over 3 billion people, provides access to information and services using tiny devices we carry in our pockets. The change to the society that the Internet will cause will be as large or larger than Gutenberg’s press, internal combustion engine, airplanes or electricity—and we’re yet to see the full extent of its impact.
The Internet, unfortunately, isn’t fulfilling its potential for several reasons:
- Violation of the information property rights through piracy and “scraping” has eliminated a source of income for developers, journalists, authors, and artists — and reducing the quality of Internet content.
- Much valuable information is not yet online, or not easily found, even if there’s a market for it. The fundamental reason for this is rigidity of licensing contracts and practices.
- It’s unnecessarily hard to create a sustainable content business online: advertising is inadequate, and charging for information on the global web is encumbered by legacy regulation.
- The internet is putting the public in danger: there is an increasing amount of false and misleading information online, leading to political polarization, extremist movements, and terrorism. The practices are inadequate, and there’s a lack of a system.
- The collection of personal data is threatening democracy with the emergence of a number of powerful private surveillance organizations. Again, the regulation is futile and inadequate.
Fortunately, there is a solution. Protection of personal and private data is a universal human right, but we need to enforce it. In order to develop and distribute protected information online, we need to implement better licensing technologies. To increase the quality and trustworthiness of information, we need to set up systems for reviewing, versioning, and reputation. Consequently, the Internet will develop its full potential, and we will create more than a billion jobs in the sustainable information economy.
2. Value of Data and Content
The Internet emerged by connecting communities of researchers, but as Internet grew, antisocial behaviors were not adequately discouraged.
When I coauthored several internet standards (PNG, JPEG, MNG), I was guided by the vision of connecting humanity. Groups of volunteers like myself were developing open standards that would allow programmers to create internet software without restrictions or taxes. We had felt this could be big if we succeeded, but we didn’t imagine that billions of people would be now using the open standards and the open software we created. The world is smaller than ever before. Friendships now span the globe. Internet technologies are reducing the need for travel for work, reducing fossil fuel consumption and pollution.
The Internet was originally designed to connect a few academic institutions, namely universities and research labs. Academia is a community of academics, which has always been based on the openness of information. Perhaps the most important to the history of the Internet is the hacker community composed of computer scientists, administrators, and programmers, most of whom are not affiliated with academia directly but are employed by companies and institutions. Whenever there is a community, its members are much more likely to volunteer time and resources to it. It was these communities that created websites, wrote the software, and started providing internet services.
The skills of the hacker community are highly sought after and compensated well, and hackers can afford to dedicate their spare time to the community. Society is funding universities and institutes who employ scholars. Within the academic community, the compensation is through citation, while plagiarism or falsification can destroy someone’s career. Institutions and communities have enforced these rules both formally and informally through members’ desire to maintain and grow their standing within the community.
The values of academic community can be sustained within universities, but are not adequate outside of it. When businesses and general public joined the internet, many of the internet technologies and services were overwhelmed with the newcomers who didn’t share their values and were not members of the community. In the beginning, there was very little unwanted email, or spam, on the Internet. But once America Online and other service providers started bringing hordes of new Internet users starting around 1996, spam started growing. It was spam that brought down the USENET forums and made decentralized email clients almost unusable. Many companies are still being held hostage with the denial of service attacks on their servers. False information is distracting people with untrue or irrelevant conspiracy theories, ineffective medical treatments, while facilitating terrorist organization recruiting and propaganda. The excessively idealistic assumptions have actually made reality worse for Internet users.
Fighting spam led to commercialization of the Internet, and excessive centralization of control and information
Large web media companies like Google, Amazon and Microsoft have been able to detect spam by creating highly centralized systems. Their services are highly popular, and the companies are liked by the general public. But as a result, a small number of companies have control of an unprecedented amount of personal information. These companies have access to what we search for, what we post about, what we email, who we message, where we go, who we go with, who we call, what websites we view.
A small group of conspiring individuals within these companies or an outside hacker can access all this data. Such break-ins have happened several times before (*,*,*). Even without a break-in, these companies are already accessing this data right now by themselves and potentially using it in ways we cannot even detect. Privacy laws do not protect us: it’s impossible to detect violations when private data is stored with these companies.
These web media companies generate profits using our data. Their business model is facilitating advertising. Advertisers working with web media companies can target us by bidding on our gender, our age, or location or even our personal identity. These web media companies control the operating systems of our phones, computers, the web browsers we use to do banking and communication. They can activate a microphone or camera at any time by pushing an update to the software. We seem to be perfectly okay with the companies already using the data for profit, by performing data analysis about us, and choosing what version of an ad is most likely going to compel us to buy something we don’t need while interrupting the communication, research or entertainment we’re engaged in. They are beginning to use our data to train the artificial intelligence, thus using the value of our information and applying it elsewhere.
As long as the public has trust in these companies, the amount of information and data will grow. It’s like a balloon that’s being inflated with data. It’s precarious: it takes a single needle to pop a balloon. Of course, once the break-in happens, people won’t trust the companies anymore. But there’s so much information there that even a single event can be highly lucrative. It’s irresponsible of these companies not to protect private information in case of an attack on their systems. We shouldn’t expect much: the appropriation of user data is at the core for many of these companies. For example, the founder of Facebook hacked into protected areas of Harvard’s computer network and copied private dormitory students’ images. He then used them to create a website where users ranked two students based on their hotness (*).
The situation is becoming even more dangerous because we are trusting these companies to offer us search results without a bias and documents without tampering. If the power of the internet consumer companies continues to grow, nobody will even know the balloon popped. There’s already evidence that internet consumer companies are becoming involved in politics by tampering with search results (*), buying media companies (*) and sponsoring politicians (*,*). So, when the balloon pops, there will perhaps be no news posts and no search results about it.
Web media companies have earned hundreds of billions of dollars by extracting value from personal and protected data
As a result of the internet development over the past 20 years, the average level of content online has been lowered, many publishers have gone out of business, and we’ve got more advertising than ever. The magazine industry has shrunk by 20% just between 2005 and 2011. The number of newsroom employees has fallen by 40%. We do, however, have web media companies earning valuations measured in hundreds of billions of dollars. Web media companies earned this in large part by matching advertising with the content either taken from the media companies or created by unpaid volunteers, while only returning a tiny fraction of that money to those who created the content. How did this happen?
Above I’ve described how web media companies accumulate and extract value from our personal data. Many of these practices have actually developed earlier, with the public internet. It was the volunteers, webmasters, who created the first websites. Websites made information easily accessible. The website was property and a brand, vouching for the reputation of the content and data there. Users bookmarked those websites they liked so that they could revisit them later — or emailed the creators of websites with suggestions and comments. Some of the websites primarily collected links to other websites and kept the links current and curated.
In those days, I kept current about the developments in the field by following newsgroups and regularly visiting key websites that curated the information on a particular topic. Google entered the picture by downloading all of Internet and indexing it. It was a Faustian bargain for the webmasters: if they prevented Google from crawling and using the data, their websites could languish in obscurity. But if they did allow Google to crawl, they’d also allow Google to make a copy of the pages and using the information in there for Google’s own profit. Something else happened too: the perceived credit for finding information went to Google and no longer to the creators of the websites.
After a few years of maintaining my website, I was no longer receiving much appreciation for this work, so I have given up maintaining the pages on my website and curating links. This must have happened around 2005. An increasing number of Wikipedia editors are giving up their unpaid efforts to maintain quality in the fight with vandalism or content spam (*,*). On the other hand, marketers continue to have an incentive to put information online that would lead to sales. As a result of depriving contributors to the open web with brand and credit, search results on Google tend to be of worse quality.
When Internet search was gradually taking over from websites, there was one area where a writer’s personal property and personal brand were still protected: blogging. While search yielded results on a particular topic, one could stay current by following blogs on topics of interest. RSS reader software provided a way to maintain subscriptions or bookmarks to blogs. The community connected through the comments on blog posts. The bloggers were known and personally subscribed to.
Alas, whenever there’s an unprotected resource online, some startup will move in and harvest it. Social media tools simplified link sharing. Thus, an “influencer” could easily post a link to an article written by someone else within their own social media feed. The conversation was removed from the blog post and instead developed in the influencer’s feed. As a result, carefully written articles have become a mere resource for influencers. As a result, the number of new blogs has been falling.
Social media companies like Twitter and Facebook reduced barriers to entry by making so easy to refer to others’ content so much that the pool of influencers was a simple rich-get-richer phenomenon: famous personalities from mainstream media also became the most followed personalities on the social media. Social media companies then used the social relationships and communities and started inserting their own advertising. This way, even social media has started withering. Part of the rise of podcasting is the inability of social media to interfere with podcast subscriptions through special apps (*,*). But it’s a just a question of time when podcasting will get aggregated.
How advertising fails as a business model for journalism?
To earn an income with free content, publishers sold advertising space for banner ads. Ad tech companies like DoubleClick (later acquired by Google) sold ad space on behalf of publishers in exchange for a cut of the revenue. Because of the lack of competition in ad tech the revenue share continues to be unfavorable for publishers. Moreover, abundant advertising fraud led to over $7B in revenue going to fraudsters rather than to publishers (*).
As a result, web advertising is hardly lucrative: the revenue that can be generated with web page advertising is measured in mere cents per hour, whereas subscription revenue from newspapers and magazines was easily measured in dollars per hour. At the same time, the online content is fundamentally unprotected through conventional copyright. The creation of print content and photographs, the collection of links to other relevant pages end up being a resource that gets harvested by search engines, social media, and content farms who end up extracting most of the financial value.
For example, a search engine will extract the title and the summary, and reuse them in their page with search results — but the publisher will not partake in lucrative ad revenue displayed on the search result page. Social media will similarly repurpose the photos, headlines, and summaries to create an attractive news feed, and similarly, won’t share the lucrative targeted ad revenue with the creators of those. A content farm will re-use the hard work of journalistic reporting by creating a derivative article for a fraction of the cost — which can be published only minutes or even seconds after the original publication.
To increase revenues in such an environment, publishers have made the advertising increasingly obstructive, eroding the privacy with tracking, slowing down page loading, increasing the amount of data consumed as well as shortening the battery life. This led more and more users to employ tools such as ad blockers (*), ad-blocking browsers (*,*), and offline reading apps (*,*). These tools strip content of ads and thereby publishers of revenue. Google’s Chrome browser is planning to start blocking ads (presumably non-Google ones, further consolidating their already dominant market share) in 2018 under the pretense of “usability” (*). Google and Facebook engage in censorship under the pretense of fighting “fake news” (*,*), although better proposals exist (*).
The success of paid content business models and continued erosion of digital advertising
I’ve recently realized I read web content less and less, and read eBooks more. It’s true that web articles are often shorter and convenient, but I find I save a lot of time by reading well-researched and well-written ebooks from reputable publishers. It’s not even necessary to pay a lot to purchase ebooks — one can just loan or rent them from public libraries and online stores that support lending. Public libraries have spent over 6% of their total materials spending on ebooks.
Why are ebooks better than web articles? Ebooks have a better business model than web pages: when an ebook is sold or lent, writers and publishers earn income. The income allows writers to do quality research and writing. Income also allows publishers to do quality selection, editing, design and distribution. Income is especially important at a time when publishing quality content on the web is increasingly about volunteering rather than about livelihood, and livelihood is about satisfying advertisers or, more recently sponsors. The sponsored content or native advertising model is about presenting advertising as content, so that readers think they’re reading an article when in reality they’re reading an advertorial.
The web media firms that have successfully charged for content are worth considerably more. Financial Times has been sold for $1.3B to Nikkei, having the circulation of 1.3M. The Economist has been valued to $1.5 billion through the sale of Pearson’s stake, with a similar figure of 1.3M subscribers, and reaching 11M digitally. These publications are thus worth $1K per paid subscriber. On the other hand, newspapers that aren’t restricting access to their content are worth considerably less per customer: The Washington Post sold for $250M with the paid circulation of about 400K, even if the digital reach is 76M. The Boston Globe and its affiliated New England media assets sold for only $70 million with the reach of 571K.
Paid business models are technically expensive to implement for smaller newspapers. But even more importantly, the management fears that increasing the price will drive down the number of readers, who have been spoiled with free content. Thus, while Financial Times and The Economist picked the path of staying financially independent from advertisers, most American newspapers and newspapers like The Guardian chose larger audiences, while continuing to shrink and cut costs.
The give-away of content by national newspapers has led to the vanishing of many blogs, smaller newspapers, and magazines. Now even these national newspapers are threatened. They have hoped that growing digital audiences would lead to growing ad revenues, using the model of renting space alongside the articles to digital ad networks. However, consolidated digital advertising networks like Facebook and Google are a strong adversary. These networks are inviting newspapers to syndicate their content to Google search results and Facebook news feeds in exchange for a fraction of the ad revenue. In the meantime, Google and Facebook can play the game of favorites, and hold all the customer data.
How content gets devalued by using Internet as a promotional tool
Perhaps the most dramatic downfall of a content industry has happened to music. Between 1996 and 2014, 75% of global music revenue evaporated, going from $60B to $15B (*). The annual revenue per capita in the US dropped 67% to $26 between 1999 and 2014 (*). The number of full-time music artists dropped by 42% between 2015 and 2000 in the US (*). An average American still spends more than 4 hours per day listening to music: this amounts to less than $0.02/hour of music, and only a fraction of that actually goes to its creator.
The fastest growing revenue source for music is digital streaming. The business model for digital streaming is still based on the radio which paid a relatively insignificant amount in exchange for creating exposure to music. However, digital streaming offers millions of channels instead of maybe a dozen that existed when the business model was created. With old radio, one had no choice what to listen to, and had to buy the album to be able to listen to a song arbitrarily. But digital streaming services offer this ability while still only paying the artist the amount a radio station would. Several artists have opted out of streaming services (*), even though it’s difficult to do so (*). Independent musicians are furthermore in a disadvantageous position and receiving as much as 10x times less money per play than large music publishers — who often hold ownership stakes of digital music streaming services.
Yet in the big scheme of things, digital streaming music companies are not alone at fault for the downfall of the music industry. They are still trying to charge for access to music under an all-you-can-eat subscription model while giving away some content for free combined with some advertising. The real problem is the ad supported model: if a website or an app offers any piece of music for free with some optional advertising to everyone in unlimited quantities — it’s very hard to persuade customers to purchase it. With human nature as it is, customers are eager to look for the cheapest option. And the epitome of the cheapest option these days happens to be YouTube.
YouTube has over a billion users watching millions of hours of video per day (*), and generated over $4B in ad revenue just in 2015 — but has paid only about $2B to rights holders in the decade from 2007 until July 2016. The most popular YouTube search term is “music”. The revenue is only a small part of the value generated for Google: there are goodwill and generated data. Goodwill is the brand value generated for YouTube by giving content away is another part of the profit for the company, and that doesn’t get measured in the revenue share calculations. The generated data allows Google to create detailed viewing profiles for users that allow them to optimize ads. YouTube is free, while at the same time, people are perfectly happy to pay for a drink, a meal, a cab ride or a vacation.
The primary ways of monetizing content these days is the subscription model. How does that work? A subscriber pays a fixed fee every month and received unlimited access to the content. Examples of this are Netflix for video and Spotify for music. It’s a bit like an all-you-can-eat buffet: a fixed payment for an unlimited quantity of food. It’s seemingly attractive, but to make it work, there is a small number of high-quality items and a large amount of inexpensive filler content. Netflix provides a trial account as the high-quality video is still scarce online, but Spotify has to compete with YouTube by providing a free ad-supported tier. As long as YouTube can freely provide content, only a minority of the potential market will be willing to pay extra. Why would a student pay $10 for an album if they can get unlimited Spotify subscription for less than $5 per month? Yet another consideration for the holder of content rights: they only have a limited amount of information what’s happening with their content, the information is delayed and it’s difficult to trust and audit what they do get.
The news websites have tried the “metered model” — where the ad-supported tier only carries a limited number of articles that can be read. Maybe such a model will also appear in music. But the fundamental problem is that articles are still fundamentally treated as “free”, and that the news media continue to value their editorial role even though the editorial role has largely been transferred to the social media “curators” who make use of inexpensive raw material of articles.
Subscriptions have been seen as the solution for content monetization online. Yet, at the same time, all-you-can-eat restaurants are a tiny minority of all restaurants. Subscriptions will not include top content without additional payment. The selection of what is free and what isn’t will be intrinsically subjective and expensive to negotiate. In the meantime, as long as a large variety of content is available free of charge. Wide accessibility of a large quantity of content on Spotify makes it difficult for bands and labels to offer digital downloads of their music. Already, Apple iTunes model of selling uniformly priced singles in addition to albums made it much cheaper to assemble an album composed of outstanding tracks. The subscription model is just the next step in this direction. The ad-supported model is a jump to zero. This price competition is leading the industry in a downward spiral which ultimately reduces the creativity.
How some companies benefit from the rhetoric of “free” content, and how the hopes of content creators aren’t materializing.
With the Internet, both barriers went down: the fragmentation of digital jurisdictions among hundreds of geographies, and judicial and law enforcement’s lack of technical ability led to the copyright laws being violated on a regular basis. The ease of copying or modifications removes the investment and delay that previously existed for physical carriers. Finally, web media companies have successfully lobbied against revamping laws that would limit or prevent effective enforcement of copyright (*) even though that is technically feasible, while passive aggressively making it difficult to use (*,*). So, at the moment, DMCA framework from the dark ages of the Internet (1998) remains as it is, with all the limitations (*,*).
Whoever pretends to “liberate” content in this system, gets to play the role of a seeming Robin Hood and gets the credit, attention, and resources. Those facilitating illegal piracy became extremely wealthy, like Kim Dotcom (*), or politically powerful, like Pirate Bay founder Peter Sunde (*). In some cases, companies that violate copyright particularly egregiously, such as the music piracy pioneer Napster (*) or Kim Dotcom’s MegaUpload (*) do get shut down. But there are few personal repercussions for the agents: Napster co-founder Sean Parker later helped start Facebook as its founding president — and is now a billionaire (*).
There’s a distinct disunity among creators on the role of copyright. Those creating are also consuming, and consuming more than most others, so having free access is quite appealing. To justify this, they are very happy to also freely share their work with others, engaging in a personal gift economy. Creators often base their work on creations of others, remixing and drawing inspiration from it, but rigid licensing practices make it difficult to obtain the formal permissions. Lack of transparency and one-sided contracts in the publishing industries create an alienation between creators and publishers. As a result, many creators are looking to revise the copyright laws, often by removing them altogether. However, rejection of creators’ rights is short-sighted. This change would primarily benefit intermediaries like Internet media and search companies. And these companies are the ones sponsoring think tank organizations and grassroots efforts that criticize copyright, and fund lobbyists that advocate limiting copyright. With these efforts and with the adoption of free content mindset, only advertising models are feasible.
Creators have also been quite willing to offer free access to everyone as to get discovered and develop a following. It’s a strategy that worked quite well in the early days of the Internet when there was a strong community, a relative lack of content and still functioning avenues to sell content. But in the over 10 years that YouTube has existed, not a single best-selling album got launched on YouTube — and most artists still get discovered and launched through existing industry networks.
A hope has been that free music would increase concert attendance. However, while $13B (in inflation-corrected dollars) of total recorded music revenue in the US vanished between 1999 and 2014, the live concert revenue only increased by $4.1B in this same period: to fill the gap, even tripling of current live concert revenue wouldn’t suffice (*,*). So, competition for attention by giving content away has only devalued music.
Another hope has been that fans would donate. Yet, the only result is continued commercial failure of the donation model (*). When the music is perceived to be free, and when “sharing is caring” and 18% of American youth think it’s acceptable to upload content to pirate websites (*) — there simply isn’t any value attributed to the content, regardless of the effort that was put into it. When physical property is assured protection by the government — why shouldn’t intellectual property have the same? What makes a landlord more deserving of government’s protection of their investment and property than a scientist, a journalist or an artist? And why not transition from the model of apartment and property leases to voluntary donations to landlords by squatters?
To summarize, with adoption of the Internet, the protection of content has weakened. It’s not that people aren’t willing to pay for good content: The success of iTunes, Netflix, Amazon and many other examples has already demonstrated that beyond all doubt. It’s that creators have bought into the deceptive optimism that giving away content will grow their audience. Moreover, the proprietary content e-commerce solutions have limited the traditional rights of content buyers — so content creators have been trying to compensate by giving content away. As a result, the value of content has been devalued, and it’s still hard to find an audience. It’s the content creators who are subsidizing this vicious cycle of content which is leading to an unprecedented shrinking of content creation industries.
The importance of insisting on a price for content
It takes many years of study to be able to produce music. Then it takes a lot of work to create a good piece of work. Finally, it takes much effort and resources to establish recognition for a single or album, so that it can rise above the din of mediocrity and find its audience. After that, it costs virtually nothing to deliver the content through the Internet. The cost of content is not the cost of delivery, it’s the cost of creation. The creators hope to recoup the cost of creation by charging for delivery.
It’s not too different for a farmer to acquire and clear the land, enrich it, select seeds, plant an apple tree, nurture it to maturity, and then protect it from pests. Once the apples are ripe, it’s very little work to pick them. But this ignores the vast amount of time and effort that needed to be put into those apples prior. Societies that don’t protect the farmers’ investment end up in poverty, as farmers stop working the land. This is beginning to happen to the Internet.
The solution to this problem is the creation of a new type of rules for the protection of content creators. The Universal Declaration of Human Rights states (*): “Everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author.” The US Constitution states (*): “The Congress shall have power […]
To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries;“ Copyright has been developed at the beginning of the 18th century, became more widespread at the end of 19th century. These laws need a very significant update for the present-day internet. Moreover, to correct for the anomalies of the past two decades, content creators and the general public deserve a part of the value that has been unfairly captured by the web media companies.
Publishers, authors, film production companies and many other owners of intellectual property are in a weak position compared with the concentration of Apple, Amazon, Google and a small number other consumer web media companies who control the commerce and rights protection technologies. These companies also spend enormous amounts of money for lobbying, Google spent $450M betwee 2015–2016 just in the EU (*). It’s only after the standards will truly protect the authors, creators and curators but also allow more openness and competition among the services based on content and data outside of public domain.
What’s needed is a model governed by standards that allows content rightsholders to control and enforce the cost of acquiring a license to the content. The cost of that license needs to be clear and should hold for everyone, regardless of the business model, be it advertising or acquisition. With this, media companies can develop new innovative offerings to the users, while the clearly priced licenses ensure that the competition is in the quality of offerings and not through content deals. The dream of the universal internet library will then be fulfilled, where any piece of work is accessible for a fair price, without illogical bundles and barriers.
3. Failing to Protect Content
The obsolescence of copyright with advancing technology
In the past, content was packaged into books and videotapes, later DVDs. It was physical objects, the “carriers”, that were bought and sold, even though the value was in the content itself. Carriers could be distributed, sold by a variety of competing vendors in different stores. The scarcity of carriers and the protection of copyright law ensured that the access to content was priced and valued. In addition to copyright, the delay and considerable investment needed for production and distribution of an illicit carrier protected the underlying content.
Physical media were very difficult to copy, magnetic media such as audio cassettes simplified copying, but the quality of the copy was lower, but with the transition to digital media, copy is perfect. The content industry tried to develop digital copy protection and digital rights management (DRM) technologies. While they do prevent casual sharing to some extent, they also prevent behaviors that people were used to with physical media, such as the creation of permanent private libraries, creation of backup copies, lending to friends, ability to use different devices to consume content. DRM is only applied to commercial content, but fails to protect our personal data and many other types of content and data. But the most important flaw of DRM is that it’s fundamentally inadequate: the protection can always be broken, and the bootleg copy uploaded to the internet.
The unintended effects of digital rights management technologies
Media companies understood that with digital technologies, digital content can be copied more easily than ever before. They sought protection for their digital content products through DRM technologies. Implementing a DRM system is difficult: it requires low-level integration with the operating system, the ability to offer a positive buying experience to end consumers, as well as the business ability to maintain partnerships with content rightsholders. As a result, few companies had the resources to develop DRM: Apple, Amazon, Google, Adobe and Microsoft. These companies were in a powerful position and they tried to leverage that.
For example, when ebooks were a novelty, I bought a Kindle to read them. To get an ebook, I’d have to use a computer to buy it from a website. Kindle allowed the books to be transferred wirelessly. Before that, an ebook would be transferred to the reader with a USB cable: a complex feat for many people, requiring special software to run on the computer. That was the time of Kindle and Nook that only large companies can accomplish: it required 1) building a hardware device 2) creating custom software 3) acquiring the content from a number of publishers 4) launching and supporting millions of customers. It took an Apple to release the iPhone. It took a Microsoft to release Windows. It took Barnes & Noble to release Nook, and Amazon to develop the Kindle.
Adobe and Microsoft tried to develop general-purpose technologies that were to be used by other stores. This is a harder problem, and Adobe’s ebook DRM technology has proven not to be very user friendly. Consequently, 75% of all the ebook sales now happen just through one company, Amazon. Amazon does not have to support open standards (*): a customer can’t read the purchased book on a device or with software that isn’t Amazon’s own (*). This limits innovation in e-book reading technology. Unlike with paper books, it’s very difficult to legally share one’s library with others. All the purchases are locked into an Amazon account, at mercy of the company operating the service (*) — and Amazon can even arbitrarily remove purchases from customers’ libraries (*).
It gets worse: Amazon controls the prices, the selection (*), and the reading experience, and is monitoring every individual page view of every reader. Amazon uses price as its main competitive advantage over other retailers and has a history of driving competitors out of business (*) through predatory pricing (*), which it can afford because of its size (*,*). Amazon has started reducing payments to authors based on monitoring of the e-book reading applications (*).
Amazon even started its own publishing unit (*). Other publishers can buy ad space on Amazon to promote their products. But Amazon can feature their own books on the landing page, or include them in automated product recommendations. With the search data, they can prioritize what books to publish. Amazon’s leadership now controls important national news media (*) and is now expanding to education (*). Ironically, judicial system perceived Amazon as an underdog when publishers tried to develop an agency pricing model for ebooks (*).
As a result of media companies quest for content protection, a very small number of technology companies have developed a commanding share of the paid media market: Amazon, Apple, Netflix and Google. These companies are involved in politics and have a considerable amount of ability to affect pricing, presentation and access to content for hundreds of million of people. There are very few controls on what these companies can or cannot do, and with their growth and degradation of journalism, it is becoming increasingly difficult to monitor and regulate them. A better approach than DRM is needed.
These days, I prefer to read ebooks on my smartphone or a standard tablet: the pages turn faster, and I don’t need to carry another device. In fact, over 80% of American cell phones sold today are smartphones. I’ve been using several different apps to read my books. Some of them are web applications which don’t even require an installation. Since an ebook is technically not much different than a saved web page, there’s no need for special ebook stores or store-brand reading devices. Ebooks have become an internet technology. It’s the time of Linux and Android, not Windows. There are much fewer barriers to entry for new publishers.
The role of libraries in making content affordable and accessible
For those who can’t afford to buy ebooks, it’s now increasingly possible to borrow ebooks from public libraries for free. Libraries are transforming themselves away from warehousing paper into curators of publishing. Libraries protect the rights of publishers and authors, buying ebooks using funds provided by the public, by members and by donors — to make verified and high quality information available to children, students and others.
A large part of humanity’s information, knowledge and entertainment aren’t yet easily available online. There are wonderful books only found in libraries, fascinating video materials and recording only found in archives, children’s cartoons only available on DVD, performances only accessible at rare and expensive metropolitan venues, lectures only happening at certain universities. Why are then these materials not accessible to anyone online?
It’s not just that the digitalization of physical content into digital form costs money, but that tickets, tuition, physical books help pay for the creation of these materials. In fact, the creators of these materials have a justified fear that once the content will be digitized, unlawful copying or piracy will deprive them of most of their income, just as it has happened to music. When it does become available, it’s usually available for a limited period of time from a single source. This model of licensing is very similar to the broadcast model, where the buyer of the license pays a fixed amount in advance. Such licensing deals tend to be viable only for a limited selection of works that are extensively promoted. At would be unaffordable for a comprehensive library of content to be created using broadcast licensing.
In summary, more content will be available when protection and licensing practices are updated for the Internet era. Instead of trying to maintain practices from the publishing and broadcasting industries, a new paradigm of data licenses needs to be developed. Data doesn’t stand alone, but it depends on what is known about it, its reputation, quality, origin. In that respect, the next chapter will examine the role of the middleman.
4. Protecting the Middleman
How products get developed and discovered?
Creators put their love and care into producing something of quality, be it a book, a song, a film, or a physical product. We celebrate creators for their individual achievements, but in reality, creators don’t get very far without a team. A prerequisite for a quality product is training the creator, helping them develop the skills. Then, the creator needs collaborators and funding to create a high quality product. And finally, the product needs to be checked, certified, introduced to the public, and then distributed. This last stage is often dismissed as marketing, but it’s exceedingly important. In fact, the distribution of the product itself is itself a part of creation.
While the creator is intimately familiar with the product and its qualities, a customer initially knows nothing about it. The task for the curator is to create a bridge between the customer and products. Customers have their own motivations, problems and interests. A curator seeks to understand the customers’ state of mind, and present the available products so that their value is clear to the customer, and present them in a way that allows the customer to explore and select. Moreover, the curator seeks to protect the customer from low quality and high price. Finally, curator retains a long-term relationship with the customer, as it’s rare for a customer to maintain strong relationships with every individual product’s creator.
The job of storekeeper or merchant includes the work of a curator. Running the business of a store also requires procuring, storing, displaying, and protecting the merchandise, handling accounting, taking payments, and managing complaints, returns and refunds. With all this complexity, the curatorial role can easily be neglected: we’ve often found a store assistant that knew little about the products he was selling.
Amazon does not have to create a pleasant environment of a bookstore and doesn’t employ staff to provide guidance to the bookstore customers. Amazon doesn’t have to display products and let customers try new products. But Amazon benefits from brick-and-mortar stores doing that. A buyer can be enjoying the curated selections in a physical store. Once the customer discovers a product, they can go to Amazon and search for the product they already know. With Amazon’s scale, it’s hard for most curators or brick-and-mortar retailers to compete with on cost. This practice is referred to as showrooming.
Indeed, a store is a mix between a warehouse and a showroom or a gallery. Customers can explore products, ask the store employees for recommendations. If there’s a problem with the product, the store provides assurance and returns. It’s expensive to provide recommendations and allow a customer to explore the products. When stores do not provide such service, the publisher needs to market the products by broadcasting. While this isn’t a problem for consumer products like toothpaste, it’s a much more significant problem with small-volume products like books and fashion. That’s why stores provide an important role in discovery. They have a similar role as museums, as they create curated collections of products. Tourists travel far to spend time visiting shopping districts filled with boutiques.
On the other hand, online stores like Amazon are the equivalent of a warehouse: it’s hard to discover anything new unless you know exactly what you’re looking for. Amazon and iTunes provide a user interface that has changed very little from the 1916 self-service store patent by Clarence Saunders. Saunders’ patent proposed that the customers walk through a store, collecting products for purchase from the shelves and putting them into the cart, paying at the checkout. Various devices, like one-way turnstiles, prevented thievery. Before that, a customer would order an item from a shopkeeper on the other side of a counter. This concept was quickly imitated across the industry.
Online retailers are encouraging show-rooming and attempting to steal the efforts of curators, giving them no recognition or compensation. For example, Amazon introduced the Amazon Shopping App in 2011. A customer can use this app to scan the barcode on the product, and buy it for a lower price from Amazon online (*). The store had to pay to attract the customer, stock the products, let customers browse in a pleasant environment, working to match the products to the customer. The store is now deprived of income, and the local community is deprived of sales tax and later deprived from retail stores going out of business.
Over the past two decades the circle of curators has shrunk. The number of bookstores has fallen from over 38,500 in 2004 to fewer than 25,000 in 2016, a 36% decrease (*). The number of independent record stores has fallen from over 3,300 in 2003 to under 1,600 in 2013, a 52% decrease (*). The number of movie tickets sold is predicted to fall from 1.6B in 2003 to 1.0B in 2016, a 36% decrease (*).
As these curators are vanishing, the mass-marketed hits and best-seller lists are beginning to dominate sales. In the past, the curators used to enable the “middle class” of quality content. That content was not being mass marketed, but was still widely available through theaters, through reviews, and through stores. There, content was picked up by enthusiasts who then recommended it further and gave it enough distribution for the products to be financially viable. But now, much content is no longer as viable as it used to be (*,*). Big hits are becoming bigger than ever: in the summer of 2016 of the top 100 tracks spinning on Pandora, 20 belonged to Drake according to the Pandora Top Spins chart (*).
Many retailers are responding to the problem of showrooming. They don’t want to carry products where they won’t offer the best price. They look to be the exclusive retailer for products, sometimes through custom products or through store brands. We can already see deals where certain movies or TV shows are only accessible through one of the stores. It’s a frustrating situation for consumers, who are sometimes forced to acquire a whole monthly subscription to just watch one show or listen to one album. With digital products, content is fundamentally the same regardless of the source — and it would be most sensible to fix the price and allow different retailers to instead compete on the quality of curation and delivery.
In the meantime top tier manufacturers such as Apple are opening their own stores partly also in response to retailers’ falling ability to curate. Many publishers have tried this as well, but the results have been somewhat mixed. Digital marketing is expensive if the customer lifetime value is less than $100. While a publisher knows how to select and produce quality content, creating a bookstore environment is not their core competency. Moreover, retailers do not want help customers discover products just so that they would later be bought directly from the supplier directly. Some publishers take an effort to assure their impartiality with respect to all their retailers.
The demise of the reviewer
A large part of curation of products is done by our friends. They purchase products and test them — and then recommend them to others. Early adopters, which Malcolm Gladwell refers to as “mavens” in his book “The Tipping Point“ can spend a lot of time and money on staying ahead of others in knowing what’s best. But these are acts of volunteerism, not paid professional work. Few people can afford to develop a true expertise on products. Professional reviewers of the heyday were employed by magazines and newspapers, who had the resources to systematically evaluate a wide selection of products, and create high-quality reviews.
Internet has made it a lot easier for anyone to publish. While such reviews are cheap and plentiful, volunteer reviewers typically don’t have much expertise in the product or evaluation techniques. Professional reviewers’ work is also used by “review aggregators” such as Metacritic or Rotten Tomatoes without compensation to create aggregations which are then offered online to end-customers. Some of the largest review aggregators have no qualms about uncredited data theft (*). E-commerce companies like Amazon solicit product reviews from its customers, offering no compensation and deprofessionalizing reviewing.
As a result, there is an increasing amount of unreliable and deceptive reviews (*) — and customers need to spend an increasing amount of time researching. The best-seller lists can be easily manipulated by large publishers who have the resources to simply purchase their own products. Social media “influencers” can leverage their following to solicit positive reviews, views and ratings to help boost their ratings. Because customers trust such lists and recommendations, an initial fake purchase will be followed by real purchases — so such manipulative actions can be highly profitable (*).
Reviewing communities bring together enthusiasts and expert who collaborate to develop their expertise on a particular topic, just like academic and hacker communities. The community provides a reward for sharing, but also peer review which discourages fakery and deception in reviews. As a result of that combined with the attitude of giving reviews away to the general public, the websites that bring the communities together are attractive acquisition targets. A book reviewer community GoodReads, as well as the movie database IMDB have been acquired by Amazon (*,*). Thus, the reviewers who thought they volunteered their work to the community, have instead merely donated their work to the companies who hosted the communities. It was these companies who received the money from the acquirers without having to compensate the contributors. The work of the contributors and their social connections are now effectively owned by Amazon.
Automation of recommendations
The internet media companies think reviews and reviewers will soon be replaced by artificial intelligence and recommender systems. Such automated recommendations use the data we voluntarily submit about what we like or dislike. For example, one who liked several action movies might be recommended other action movies. The technology was pioneered by Netflix who realized that marketing drives people to see heavily marketed recent blockbuster movies — but it was expensive for them to purchase such a large number of DVDs. So, Netflix seeks to persuade its customers through recommendations to instead order an older, cheaper movie that they would still like. This has proven very profitable for Netflix, and given it a key advantage over other DVD rental companies.
When Amazon notices that we’re considering a particular product, it lists other products that were frequently bought alongside with it. For example, if you look at Tolstoy’s War and Peace product page on Amazon, you will be also suggested other Russian classics. Amazon’s goal is to increase the number of sales, while minimizing the human effort of curation.
This automation seems convenient, but there are important problems. Quality curation is often about exposing a visitor to new experiences, so that they develop a broader perspective. On the other hand, recommender systems seem to draw an unsuspecting customer deeper and deeper into the rabbit hole they might have crawled into. For example, someone who bought a right-wing political book was just recommended more right-wing books, and someone who has bought a left-wing book was recommended even more left-wing books, based on the research into Amazon’s product recommendations (*). This can lead to dangerous political polarization, creating a deep rift in society.
Another problem with such product recommendations is that new products have not been bought by anyone yet. So, a manufacturer with a large marketing budget can pay people to buy the product and thus populate the recommendation systems. Somewhat ironically, Amazon also offers its suppliers an option to buy advertising on amazon.com (*). The digital advertising market is highly centralized with Google and Facebook controlling almost 50% of digital advertising dollars in the US in 2016 (*). But there’s no economic incentive for an early adopter or a discoverer of such a product to profit from helping to spread the good word for a quality product except through non-monetary reputation economy.