Digital Publishing 101

SFU Publishing Workshops – August 7, 2012

A one-day primer with John Maxwell of SFU Publishing, for writers and publishers who want to understand the range of digital options open to them. We’ll introduce and discuss a range of topics, provide examples and demos where possible, and then focus on your own contexts: how to take these technologies home and implement them effectively.


Part 1: Introductions and agendas (9:00am)

I like to run this class as a conversation as much as possible, so we’ll take a little time to learn who we are and where we’re headed, collectively and individually.

Part 2: Foundations (~9:30 am)

We begin with some storytelling to set the stage, introducing some of the major concepts, forces, and systems that shape digital publishing in 2012.

The Internet:

Origins with Sputnik in 1957 and the “space race” between the USA and the USSR, which spurred the United States to pour millions of dollars into science and technology research and development. Much of this went through a Pentagon-based program called the Advanced Research Projects Agency (ARPA).

ARPA’s Information Processing Techniques Office funded much of computer-science research in the 1960s USA. Director JCR Licklider pursued a vision of “man-computer symbiosis” as opposed to the (then-dominant) artificial intelligence research. This came to mean interactive computing, and the idea of the computer as communications device.

By 1969, they had developed a network that connected disparate computing resources across the United States. By 1974 they were talking about an “Internet,” with a set of standardized protocols.

By the about 1983 the thing was established in its modern form. An end-to-end architecture.

  • a distributed network
  • a heterogenous network
  • a packet-switching network
  • with a peer governance model

That is, a very different thing from the existing telephone network—yet flexible enough to ultimately eat the phone network whole.

In 1990 Tim Berners-Lee invented the World-Wide Web, which put a ‘universal’ interface on Internet resources.

The Internet has been successful largely because of the use of free software in its infrastructural components. Conversely, free software has largely been successful because of the Internet.

Today, over 2 billion people have access to the net. Google indexes something like 50 trillion pages.

KPCB research reports that mobile internet is growing vastly quicker that the desktop did.

This is a massive communications revolution. It is not comparable to the advent of television; nor to the printing press. It is much bigger. We have only just begun.

The state of computing today: a networked society

Interlude: “A Writing Revolution”

YouTube: 24 35 72 hrs of video uploaded EACH MINUTE (2012 statistic)

James Bridle (of the excellent suggests that we have replaced the book as the primary place where we embed our collective memories — we now use the network.

The ebook: a narrative

Technological determinism is the perspective or frame of mind in which technology appears to move forward according to its own logic. For example, we tend to think of electronics and gadgets in fairly deterministic terms: we expect things to get faster better cheaper smaller.

We’ve watched Star Trek for several generations and have seen electronic devices as information sources. Also: sliding doors instead of hinges.

In the late 1990s, Shawn Fanning’s Napster showed what could happen to an industry with a static business model based on manufacturing and distribution efficiencies. Music industry execs woke up one morning to find that millions of people sharing digital files.

This would happen to books next, wouldn’t it? It seemed/seems inevitable. We will read on screens, as soon as some magical threshold of clarity/cheapness/lightness/ease of use/screen quality/technical standards is crossed. The seven signs of the apocalypse. A tipping point. Perhaps we’ve passed it already. Amazon keeps telling us we have.

In a sense, it’s very true. More people are reading (and writing) more text today than at any other time in history. Almost entirely on screens. But is that about ebooks?

The “e-book” is a concept in almost continuous circulation since the 1990s, in which a roughly book-sized reading device becomes the standard interface for reading text.

It satisfies our desire and expectation to one day live in Star Trek; it gives the publishing industry–and writers–a fairly comfortable, predictable future to look forward to (in which they still clearly have a role).

The first wave of e-reader devices hit the market in the late 1990s; they were an utter flop. Nobody actually wanted one.

The second wave began in 2007, when the first e-Ink devices hit the streets. e-Ink, an alternative screen technology, was expected to drive the market to the tipping point. Especially when Amazon released their device, the Kindle. Here was a reader permanently connected to a serious bookstore (this had never happened before).

Around the same time, almost the entire industry (except Amazon) slowly standardized on a common ebook file format: ePub. This was long considered to be one of the seven signs.

Lots of bookstore-branded e-readers appeared, mostly using the e-Ink technology: Kobo (Chapters), Barnes&Noble, Borders. And, at the same time, lots of folks read ebooks on iPhones than on dedicated readers; and many more on PCs (as PDFs)

Apple’s iPad was supposedly the seventh seal, popularly in the form of a unicorn. It came, it went.

Now, tablets and readers are emerging with great rapidity. A Pew study suggests that 29% of American adults have either an e-reader or a tablet device. And yet…

The digital marketplace

Books are, in a sense, a pawn in a much larger game, played at a very high level, by some of the largest companies in the world: Amazon, Apple, Google, Microsoft.

The bigger game (people are saying, “big data”) is about access to individuals, about who gets to act as the platform for everything you do online: communicate… shop… learn… remember. Books are a convenient entry point to all that, for unsurprising reasons.

But make no mistake: it may have begun with books, but it certainly does not end there.

The collateral damage done by this massive ‘scaling up’ of a industry is terrifying for publishers, writers, booksellers, who are tantalized by the possibilities… while being blown like dry leaves in a storm.

For a little insight into Amazon, see James Bridle: and Eric Hellman:

Part 3: The WWW and the Markup-based Paradigm (~11:00 am)

The Web as publishing platform for the 21st century.

Ebooks or not, digital publishing in the early 21st century is based on the web and web technologies. This is a good thing.

Berners-Lee’s 3 simple inventions: http, url, html

Web technologies are open, shared, non-proprietary. Also robust and mature enough–and dynamic enough–that it no longer makes sense to conceive of a fundamentally different platform.

The result is the ebook standards are based on web technologies; app development is based on web technologies…

The digital publishing platform of 2025 won’t be a replacement of the web, but an evolution of it.

from a Page- to a Markup-based paradigm: a brief history

Back in the early 1970s, people began to work with computer-driven phototypesetting machines.

Each typesetter had its own set of control codes (markup). Preparing typeset copy was a major job; if you changed machines, you had to do it all again.

At IBM, Charles Goldfarb came up with a solution called Generalized Markup Language: create a standard way of articulating the structural components of text, in a machine-readable, vendor-neutral way. Make vendors serve the standard, instead of users serving the vendors.

The basic idea was “Separation of Concerns” – keep the semantics separate from the formatting & mechanics

This was a powerful idea. It was taken up in the 1970s by people with BIG documentation systems — US DoD contractors (and also, at Coach House Press)

By 1980s, generalized markup had become a big deal: it was standardized, with big industrial applications, vendors, big contracts. US DoD mandated CALS (Continuous Acquisition and Life-cycle Support) documentation from all contractors.

1986, SGML became an ISO standard for information processing. Aimed mostly at Fortune-500 companies: aerospace, pharmaceuticals, auto-parts, etc.


In the early 1980s, Adobe invented PostScript, which quickly became the defacto standard for controlling printers and output devices. Postscript language defined how ink went on a page, incorporating both text (typography) and graphics.

An entire industry sprang up around Desktop Publishing. Aldus Pagemaker + Apple Laserwriter. Later QuarkXPress, Illustrator, and then InDesign.

In the late 1990s, Adobe streamlined PostScript for the Web (and the future of print, too) in PDF: a page-perfect digital copy.

PDF has proven enormously successful, largely because it makes a convenient digital version of paper. PDFs behave and circulate somewhat like paper.

Neither Postscript nor PDF are based on generalized markup.


In 1990, Berners-Lee invented the WWW. Created an incredibly simple SGML document type: Hypertext Markup Language, which enabled Internet cross-references to be embedded in documents.

The Web exploded in the mid 1990s, and HTML was pulled in a bunch of different directions, lost any conceptual integrity it might have had.

1996/97 an effort to try to rehabilitate the web: XML, a streamlined, web-ready version of SGML. XHTML was re-rendering of HTML markup in a canonical mode.

Widespread adoption of CSS stylesheets (~circa 2000) moved us back to separation of concerns: use HTML to define the structures of the document; use CSS to define formatting.

A tale of two paradigms:

Postscript/DTP vs XML/markup.

Personal computing has cemented the DTP (and WYSIWYG) way of thinking about document production. It has almost made the markup paradigm obsolete.


DTP doesn’t scale. It ties document structure and formatting together. It perpetuates a page-based way of looking at communications.

Industrial-strength document publishing is still done with XML, because the “Separation of Concerns” offers big benefits: long-term re-use, vendor independence, device-independence, ability to distribute work over multiple participants, etc.

But at a smaller than industrial scale, XML has been a bit of a flop; it hasn’t caught on like many (myself included) predicted, a decade ago. The reasons are several:

  1. XML is conceptually very different from DTP/PDF paradigm
  2. investment mentality/ long-term v short-term
  3. ROI largely based on big capital
  4. It’s still labour-intensive
  5. print tools are pretty rare
  6. XML has been pitched as “industrial strength,” meaning complex and hard (see


Where XML has not flopped is the Web. Because HTML is—mostly— XML.

Modern content management systems (blogs, wikis, and beyond) are based on Separation of Concerns.

Web authoring and editing toolkits are free and ubiquitous today. Which means that XML authoring and editing tools are too.

The rise of social media and the link-based economy

Baldur Bjarnason: “the link is the 21st century’s most under-appreciated punctuation mark”

Social media—broadly defined—began early in this century with the rise of blogging. It was, in a sense, “the web done right,” where rhetoric is shaped by currency, frequency, and, most importantly, interconnectedness. Sense inhered not in a single text as in a network of texts.

More recent “social media” like Facebook and Twitter are all about linkages: between people, between people and the things they like, and simply to other things on the network.

The importance and centrality of links is conspicuously absent from the world of ebooks, a sign that the ebook is more a part of the old world than the new.

Part 4: Audiences, Readers, Rights (~1:00 pm)

We workshop in the afternoon…

  1. Publications and genres: what are we publishing, and for whom?
  2. Readers and audiences online today
  3. The role(s) of SEO and social media in gathering & maintaining audiences
  4. The many problem(s) with digital rights

Part 5: Practical Publication Strategies (~2:30 pm)

A look at some contemporary practice, strategies, tactics…

  1. Adobe InDesign as the default dinosaur
  2. Whither XML?
  3. Conversion strategies
  4. Web-first workflows —and esp.
  5. ePub Channels