It’s been too long. Back in 2013, I wrote about moving Publishing@SFU’s web infrastructure off of physical servers in my office and onto nice virtual servers on SFU’s shared hosting service. It was so nice to not worry about power outages and other physical-world hassles. It was as though those servers didn’t even exist anymore. Out of sight, out of mind.
The problem with “out of mind” is that it really is that. So over the intervening years, while our website chugged along and gathered a lot of content, we probably didn’t spend as much time keeping it all tuned up and upgraded as we should have. The inevitable would happen… and it did.
We got hacked last Thursday. Or probably at some point well before that, but the site went down on Thursday. Juan and I had a good look at the back end of the site, wondering if we could recover from it. But that server install was fully six years old, well past its upgrade lifespan. So we pulled the plug. Well, virtually we did: we requested a service ticket for someone to pull the plug. Metaphorically, I mean; really, what happened is someone at SFU IT typed some keystrokes and publishing.sfu.ca ceased to exist.
We are back, a week later, with a properly managed and backed-up host on Reclaim Hosting. Interestingly, if you read that same post from 2013, in which I talked about moving to new server infrastructure, I also remarked that we were excited about working with Reclaim Hosting for our (then) new PUB101 course. So after six years of absolutely stellar service from Reclaim on behalf of our students, we are finally moving our own stuff onto their planet as well.
As always, I can’t say enough good things about Reclaim. They are completely on the ball, their priorities are right, and they just keep getting better. Jim & Tim & crew, you’re the best!
Pandoc, the amazingly versatile document production and conversion toolkit, has now been released in version 2.0. Lead developer John MacFarlane describes the move to v2 as “a major architectural change;” and also that “with each release, pandoc becomes more a team effort.”
A quick browse through the release notes shows a lot of practical improvements and new features. In addition to its already robust handling of plaintext, markdown, html, Word .docx, LibreOffice .odt, InDesign .icml, and epub2/epub3 formats, the new release has a number of new features that publishers and developers should check out. I’ll quote just a tiny bit of the release notes document: Read more
I’ve never ‘required’ a textbook for my classes; given that I’m usually on about digital media, my classes are usually based on online resources. However, this past year, Michael Bhaskar published an excellent book on his theoretical model for understanding publishing, The Content Machine, and I thought this would make an excellent required reading for our grad students.
So I ordered a class set through the campus bookstore, and of course they were late arriving, but by the second week of class, everybody had a shiny red copy of The Content Machine — except two students, who came to me, puzzled, saying that something wasn’t right. Inside the red cover of their books was something else: Broken, by Traci L Slatton. Read more
When I was younger, and keen as hell about XML as the solution to everything, and working on my PhD, I wrote a bibliographic reference management system. This was circa 2002 or so, and I badly needed to procrastinate from working on my dissertation. There’s nothing like being productive on another project to make you feel good about putting something off. At the time, I was juggling a couple of hundred references, plus notes. I looked at the available options at the time (EndNote, RefWorks) and was not impressed with them, or any off-the-shelf reference manager. So I wrote my own. I looked at how some of the other systems worked, and made one that was ‘better.’
Over at Digital Pathways: Creating Digital Fiction with Kate Pullinger, I wrote a long-ish blog post on the experience of digital reading, and how we (publishing people) tend to underplay the experiential aspects of reading while we pursue the shorter-term advances of “digital” publishing. I end by appealing to publishers to look to writers and creative people to carve out new genres and new reading experiences, rather than just putting the old ones in digital containers. The post is here:
On May 7th, John MacFarlane released Pandoc v1.12.4 – a significant update that includes many enhancements across the wide range of its reader and writer modules. For publishers, the key enhancement is the integration of a writer module for Adobe’s ICML. This allows Pandoc to effectively export to Adobe InDesign.
Pandoc is a free, multi-purpose document conversion toolkit with an extensible design and some very sophisticated features. It presents itself most straightforwardly as a markdown engine: it reads text files prepared in markdown format and converts them to HTML. But Pandoc can do much, much more than that. It reads and parses no less than 10 different structured formats, and can then output to about 35 formats. It does so by parsing to a neat internal format, then re-generating outputs as needed.
Its useful outputs include HTML and HTML5, EPUB and EPUB3, ODT and DOCX, LaTeX, DocBook XML, and several HTML-based slideslow formats. As of v1.12.4, it can also output ICML, which is the open file format for Adobe’s InCopy software, which is directly usable in Adobe InDesign. If you look at that list, you’ll see that Pandoc can form the basis of a single-source publishing workflow: a single editorial file can instantly go to print/PDF, ebook, and web outputs.
Beyond file conversion, Pandoc has numerous well-thought out features for managing document metadata, citations and bibliographies, footnotes (possibly the nicest footnoting system ever), math and equation support, images, and page templates. See the Pandoc user guide for details.
If you’re producing books, stories, journals, articles that are primarily text-driven, and you’re managing multiple tools and processes to produce digital and print editions, you really need to take a good look at Pandoc. It makes most document preparation, conversion, and production tasks trivially easy, so you can spend your time on writing, design, and reach instead.
At this year’s Books in Browsers ‘opening act’ event, Creating Minds, held at UCBerkeley on Oct 23, we heard a number of speakers talking about the coming of the machine voices into our lives. That day, and at the BiB conference proper that followed, there were numerous references to machine cognition, algorithmic poetry, spambots, twitterbots, and the myriad non-humans that co-habit our social and literary spaces these days.
It does not surprise me that my phone—an Android, natch—wants to get in on the act. I’ve installed a swiping (as opposed to pecking) keyboard called SwiftKey that works pretty well; it’s much more efficient to slide one’s finger across the touchscreen than to tap away at little targets. And, of course in 2013, it learns as we go; it picks up my frequently used words—and phrases—and uses those patterns to do a supposedly better job of interpreting my smudgy finger movements as I try to achieve 40wpm. It also does the usual trick of predicting the next word, displaying the three best guesses just above the Qwerty, so I can pick the right one, if I’m so inclined.
And that allows it to generate its own prose poems, sorta. All I do is keep hitting the next guess, over and over again. On the newest beta version I installed (only a few days in to learning my stuff), it goes like this:
I am a beautiful person who is the best of luck to you by the way to get the best of luck to you by the way to get the best of…
Cheery little beast, isn’t it? The ending goes like one of those elegant irrational numbers with the repeating decimal-place patterns. Anyway, that’s pretty much the out-of-the-box functionality. Now, if I go back to the older version, where I’ve already given it six months of my typing history, things get a little more interesting:
I am a beautiful person who is the Internet and Us to the durability virtue of publishing as a manufacturing issue to publication as the author of the book of the book of the book of the book…
If you know me, you’ll recognize this fragmentary discourse as being pretty close to the blather that escapes from my mouth most days. I do like that endlessly repeating “of the book of the book of the book” mantra. It changes over time, too: when Haig Armen and I were working on our paper on index cards last spring, I would often get a big, allcaps “HINGE” in the first few words… but you’ll have to read further to get the point of that.
But they frustrate me, these autopredicting algorithms. Make no mistake: I for one welcome our spambot overlords, and I think James Bridle’s piece on how the robots are reaching out for love is one of the most poignant pieces of contemporary cultural criticism I’ve seen. But I do wish the makers of these things would take a longer view. Let me explain.
If I’m trying to type “It’s pretty interesting“—as I was just now in an email to my wife, and the thing autocorrects/autopredicts to “It’s pretty interview” or “It’s pretty girl,” (those are actual suggestions from my keyboard) well, that’s not really all that helpful. I wish it could make some broader predictions, maybe drawn from some large corpus of fine literature. Maybe then it would predict, “It’s pretty intriguing” or, “It’s pretty intense”… or how about “It’s pretty irresponsible,” perhaps I would be delighted (that word that keeps coming up in UX discussions these days) by it and it would begin to influence the way I write.
So could I please request that some DH project be set up that would glue, say, a corpus of romantic or modernist literature, with lots of good word proximity metrics worked out, into my autocorrect? Perhaps we could all choose, in the Settings dialog, which corpus we’d like to be corrected by? I’m not sure having the machine learn my typing patterns is the best way to improve my writing. But if I could have, say, Keats, or Virginia Woolf’s patterns correcting mine… then we’d have something.
This fall, I subjected some MPub students to working out a book publishing workflow, using Pandoc, the amazing document processor tool created by Berkeley philosopher John MacFarlane.
Pandoc is a remarkably flexible document conversion tool. It takes text input in a variety of open input formats (most usefully markdown and HTML) and can convert to more than a dozen outputs, including a variety of web-based formats (HTML, EPUB, markdown, and other blogging markup), word processor formats (RTF and OpenOffice’s ODT), and to a couple of TeX-based typeset outputs (that is, to PDF). That’s useful, but what makes PanDoc really great is that it works bloody well. It’s solid as a rock, totally well organized and documented. In short, the attention to detail in it is really superior.
I say that I “subjected” the students to it, because you run Pandoc almost entirely from the Unix command line. That’s a bit of a stretch zone for people raised on the Adobe Creative Suite. But if you’re comfy working with the shell (and even moreso if you’re happy with shell scripts) it is stunningly efficient. Read more
For years, the Publishing @ SFU web presence ran off a pair of Mac Minis hidden in my office, both running Linux. One was the main ‘www‘ server, the other was ‘thinkubator,’ where we ran experimental stuff. The two machines talked to each other at night, swapping backups, in case either machine were to fail. That arrangement was stable for a very long time. The machines ran Linux, which is so stable that I could ignore them for months and months and years—in truth, until they got really quite stale and out of date.
What prompted the end of this arrangement was the renovation of the bottom floors of SFU Harbour Centre. Over the past year, construction in the building has made the power go out enough times to drive me a little spare. I realized I should join the 21st century and get our main website (which is just a WordPress site) onto a proper hosted service in a reliable location. This spring I moved that site onto SFU IT Services’ own virtual hosting service, and stopped worrying about weekend power outages.
Then, in anticipation of this fall’s incoming MPub cohort we replaced the iMacs in our grad student offices with a suite of brand new machines—nice for them, and it also freed up a squadron of older machines for other uses. Two of these I commandeered as replacements for my old Minis, spruced up with Ubuntu 13.04 Linux, and now occupying the space left by my small stack of Minis and the horrendous old beige 15″ CRT that was hooked up to them (I swear the last CRT monitor in active use at Harbour Centre). One of them is the new tkbr/thinkubator machine, hosting a veritable warren of WordPress sites, two or three Gitit wikis, an experimental Booktype install, various file services (including ownCloud, as soon as I figure out how to serve certificates properly), and whatever else we need. The second one is set up pretty much identically (er, redundantly) but serves primarily as my desktop machine, as I’ve come to the conclusion that every bit of software I actually use anymore can run as well or better on Linux than on MacOS.
I tweeted the other day that it seemed like 80% of what I’d done as a teacher this fall was system administration. It occurs to me that perhaps that’s an accurate reflection of what publishing is actually about in 2013.