JakeSavin.com http://www.jakesavin.com “Have no fear of perfection—you’ll never reach it.” – Salvador Dalí Fri, 31 Oct 2014 00:44:36 +0000 en-US hourly 1 http://wordpress.org/?v=4.0 Multiple External Monitors on Mid-2011 MacBook Airhttp://www.jakesavin.com/2014/10/31/multiple-external-monitors-on-mid-2011-macbook-air/?utm_source=rss&utm_medium=rss&utm_campaign=multiple-external-monitors-on-mid-2011-macbook-air&utm_source=rss&utm_medium=rss&utm_campaign=multiple-external-monitors-on-mid-2011-macbook-air http://www.jakesavin.com/2014/10/31/multiple-external-monitors-on-mid-2011-macbook-air/#comments Fri, 31 Oct 2014 00:44:36 +0000 http://www.jakesavin.com/?p=12522 Etekcity_USB3_dock
I’ve been working with multiple displays since at least 2004. Back in those days I had a 17″ PowerBook G4 with a PCM/CIA card that provided a second external DVI output. It was slow, but it worked, and for the programming I was doing the lacking performance was not an issue.

For the last few months, I’ve been using a Sorbent USB 2.0 adapter to provide a second video output for my Mid-2011 MacBook Air, so I had two external displays, and the laptop itself for 3 total. The problem is that this MacBook Air doesn’t have USB 3.0, and USB 2.0 just doesn’t have enough throughput to drive a large display, so it had a lot of lag—too much to really be acceptable.

I’d been looking around for solutions. The problem is that all of the Thunderbolt docks can only really drive a single external display unless one of your monitors is itself a Thunderbolt display, which can work via Thunderbolt pass-through. But recently there have been more and more USB 3.0-based docks that support Mac OS X.

Kanex_Thunderbolt_USB_adapterSo I picked up a Kanex KTU10 Thunderbolt to eSATA Plus USB 3.0 adapter, and an Etekcity USB 3.0 dual monitor dock for my late-2011 MacBook Air. This dock has two USB 3.0 ports, 4 USB 2.0 ports, and two display outputs (HDMI and DVI), and the Kanex adapter should in theory provide a USB 3.0 port which the dock needs.

After a little dance with installing the latest DisplayLink driver (2.3 beta), they totally work.

So for about $200 (only a tad more than the cost of one of the single-display Thunderbolt docks), I’m running with three screens again, and the performance is perfectly acceptable for most things I will ever need to do.

Plus I’ve got gigabit Ethernet, more USB 3.0 ports, and an eSATA port which will be great for backing up my machine to an external drive.

Overall I’m very pleased.

]]>
http://www.jakesavin.com/2014/10/31/multiple-external-monitors-on-mid-2011-macbook-air/feed/ 0
Ported my Radio UserLand Sitehttp://www.jakesavin.com/2014/10/29/jake-userland-com/?utm_source=rss&utm_medium=rss&utm_campaign=jake-userland-com&utm_source=rss&utm_medium=rss&utm_campaign=jake-userland-com http://www.jakesavin.com/2014/10/29/jake-userland-com/#comments Wed, 29 Oct 2014 19:53:30 +0000 http://www.jakesavin.com/?p=12518 radioProductShotIt may take some time for the DNS change to propagate, and there are certainly going to be some broken incoming links, but I just finished the bulk of the work to port my Radio UserLand site here.

All of the posts from jake.userland.com are in a new Jake’s Radio ‘Blog category, in addition to preserving their original categories (some of which overlap with ones that were already here).

That leaves just one site to port in before my entire blogging life will live here at jakesavin.com.

(Of course I have a bunch of other sites too, and I have yet to decide what to do with each of them.)

]]>
http://www.jakesavin.com/2014/10/29/jake-userland-com/feed/ 0
Big Data: It’s The Analysis, Stupid!http://www.jakesavin.com/2014/10/24/its-the-analysis-stupid/?utm_source=rss&utm_medium=rss&utm_campaign=its-the-analysis-stupid&utm_source=rss&utm_medium=rss&utm_campaign=its-the-analysis-stupid http://www.jakesavin.com/2014/10/24/its-the-analysis-stupid/#comments Fri, 24 Oct 2014 09:18:28 +0000 http://www.jakesavin.com/?p=11118

We took the verticals that FounderDating Network Cofounder members (those members who have indicated that they are interested in finding cofounders) selected as markets they are interested in started a company in and compared the last six months with the same six months one year ago…

- via Founder Dating

It comes as little surprise to me that the verticals that seem to remain pretty stable include commerce, small business, advertising, cloud services, and enterprise. To my mind, this is reflective of how our economy intersects with technology in a fairly general sense. Of course mobile is still big, and I believe most investment in mobile is driven by commerce (including advertising) and business needs, with cloud services serving a supporting role. It is interesting though that mobile startup investment seems to be reaching a plateau rather than growing or declining.

It’s also no surprise to me that the wearable and smart home verticals are on the rise, given the buzz around “Internet of Things”, health-data scenarios, and clean energy over the last few years. Interest in these verticals has existed for a long time, but investment is happening now for two reasons: maturing new technologies are finally enabling them, and our social norms are changing. It of course remains to be seen whether there will be a bubble in either wearables or smart home startups, but for the moment there’s a scramble to deliver new products and services in both spaces, and there’s a lot of room for growth over the next few years.

The consumer electronics rise is probably related in part to wearables and smart home, though it’s interesting to contemplate what might be happening if some portion of that rise is independent. (I’m not going to do that here though.)

To me, the most interesting stand-out in the Founder Dating verticals report is an apparent decline in interest doing startups in the data & analytics space.

Is Big Data investment waning?

I see more and more job listings these days, in all sorts of technology disciplines that call for “a passion for big data” or “proven ability to analyze data for customer insights”. In part at least [big] data analytics seems to be getting absorbed into the broader technology toolbox—that more an more “Big Data” is seen as a core competency, or from another point of view just another part of the “cost of doing business.”

Simultaneously, the idea of Big Data driving markets in-and-of itself seems to be dwindling. And I think this is a good thing.

Data by itself is just data even if it’s Big

I’ve felt for a few years now that there’s been an over-emphasis on data for its own sake, at least the way it’s been marketed so far: More data, more types of data, more sources of data, more users contributing data, etc.

BigData_2267x1146_whiteThere’s certainly been a huge rise in data warehousing and reporting capability across the many industries touched by high-tech. And many companies have made at times extravagant claims about how Big Data will revolutionize all aspects of your business (technology or otherwise).

It’s true that we can now store, search, and retrieve information with a capacity and speed that was unimaginable even two or three years ago. But for the most part, availability and cost-effectiveness of data collection and reporting by itself has not (so far) revolutionized our lives or our businesses, except in a few niches—web search and social networks being two of the most visible.

It’s the analysis, stupid!

Take Facebook and Twitter in the social space, Google in search, or 23andMe in the consumer DNA analysis space. For at least these verticals there’s also been a correspondingly large investment in data analysis—probably in nearly all cases a much larger investment.

We need to understand that good data analysis requires a lot of creativity, long-term investment in tools and algorithms, and an iterative development process—all of which is far from free. The data by itself is just bits on a disk somewhere.

Access to vast amounts of data has indeed been a fantastic aid that has driven broad, albeit often incremental improvements in decision making, product design, and operational efficiency. More rarely it’s enabled completely new product spaces, though without a real data analysis component, most of the new markets that have opened up have been related to data warehousing. The mere availability of lots of data has not so far been a panacea. And it may never be.

It’s certainly true that we take for granted today that we have comprehensive map data at our fingertips.

Ultimately though, the most interesting Big Data scenarios require that we aggregate and correlate vast data-sets in ways that ask specifically designed questions, and which report results that can be interpreted as effective, meaningful, actionable answers to those questions. (Remember Douglas Adams’ 42?)

And so far asking the right questions is still nearly completely in the domain of human beings.

]]>
http://www.jakesavin.com/2014/10/24/its-the-analysis-stupid/feed/ 0
My EditThisPage.com site is now herehttp://www.jakesavin.com/2014/10/17/my-editthispage-com-site-is-now-here/?utm_source=rss&utm_medium=rss&utm_campaign=my-editthispage-com-site-is-now-here&utm_source=rss&utm_medium=rss&utm_campaign=my-editthispage-com-site-is-now-here http://www.jakesavin.com/2014/10/17/my-editthispage-com-site-is-now-here/#comments Fri, 17 Oct 2014 18:04:07 +0000 http://www.jakesavin.com/?p=11110 Hi all—here’s an update following my previous post asking for some WordPress advice: I pulled the trigger, and now all of the content from Jake.EditThisPage.com is ported over to JakeSavin.com. Amazingly it worked right the first time! When does that ever happen?

Most, if not all of the links into the old site now redirect to the right place here. For the moment they’re temporary redirects, but after a bit more testing I’ll make them permanent so they’ll get picked up by search engines and the like. And contrary to my initial fears, the problem with pages living at multiple URLs was able to be easily resolved by redirecting via mod_rewrite rules in my .htaccess file.

404: Just say no!

The content of that site spans the period from December 22, 1999 to March 11, 2003, and all of the posts from that site are in their own category to make them easy to find.

Next I’m going to write some code to export my Radio UserLand site to WXR (WordPress eXtended RSS) format, so I can merge that content in too. I know a lot more now than I did when I started this work with Manila, so it should be quite a bit easier. After that, a one-off exporter for my custom WebsiteFramework site, Jspace.org. That one goes way back to 1997!

]]>
http://www.jakesavin.com/2014/10/17/my-editthispage-com-site-is-now-here/feed/ 0
I Need Some WordPress Advicehttp://www.jakesavin.com/2014/10/16/i-need-some-wordpress-advice/?utm_source=rss&utm_medium=rss&utm_campaign=i-need-some-wordpress-advice&utm_source=rss&utm_medium=rss&utm_campaign=i-need-some-wordpress-advice http://www.jakesavin.com/2014/10/16/i-need-some-wordpress-advice/#comments Thu, 16 Oct 2014 02:55:23 +0000 http://www.jakesavin.com/?p=1937 I have a plan for something I want to do with my site, and could use some advice from experienced WordPress people.

I have two legacy sites that I want to merge into my current WordPress site. Content in this site already consists of the imported content from one of these sites, plus posts I’ve made since switching over.

The other site I want to merge in has conflicting post IDs. In order to redirect old URLs to their new homes in WordPress, I need a way to resolve this conflict in a predictable fashion that can be addressed with mod_rewrite (or something comparably simple).

So I decided to apply an offset of 10,000 as I export the content from that site, so:

  • ID 15 becomes ID 10015.
  • ID 1243 becomes ID 11243.

This guarantees that there will be no conflict with any IDs in the current site.

And since the old IDs can be transformed relatively easily with regex into the new ones, I can create some mod_rewrite rules that are conditional on requests coming to the old host name, which redirect from the old URLs to the new ones. (I’ve already tested this, and it appears to work.)

So basically what I want to know is this:

Is there some reason I should not do this?

Am I painting myself into a corner?

Will the jump from ID ~2000 to ID 10001 cause any issues?

Any gotchas (SEO or otherwise) with my next post after the import starting at roughly ID 12000?

Any comments in favor or against are much appreciated! :-)

Update: @octothorpe replies on Twitter, “@jsavin That should work, although having a lot of mod_rewrite can add serious latency. Also make the redirects 301s.” — I’m doin’ this thang…

]]>
http://www.jakesavin.com/2014/10/16/i-need-some-wordpress-advice/feed/ 1
Dave Winer: Someone had to go firsthttp://www.jakesavin.com/2014/10/15/dave-winer-someone-had-to-go-first/?utm_source=rss&utm_medium=rss&utm_campaign=dave-winer-someone-had-to-go-first&utm_source=rss&utm_medium=rss&utm_campaign=dave-winer-someone-had-to-go-first http://www.jakesavin.com/2014/10/15/dave-winer-someone-had-to-go-first/#comments Wed, 15 Oct 2014 20:41:00 +0000 http://www.jakesavin.com/?p=1948 Dave Winer:

As Walter Isaacson points out  innovators need to be both humanitarians and scientists, we have to touch the human spirit, and be masters of the scientific method. In the bootstrap of blogging it was enormously important that I was both a writer and a programmer. We had to learn to write for this new medium, and we had to figure out how the software worked.

I was lucky in 1994 that I was completely free to explore, and that the world was ready to make this leap. So I began a trip, that led to something wonderful , every bit as big as I thought it might be back then.

Read the whole thing.

]]>
http://www.jakesavin.com/2014/10/15/dave-winer-someone-had-to-go-first/feed/ 0
WordPress vs. Roll-Your-Own Blog Enginehttp://www.jakesavin.com/2014/10/14/wordpress-vs-roll-your-own-blog-engine/?utm_source=rss&utm_medium=rss&utm_campaign=wordpress-vs-roll-your-own-blog-engine&utm_source=rss&utm_medium=rss&utm_campaign=wordpress-vs-roll-your-own-blog-engine http://www.jakesavin.com/2014/10/14/wordpress-vs-roll-your-own-blog-engine/#comments Tue, 14 Oct 2014 18:04:19 +0000 http://www.jakesavin.com/?p=1915 Alex King posted an interesting rebuttal of Santiago Valdarrama’s missive explaining why he’s building his own blog engine.

Taken together, these posts pretty much sum up the reasons why I went with self-hosted WordPress, rather than try to roll my own solution, or continue to lope along indefinitely with Manila.

A couple of Alex’s points in particular stuck out for me:

Santiago: There’s always a learning curve. Every platform is different, specially when you want to fine tune your layout and deviate from the provided templates.

Alex: This one strikes me as a bit silly. There is a learning curve when building your own system too – especially if you haven’t written your layout/templating system yet.

Then:

Santiago: You’ll never get to experience the satisfaction of engaging in a conversation about how you developed your own platform from scratch.

Alex: … if what you want is engagement then joining a bountiful and vibrant community of developers is a much bigger opportunity than the potential for a conversation with another NIH hacker.

Santiago finished his post with:

It takes a few evenings of work to get it done. It’s that simple.

Honestly I doubt it. Although I’m an experienced web developer, if I were to attempt to roll my own solution from scratch, it would be a huge undertaking, fraught with many potentially fatal problems:

  • First I’d have to choose a programming language and platform, with very little in the way of criteria with which to make the right decision—at least not without doing a lot of research first.
  • I’d need to decide what features I really need and what I could do without.
  • I’d have to write (and debug) the code—probably a lot of code.
  • If I wanted to be able to use a native app to post to my blog, I’d have to implement a well-known API, with a dialect that the app understands. (Mo code, mo problems.)
  • I wouldn’t be able to take advantage of the vast universe of WordPress plugins: I wanted a feature a plugin implemented, I’d have to write it myself. (Mo code, mo problems)
  • And so on…

And after all that, I’d still have to find a way to export the content from my current site, and import it into the new one, which was something was going to have to do anyway. :-(

Plus, as Alex hints at by pointing out the vibrancy of the WordPress community, I wouldn’t be able to leverage the experience to actually learn WordPress (and some PHP, and some optimization, and some Apache config, and…).

Update: Santiago has a follow-up post:

“I’d never ask someone to do this. Rolling your own engine means a lot of work, and unless you are really on the nerd side (like I am and Brent Simmons is), it will be a waste of your time.”

Update: More dialog on Twitter

Ps. In the end Brent decided to stick with the self-built engine he’s been using for years, and write an iOS app for himself to post to it remotely. Moral of this story: Stick with what you know?

]]>
http://www.jakesavin.com/2014/10/14/wordpress-vs-roll-your-own-blog-engine/feed/ 2
CocoaConf Seattle 2014 iCalendar Feedhttp://www.jakesavin.com/2014/10/14/cocoaconf-seattle-2014-ical/?utm_source=rss&utm_medium=rss&utm_campaign=cocoaconf-seattle-2014-ical&utm_source=rss&utm_medium=rss&utm_campaign=cocoaconf-seattle-2014-ical http://www.jakesavin.com/2014/10/14/cocoaconf-seattle-2014-ical/#comments Tue, 14 Oct 2014 10:20:19 +0000 http://www.jakesavin.com/?p=1926 CocoaConf LogoThis June, I was one of the lucky ones who’d won the lottery, and was able to attend WWDC in San Francisco. While I was at the conference, it was awesome to have the whole schedule at my fingertips via the WWDC app on my iPhone. With CocoaConf Seattle just around the corner, I found myself wishing there were a CocoaConf app. No such luck.

Then I remembered iCal feeds are a thing, so went to check on the CocoaConf website for a subscribable calendar feed for the Seattle event, but that also didn’t seem to exist.

So as a public service to my fellow nerds who are attending CocoaConf 2014 in Seattle, I created a public iCal feed using Google Calendar, that you can subscribe to for the schedule of all the sessions, including the Thursday workshops. It should work on iOS devices, Google Calendar, Calendar.app on the Mac, BusyCal, and others. Here’s the link:

https://www.google.com/calendar/ical/f7oecob86640vd63qtmu57r92c%40group.calendar.google.com/public/basic.ics

If you’re looking at this post on your iPhone, iPad, or Mac you should be able to just click the link to subscribe to the calendar.

Hope to see you at the conference!

]]>
http://www.jakesavin.com/2014/10/14/cocoaconf-seattle-2014-ical/feed/ 0
Thanks, Dave Winer!http://www.jakesavin.com/2014/10/08/thanks-dave-winer/?utm_source=rss&utm_medium=rss&utm_campaign=thanks-dave-winer&utm_source=rss&utm_medium=rss&utm_campaign=thanks-dave-winer http://www.jakesavin.com/2014/10/08/thanks-dave-winer/#comments Wed, 08 Oct 2014 01:08:08 +0000 http://www.jakesavin.com/?p=1890 Today is Dave Winer’s 20th Anniversary blogging, starting with this DaveNet piece in 1994. Dave wrote a great piece about the occasion here: 20 years of blogging. As I read it, a few things came to mind. Oddly the number 19 seems to be a theme…

Another life on another continent

Twenty years ago today, I had just turned 25. I was living in Amsterdam, and touring as the bassist with an indie rock band called Painting Over Picasso. We had just released our first album. My life has changed a lot since then—enough so that today it feels like the “me” in Amsterdam might have been a different person altogether.

I started reading Dave’s writing online and getting into programming with Frontier some time in 1995, while I was still living abroad. Dave’s writing and programming in Frontier are among the few threads in my life that cross the K-T boundary between my music and tech careers.

The band stayed together for a little over four years altogether. I had been an aspiring musician for about five years before that, and continued performing in public with various bands until 2006, though not professionally.

Reflecting on 20 years of… anything

In total my music career lasted about 19 years, only seven of which were really serious.

19 years is the same amount of time between when I first started programming in Frontier to now.

19 years is also the time between writing my very first programs on the Apple II at my middle school, and joining UserLand as a developer in 2000.

I’d be pretty hard pressed to think of any single thing I’ve done more or less continuously for 20 years.

Starting a second career

I returned to the United States in 1996, with no job, no real prospects, a music composition degree, and a strong sense that I belonged in the software industry. Though I had been an amateur programmer off-and-on since 1980 or so and had created large-ish projects of my own, I had zero real work experience in technology.

Through a college friend, I managed to bully my way into an entry-level software test engineer job at Sonic Solutions in Marin County, which brought me to the Bay Area for my first tech job. My first day of work was my birthday, October 1, 1996, just over 18 years ago. (Huzzah for testers!)

I met my friend Vance who introduced me to Sonic in the fall of 1987 at Reed College in Portland, OR—the same college that Steve Jobs famously dropped out of. (Not that this fact has anything at all to do with me.) So I’ve known Vance for 27 years.

The number of people outside of my family that would call close friends for 20 years or more is… 4.

Doing anything or knowing anyone for more than 20 years is rare in my experience, but as it turns out my true friends have lasted longer than my career tracks to date.

And this comes as no surprise. ;-)

A few quotes from Dave’s piece

These passages in Dave’s piece today resonated with me:

… You should create stuff because you enjoy being creative, because you have the creative impulse. Not because you expect to be loved for it. #

That’s how I’ve always felt about my own creative work, whether sculpture and painting I did in high school, making music, creating software, or writing (online or off).

It really is great to be admired for one’s accomplishments. But that’s never been the most important reason I’ve worked to create anything. People may admire someone or their work, but admiration doesn’t equal love. It’s an easy mistake to make, especially when you’re young.

This part about Aaron Swarts also struck a chord for me:

… I did him the honor he asked for, and treated him as a responsible person. One of the great things about the Internet is that our bodies are the same size here, and if you want to play with the adults, there’s nothing stopping a young person from doing so… #

I remember how surprised, and then delighted I was when I first learned about Aaron’s youth, as he began to engage with the RSS community. It was refreshing to see (much of) the community accept him as a person with ideas, and worthy of being listened to. Especially since I often didn’t fit in easily when I was younger, and I’d wished that being bright and engaged were enough to gain acceptance in the so-called “real world”.

Dave did me the same honor when I approached him and UserLand in 2000, and asked if I could work with them. At that point I had few accomplishments to prove my worth as a developer, other than some hacked together Frontier scripts that ran my own blog, a bit of incomplete online writing, and the willingness to ask for their trust.

The move to UserLand, and having worked with Dave and others there have had a very positive long-term effect for me, which is difficult to quantify:

  • I grew from a tinkerer into a real software developer working in Frontier, with Dave, Brent, and André as mentors.
  • Relationships I made at UserLand, and work I did both with and for Dave, continue to open professional doors for me even today.
  • My UserLand experience proved to me in a personal way that a few smart, dedicated people can have a big impact, if they stick to it over time.

Thanks, Dave!

So on your 20th blog-versary, I’d like to say “Thanks, Dave!” for sharing your thoughts and writing with us all, and for narrating your work on so many things, some of which now seem obvious—even taken for granted.

And on a personal note, thank you for the great opportunity and experience of working with you and the folks at UserLand. It was a great experience for me, and in no small part it was reading your writing starting about 19 years ago, that made me to want to work with you. :-)

]]>
http://www.jakesavin.com/2014/10/08/thanks-dave-winer/feed/ 0
Porting to WordPress Part 3: Codehttp://www.jakesavin.com/2014/10/07/porting-to-wordpress-worknotes-part-3/?utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-3&utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-3 http://www.jakesavin.com/2014/10/07/porting-to-wordpress-worknotes-part-3/#comments Tue, 07 Oct 2014 19:34:42 +0000 http://www.jakesavin.com/?p=1815 In the last post on this topic, I discussed some of the differences between Manila and WordPress, and how understanding those differences teased out some of the requirements for this project.

In this post I’m going to talk about the design and implementation of a ManilaToWXR Tool, some more requirements that were revealed through the process of building it, and a few of the tricky edge cases I had to deal with.

A little history first…

Frontier website headerAmong the more interesting things I did while I was a developer at UserLand, was to build a framework we called the Tools Framework, which brought together many different points of extensibility, and made it easy for developers to customize the environment.

In Frontier, Radio UserLand, and the OPML Editor, a Tool is a collection of code and data in a database, which extends or overrides some platform- or application-level functionality. It’s sort of analogous to a Plugin in the WordPress universe, but Tools can also do things like run code periodically (or continuously) in the background, or implement entirely new web applications, or even customize Frontier’s native UI.

For example, you could implement a Tool that hooks into the windowTypes framework and File menu callbacks to implement a new document type corresponding to a WordPress post. Commands in the File menu call the WordPress API, and present a native interface for editing your blog—probably in an outline. Radio UserLand did exactly this for Manila sites, and it was fantastic. (More on that later.)

Another example of a Tool is one that implements some new XML-RPC endpoints (RPC handlers in Frontier) to provide a programmatic API for accessing some content in a database on your server.

For my purposes, I’m not doing anything nearly so complicated. The main thing I wanted comes from the Tools > New Tool… menu command. This creates a new database and pre-populates it with a bunch of placeholders for things like its menu, a table for data and preferences, and of course a table where my code will live.

It gives me an easy, standard way to create a database with the right structure, and the hooks into the menu bar that I wanted to make my exporter easy to use.

Code Components

Now some of this may sound pedantic to the developer-types who are reading this, but please bear with me on behalf of our non-nerd cohorts.

Any time you need to write a lot of code, it makes sense to break the work down into small, bite-sized problems. By solving each of those problems one at a time, sometimes in layers, you eventually work your way towards a complete solution.

Each little piece should be simple enough that you can compartmentalize it and separate it from the other pieces. This is called factoring, and it’s good for lots of reasons including readability, maintainability, debug-ability, reuse. And if you miss something, make a mistake in your design, or discover that some part of your system doesn’t perform well, it’s far easier to rewrite just one or a couple of parts than it is to de-spaghettify a big, monolithic mess.

Components and sub-components should have simple and consistent interfaces so that other code that talks to them can in turn be made simple and consistent. Components should also have minimal or no side-effects, meaning that they don’t change data that some other code depends on. And components should usually perform one or a very small number of tasks in a predictable way, to keep them small, and make them easy to test and debug. If you find yourself writing hundreds of lines of code in one place, you probably need to break the problem down into smaller components.

So with these concepts in mind, I set about coming up with a component-level design for my Tool. I initially came up with four types of components that I would need, and each type of component may have a specific version depending on the type of object it knows about.

Iterators

First, I’m going to need an easy way to iterate across posts, stories, pictures, and other objects. As my code iterates objects in my site, the tool will create a fragment of XML that will go into a WXR file on disk.

By separating the iteration from everything else, I can easily change the order in which objects are exported, apply filters for specific object types, or only export objects in a given date or ID range. (It turned out that ranges and filters were useful for debugging later on.)

Manila stores most content in its #discussionGroup in a sub-table named messages. User information is in #membershipGroup, and there’s some other data scattered around too. But the most important content—posts, pages, pictures, and comments—is all in the #discussionGroup.

Initially I’d planned to make multiple passes over the data, with one pass for each type of data I wanted to export. So first export all the posts, next the pages, next pictures, etc. As it turned out however, in both Manila and WordPress, a post, a page, and a picture have more in common than not in terms of how they’re stored and the data that comes along with them. Therefore it actually made more sense to do just one pass, and export all the data at one time.

There was one exception, however: In WordPress unlike Manila, comments are stored in a separate table from other first-class site content, and they appear in a WXR file as children of an

<item>
rather than as their own
<item>
under the
<channel>
element:

<item>
  <content:encoded><![CDATA[ ... Post contents here ... ]]></content:encoded>
...
  <wp:comment>
    <wp:comment_author><![CDATA[commenter]]></wp:comment_author>
    <wp:comment_author_email>someone@example.com</wp:comment_author_email>
    <wp:comment_author_IP>IP_address</wp:comment_author_IP>
    <wp:comment_author_url>http://blog.example.com/</wp:comment_author_url>
    <wp:comment_content><![CDATA[Hi, I found your blog via a google search. I was interested in your comments about setting this up. Can you help? Thanks!]]></wp:comment_content>
    <wp:comment_date>2004-08-01 14:17:03</wp:comment_date>
    <wp:comment_date_gmt>2004-08-01 21:17:03</wp:comment_date_gmt>
    <wp:comment_id>15</wp:comment_id>
    <wp:comment_parent>0</wp:comment_parent>
    <wp:comment_type></wp:comment_type>
    <wp:comment_user_id>3</wp:comment_user_id>
    <wp:comment_approved>1</wp:comment_approved>
  </wp:comment>
...
</item>

In the end I decided to write two iterators. Each of them would take the address of the site (so they can find other required metadata about a person for instance), and the address of a function to call for each object as it goes along:

wxr.visit.messages
– iterates over all of the messages in my site’s #discussionGroup, skipping over deleted items and comments, since they won’t be exported as an
<item>
in my WXR file.

// UserTalk Source for wxr.visit.messages
on messages (adrsite, visitproc) {
  local (adrmsgs = wxr.site.messages (adrsite), adr);
  for adr in adrmsgs {
    local (id = wxr.post.id (adr));
    if not visitproc^ (id) { // Stop here?
      return (false)}};
  return (true)}

wxr.visit.comments
recurses over responses to a message to generate threaded comment information.

// UserTalk Source for wxr.visit.comments
on comments (adrsite, adr, visitproc) {
  local (commentId);
  for commentId in adr^.responses {
    local (adrComment = wxr.comment.address (adrsite, commentId));
    if adrComment != adrPost {
      if not visitproc^ (adrComment) {
        return (false)}}}; //unwind recursion
  return (true)}

It turned out later on that I needed two more iterators—one for categories, and one for “Gems” (non-picture files), but the two above were a great starting point that would give my code easy access to the bulk of the content.

Data Extractors

Next I needed some data extractors. These are type-specific components will pull some data for a post, picture, comment, etc out of the database, and normalize it to a native data structure that can then easily be output to XML for my WXR file.

The most important data extractor is wxr.post.data, which takes the address of a message containing a blog post that’s in my site’s #discussionGroup—and returns a table (struct) that has all of the data elements that will go into an

<item>
in the exported WXR file.

Because the WordPress importer expects the comments as <wp:comment> sub-elements of

<item>
the post data extractor will also call into another data extractor that generates normalized data representing a comment.

For other types of objects I’ll need code that extracts data for that type as well. So I’ll need code to extract data for a picture, code to extract data for a page (story), and code to extract data for a gem (file).

Here’s part of the code that grabs the data for a comment:

// UserTalk Source for wxr.comment.data
on data (adrsite, id) //return a table of data for a comment
  local (t); new (tableType, @t); //<wp:comment>
  local (adr = wxr.comment.address (adrsite, id));
  
  on add (n, s) {
    t.["wp:" + n] = s}; //all comment data is in the wp: namespace

  add ("comment_id", id);
  add ("comment_author", wxr.string.cdata (wxr.member.name (adrsite, adr^.member)));
  add ("comment_author_email", adr^.member);
  add ("comment_content", wxr.string.cdata (wxr.string.processMacros (adrsite, adr^.body)));
...
  bundle { //<wp:comment_approved>
    local (flApproved = 1);
    if defined (adr^.flDeleted) and adr^.flDeleted {
      flApproved = 0};
    add ("comment_approved", flApproved)};
  add ("comment_parent", wxr.comment.parent (adrsite, id))
...
  
  return (t) //</wp:comment>

There are a few interesting things to point out here:

  1. I chose to capture comment content even if it’s not approved. Better to keep the content than lose it, just in case I decide to approve it later.
  2. The call to
    wxr.comment.parent
    gets the ID of the comment’s parent. This preserves the threaded nature of the conversation, even if I decide not to have threaded comments in my WordPress site later on. It turns out that supporting both threaded and unthreaded comments was the source of some pain that I’ll explain in a future post.
  3. The call to
    wxr.string.processMacros
    is especially important. This call emulates what Manila, mainResponder, and the Frontier website framework do when a page is rendered to HTML. Without this capability, Frontier macro source code would leak through into my WordPress site, and possibly many internal links from #glossary items would not be broken. Getting this working was another source of pain that took a while to work through—again, more in a future post.
  4. All sub-items in the table that gets returned have names that start with “wp:”, which I’ll explain below…

Encoders

Once I had some structured data, I was going to need to use it to encode some XML. It turns out that this component could be done in a very generic way that would work with any of my data extractors.

Frontier actually does have somewhat comprehensive XML capabilities. But the way it’s implemented requires very verbose code that I really didn’t want to write. I had done quite enough of that in a past life. ;-)

So I decided to write a much simpler one-way XML-izer that I could easily integrate with my data extractors.

The solution I came up with was to recurse over the data structure that an extractor passed to it, and generate an XML tree whose element names match sub-items’ names, and whose element content were the contents of each sub-item.

There were three features I needed to add in order to make this work well:

Namespaces: Many elements in a WXR file are in a non-default namespace—either

wp:
for the WordPress-specific data, or
dc:
for the Dublin Core extension. This feature was easy to deal with by just naming sub-items with the namespace prefix, i.e. an element named
parent
in the
wp:
namespace would simply be called
wp:parent
when returned by the data extractor.

Multiple elements: Often I needed to create multiple elements at a given level in the XML file that all have the same name.

<wp:comment>
is a good example. The solution I came up with here is similar to the one Frontier implements in its native XML verbs.

A compiled XML table in Frontier has sub-items representing elements, which have a number, a tab character, and the element’s name. The Frontier GUI hides the number and the tab character when you view the table, so you can see multiple same-named elements in the table editor. When you click an item’s name, the number and tab character are revealed, and you can edit them if you want. That said, you’re supposed to use the XML verbs, xml.addTable or xml.addValue to add elements.

Most of this is not particularly well documented, and personally I don’t think it was the most elegant solution, but it was effective at working around Frontier’s limitation that items in tables had to have unique names, whereas in XML they don’t.

I wanted something simpler, so I decided instead to simply strip anything after a comma character from the sub-item’s name. This way whenever my data extractor is adding an item, it can just use table.uniqueName with a prefix ending in a comma character, and then add the item at that address. Two lines of code, or one if we get just a little bit fancy:

table.uniqueName (element + ",", @t)^ = value;

XML attributes: The last problem to solve was generating attributes on XML elements, for example

<guid isPermalink="false">...</guid>
. It turns out that if there were an xml.addAttributeValue in Frontier, it could have handled this pretty easily, but that was never implemented. Instead I’d have to add an
/atts
sub-table, and add the attribute manually—which takes multiple lines of code just to set a single attribute. Of course I could implement xml.addAttributeValue, but I don’t have a way to distribute it, so nobody else could use it! :-(

In addition, I really didn’t want big, deeply-nested data structures flying around my call-stack, since I’m going to be creating thousands of tables at run-time, and I was concerned about memory and performance.

In the end I decided to do a hack: By using the | character to delimit attribute/value pairs in the name of table sub-elements, I could include the attributes and their values into the element name itself. So the

<guid isPermalink="false">
element would come from a sub-item named
guid|isPermalink=false
.

Normally I would avoid doing something like this since hacks have a tendency to be fragile, but in this case I know in advance what all of the output needs to look like, so I don’t need a robust widely-applicable solution, and the time I save with the hacky version is worth it.

Utility Functions

Then there’s a bunch of miscellany:

  • A way to easily wrap the body of a post with <![CDATA[...]]> tokens, and properly handle the edge case where the body actually contains those tokens.
  • A non-buggy way to encode entities in text destined for XML. (xml.entityEncode has had some bugs forever, which weren’t fixed because of Rule 1.)
  • Code to deal with encoding various date formats, and converting to GMT.
  • Code to convert non-printable characters into the appropriate HTML entities (which in turn get encoded in XML).
  • Other utility functions dealing with URLs, calculating permalinks, getting people’s names from their usernames, etc.

The Elephants in the Room

At this point there were a few more things I knew I would need to address. I’ll talk about these along with handling media objects in my next post. In the meantime, here’s a teaser:

  1. Lots of stuff in Manila just doesn’t work at all unless you actually install the site, with Manila’s source code available.
  2. The macro and glossary processors aren’t easy to get working unless the code is running in the context of a real web request.
  3. What should I do about all the incoming links to my site? Are they all going to simply break?

I’ll talk about how I dealt with these and other issues in the next post.

More soon…

]]>
http://www.jakesavin.com/2014/10/07/porting-to-wordpress-worknotes-part-3/feed/ 0
Porting to WordPress Part 2: Requirementshttp://www.jakesavin.com/2014/10/03/porting-to-wordpress-worknotes-part-2/?utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-2&utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-2 http://www.jakesavin.com/2014/10/03/porting-to-wordpress-worknotes-part-2/#comments Fri, 03 Oct 2014 21:44:00 +0000 http://www.jakesavin.com/?p=1789 In my earlier post about porting from Manila to WordPress, I covered some basics around how and why I decided on the approach I took, and some of the requirements for the new site.

I’ve made a ton of progress—what you’re reading right now is coming from WordPress 4.0, hosted on my own server—but I’ve been remiss on follow-up posts. Fortunately I took lots of notes during this process, since I knew I wanted to write more about it. Probably too many notes in fact. ;-)

I also found myself falling diving into the rabbit hole: I’ve been debugging the WordPress importer plug-in, while slowly and osmotically learning PHP, and discovering the wonders … um … fun that is XDebug, Eclipse PhpStorm, and MAMP. (PhpStorm is great so far, but still very unfamiliar.) Why? Two reasons, one of which I touched on before:

  • I get to learn about WordPress internals, PHP, and debugging PHP sites—and learning is always a Good Thing™
  • It turns out that Manila, a product developed over the better part of a decade, is quite complicated (duh), and I get to figure out how to re-simplify my legacy websites

I know Manila better than almost(?) anyone, so even years after developing in that environment full-time, its nooks and crannies are mostly familiar to me. Manila is an old friend, and we have a relationship complicated by the history of our mutual growth. Because Manila and I learned the Web organically over the last decade or so, we share shall we say, breadth. ;-)

It’s a valuable trait that makes us both very flexible, but it also means that we’re sometimes hard to understand. And in doing this project, it’s likely I would need to make some difficult trade-offs, or else suffer endless debugging and long-term maintenance complexity, both with rapidly diminishing returns.

I’ll give you a few examples, and will tease out requirements as I talk through them:

Home Pages vs News Items

In its original incarnation, Manila only understood one “Home Page” per day. You could write as much as you want, add as many links as you want, and format however you want. But the content for a given day had no set structure or order.

√ Requirement: Ability to link to a day-archive in my WordPress site, not just to a post

Relatively early in Manila’s product lifetime, Brent Simmons implemented News Items in Manila, which enabled Manila sites to have the same kind of structure we think of today as a Blog—a series of reverse-chronological posts, usually with a title, and sometimes with a link. As I recall, Manila’s News Items were inspired by Slashdot’s format which was essentially blogs+categories—but the reverse-chronological collection of posts was key.

As the platform grew, News Items eventually had other data associated with them too: They supported per-item comments and trackbacks (like WordPress posts), and they separated the concept of “last update” from “published” though differently than WordPress does.

For the purpose of this project, it’s important to understand that a “news-day” post and “news item post” are different things, and needed to be dealt with accordingly.

√ Requirement: Handle both day-post-style sites and item-post-style sites

√ Requirement: Translate News Item departments into WordPress categories

To make matters more complicated, the Managing Editor (admin) of a Manila site could switch between News Items and Home Pages at will. So some days might have a single, monolithic post while other days may have many separate posts.

√ Requirement: Support both per-day and per-post styles within a single site

For content on JakeSavin.com this won’t be much of an issue since it’s always been a News Items (per-post) style site, and I’ve rarely made more than one post per day. But in the long run I also want to bring in content from Jake.EditThisPage.Com—years worth of content that I don’t want to lose—and it’s one of these mixed sites with some day-page style content, and some blog post style content, and a mix that sometimes included many posts each day.

⇒ Insight: I don’t need to deal with day-type sites right now, but I shouldn’t design myself into a corner that precludes them.

Permalinks, GUIDs, and IDs

So what the heck are these things? I mean I’ve heard of a permalink but a GUID? I get what an ID is, but why do I need to understand it?

Permalink

The permalink to a post is a URL which doesn’t change over time, which goes straight to the post. It’s important to preserve these links, since every time someone links to a post on your site, the place they’re linking to (ideally) is your post’s permalink. If that URL changes then all of those incoming links will break, and The Web will be just a tiny bit more lonely: On The Web, broken links == sadness.

It turns out that by default, WordPress and Manila format blog post URLs quite differently. Moreover, WordPress pages typically live at only one URL (really two—one by its link [path], and one by its ID), whereas in Manila, “Stories” (Pages in WordPress) and sometimes even individual posts can live at any number of URLs, some of which are generated, and some of which may be added by the user.

For example a blog post (news item) in Manila is most often accessed via a calendar-style URL off of the root of the site, like http://example.com/2014/10/01#1234, but it may also appear at any the following (or more) URLs:

  • http://example.com/discuss/msgReader$1234 — note the $ delimiter
  • http://example.com/stories/storyReader$1234 — if promoted to a story [page]
  • http://example.com/my-super-awesome-post — user-entered path
  • http://example.com/awesome/firstPost — another user-entered path
  • http://example.com/newsItems/departments/superAwesome/2014/10/01#1234 — from department (category)

If I want to preserve my site’s existing web presence, then I should do whatever I can to make sure that incoming links continue to work. And while I control all the domains involved, I also don’t want to have to maintain a giant list of redirects…

√ Requirement: Support at least one of Manila’s canonical URLs for transferred content

√ Stretch-goal: Support all URLs for a given bit of content, including user-generated ones

GUID

A post’s GUID is its canonical and unchanging identifier that signals to feed readers (RSS, Atom, etc), that if it sees this post again, it doesn’t need to show it to users, since they’ve already seen it.

But if the post’s URL ever changes, a well-behaving content management system should remember the original GUID and not change it, so that folks who subscribe to the site in a feed reader don’t get blasted with a whole lot of repeat posts.

There are other potential uses for a post’s GUID. Some systems might use it to identify a post when accessing it via an API. Some (like Manila) use a combination of the site’s URL and a post’s ID instead for API access.

Sometimes it’s easy to generate a GUID just by reusing the value of the post’s permalink. In this case you could add an attribute called

isPermalink
and make its value
true
 to signal to consuming apps that the GUID actually points at a real web resource. (WordPress doesn’t do this, even when the permalink and GUID are the same.) This could be especially useful if the post has a
link
which is not a link to the post itself.

Then there’s the ID. Manila and WordPress both have sequential IDs for the super-set of posts and pages. Unlike WordPress though, Manila also keeps comments in the same “table” as posts and pages, whereas WordPress treats comments completely separately. Going from Manila to WordPress then shouldn’t create any issues, since there are no inherent ID conflicts.

Data Hierarchy: What’s the same, what’s different?

Among the reasons I picked WordPress instead of some other platform, is that WordPress and Manila actually have a great deal in common:

  • They both separate content from layout by flowing content through a Theme
  • They both use a database to store the content
  • They both have posts, media, and pages (in Manila, News Items, Pictures & Gems, and Stories)
  • The table used for posts, stories, and media is the same (in Manila it’s the site’s Discussion Group)
  • Both systems use the filesystem for blob storage for media files

But there are some differences:

  • In a Manila site, you can have threaded discussions that aren’t attached to a post or page. Not so in WordPress.
    • This could be faked up with private posts/pages in WordPress, but depending on the site this may not be worth the extra development effort.
  • In Manila, comments are stored in the same table as posts, pages, and media objects, but in WordPress, commentsare stored separately.
    • In theory this shouldn’t be an issue, since as long as I build the WXR file such that WordPress understands it, comment content will import just fine.

I was thrilled to discover that WordPress supports threaded discussions. Though it’s not an issue for JakeSavin.com since it’s always had flat comment threads, when I get around to porting over my other sites, I will want to preserve threaded discussions.


That’s it for this post. In the next post, I’ll talk about the code that I wrote, how I tested and debugged it, and what kind of crazy edge cases I found continue to find.

]]>
http://www.jakesavin.com/2014/10/03/porting-to-wordpress-worknotes-part-2/feed/ 0
Hello World!http://www.jakesavin.com/2014/09/21/hello-world/?utm_source=rss&utm_medium=rss&utm_campaign=hello-world&utm_source=rss&utm_medium=rss&utm_campaign=hello-world http://www.jakesavin.com/2014/09/21/hello-world/#comments Sun, 21 Sep 2014 07:54:41 +0000 http://www.jakesavin.com/?p=10000084 If you see this, you’re looking at all of my old site’s content running on my self-hosted WordPress server… Phew!

I think that’s enough for tonight, but there will be more soon. I’ve got a bunch of details I want to write up about this project. Plus now that I have the tools I wrote to migrate from Manila to WordPress, I’ve got a bunch of other old content I want to migrate.

I suppose I should first figure out some redirects though, at least so my RSS subscribers don’t all break.

Stay tuned… ;-)


Ps. All of the <guid>’s in my feed are now changed to a new format. I apologize that your RSS aggregator is probably about to freak out now. Fixing this was sadly not worth the effort at this point. :-(

Pps. I realize it’s now 4:30 am, but I couldn’t let my links rot. I modified my WordPress permalink format to make legacy incoming links continue to work. I still need to do some testing, but for the moment things are much better than having basically every incoming permalink go 404.

]]>
http://www.jakesavin.com/2014/09/21/hello-world/feed/ 0
Porting to WordPress Part 1: Scopinghttp://www.jakesavin.com/2014/09/11/porting-to-wordpress-worknotes-part-1/?utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-1&utm_source=rss&utm_medium=rss&utm_campaign=porting-to-wordpress-worknotes-part-1 http://www.jakesavin.com/2014/09/11/porting-to-wordpress-worknotes-part-1/#comments Thu, 11 Sep 2014 23:51:00 +0000 http://www.jakesavin.com/2014/09/11/porting-to-wordpress-worknotes-part-1/ wordpress-logo-sm.png:
About a week ago I started a project to port this site from
Manila to WordPress. While there are probably very few Manila users still out there who might want to do this, I thought it would still be a useful exercise to document the process here, in case anything I’ve learned might be useful.

This is something I have been wanting to do in my spare time for many months now — probably two years or more. But with family and work obligations, a couple of job changes, and a move from Woodinville to Seattle last fall, carving out the time to do this well was beyond my capacity.

Now that I’m recently between jobs, the only big obligation I have outside of spending time with my wife and son is to find another job. Some learning can’t hurt that effort. Plus I seem to have a surplus of spare time on my hands right at the moment.

Managed or Self-hosted?

The first question I needed to answer before even starting this process is whether I want to host on a managed service (most likely WordPress.com), or if I should self-host. There are trade-offs either way.

The biggest advantages of the managed option come from the very fact that the servers are run by someone else. I wouldn’t have to worry about network outages, hardware failures, software installation and updates, and applying an endless stream of security patches.

But some of the same features which are good if I were to go with a hosted solution, are also limiting. I would have limited control over customization. I wouldn’t be able to install additional software along-side of WordPress. I would be limited to the number of sub-sites I was willing to pay for. I wouldn’t necessarily have direct access to the guts of the system (database, source code, etc).

Most importantly, I wouldn’t be in control of my web presence end-to-end — something which has been important to me ever since I first started publishing my own content on the Web in 1997.

There’s one more advantage of self-hosting which is important to me: I want to learn how WordPress itself actually works. I want to understand what’s actually required to administer a server, and also start learning about the WordPress source code. The fringe benefit of this is also learning some PHP, which while some web developers prefer alternate languages like Ruby, Python, or Node.js, the install-base of WordPress itself is so enormous, that from a professional development perspective, learning some PHP is a pretty sensible thing to do.

I decided to go self-hosted, on my relatively new Synology DS-412+ NAS. It’s more than capable of running the site along with the other services I use it for. It’s always on, always connected to the Internet, and has RAID redundancy which will limit at least somewhat, the risks associated with hardware failure.

Develop a Strategy

The next thing I needed to work out was an overarching plan for how to do this.

Aside from getting WordPress installed and running on my NAS, how the heck am I going to get all the data ported over?

First, I made a list of what’s actually on the site:

  1. A bunch of blog posts (News Items) sitting in a Frontier object database
  2. Comments on those posts
  3. A small amount of discussion in the threaded discussion group
  4. User accounts for everyone who commented or posted in the discussion group
  5. A bunch of pictures and other media files
  6. A few “stories”
  7. Some “departments” that blog posts live in
  8. A site structure that put some pages on friendlier URLs
  9. Logs and stats that I don’t care much about
  10. A sub-site I never posted much to, and abandoned years ago

For the most part, there aren’t any types of information that don’t have an allegory in WordPress. News items are blog posts, comments are comments, stories are pages, pictures are image attachments, departments are categories. The stats and logs I’m happy to throw away. Not sure what to do with the site structure, but if push comes to shove, I can just use .htaccess files to redirect the old URLs to their new homes.

Next I needed a development environment — someplace where I can write and refine code that would extract the data and get it into WordPress.

On the Manila side, I did some work a little over a year ago to get Manila nominally working in Dave Winer’s OPML editor, which is based on the same kernel and foundation as UserLand Frontier, over which Manila was originally developed. The nice thing about this is that I have a viable development environment that I can use separately from the Manila server that’s currently running the production site.

On the WordPress side it makes sense to just host my development test sites on my MacBook Air, and then once I have the end-to-end porting process working well, actually port to my production server — the Synology NAS.

Data Transmogrification

Leaving the media files and comments aside for a moment, I needed to make a big decision about how to get the blog post data out of my site, and into my WordPress site. This was going to involve writing code somewhere to pull the data out, massage it in an as-yet unknown way, and then put it somewhere that WordPress could use it to (re-)build the site.

It seemed like there were about five ways to go and maybe only one or two good ones. Which method I picked would determine how hard this would actually be, how long it might take, and if it’s even feasible at all.

Method 1: Track down Erin Clerico

A bunch of years ago, Erin Clerico (a long-time builder and hoster of Manila sites in the 2000’s) had developed some tools to port Manila sites to WordPress.

As it turned out, a couple years back I’d discussed with Erin the possibility of porting my site using his tools. Sadly he was no longer maintaining them at that time.

If I remembered correctly, his tools used the WordPress API to transmit the content into WordPress from a live Manila server — I have one of those. It might be possible, I thought, to see if Erin would share his code with me, and I could update and adapt it as necessary for my site, and the newer versions of WordPress itself.

But this was unknown territory: I’ve never looked at Erin’s code, know very little about what may have changed over the years in the WordPress API, and don’t even know if Erin still has that code anywhere.

Method 2: Use the WordPress API

I could of course write my own code from scratch that sends content to WordPress via its API.

This would be a good learning exercise, since I would get to know the API well. And the likelihood that WordPress will do the right thing with the data I send it is obviously pretty high. Since that component is widely used, it’s probably quite well tested and robust.

This approach would also work equally well, no matter where I decided to host the site — on my own server or whatever hosted service I chose.

But there potential problems:

  • Manila/Frontier may speak a different dialect on the wire than WordPress — I haven’t tested it myself.
  • Client/server debugging can be a pain, unless you have good debugging tools on both sides of the connection. I’ve got great tools on the Manila side, but basically no experience debugging web services in PHP on the WordPress side.
  • It’s likely to be slow because of all the extra work the machines will have to do in order to get the data to go between the “on-the-wire” format and their native format. (This will also make debugging more tedious.)

Method 3: Use Manila’s RSS generator

Of course Manila speaks RSS (duh). And WordPress has an RSS import tool — Cool!

In theory I should be able to set Manila’s RSS feed to include a very large number of items (say 5,000), and then have WordPress read and import from the feed.

The main problem here is that I would lose all the comments. Also I’m not sure what happens to images and the like. Would they be imported too? Or would I have to go through every post that has a picture, upload the picture, and edit the post to link to the new URL?

I’m less worried about the images, since I can just maintain them at their current URLs. It’s a shame not to have real attachment objects in my WordPress site, but not the end of the world.

Loss of the comments however would be a let-down to my users, and would also limit the export tool’s potential usefulness for other people (or my other sites).

Method 4: Make Manila impersonate another service

In theory it should be possible to make Manila expose RPC interfaces that work just like Blogger, LiveJournal, or Tumblr. WordPress has importers that work with all of these APIs against the original services.

Assuming there aren’t limitations of Frontier (for example no HTTPS, or complications around authentication) that would prevent this from working, this should get most or all of the information I want into WordPress.

But there are limitations with some of the importers:

  • The Tumblr importer imports posts and media, but I’d lose comments and users (commenters’ identities)
  • The LiveJournal importer seems to only understand posts
  • The Movable Type and TypePad importer imports from an export/backup file, and understands posts and comments, but not media

The only importer that appears to work directly from an API, and supports posts, comments, and users is the Blogger importer. (It doesn’t say it’ll pick up media however.)

In the Movable Type / TypePad case, I’d have to write code to export to their file format, and it’s not clear what might get lost in that process. It’s probably also roughly the same amount of work that would be needed to export to WordPress’ own WXP format (see below), so that’s not a clear win.

When it comes to emulating the APIs of other services (Blogger, Tumblr, LiveJournal), there’s potentially a large amount of work involved, and except for Blogger, there would be missing data. There’s also the non-trivial matter of learning those APIs. (If I’m going to learn a new API, I’d rather learn the WordPress API first.)

Method 5: Make Manila pretend to be WordPress

While researching the problem, I discovered quickly that WordPress itself exports to a format they call WXR, which stands for WordPress eXtended RSS. Basically it’s an XML file containing an RSS 2.0 feed, with additional elements in an extension namespace (wp:). The extension elements provide additional information for posts, and also add comments and attachment information.

On first glance, this seemed like the best approach, since I wouldn’t be pretending to understand the intricacies of another service, and instead would be speaking RSS with the eXtended WordPress elements — a format that WordPress itself natively understands.

Also since I’m doing a static file export, my code-test-debug cycle should be tighter: More fun to do the work, and less time overall.

Method 6: Reverse-engineer the WordPress database schema

I did briefly consider diving into MySQL and trying to understand how WordPress stores data in the database itself. It’s theoretically possible to have Manila inject database records into MySQL directly, and then WordPress wouldn’t be the wiser that the data didn’t come from WordPress itself.

This idea is pretty much a non-starter for this project though, for the primary reason that reverse-engineering anything is inherently difficult, and the likelihood that I would miss something important and not realize it until much later is pretty high.

Time to get started!

I decided on Method 5: Make Manila pretend to be WordPress. It’s the easiest overall from a coding perspective, the least different from things I already know (RSS 2.0 + extensions), and should support all of the data that I want to get into WordPress from my site. It also has the advantage of being likely to work regardless of whether I stick with the decision to self-host, or decide to host at WordPress.com or wherever else.

Implementing the Blogger API was a close second, and indeed if Manila still had a large user-base I almost certainly would have done this. (There are many apps and tools that know how to talk to Blogger, so there would have been multiple benefits from this approach for Manila’s users.)


In the next post I talk about some differences between the way Manila and WordPress store data, and some requirements that surfaced while investigating how to export data from Manila to WXR.

]]>
http://www.jakesavin.com/2014/09/11/porting-to-wordpress-worknotes-part-1/feed/ 0
Join the Battle for Net Neutralityhttp://www.jakesavin.com/2014/09/10/join-the-battle-for-net-neutrality/?utm_source=rss&utm_medium=rss&utm_campaign=join-the-battle-for-net-neutrality&utm_source=rss&utm_medium=rss&utm_campaign=join-the-battle-for-net-neutrality http://www.jakesavin.com/2014/09/10/join-the-battle-for-net-neutrality/#comments Wed, 10 Sep 2014 16:24:48 +0000 http://www.jakesavin.com/2014/09/10/join-the-battle-for-net-neutrality/ Spinner-DarkRed.gif:
Today, sites all over the Web are making a statement by

“[covering] the web with symbolic ‘loading’ icons, to remind everyone what an Internet without net neutrality would look like, and drive record numbers of emails and calls to lawmakers.”

Obviously if you’re reading this, you see that I’m participating. You can too.

Go here: https://www.battleforthenet.com/sept10th/

There are super-simple instructions there (scroll down) for adding a modal or banner to your site, to show your support. The modal like the one you saw here is best because visitors to your site can very easily add their names to the letter for congress, and also get connected to their representative by phone–without even having to dial. Either way all it takes is a few lines of HTML code in your site’s &lt;head> element.

The Web and indeed the Internet as we know it today wouldn’t exist if it weren’t for equal access to bandwidth without the throttling and corporate favoritism that the big ISPs and carriers are lobbying for. Without Net Neutrality, we will be forced to pay more for services we love, and miss out on continued incredible innovation that’s only possible if new and small players have the same access to Internet bandwidth as the BigCo’s.

Please help!

https://www.battleforthenet.com/sept10th/

Ps. If you need a spinner (gif or png), check out SpiffyGif.

]]>
http://www.jakesavin.com/2014/09/10/join-the-battle-for-net-neutrality/feed/ 0
CloudKit: Square Pegs and Round Holeshttp://www.jakesavin.com/2014/07/29/cloudkit-square-pegs-and-round-holes/?utm_source=rss&utm_medium=rss&utm_campaign=cloudkit-square-pegs-and-round-holes&utm_source=rss&utm_medium=rss&utm_campaign=cloudkit-square-pegs-and-round-holes http://www.jakesavin.com/2014/07/29/cloudkit-square-pegs-and-round-holes/#comments Wed, 30 Jul 2014 06:34:59 +0000 http://www.jakesavin.com/2014/07/29/cloudkit-square-pegs-and-round-holes/ My long-time friend Brent Simmons has been pretty prolific on his blog recently — sadly me, not so much. (I’m working on it.) Monday, he wrote a response to Marco Tabini‘s Macworld article, Why you should care about CloudKit:

“While it’s technically possible to use the public data for group collaboration, it’s only the code in the client app that enforces this. CloudKit has no notion of groups.

“It would be irresponsible to use the public data for private group collaboration.

“Neither of the two apps mentioned as example — Glassboard and Wunderlist — should use CloudKit.”

I completely agree, and actually the question of whether Glassboard (or Twitter) would be possible to build with CloudKit, was the source of some discussion among some of the folks with whom I attended WWDC this year.

CloudKit doesn’t actually provide any mechanism at all for natively declaring that person X and person Y have access to resource Q (and noone else does). It provides the ability to securely and privately store some data for a single person as associated with an app -and/or- to store some data that’s available to everyone who is associated with that app. That’s it (mostly).

It’s possible separately via a web portal (not programmatically as far as I know), to configure a subset of data to be editable only by specific people, but the idea is more about providing a way for the maintainers of some data resources to update that data, than it is about providing a mechanism for users to create ad-hoc groups among themselves. (i.e. dynamic configuration data that’s loaded by the app at launch.)

While this is a super useful feature, the value of which hasn’t really been called out much by the iOS dev community, it is not what Marco Tabini described. (I can see how the misunderstanding arose though.)

But can’t I do groups on top of CloudKit?

Seems like a reasonable question, right? Why not leverage the lower-level infrastructure that Apple is providing, and implement the security over the top of it? Bad idea.

While it’s probably be theoretically possible to integrate an encryption library and set up a mechanism for building and maintaining groups that is actually private and secure — on top of Apple’s CloudKit service, this would be a terrible idea from a security, testability, and code maintainability perspective.

First there’s the issue of bugs and vulnerabilities in the encryption library you choose to include. I’m not saying anything specific about any particular open-source or licensable encryption code or algorithm, but this is a notoriously difficult thing to get right, and encryption is under constant attack from every angle you could imagine. The world’s government intelligence services and organized crime syndicates are almost certain to do a better job hacking these things than you (or the maintainers of the open source code) are going do at protecting your users.

Then there’s the problem of an external dependency keeping up with changes to iOS itself. Let’s say for example that two years from now you want to move your code to Swift, but you’re dependent on an open source project that hasn’t been updated to work either with Swift or with ObjC in the latest version of iOS. Guess what: You’re now in a holding pattern until either (gasp!) you port or patch the open source code, or someone else does. That’s a dependency you don’t want to take.

Then there’s Apple. It seems likely (and I speak without any insider knowledge at all) that at some point Apple will start to add group collaboration features to CloudKit itself, to its successor, or to some higher-level service.

Now you have another horrible choice to make: Do I continue to bear the burden of technical debt that comes from having rolled my own solution, or do I hunker down for six months and port to the new thing? And how do I migrate my users’ data? What’s going to break when I have a data migration bug? How am I going to recover from that? Where’s the backup?

(Brent also made the excellent point that if you want your users to be able to get to their data from anywhere else besides their iOS devices, CloudKit isn’t going to get you there right now.)

Architectural decisions should not be taken lightly

I’ll say it again: Architectural decisions should not be taken lightly.

You have to think deeply about this stuff right at the beginning if you want your app, your product(s), and your company to succeed over time. The big design decisions you make early on will have a lasting and possibly profound impact on what happens in the long run…

… And, when it comes to privacy and security, we almost never get second chances. You should fully expect that a breach of trust, whether intentional or not, will be met with revolt.

Looking at the situation from 30,000 feet: Would you rather go with a somewhat more difficult solution up-front, one that came perhaps with some of its own problems, but which solved the privacy, security, and platform-footprint issues right now?

Or would you rather build something you don’t fully understand yourself, on top of a service which isn’t really intended to do what you’re forcing it to do?

Just sayin’…

CloudKit is very promising

For simpler scenarios, CloudKit is going to provide a ton of value. More than likely, the service will meet the needs of a huge number of developers… with some caveats:

  • It’s Apple-only. You’re not going to get to the web or Android right now, and no promises at all about the future.
  • Access is public or individual. There’s no good way to deal with groups right now.
  • You can’t write any server-side business logic. It’s purely a data store, and that’s it. This might change in the future, but don’t bet your business or livelihood on it.

Those are the big ones. There are almost certainly others, including pricing, resiliency, backups, roll-backs, etc.

Cloud-based data storage is a huge and complex field. I for one am very happy to see Apple taking a methodical and measured approach to it this time around. But that inherently means we have to live within its limitations.

I’m confident that CloudKit is the right approach for a lot of developers, and mostly confident that it will work for those developers and not fall on its face. It’s not the end-all and be-all that some folks would want it to be. And frankly I’m glad it’s not trying to be.

]]>
http://www.jakesavin.com/2014/07/29/cloudkit-square-pegs-and-round-holes/feed/ 0
Is it possible to leave Facebook?http://www.jakesavin.com/2014/07/06/is-it-possible-to-leave-facebook/?utm_source=rss&utm_medium=rss&utm_campaign=is-it-possible-to-leave-facebook&utm_source=rss&utm_medium=rss&utm_campaign=is-it-possible-to-leave-facebook http://www.jakesavin.com/2014/07/06/is-it-possible-to-leave-facebook/#comments Sun, 06 Jul 2014 21:01:17 +0000 http://www.jakesavin.com/2014/07/06/is-it-possible-to-leave-facebook/ (Posted originally to my Facebook feed.)

I keep trying to reclaim online time from Facebook, and then someone tags me or posts to my timeline, and my inner moderator kicks in.

And then there I am right back on Facebook again, with their web bugs and their GPS tracking. To me recently Facebook feels like the web version of talking to your friends on the phone while the NSA records your call, only plus baby pictures and pithy memes. (Oh-hi, NSA agent, how’s your day? Did I mention NSA? [Attempts Jedi hand-wave.])

I wonder sometimes if Facebook has made it so difficult to feel as if you have any privacy, that for some of us the only way to feel we’re not being spied on by the big-data-big-brother is to delete our accounts entirely–to commit Facebook-Seppuku.

… And now that I’ve said all that, I’m pretty sure that since I’m mentioning on Facebook, that I’m considering leaving, well, Facebook, I’m going to start getting a flood of “compelling” push notifications and emails saying how much my friends miss me and that I need to come back to Facebook and approve all those timeline posts and wish distant acquaintances “Happy Birthday” and the “like”…

I love it that my online life has brought me closer to those that I love, work with, and care about. I love that information now flows (mostly) so easily. I owe my livelihood to the Internet and the web.

But Facebook is *not* the Internet, people. It’s totally possible to interact engagingly online without it. Send some email. Start a website. Spread your footprint out to other services. Sure you won’t have quite so many “friends” “liking” your pictures or leaving pithy comments on your posts, but you might just get some intimacy back in return, and you’re going to have a hard time finding that inside the walls of Zuck’s castle.

]]>
http://www.jakesavin.com/2014/07/06/is-it-possible-to-leave-facebook/feed/ 0
Google Reader, RIPhttp://www.jakesavin.com/2013/07/02/google-reader-rip/?utm_source=rss&utm_medium=rss&utm_campaign=google-reader-rip&utm_source=rss&utm_medium=rss&utm_campaign=google-reader-rip http://www.jakesavin.com/2013/07/02/google-reader-rip/#comments Tue, 02 Jul 2013 07:03:00 +0000 http://www.jakesavin.com/2013/07/02/google-reader-rip/ Screenshot… And commentary:

HesDeadJim.jpeg:

]]>
http://www.jakesavin.com/2013/07/02/google-reader-rip/feed/ 0
Don’t Forget To Download Your Feeds From Google!http://www.jakesavin.com/2013/06/29/dont-forget-to-download-your-feeds-from-google/?utm_source=rss&utm_medium=rss&utm_campaign=dont-forget-to-download-your-feeds-from-google&utm_source=rss&utm_medium=rss&utm_campaign=dont-forget-to-download-your-feeds-from-google http://www.jakesavin.com/2013/06/29/dont-forget-to-download-your-feeds-from-google/#comments Sat, 29 Jun 2013 19:39:46 +0000 http://www.jakesavin.com/2013/06/29/dont-forget-to-download-your-feeds-from-google/ rss_glass_128.png:
Monday will be the very last day that you will be able to access your RSS feeds using Google Reader, so if you haven’t already migrated to one of the other services, I strongly recommend that you…

Export your Google Reader data before Monday!

Here’s a quick how-to:

  1. Go to Google Takeout at https://www.google.com/takeout/
  2. Assuming you only want Reader data right now, click on Choose services at the top of the page
  3. Click on the Reader button
  4. Click the Create Archive button

Unless you have an enormous number of feeds, your archive should be created relatively quickly. If it’s taking a long time, you can check the box to send you an email when the archive is ready for download. (There’s no need to keep your browser opened in this case.)

Once it’s all packaged up, click the Download button next to your new archive. (If you did the email option, there will be a link in the email to take you to the download page.)

What the heck do I do with it?

The download will be a zip archive containing a bunch of JSON files and one XML file. The JSON files have your notes, likes, starred, shared and follower data.

The XML file – subscriptions.xml – is the important one. It has a list of all of your feeds, and what folders they are in. (It’s actually in OPML format, which is based on XML.) Most feed reading services and apps will know how to import this file, and recreate your subscriptions. Some will be able to understand your folder structure, but not all.

Preserving my read/unread state?

Sadly, importing just subscriptions.xml doesn’t keep your read/unread state, and most services also don’t know how import the JSON files at all.

There are only two web-based services that I’ve tried so far that actually do keep your read/unread states: Feedly, and Newsblur. Of the two, I prefer Newsblur’s UI over Feedly’s since it’s more like what I’m used to, but lots of people seem to like Feedly’s slicker, less cluttered UI better.

Both Feedly and Newsblur were able to import from Google directly, as can many others, but these are the only two I know of that keep your read/unread state. To do this, you connect the app to your Google Account, and they go out to Google to get your data.

Both services can also import your subscriptions.xml, connecting to your Google account is the better option if you’re doing your import before Reader is shut off. This will capture read/unread state (and in Newsblur’s case your shared stories) instead of just your subscriptions.

Edit: I just tried the new AOL Reader, and while it has a decent mobile web UI (gah), and did import my feeds from Google, it did not preserve my read/unread state.

Other web-based services

There are a slew of other services out there too, spanning a wide range of feature-completeness, API support, iPhone or other mobile apps, and social/sharing functionality.

The ones I’ve looked at most closely are:

  • David Smith’s Feedwrangler which lacks folder support but has a very interesting Smart Streams feature, and has its own iOS apps and API
  • The Old Reader, which started as a web-only Reader replacement that imitated an older version of Google’s UI, and does include folder support

Both services can import your feed list either directly from Google or using your subscriptions.xml from Google Takeout, but neither will preserve your read/unread articles or shared/starred stories.

Disclosure:

Last month, I started working at Black Pixel, which just released NetNewsWire 4 in open beta.

NetNewsWire is a desktop RSS reader for Mac, which was originally created by my friend and former colleague, Brent Simmons. Previous versions of the app supported Google Reader syncing, but Reader sync was removed from this version since Reader itself is shutting down.

To be clear: I’m not working on NetNewsWire at Black Pixel, so I don’t have intimate knowledge of the roadmap, but syncing will come.

There is some public information on the beta release announcement here.

]]>
http://www.jakesavin.com/2013/06/29/dont-forget-to-download-your-feeds-from-google/feed/ 0
I’m Joining Black Pixelhttp://www.jakesavin.com/2013/05/20/im-joining-black-pixel/?utm_source=rss&utm_medium=rss&utm_campaign=im-joining-black-pixel&utm_source=rss&utm_medium=rss&utm_campaign=im-joining-black-pixel http://www.jakesavin.com/2013/05/20/im-joining-black-pixel/#comments Mon, 20 May 2013 20:05:29 +0000 http://www.jakesavin.com/2013/05/20/im-joining-black-pixel/ Last Friday was my last day at Hitachi Data Systems, and I mentioned that I’m leaving the BigCo’s for something new.

blackpixel-logo.png:
Wednesday will be my first day at Black Pixel. I’m SuperExcited™ to be joining the team! I’ve known a few folks there for some time, and regard all of them as super bright, thoughtful, respectful, high-integrity people—exactly the type of people that I want to surround myself with. When the opportunity arose to work with them, I immediately realized that it was too good to pass up.

In my own experience, the chance to make a move like this only comes along once every few years (if that). I’ve learned, sometimes the hard way, that when they happen you shouldn’t let them pass you by.

I love new challenges, and I can’t wait to start this new chapter in my professional life!

Ps. For anyone at HDS who might be reading this, it’s been a real pleasure working with you all. Thanks to everyone for all your help and support over the last year. The team really is awesome, and I wish you all great success moving forward!

]]>
http://www.jakesavin.com/2013/05/20/im-joining-black-pixel/feed/ 0
Hello, and welcome to ‘The Middle of the Film’…http://www.jakesavin.com/2013/05/17/hello-and-welcome-to-the-middle-of-the-film/?utm_source=rss&utm_medium=rss&utm_campaign=hello-and-welcome-to-the-middle-of-the-film&utm_source=rss&utm_medium=rss&utm_campaign=hello-and-welcome-to-the-middle-of-the-film http://www.jakesavin.com/2013/05/17/hello-and-welcome-to-the-middle-of-the-film/#comments Fri, 17 May 2013 20:05:37 +0000 http://www.jakesavin.com/2013/05/17/hello-and-welcome-to-the-middle-of-the-film/ Today is my last full day at Hitachi Data Systems. I’m leaving the world of the BigCo’s. Next week I start something new, and I’m super excited about it! More on that very soon…

In the meantime, here’s this:

Hello, and welcome to ‘The Middle of the Film’, the moment where we take a break to invite you, the audience, to join us, the film-makers, in ‘Find the Fish’. We’re going to show you a scene from another film and ask you to guess where the fish is, but, if you think you know, don’t keep it to yourselves. Yell out so that all the cinema can hear you. So, here we are with… ‘Find the Fish’.

(For a hint, check out the categories on this post.)

]]>
http://www.jakesavin.com/2013/05/17/hello-and-welcome-to-the-middle-of-the-film/feed/ 0