Mixworthy

Ape Delay Remixes

Last year I did a couple of remixes for my friends ollo from their lovely retro album Ape Delay. I picked a couple of tracks and ended up with the following…

Some other people did some awesome remixes too, so listen to them all, and maybe download and donate some money. And check out the original album.

And after all that, there’s more of my stuff on Soundcloud.

On future web

Out of the tragedy of Aaron Swartz’s untimely departure earlier this year are some of his outputs that endure. One, RSS, through his contribution in the RSS-DEV Working Group, is a cornerstone of syndicated web content distribution, although with the demise of Google Reader, one wonders how long RSS or its successor, the Atom Syndication Format, will survive in a new model of published content. That’s perhaps for another post. Markdown, co-written with John Gruber is a format I’m using to write this text. And there are others.

A more recent output was the unfinished e-book The Programmable Web, available for free download from the publisher Morgan & Claypool’s site. In it, he lays out a brief but amusing history of the web, and, although it’s a first draft and was never updated, he also paints an attractive—if utopian—vision of what the web could be, and why the Semantic Web would be an important part of that.

Layer Cake

It reminds me of why I sometimes fiddle with semantic web technologies in my spare time. While it’s a sometimes impenetrable mass of specifications dealing with knowledge models and representations, it is a critical body of knowledge for implementing and traversing a machine-readable web. To implement this vision is to bootstrap the web to a totally new level of unimaginable utility, as the current Web 2.0 functionality would be to our pre-1990 selves. (I’m also cynical enough to believe it would also have unimaginable negative consequences, but then we’ve so far survived the nuclear age too, despite the odds.)

I’m not a futurist or misty-eyed sci-fi fanatic but the combination of knowledge models, open linked data, discoverable APIs, and agents leads to some predictable outcomes. The products can be as pedestrian as the Internet refrigerator that arranges its own restocking and will inevitably display an ad each time you open it. Or as provocative as an undetectable wearable computing fabric that “augments” its owner’s reality, and could change the very nature of “self” and “other”. I think it is the future. Laugh or cry at Google Glass, it’s a pointer to where we’re going. That’s both exciting and challenging, but inevitable. Assuming we survive our more pressing current and future geo-political challenges of course.

The image above comes from here.

Pace layering

Another one of those observations that just rings true on so many levels is that a single system has components that change at different rates. It was identified by Stewart Brand in his book How Buildings Learn

The concept of pace layering (Brand 1994, Morville 2005) sees a building as a series of layers that have differing life spans. The site itself has an eternal life, whereas the building structure might last 50 to 100 years. Other layers such as the external cladding of the building or the interior walls might have a life of 20 years with internal design, decoration and furniture lasting for 5 to 10 years. In a rapidly moving world it makes sense to locate the capacity for change in those items with the potential shortest life span and avoid, if possible, creating some layers, such as internal dividing walls, that have a medium term life span and are a potential barrier to accommodating changing activities.

Taken from JISC, via Tom Graves’s post.

In architectures of any kind—software, enterprise,or physical—failure to accommodate for this will ultimately tear your system apart. In any case, accommodating different rates of change, as in gears in a machine operating at different speeds, requires careful design of the interfacing points. Another reason that designing good and enduring interfaces is hard.

Small steps

In a classic example of what goes wrong in bigger environments, I wanted to knock up a quick tool to solve a problem. I decided to use Clojure because the solution involves data transformation from XML to JSON, so a functional approach makes sense. I also want to improve my Clojure skills, which are on the amateur side.

Leiningen is the natural lifecycle tool. I created my project, updated my dependencies in project.clj to the latest version of the midje testing tool, and in fine TDD style wrote a quick sanity test, ran lein midje and got a green response. Good work.

After some reading up on XML zippers in Clojure, I then made the timeless error of overconfidence by taking a big leap forward and writing a simple functional test that required a number of implementation steps, including coding and updating components and tools. Pretty soon, I was in Leiningen hell, getting meaningless exception stack traces, thrashing around trying different versions of tools and libraries, and commenting out increasingly large pieces of code until I found my issue—in about the second line I’d written. Lots of time wasted, no value delivered.

Moral to the story: when you’re in unfamiliar territory, you move faster if you take small steps.

Muzak update

In my Latest Muzak I neglected another recent album…

John Talabot: Fin

Cover

This is a slightly older one, dating from early 2012, but I only picked it up recently. It’s intelligent slow-ish house with a smooth euro groove, a bit of tasteful female vocal, and lush production. The opening track is an excellent slow burn groover, but the standout for me is the second track Destiny with Pional on vocals.

Audio appendix

If you care about such things, I’ve moved my blog over to a self-hosted WordPress site. All went smoothly, including the importing of all the content from Posterous. Lovely. Very impressed. Should have done it years ago. Then I needed to get an on-page audio player. After a few attempts, I ended up with MP3-jPlayer, which not only starts in HTML5 and degrades into native players or even (ugh) Flash, but provides some decent styling options as well.

Latest muzak

Layo & Bushwacka!: Rising and Falling

Cover

More stripped-back and focused on deeper tech-house than previous releases. Moody, with some darker bits, and a couple of vocal tracks. Superb production, often danceable, all excellent. My favourite from the last few months, and I’m not done listening to it yet.

Deadbeat: Eight

Cover

This seems more of a collection of tracks than a coherent album, comprising (unsurprisingly) eight atmospheric dubstep tracks, with more emphasis on the dub than the techno. It’s not as accessible as Drawn & Quartered from 2011, so it’s taking me some time to get into it. My vote is still out.

Chymera: Death by Misadventure

Cover

Melodic tech-house from May 2012. Not much to say. Very listenable, with some jazz, blues, and soul influences. A bit of vocals here and there too. Nice.

Photek: Ku:Palm

Cover

I like Photek but this is a patchy release. There are some good tracks (Pyramid) but also a few that just don’t fit with the post-d&b bass-heavy vibe where I think he does his best stuff. Personally, I’d like to see more of his delicately restrained anger. We have plenty to be angry about.

Brian Eno: Lux

Cover

The master returns with more impressionist ambience, again on Warp Records. Read the interview with Laurie Anderson on the background to it, and creativity in general. A suitable backdrop for a rainy afternoon’s contemplation of one’s infinitesimally small place in the universe.

Max Cooper

Max

I seem to need regular fixes of Max Cooper’s organic deep tech-house right now. Fortunately he posts free mixes and live sets regularly to SoundCloud so new material is never far away. I even have an IFTTT workflow to notify me of new tracks. Yay, cloud.

Solitude

Image

Ambient techno is not dead. Solitude (UK) does regular “mixtapes” to Soundcloud with plenty of lie-on-the-floor-with-headphones mixes called “5am mix” and so on. It’s mostly ambient dubstep; Burial is a favourite inclusion.

Semantic appendix

As an appendix to the Semantic Scratchings post, a couple of additional diagrams to illustrate the process.

 

The first shows a simple graph of a mythical contact. There are additional properties that can be added, but it shows the core split into organisation, vCard, and FOAF relationships.

 

0image

 

 

The second shows the workflow from entering contact details, to displaying query results on a web page. The icons are standard UML analysis boundary, control, and entity classes, and it also shows the domains in which each entity sits.

 

Image

 

I hope it gives some idea of the steps I’m using. Not optimised, certainly, but it’s reasonably decoupled across standard protocols and APIs.

 

Semantic scratchings

Over the last few weeks, while I’ve had some time to myself, I’ve been scratching an itch by going deeper into semantic web technologies with an exploratory project of sorts. I guess it’s paying off in that it’s raising as many questions as it answers, and it’s also giving me some in-my-own-head street cred by both getting down dirty with writing, building, and deploying code, and thinking about things like ontologies and formal knowledge capture. If that floats your boat, read on. It’s a long entry.

Background

I won’t go into the nature of the itch; the scratching is always more interesting. I’ve been using sem-web tools to get a handle on my address book. It’s always annoyed me how siloed that information is, and how much the data could be augmented and improved by linking it with other data, both public and private. The same problem on a much larger scale has provoked the rise of the Linked Data movement. I credit a large part of inspiration for my little project from Norm Walsh’s paper from 2002 and related blog posts.

It’s a journey, possibly without end, but I’ll describe a little about where I’ve been, with some technical detail where useful. I’ll throw in links for the less familiar. Ask me if you want more detail.

It goes without saying that the organisations that hold your contact information, like Google, LinkedIn, and Facebook are building this sort stuff anyway, whether it’s using semantic web or not. Even if I wanted to, there’s unlikely to be a web startup in this.

So Far

To date I’ve built three main building blocks that do the following:

  1. Perform an extract-transform-load from my OSX Contacts app into RDF triples.
  2. Load the triples into a cloud-based RDF triple store.
  3. Run queries on the triple store using a web front end.

None of this is in itself even remotely revolutionary or exciting in the regular IT world, and the sem-web people have been doing it for years too. A triple store is just a specialised graph database, one of the many NoSQL entrants. It’s just that I’ve just chosen to hand-build some of the pieces, learning as I go.

Implementation

First a word on implementation language. I really wanted to do this in Clojure because its elegance and scalability should complement RDF. Sadly, the tools weren’t as mature as I would have liked, and my Clojure skills weren’t up to developing them myself, although others have shown the way. I considered using Node.js but again, I didn’t feel that there wasn’t enough there to work with. Not so with my old friend Ruby, where there are a wealth of useful libraries, and it’s a language I’m competent (but not expert) in, and encourages fast development. I also get to use RSpec and Rake. For a worked example, see Jeni Tennison’s post on using 4store and RDF.rb.

ETL

The ETL step uses a simple AppleScript to extract my Contacts app database into a single vCard file. I could use a Ruby Cocoa library to access it, but this was quick and easy. Moreover, vCard is a standard, so anything source data I can get in vCard format should be consumable from this step on.

The question was then how to transform this information about people into RDF. FOAF as a vocabulary doesn’t cut it on its own. Luckily, the W3C has already addressed this in their draft paper, and have a reference model (see image below). It leverages heavily off some well-known vocabularies: FOAF, ORG, SKOS, and vCard in RDF.

People Example

There’s also a big version.

I wrote a Ruby class to iterate through the vCard entries in the exported file and assemble the graph shown above using the excellent RDF.rb library, generating nodes and edges on the fly. These were pulled into an RDF graph in memory and then dumped out to a file of triples in the Turtle format. This can take some time: well over a minute on my i5 MacBook Pro grinding through nearly 300 contacts. It currently generates around 7000 triples (an average of 23 per contact).

An entry for an example contact currently looks like this in Turtle:

@prefix : <http://alphajuliet.com/ns/contact#> .
@prefix db: <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix gldp: <http://www.w3.org/ns/people#> .
@prefix net: <http://alphajuliet.com/ns/ont/network#> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix v: <http://www.w3.org/2006/vcard/ns#> .

:m254028 a org:Membership;
     org:member :person-6D9E0CBF-C599-4BEC-8C01-B1B699914D04;
     org:organization :org-example-corporation;
     org:role [ a org:Role;
         skos:prefLabel "CTO"] .

:org-example-corporation a org:Organization;
     skos:prefLabel "Example Corporation" .

:person-6D9E0CBF-C599-4BEC-8C01-B1B699914D04 a foaf:Person;
     net:workedAt [ a org:Organization;
         skos:prefLabel "Oracle Australia"];
     dbo:team db:Geelong_football_club;
     gldp:card [ a v:VCard;
         v:adr [ a v:work;
             v:country "Australia";
             v:locality "Sydney"];
         v:email [ a v:work;
             rdf:value "jane.smith@example.org"],
             [ a v:home;
             rdf:value "jane.smith12345@gmail.com"];
         v:fn "Jane Smith";
         v:note "Met at Oracle";
         v:tel [ a v:cell;
             rdf:value "+61 412 345 678"],
             [ a v:work;
             rdf:value "+61 2 9876 5432"]];
     foaf:account [ a foaf:OnlineAccount;
         foaf:accountName <http://www.linkedin.com/in/janesmith12345>],
         [ a foaf:OnlineAccount;
         foaf:accountName <http://twitter.com/janesmith12345>];
     foaf:homepage <http://www.example.org/>,
         <http://jane.smith.name/>,
         <http://www.example.org/janesmith/profile>;
     foaf:knows [ a foaf:Person;
         foaf:name "John Smith"],
         [ a foaf:Person;
         foaf:name "Marcus Smith"],
         [ a foaf:Person;
         foaf:name "Alice Jones"];
     foaf:name "Jane Smith" .

The UUID in the foaf:Person node is generated and retained by the Contacts app, so I have a guaranteed ID over the lifetime of the contact.

Because of the stream processing of the vCard entries, there is no way of setting up inferred relationships between existing items of data on the first pass, such as identifying explicitly that I know all my contacts. Fortunately, that’s what SPARQL is good at, so I use the following slightly awkward query:

CONSTRUCT {
    ?a foaf:knows ?b .
} 
WHERE {
    ?a a foaf:Person .
    ?a foaf:name "Andrew Joyner" ; 
        gldp:card ?c1 .
    ?b a foaf:Person ; 
        gldp:card ?c2 .
}

This generates a set of inferred triples that I can add into the main graph.

Import

I was using a local version of 4store up until very recently. It’s a no-fuss, solid, and open-source triple store with REST and SPARQL endpoints. However, I am doing development across different computers via Dropbox, and I wanted to centralise the data. One option would have been to set up an instance of 4store or maybe stardog on Amazon’s magic cloud. Fortunately, there is a new cloud triple store called Dydra that keeps it very simple, and I was kindly given a private beta account with some free storage.

Currently I’m manually clearing and adding the entire graph each time it gets updated, but that’s ok while it’s in development. Eventually, this will be scripted through the Dydra Ruby API gem.

Web Query

The core of querying an RDF store is SPARQL, which is as SQL is to relational databases. It even looks similar. I’ve set up a localhost front end using Sinatra, Markaby and a Ruby SPARQL client to apply queries, some with user inputs, to return “interesting facts”, like:

  • List all the organisations and how many contacts work for them, shown as a line chart using the D3 Javascript visualisation library.
  • List who knows who across the graph.
  • Display a D3 “force” graph of my network based on foaf:knows relationships (that’s me at the centre of the universe)

foaf:knows graph

  • Who are all the people in a person’s circle, i.e. the subject and object of a foaf:knows predicate
  • Who works at a given company and their mobile number
  • Who don’t I have an email address for?

As I said, fascinating. It’s very basic but it’s a start. I want to start visualising more information with D3, such as the broader social networks.

As a matter of style, I’m applying actual SPARQL queries rather than using a pure Ruby approach that sparql_client encourages. It just seems to replace one format with another without adding any useful abstraction.

Deployment

I’m managing the code under Git, and I’ve deployed the code base onto Heroku, both as a cloud education exercise, and so I can access it everywhere. However, because it contains personal contact data, I can’t make it public.

Data model

Being an organised person, I’ve filled in a number of the relationships in Contacts to reflect spouses, friends, children, pets, and so on. These all get mapped using a triple such as ajc:person-xxxx foaf:knows [ a foaf:Person; foaf:name "John Smith" ] .. The square brackets result in a blank node that can to be linked up later to the actual person based on inference rules. I don’t want to assume at this stage that there is only one “John Smith” in my address book. I know three Steve Wilsons for example.

Along the lines of Norm Walsh’s approach, I’ve also added “custom” relationships such as foaf:knows and net:workedAt, which get mapped into a set of triples during the transform process.

I’ve also played with adding my own RDF triples explicitly as notes in my contract entries, to give maximum flexibility. I use the format rdf: { ... } to enclose a series of Turtle triples, using my standard RDF prefixes, and transform them into real triples.

I’ve started an ontology to capture entities and relationships that don’t seem to exist elsewhere or are too complex for my needs. One example is to capture where someone worked before, using a net:workedAt property to map from foaf:Person to org:Organization. It highlights a major question in my mind around provenance and versioning of information (see next section).

Provenance and versioning

Clearly, one of the potential shortcomings of the system so far is that the quality of the data is determined by my efforts to keep my address book accurate and up to date. I did take some steps to pull in data from my LinkedIn connections, but quickly hit the API transaction limit in trying to pull down all the info on my nearly 600 connections, so it’s on hold for now. I do fantasise about LinkedIn (and Facebook) having a SPARQL endpoint on their data, but I suspect they would rather be at the centre of my social network, and not my little triple store.

Assuming that I did import contact data from LinkedIn and Facebook, where people manage their own contact details. I’d want to capture the source or provenance of that information, so I could decide the level of trust I should place in it, and resolve conflicts. Of course, there’s a W3C Provenance vocabulary for expressing that. The bigger question is how to capture the dynamic nature of the data over time. A person works for this company this month, and that company next month; how do I best capture both bits of information? The Provenance ontology provides a basis for capturing that in terms of a validity duration of a triple, but not necessarily at a single point in time, like a snapshot. I’d like to say, for example: “this triple is valid right now”, and then change it in the future, and say the same thing at a later time. It’s not as precise as a duration, but I may neither have start and end date, nor care.

Updating triples

Another question is around the mechanics of updating triples. At the moment I clear the store and do a full ETL from Contacts each time, but that’s clearly not workable longer-term. If something changes, I want to be able to insert a new triple and appropriately handle the old one by deleting it, changing it, or adding additional triples to resolve any conflict. That requires me to know what triples are already there. I can see an involved solution requiring me to query the existing store for relevant triples, determine the steps to do the update, and then apply the updates to the store. The SPARQL Update spec provides for conditional deletes but I need to work it through to see how to do it. There’s a parallel here to the problem of maintaining referential integrity in a relational database.

Of course, these are all very answerable questions, I just haven’t got there yet or seen existing solutions. Updates in later posts.

Future work

It’s still developing, and a long way off being useful. There’s also a bunch of related technologies I want to play with over time. Amongst the backlog, in no particular order…

  • Add some more useful queries and visualisations
  • Include hyperlinks in the returned data so I can start browsing it
  • Link to information in public stores such as DBpedia, i.e., real Linked Data
  • Set up a continuous deployment chain using VMs and Puppet, maybe on AWS for fun
  • Import LinkedIn connection data
  • Add provenance metadata with [PROV][]
  • Add more specific contact relationship information with the REL vocabulary
  • Leverage other ontologies such as schema.org and NEPOMUK
  • Look at using reasoners or inference engines, such as Pellet or EulerSharp

References

Apart from all the links, a few good pointers for learning more.

On the cloud

cloud

The IT world is in a period of dramatic transition. That may be trite to say when it’s been in transition all its life, but things are shifting more rapidly. The evolution and adoption of new technologies is also raising business’s expectations on being able to execute on new business models faster, reduce investment costs, and attract and keep customers. 

The usual suspects are at play here, the ones you keep hearing about: cloud, data analytics, mobile, social networking, etc.

The globalisation of IT through XaaS cloud services (X being whatever you want these days) is  forcing commoditisation and a necessary maturing of the market to deliver the necessary features and service levels at acceptable cost. That’s not a bad thing for either cloud buyer or seller. If you consider the vertical technology stack of (from the bottom) data centre, hardware, networking, infrastructure, platform, middleware, applications, business process, and business service, it’s clear there are huge opportunities for all sorts of vertical and horizontal integration plays, niche players, value-added services, and consolidation. It’s easy to believe that most enterprise services for all but the largest organisations will be outsourced within five years. How that plays for large shrink-wrapped software vendors is a major question, but I can’t see the advantages of buying large suites of software on capex over a pay-as-you-go opex model surviving the market shift to increasingly trusted (and secure) clouds.

It’s a classic disruptive play. Clouds may have been immature once, but they will increase share as they grow upwards, pushing the big players into alternative models or premium niches. Companies  don’t want servers, they want a business enabler.

Cloud will also therefore change the way business is done. The pace of business will increase. Markets will evolve faster. Three-year or even financial-year strategic plans won’t survive sustained contact with the market. The pace of change will only be constrained by the rate that we ourselves can digest change, and perhaps technology will have a solution for that too.

But I’m jumping ahead. “Cloud” right now is increasingly about service delivery, ITIL-style. SLAs will become the mantra of those who pay for services, cloud or otherwise. The challenge has already been articulated elsewhere: how do IT departments maintain their relevance as their company’s legacy systems migrate into the ether, and the business starts buying their own IT per user per month?

 

 

Enhanced by Zemanta

On the casual workplace

Image

Before I left my last job, I moved interstate and the company set up a local workplace in a serviced office. We had a room with four desks and a separate meeting room, hosted by one of the dominant providers. Simple and effective, but also spartan and completely soulless. While that certainly wasn’t why I left, the thought of working in that office for the foreseeable future wasn’t uplifting. At best, a spartan office encourages focus by lacking distractions, but, as humans, we like environments that reflect our personalities and perhaps even our imperfections.

Now take cafés. I love a good one. Not coincidentally, I find that 80% of my work meetings, both internal and external, take place over a coffee in a café. It removes some formality, which encourages more open discussion, and can build better relationships. Sometimes there are almost back-to-back coffees with people through a day. Over-caffeination is a real risk. I also find a cafe an excellent place to work solo, away from office interruptions, and I find my best creative or strategic thinking seems to come from such an environment.

So, the idea… a serviced office in a café. You turn up, maybe to a reserved table, and pay cover charge to the café, maybe $20 an hour. For that you get some decent wi-fi, access to a wireless printer, a power outlet for your laptop or phone charger, and a clean bathroom. The music is discreet, you order drinks and food separately, maybe to a minimum spend. Conduct your business, meet associates and customers, do your work, all in a relaxed environment and without the need to consistently pay your way with coffee and sandwiches. It’s not a substitute for a corporate office, but I suspect good enough for many, particularly solo workers or mobile sales reps.

Maybe it exists already, it’s just not here yet. Either way, I see opportunity.

Enhanced by Zemanta