RESTful Web Services for your Oven
There has been much discussion surrounding RESTful web services as a means of defining architectural best practices. Rails (http://rubyonrails.org) was a strong catalyst for the convergence of a specific naming standard, choosing convention over configuration. By creating classes with specific names, we can match instances of these classes to records in similarly named tables. The Device model can be automatically mapped to rows in the devices table, and as long as the names follow the convention exactly, everything just magically works. This decision to adopt what is known as the ActiveRecord pattern has influenced the best practice for object modeling and serialization. However, it may not always be the best strategy. As we build web services, it’s important to consider the architectural implications of one design over another. Let’s explore some options.
Case Study: Web Service for Oven
If we’re to consider a web service, let’s push the envelope a bit to include a real world example that might arise in the next few years. As of this writing, there are light bulbs on the market that can be controlled by web service requests over wifi. Philips has published the Hue API, which allows developers to build apps that interact with the light fixtures to change things like color and brightness. It won’t be long before there will be a product on the market that allows a web service client to adjust the temperature of the oven in your home via wifi. You’ll probably open an app on your mobile device to monitor data from the oven, adjust cooking temperature and time, and maybe even define a temperature profile to cook at 250F for 15min and then 450F for 1min. For ovens that support it, we could enable or disable the convection system to turn on/off at specific times during the baking cycle.
So, what does that look like in web service parlance?
Strategies
There are several distinct approaches to achieving the desired result. The one we choose in any given situation must be dictated by the relative merits of applying the technique to its unique scenario.
1. Action-Based GET Command
GET /bake?temp=350&time=30&preheat=true
What we’re saying here is that bake is a special action, that the target temperature is 350 (presumably Fahrenheit), cook time is 30 (presumably minutes), and the oven should preheat to target temperature before starting the cook timer. That’s a lot of assumptions, but these parameters are typically well documented in the provider’s API specification. For standards that find their way into consumer electronics, there is a document somewhere on the web that explains the options and their possible values, if any.
This sort of strategy works well for tasks that are simple enough as to be explained in a small number of input parameters. It is expected that the task will begin right away. No response value is expected. When the task completes, the oven will return to its idle state. Note that this strategy works only when the oven behaves like a sequential queue. There is one oven and it can cook one thing at a time. A separate task opens the door, removes the cake, and loads the pan for the next cake into the oven. In all likelihood, that would be a human task.
2. Timer-Based Request Sequence
After 0sec PUT /ovens/1.json {"oven":{"target_temperature":350}}
After 1800sec PUT /ovens/1.json {"oven":{"target_temperature":nil}}
This more closely follows the Rails style. We define a specific instance of an Oven with /ovens/1. We specify the communication language with the format extension, in this case json. We send a namespaced attribute hash, so the service can easily map the content body to an object in the service’s object model. With the class name matching the root key in the hash, the deserializer can magically find the actual oven for the given id (defined in the path, in this case 1) and update its state with the given attributes.
Observant readers will note that this strategy eliminates the possibility of a preheat period. This illustrates a deficiency in the architecture of this strategy. It lacks the ability to react to changes in the state of the systems being controlled. More specifically, this system requires frequent polling to achieve the preheat feature. Attempts to closely approximate this feature would likely result in a log like this:
PUT /ovens/1.json {"oven":{"temperature":250}}
GET /ovens/1.json -> {"oven":{"actual_temperature":75}}
GET /ovens/1.json -> {"oven":{"actual_temperature":125}}
GET /ovens/1.json -> {"oven":{"actual_temperature":175}}
GET /ovens/1.json -> {"oven":{"actual_temperature":225}}
GET /ovens/1.json -> {"oven":{"actual_temperature":250}}
PUT /ovens/1.json {"oven":{...}}
3. Task-Based Approach
POST /profiles.json {"profile":{"oven_id":1}} -> {"profile":{"id":123,"oven_id":1,"started_at":nil,"ended_at":nil}}
POST /profiles/123/items.json {"item":{"temperature":250,"delay":0}}
POST /profiles/123/items.json {"item":{"temperature":nil,"delay":1800}, "condition":{"min_temperature":250}}
GET /profiles/123/start.json -> {"profile":{"id":123,"oven_id":1,"started_at":"201303131234.567Z","ended_at":nil}}
This is a nice approach when you need to schedule things in advance or if you have complex sequences involving many set points over time. The inclusion of the condition parameter enables the preheat feature. It’s worth mention here that this approach implies a sophisticated web service interface to the oven.
In Stores Soon!
In order to support the kind of command capabilities described here, the appliance manufacturers will rely on industry standards. That means the commercial sector will largely dictate the direction and evolution of these standards. We need to build some rather sophisticated command and control systems, following RESTful best practices, so the appliance manufacturers can have something to integrate into their products. If there is no clear standard, they’ll lose their way. Worst case, we end up with a widely adopted system no one can actually use; or we have the Bluray vs HD DVD battle.
Inevitably, what will most likely happen is there will be a handful of home automation control system manufacturers that emerge with smart appliance controls that integrate with “smart appliances” (those that support early drafts of a standard). These home automation companies will drive the development of a standard architecture, in much the same way that web browser companies drive the evolution of the HTML and CSS standards. In any case, it’s an interesting way to think about configuring the settings on your oven.
Endnote: a quick shout out to Justin Davis (@jwd2a), founder of Madera Labs (http://maderalabs.com), for the topic. His tweet about RESTful web services to control household appliances is what inspired this post.
Threshold Dynamics: Supply-Driven vs Demand-Driven
In a perfect world, we design for an isolated set of conditions, call it the “equilibrium state,” and use linear first order system response as an approximation to model the actual system behavior. That works for many systems, where a single effect dominates the force balance. However, many systems can not be adequately described with a linear first order model. bus+ is one such system, as it requires many vehicles to provide infrastructure support for many passengers, additionally requiring location-based matching.
Consider three cases from the passenger’s perspective:
- Multiple candidate vehicles available
- One candidate vehicle available
- Zero candidate vehicles available
In the first scenario (multiple candidate vehicles), the primary challenge is filtering candidates and matching them to passengers efficiently. This system is driven by supply dynamics. For any one passenger (consumer), several candidate vehicles (producers) must compete for the fare (fee for service). bus+ defines a mechanism to govern the matching process fairly and effectively compensates for scale. This solution works well for this scenario, but it is overkill for the second scenario.
The second scenario (one candidate vehicle) is the simplest case. The matcher found a single candidate vehicle, so the standard bidding process is skipped. The passenger is assigned to the vehicle immediately.
In the third scenario (zero candidate vehicles), the system is inverted, instead driven by demand. As passenger demand surpasses vehicle availability, some process must govern the assignment of passengers that are waiting for vehicles to become available. This means introducing a location-sensitive queueing system to manage passengers. It also means adding a real-time human resources paging system to enable automated demand-driven notification of “on call” drivers – those who are not “on duty” but also not in a “do not disturb” state.
All three scenarios serve similar goals – matching passengers to vehicles. Each scenario represents a unique puzzle, governed by very different rules. The system as a whole must be able to satisfy all of its operating parameters, spanning all three scenarios. As the design and architecture of bus+ matures, we’re delighted to find elegant and compelling solutions to these evolving challenges. I’m pleased to share those solutions with others, in hopes that we might all benefit and contribute positively to this universal problem.
Device Testing for Complex Geo-Sensitive Systems
I was demonstrating bus+ at a party this weekend, and I encountered a problem that only really comes up in special cases. Since I specialize in solving hard problems, special cases are the norm in my daily work. In the case of bus+, there is a passenger app and a driver app. Passengers request point-to-point transport. Automated dispatch matches passengers to drivers. Drivers pick up and drop off passengers. When there are multiple apps for different kinds of users, all coordinating together within a system, it can be tricky to do device testing. If, for example, one app must be in the foreground at the time a message is received in order to function correctly, it can be a challenge to test two or more of those apps on the same device. One app is foregrounded to initiate the producer action. Then, the other app must be quickly brought to the foreground to handle the message from the producer. The consumer app must then be backgrounded in favor of the producer app, where the response message from the consumer is handled. This makes for an ultimately ineffective demo.
There is also a bigger problem of geo-sensitivity. It is physically impossible to test all the bus+ features on a single device. Once a passenger is assigned to a vehicle, we use geo-fencing to detect pickup and drop-off and to notify the passenger of the arrival time of their ride. As the vehicle moves, it eventually crosses a virtual boundary in geo space, which triggers an alert to the passenger, saying “i’m 5mins away.” If the driver app is running on the same device as the passenger app, the vehicle will be detected as “arrived” immediately upon assignment, resulting in a flurry of messages that are not representative of real world app behavior. The only way to test the full behavior is to use separate devices. Then, the problem changes from a technical one to a social one.
So far, the best solution I’ve found to multiple device testing is to reach out to developer communities or friends and family to find willing participants. Realistically, if you can’t find ten people willing to test your system, you’re probably not addressing an urgent market need. If you can find people who are your target market who are also willing to participate in a private ad hoc beta, you’ll be that much closer to proving your concept and taking it to market. Regardless of your audience during private testing, you’ll need to coordinate with your test users to make sure they know what you expect of them. If, as in the case of bus+, your system involves user actions outside the app (driving somewhere to pick up someone they may not know), you need to be very clear up front that you need more from them than just a few minutes messing around on their mobile device. We’re nearly to this point with bus+, so I’ll need to design a coordination plan for my test users, so they’re all working together. If someone is testing passenger workflow when no one is actively on duty with the driver app, the passenger is doomed to disappointment. By communicating with the test team, we can minimize the total wall clock time we need to find the broken bits and fix them.
Ultimately, there’s a leap of faith the developer must take when publishing this kind of app. The first day we sell our driver app, we expect to see a lot of email traffic from drivers wondering why the app doesn’t do anything after they set it up. For this app, the driver and passenger apps will hit the App Store at the same time. However, there will definitely be a period of time when passengers are disappointed because there are no vehicles in their area yet. We’ll use email campaigns to mitigate as much of the disappointment as we can, but we expect to see low ratings and bad reviews until the early adopters populate the ghost town. We can also counter the ghost town effect with targeted marketing and strategic partnerships. My hope is to approach Super Shuttle, Yellow Cab, and county transit officials to join as pre-launch partners, so we have sufficient user base on the driver side to meet the sustained demand we’ll see from the passenger side.
Information Architecture Diagrams
I’m a huge fan of UML. While I haven’t really written much about it, I have used UML for many years to visualize complex interactions between systems in time and space. UML provides a set of diagrams that encapsulate key perspectives of a system. Use case diagrams illustrate the ways a user might interact with the various components that make up a system. Object diagrams show a detailed definition of the information being manipulated and how it relates to other information. It’s nice to see a visual representation of time and sequence in activity diagrams. These graphical reminders give people an anchor to stabilize their understanding of a system described in words elsewhere. However, none of the diagrams in the UML toolkit shows how everything fits together. For that, we need a new format, which I call an information architecture (IA) diagram.
This is not a new term, but I believe it is new in this context. There are three key aspects of an IA diagram – object model, user stories, sequence. When systems have multiple actors, all working together to achieve some task, it is essential that the architect maintain a keen awareness of choreography. In dance, someone must coordinate all the dancers to work in harmony, or the lead ballerina may take an embarrassing fall, kicked in the face by her partner’s badly timed arabesque. Similarly in a producer-consumer model, we normally pay for goods before we transport them home. I doubt many retailers would be fond of a business model where the purchase was made on delivery, especially since most shoppers do not have point-of-sale card swipe equipment in their homes. Order matters. It’s important that each actor take appropriate action at appropriate time along a sequence. The goal of an IA diagram is to distill the critical interactivity idioms into one representation, illustrating the point in time of each action, the actor, and the information being manipulated.
Actions (CRUD operations on RESTful endpoints) originate from the actor timeline and point to their appropriate object. The single and double arrows in the object model represent belongs-to and has-many relationships, respectively. An envelope symbol pointing to an actor timeline is used to denote asynchronous notification, optionally with a label. Notifications stemming from an object represent on-create, on-update, etc, event triggers. As with the systems represented in an IA diagram, a text description of the diagram itself can not do justice to the glory that is a well-executed example. Without further ado, I present the following sketch of the passenger workflow from BusPlus as self-evident.
My most significant inspiration for developing this diagram technique comes from Edward Tufte, a master craftsman in the art of visualization. His book, The Visual Display of Quantitative Information, is available on Amazon. It is well respected as one of the most influential texts in the field.
Geospatial Visualization & Interactivity Techniques
Lots of apps use search and browse features to expose large catalogs of useful information. In recent evolution of the market, this information has begun to include geospatial content, joining the location and time stamps along with other data. It could be a location-tagged picture or status update. It could be some traffic data, alerting you of specific roads to avoid at specific times. In the retail space, we’ve encountered a challenging intersection of technology, interactivity, and human behavior.
Background
I’ve been developing the infrastructure and mobile and web apps for Rakiteer off and on for about a year and full-time for the last three months. In a nutshell, we tag sale items with a location and time stamp, along with details about the product and make them available to consumers for purchase through their mobile device. You can also think of this as a real-time feed, ordered by time stamp, giving consumers a stream of deals as they’re discovered. Remember, these are deep discount deals on scarce items, usually no more than a few units in stock, not a product line available everywhere in abundance. The challenge: organizing data by relevance in time and space.
Solution 1
At first, we believed there were only two modes people might want to use – nearby and worldwide. We know that the majority of users will want to see things nearby their current location. Shopping is a largely impulse-driven emotional decision process. People are far more interested in deals in local shops than they are in deals across the country. A sure-fire way to guarantee user disappointment is to show them awesome deals on things they love in stores thousands of miles away. Fortunately, the nearby filter is pretty easy. We use a bounding quadrangle (four lat/lng points or a center point and grid size) to constrain in space, and the results are ordered in time and paginated in groups of a hundred or less. Small result sets can even be ordered by point-to-point distance, as the square root required to compute the distance is manageable if you only need to calculate fifty values, but that sort action happens on the device, not on the server. For the worldwide mode, though, we found ourselves questioning the relevance and value to the consumer. We didn’t see any meaningful use of data from all over the world, ordered in time. That seems like watching the global twitter feed – useless information overload.
Let’s take a step back for a moment. What does nearby mean? Well, we normally think it means near our current location, but it can also mean near a different location. As we fold that into our understanding of the user’s needs, we begin to appreciate something else. Maybe the user will want to look for items near a place they plan to be at some later point in time. Let’s say I’m planning to visit my family in Maryland for the holidays. I know that Delaware has no sales tax, so I might want to discover deals within an hour’s drive of my grandparents’ house, so I can take advantage of the tax haven. With the nearby/worldwide toggle, I can’t find this information until I’m already there, and by the time I get there, I’ve probably forgotten I was going to go shopping.
Solution 2
After realizing the nearby/worldwide mode switch doesn’t really work as well as we thought, I set out to find a new interactivity idiom to meet our needs. Ironically, it was a user complaint that crystallized the direction we took. Someone complained that the app was not showing them any results. As we walked through the guide questions to help us understand what the user’s expectations were, it became clear that the user had declined to grant our app access to read their current location. It had simply never occurred to us that someone might decline that service. By design, the app is pretty much useless if you don’t allow location services. After this experience, I knew we had to find a solution that worked even when the user declines location services. In the end, I discovered a far superior solution that does so much more to help the user find value than we ever could have achieved with the previous solution.
Now, the user selects the search region by navigating the visible bounds of a map. This provides a simple, intuitive control that clearly defines the geospatial bounds of the search, while also providing immediate feedback about whether the current region contains any results. At all map scales, we display a translucent solid color overlay indicating rectangular areas of the map where there are deals. At small scales, we display the locations of deals as pins, telling you exactly where those items are relative to your current location. With this design, the user never needs to guess an area where there might be some valuable data. They simply zoom the map out a bit and look for red regions. At the moment, we’re only using a single color, but the goal of a heat map is to indicate a continuum of content, where bright spots are highly active data-rich areas and dark spots are areas with stale data. We’ll eventually use shades of color to represent relative densities between grid cells, but thanks to our architecture, we’ll have a lot more options for visualizing the information in a compelling and useful overlay.
Infrastructure & Performance
This innovation has dramatically improved the usability and development of user confidence in our brand and largely eliminates the “ghost town” effect of location-sensitive apps, showing no data for your area because you’re the first to use the app there. But, observant readers will probably be saying “uh, you can’t index the internet on your user’s device, so how are you distributing that data?” Let’s dig into how we’ve achieved our scalable heat map solution.
My first foray into heat maps was in building one for Tour Wrist. We encountered much the same problem with Tour Wrist as we do with Rakiteer. How do we help users find results they want, especially when the data itself is not easily searchable? Sure, we used tagging and schema-defined meta data to describe each tour, but there is a lot of unsearchable image data there. More importantly, the location problem persists. If you’re looking at a nationwide map view, you can’t show all the pins for all the locations in the country, and paginating the list means cutting out a lot of data. A heat map is the answer.
Heat maps visualize density information. By slicing the world up into a discrete grid, we can easily represent aggregated data density information with colored overlays. When I built the heat map for Tour Wrist, I naively chopped up the world into a 1deg grid by latitude and longitude. That makes a 180×360 grid, or 64,800 cells. The heat map aggregation system is simple. Any time a record is created, we check to see which grid cell contains the location stamped on the record. We increment a counter on that grid cell. If a record is destroyed, we decrement the counter. Tour Wrist hosts 3D panoramic photos, called tours, from photographers all over the world, so the grid needed to span the entire globe. For Rakiteer, we’re focusing on America for now, so we didn’t need the global grid. Also, for Tour Wrist, we used an eager loading approach, where we create all the cells for the whole world up front and then update the records as data flows through the system. For Rakiteer, we use a revised lazy loading approach, only creating a grid cell when there’s data to represent there. This is an important distinction, as it eliminates the need to filter out records with zero values when rendering the overlay.
So, now we have all this wonderful elegant data structure for storing the information efficiently on the web service side. We still need to convey that data to every wireless client in our app’s ecosystem. We can have the map overlay render based on a web service call, allowing us to construct the overlay showing only data for the selected region. That would seem to make sense, since you’d want to minimize the information you need to send from server to client in order to have sufficient data to render the overlay. However, it really makes more sense to sync the entire heat map data set to the device, so you can render the map without needing to make a slow web service call in the process. If you already have all the data locally, you can render quickly.
We sync the heat map using a progressive sync technique. The first time a user launches the app, we sync the entire collection of grid cells and store the time stamp of the request. Then, every subsequent time the heat map syncs, it only requests grid cells that have changed since the last sync. We can do this because each grid cell maintains a record of its creation date and last update date. This allows the web service to maintain current copies of a large sparse data set on every user’s device efficiently. Since the heat map data is updated in real-time with the creation/destruction of the data being aggregated, the user’s heat map data is always kept current.
By considering the big picture of how the data is consumed, how the user interacts with it, and the performance impact of distribution, we were able to find an elegant, scalable, high performance heat map solution that integrates very easily with any web service and/or MapKit app. We are planning to package the heat map code into a gem and publish it sometime this year, so other Rails developers can leverage our work to bring value to their users.
What Does Mitt Mean? A Social Experiment
I’d like to preface all of this with a simple assertion. My political beliefs have no influence over the content of this post. I merely provided a tool for others to share their feelings. The entire experiment was done anonymously, despite substantial additional data mining opportunities with authenticated sessions.
At around 10:30am this morning, I had an idea. There’s been so much conversation in news and talk shows in America about the vagueness and lack of specificity in Mitt Romney’s campaign rhetoric. I thought to myself, what does he mean? Then, because my brain is highly susceptible to tangential effects, I thought about what the word “mitt” means, how many possible interpretations there are, just of that word alone. As I sat there, expanding on all the various words that came to mind when thinking of this one word, I began to realize that lots of other people must be having similar thoughts. I especially enjoyed http://romneytaxplan.com, so I thought I’d try my hand at achieving a similar goal, but collecting interesting data about people’s raw emotional reaction to the word “mitt.”
And thus, http://whatdoesmittmean.com was born. I checked on the domain. It was available. I bought it immediately, and set about to build an app that would capture the gut reactions people have. It is designed simply, with the intention to minimize bias in the user’s response. I considered showing pictures of Mitt Romney or oven mitts or baseball gloves or kittens or cute little girls playing in the snow. In the end, though, I concluded that any visual cue would result in a strong perceptual bias. Given the point of the experiment – a measurement of unfiltered reaction to a word – it felt best to bound the perceptual model with a short and highly vague prompt.
What Does Mitt Mean?
With just that one prompt, we invite users to enter some things that come to mind when they think of that sentence. We even give a placeholder in the text field, saying “first thing that comes to mind,” hoping to inspire a more visceral experience. No time to think, just say the first thing that comes to you.
Right now, we still have only a small data set from the few dozen folks who have visited the app and contributed. As time passes and I continue to promote the experiment in social media over the next few weeks, I expect to improve on the presentation of data. There will be a word cloud and maybe a top ten words list. I’m open to suggestion, too, so if anyone has an idea for different interesting ways to represent the information we’re collecting, please comment on this post. Also, please tell your friends. I’ll be adding social media sharing widgets to the app shortly to make it easier to share it. If you care about the answer to that question, please spread the word. As more people see the question, we have more opportunity to collect answers. Those answers help us better understand each other and our general impression of Mitt Romney.
Endnote: I love the fact that I can go from absolutely nothing, through concept, to an app running on a new domain in under 2hrs. That’s the power of Rails and Heroku.
Cost-Efficient Web Service Hosting with Heroku, A Case Study
We use Heroku for all web service hosting for our projects. It’s free to get started, and the free service level offers substantial performance. One of the most common concerns I hear from clients and venture partners is the fear that hosting costs will eventually be prohibitively expensive, as the system scales. I’d like to offer the following case study as evidence of the actual cost per performance.
Case Study: Uptimetry
We’re very proud of our URL monitoring service, Uptimetry, which we’ve been operating for about 18mo. It provides a simple service of polling web resources regularly to verify availability and validate content. Last month, the Uptimetry system processed a little over two million requests. Using the New Relic performance monitoring tool (a free addon), I can gain visibility into all sorts of interesting metrics. For the purposes of this study, we’ll focus on the web transaction reporting features. Using this tool, I was able to determine the average request response time and the average CPU usage required to process all the requests. Uptimetry has an average response time of about 100ms and uses about 10% CPU. This is with a single server instance – what Heroku calls a “dyno.” The first dyno is free, and you can scale up to a max of 100.
It follows logically that a single dyno could handle twenty million requests per month, assuming the average response time remains constant at 100ms. If we stretch our assumption to maintain the response time as the system scales, the upper bound of capacity would be 100 dynos running 100% CPU. That works out to two billion requests per month for a price of about $3600/mo, or about $1.80/mo per million requests. Of course, this number is based largely on the 100ms response time, which is fairly aggressive, even by 2012 standards. Our conservative target is typically 500ms. I can’t exaggerate how happy I am to be able to claim actual performance that is 5x better than our target. I’m even happier to say it costs us nothing to deliver that.