Saturday, June 22, 2013

SDN: which approach, network or IT?

Introduction


After many SDN presentations about SDN by different type of vendors, I arrived to the conclusion that there are two distinct approaches on SDN: the network and the IT approach. Interestingly the two approaches lead to the same results from the network perspective, but completely differ on what is the implementation roadmap and the impact on the position of network operators in the ecosystem of digital service providers.

Approaches


Network

The network approach generally explains that by using SDN, costs (CAPEX and OPEX) will be reduced for the following reasons:
  • commodetization of the hardware thru virtualization = CAPEX reduction
  • automation of tasks by having widespread control of the forwarding plane by the software = OPEX reduction
When asking what equipments are included in the SDN , the vendors taking the network approach give the classic list of all the existing network elements like routers, switches, firewalls, load balancers, protocol converters, base stations etc... etc.. However they consider the data center resources and the devices are attached to the network, and therefore not part of the SDN story.

These vendors are also very quick to indicate the flaws of the approaches taken by the non-network people or vendors who have an SDN story:
  • the controller will not scale,
  • the forwarding plane to controller round-trip is too slow if applied to each packet.
and therefore they are very convincing in introducing a new layer and new equipments (hopefully on commodetized hardware) in order to cope with these flaws.

IT

The IT approach looks at SDN to solve a data center problem associated with the fact that if a quick availability of a new IT resources (computing, storage...) is needed, the network has to be agile and quick to configure. The network must also be very cheap to build and maintain since for IT the network is a cost, meaning that OPEX and CAPEX have to be as low as possible. And since the dynamic allocation of the IT resources is done by the software, it makes complete sense to actually let the software control the network.

The equipments included in the SDN story are of course the network elements, but also includes the IT resource themselves.

The IT approach generally considers approaching SDN by creating/implementing an overaly network with a distributed controller, controller to controller connections and controllers linked with the already existing distributed brokers (acting as controllers) of IT resources in cloud or hybrid cloud solutions. Implemented in this overlay network is the ability to cache forwarding logic which is a familiar pattern already being used in many other places (the web will never work if caching didn't exist...).

The vendors taking the IT approach generally don't have much opinion on the network approach or see it as an opportunity since once SDN is implemented in the network, it will be possible to expand the overlay network.

Implementation results 


What are then the different results of implementing SDN using the two approaches:
  • From a network perspective, there is not much difference since it is clear that the network will be cheaper to implement and maintain and will be more agile and quick to configure no matter which approach is taken.
  • From a roadmap perspective the IT approach is clearly pushing an outside-in roadmap (creation of an overlay -> to the core network) while the network approach is generally pushing to address first to expensive components hence an inside-out roadmap. However the network approach also implies the implementation of a new layer of distributed controllers. The two approaches lead to two different roadmaps but it may be possible to create hybrid roadmap to mitigate the risks.
  • From a provider perspective, the nature of the provider and the existence of the two approaches lead to a dramatic difference of positioning:
    • An OTT (Over The Top) provider will always consider the IT approach with a top-down view including the IT resources, the network and the devices (home/enterprise gateways, terminal) since it addresses key IT problems and opportunities (cost, agility, distribution of the load, distributed software....) and because the network group generally is either non existent or a minority within the IT organization. Since an OTT will generally create an overlay network, it is always possible to opportunistically expand it on top of the SDN network operator will implement,
    • A network operator will generally consider both approaches from a bottom-up view, unfortunately in separate groups: two groups (network and IT) or even three groups (core network, access and IT) and will generally not include the attached terminals anyway. Most likely the network approach will be picked since it addresses the current problems of doing more with less and because the network vendors are delivering a comfortable message with acceptable disruption.

    This situation is actually very similar to the one delivered by the same network vendors few year ago with IMS but instead of cost reduction (the new problem "du jour"), it was about innovative services (the old problem "du jour").  However once implemented these technologies have left and will leave the network operators in the same bad spot witnessing a continuously expanding ecosystem of high value OTT services (including the innovative services IMS was promising) while only seeing an ever increasing traffic of low level packets.

    In the SDN case network operators may be able to save costs (the constant increase of traffic may offset the savings)  but will not be able to be part of the IT supply chain and therefore will miss  the massively distributed cloud movement and its related solutions.

Call to action


As described in my previous blog, implementing SDN will give to the network operators an opportunity  to change the game and be part of the IT supply chain. However this means:
  1. eliminating silos (IT, network, access, terminal) within the network operator,
  2. while keeping a bottom-up view, network operators need to understand and embrace the top-down view which includes the distributed software part. Today software solutions are more distributed and dynamic/elastic than the network itself (in the 7 layer OSI model, the layer 7 has sub-layers that define a logical network where the nodes are software elements implementing API, the links are the relationships between these elements and the packets are the messages/events passed between these elements),
  3. making sure that the vendors (including the software only vendors) who have an IT approach of SDN are considered and evaluated in network environment,
  4. pushing the vendors who have a network approach of SDN to embrace the IT approach by:
    • treating any network element as IT resource (may be we should have more memory and computing pre-installed in network elements):
      • be capable of running multiple workloads (network workloads and IT/software solution workloads),
      • must be linked to the IT resource broker (cloud or hybrid cloud),
    • treating any IT resource (within data centers or on premise (home, enterprise)) as active SDN network element therefore must be linked to the distributed controllers either to provide or receive control information, 
  5. being more prescriptive on the solutions vendors/SI need to provide,
  6. considering any gateway or box installed on premise (home, enterprise) by the network operator as an active SDN network element and an IT resource,
  7. working with the device manufacturers to treat any subsidized devices as an active SDN network element and an IT resource with specific constraints, 
  8. partnering with IT resource providers to expand the type of IT resources available for the software solution developers and providers.

Monday, June 10, 2013

M2M Platform

Creating a platform is a complex and delicate task, and needs to serve multiple purposes,  one of the most important being to support and simplify  the work of the developers who will be using/consuming this platform. This is particularly true when talking about M2M platform, which is be a very specialized platform.

Looking at the device to back end infrastructure communication and integration evolution, it has always been an issue especially when dealing with "small" devices, and most of the time dedicated and domain specific solutions were developed since both the front end (device related) and the back end (digital service related) had to know about the specificity of each other’s in order to create a proper solution.

One of the main aspect of a M2M platform will be to remove these dependencies and create a proper set of boundaries which will allow a looser coupling between the devices and the digital services, provide a scalable solution (technically and economically) and create an appealing environment for developer that does not overlap with  existing platforms . In the advent of modern integration, a platform implements and exposes APIs that allows a consumer of these APIs (other digital services, app, etc...) to use functionalities  without having to worry about how these functionalities are implemented.

Core Description of the Platform

Taking a top down approach of a M2M solution what are the core problems a digital service does not need to deal with and therefore will delegate to a M2M platform:

Device connectivity

It is key that the device connectivity is abstracted and only the properties of the connection need to be exposed instead of describing the connection itself. So in a way the facts that the device is using a wireless or wired connection, a wireless connection is over WIFI or over a cell network are actually irrelevant, however the facts that the connection is of variable bandwidth, the device may or not lose connectivity (disconnected is not a error but treated as an exception), the connection is chargeable instead of being free (notion of rating and account associated with the connection)  and the connection is secure are examples of properties that the associated digital services may be interested in.

It is also key to handle how the logical connection is established and maintained: client centric (client polling /keep alive), server centric (long running connection, going thru a third party (eg. pusher) etc..) or maintained by the network. Each of these solutions have their own merits/limitations and results in different behaviors that need be exposed, not explicitly but as a set of attributes of the connectivity (eg:  real time…) to the consuming digital services which do not need to understand how the logical connection is established and maintained.

Due to the network versus application separation, it is interesting to see that there are two approaches of the device connectivity: an OTT approach which basically considers the network as an IP fabric, and the network approach which allows some optimizations based on the network knowledge. A less polarized view should be taken, and opportunistic optimizations should be applied to benefit from both world.

Implementing a proper device connection abstraction is not a trivial task and may lead to the implementation of sophisticated software elements either in the device, at the network level or in server components which are actually implementing the  device connectivity API themselves. The distribution of such elements may also vary with the intelligence (ability for the device to store and  execute a pre-installed payload or a payload coming from a server) of the device, and therefore may be device heavy, server heavy or a combination of both.  The network and the different systems involved will need to be flexible enough to handle (physically, economically...) the scale of the solution. On the server side the major difficulty of the device connectivity is actually the handle a large amount of long running connections which actually may not carry much load, but because of the number of connections, each servers handling them will have to spawn many threads and therefore efficient thread management will be an issue. It is also possible to have an elastic cloud solution using  a large number of small computing instances, which can be created and release quickly.

Device management

Digital services don't want to directly deal with the device management aspects but may want to invoke the device management capabilities via APIs either to make sure that the device has the proper configurations/profiles and/or proper binaries/payload running in the device  or to assess the static/dynamic properties of the device (screen size/battery status).

The device management relies on the fact that a device is a connected device and therefore will consume the exposed device connectivity API.

Requests for downloading specific payload to a single device or a group of devices which may lead to working on a specific schedule based on the number of devices or on dynamic characteristics of the network at the time of the actual download, is an important aspect of a proper device management but need to be fully abstracted in order not to put the burden on the consumer about the complexity of the tasks.

Similar to the device connectivity, how the management capabilities are implemented must be hidden from the consumer of the capabilities, and may imply software elements in the device, on a  server, or a combination of both  and the distribution of these software elements will vary based on how smart the device is, how good the device connectivity is and what the consumer of the information actually need. For example some of the device dynamic properties may be pre-fetch (sort of reverse DNS) and cached on the sever side and will be what the consumer of the device management will see instead of accessing the device itself...

In order to improve performance and scalability, the device management needs to use existing utilities like CDN, caching,  device static properties repository and again based on the variability of the load must use distributed elasticity (up and down) of IT resources (computing, storage) in order to be scalable (economically and technically).

Device abstraction

 This is the final but most tricky part of the M2M platform since it sits on top of the device connectivity and device management (this one of the consumer of these components), and handles most of the true functional aspects of the M2M platform, which is to provide a proper description of the device/group of devices  independently to consumer of the platform. It is actually key to protect that independence, otherwise we are not dealing with a platform but a framework which generally have scalability (technical, operational and economical) issues. The device abstraction also has to be two-way from the device to the digital service and vice-versa in order to provide the full scope of the platform:
  • On the device to digital service way abstracting the data/events generated/handled by the device is the main task since it is key to provide the proper meaning of the data to the digital service that will consume it. An electric meter may send a number for the consumption which will be meaningless if the unit (Watt)  is not added, a most extreme case of abstraction is handled for some very low level devices which only give the information by doing a memory dump... the abstraction in this case will have to filter the memory dump to extract the right information described on the API. Since the digital service may have requested the download of a specific payload inside the device in order to establish a high level relationship between the device and the digital service, the abstraction may become a path through or a stage as part of the payload execution, however this generally completely opaque to the platform  considering the specific functionalities embedded in the payload (again focus on the independence between the M2M platform and its consumers).
  • On the digital service to device way the actions may be handled in stage in order to cope with the limited capabilities of the device to handle the request or the consequences of the request. A digital service may need a specific payload to be rendered by the device (for example a user experience) and the device abstraction may perform some pre-processing before actually invoking the physical device itself (opera mini type). A payload needed by the digital service may define  its own API which may or may not be accessible by other digital services.
Depending on how smart the device is, the device abstraction may be pushed to the device itself. In which case the server side will only be a way to discover where the device abstraction end point is.

The grouping of devices is important but needs to be approached with caution since it may not be very useful (an potentially confusing/limiting) to perform  grouping within the M2M platform while this grouping could be done by non-specific computing platform and most likely will be what the developer is used to instead of depending on the M2M platform to do that. Since the M2M platform is representing by a series of API, it is very important to understand that many existing platforms (either as off the shelf technologies or as  “as a service” components) are fully capable of aggregating exposed API from the M2M platform and other platforms and will be most likely used by developers instead. The definition of the grouping performed by the M2M platform is therefore important and will cover the following:
  • expression of complex device model....a device may be described as a unique entity called the “simple” device (because it is simple enough, or because the device itself (no matter what is its level of complexity) does not allow to go deeper than one level), however other models of devices may need to be described as a set:: 
    • a physically bounded set of devices that share the same connectivity or are bound together because a specific physical condition (mobile phone (sim, phone), vcr (scheduler, recorder), a car, a plane etc…), 
    • a gateway device representing a collection of devices that are or not directly addressable, (a home gateway, a mesh network hub… ) 
    • (will be interested have an exhaustive list of device models of better a canonical set which by composition is identifying each model ). 
These sets can be cascading since a complex device is a device and therefore can be added in another complex device. This mean that to access the complex device, the device connectivity, device management, device abstraction

API will have to be defined, either on their own or as delegated to the API of one of the devices of the set. Each of these form of groupings are important since the digital service that accesses the set of devices will be capable to understand the level of complexity (constraints, properties(static/dynamic) of the sets…) and act on it.. How far can we go in the modelling provided by the M2M platform will have to be assessed since one a key reason of a platform is to simplify and also we have to take in account that even this modelling or part of it can be defined outside of the M2M platform (aggregator business).
  • grouping with a specific context (geography, political, administrative, topic etc...) mostly for the digital service to device way, a digital service may want to update all the devices of a specific region so instead of sending as many requests than the devices in the group, only one request is sent. This form of grouping is not a device but has specific API to describe the properties of the group. When dealing with REST architecture (which should be the way we handle this type of platform) a device is a REST resource and it is convenient to have the notion of group of devices as a REST resource too and this implemented by the platform, but in this case this grouping is not to aggregate information but provide a mechanism to easily discover the resources contained in the group. This is similar to the notion of group when dealing with files then the associated group notion is folder, with photos the associated group notion is album and for contacts the associated group notion is address book. etc…
No matter the form of grouping, it is very important to define either the device model that the grouping implies or define the context which defines why the grouping exist.

Extended View of the Platform

In order to complete the M2M platform it is necessary to handle others domains that the M2M platform will used for implementation or rely one for defining solutions:

API exposure

Once the API for the devices (“simple” or complex) and groups of devices are implemented, it is important to expose these APIs in a way that it is compelling to developers and also perform the necessary tasks that are specific to exposing APIs like:
  • developer registration, 
  • authc/authz on API, 
  • application registration (not download: API exposure is not synonymous to Application store), 
  • API business model (even freemium is a business model)
  • API throttling(per developer/per application)
  • API metering
  • API sandbox
  • API documentation presentation
  • Code sample , SDK etc…
All of these tasks are very generic and should not be associated with the semantic of the API themselves and off the shelf platforms (generally as part of the API management domain) exist to handle that. It will be important for the M2M platform not to replicate such functionalities but more rely on a logically centralized (physically very distributed for scaling reason) facility which will be shared with other platforms, providing a complete set of API that can be used independently or as a composite.

API Consumption

The final step which is out of scope from a platform perspective but in scope for defining an end to end solution is the API consumption. This step defines what and how  solutions are being developed on top of the set of expose API. There are many ways to consume API in order to create a solution:
  • Client mashups (all the logic of the solution runs in a client and direct call are made from  the client to the API).
  • Front end aggregation (all the logic runs on a front end server on behalf of the client and call to API are made from the front end server
  • Cloud mashups (the logic of the solution runs on a server and  the call to APIs are made from the cloud)
  • API adaption (API form the platform are adapted to serve a specific purpose within another platform on top of the M2M platform)
  • API aggregator/broker (aggregate different APIs to create a new API or aggregate many APIs that expose the same operation to just one end point)
  • ….
And of course many solutions are generally a combination of all of these ways. There are also a large amount of platforms helping the development of solutions, either as software components (products from software vendors or off the shelf managed open source components) or as service (force.com, app engine, azure….). Therefore it will be important to let the solution developer use the best platform for what is needed and what the developer is familiar with instead of imposing a specific model that will force change of behavior. It is also very important to define a clear scope of the M2M platform as a set of expose APIs around device (connectivity, management, abstraction/”simple”, complex/single or within a group).

Defining vertical solutions is key for the success of the platform however the solutions should really use the platform instead of taking a silo approach. As many start ups describe it: “develop horizontal, sell vertical”, which shows the tension between the platform and the solutions but also indicates that solutions are Trojan horses for the platform. More and more the APIs/platform are as important than the solutions themselves and should be delivered at the same times.

Cloud environment:

The different components of the system (device connectivity, device management and device abstraction) must have separate level of scalability also may have to be elastic (up and down) in order to cope with the variability of the load.

-          The device connectivity has the task to handle a large amount of active connections without necessarily handling a heavy computing load and for that reason it may be needed at the server level to have an massively distributed elastic pool of small instances.

-     The device management and abstraction look more like a classic service solution with a variable load with  potential high load (payload download) therefore could be handled by an elastic pool of medium/large instances, but with reduce level of I/O.

This platform should therefore not look as a monolithic system but a set of cloud based sub systems with different cardinality reducing  from the edge (device connectivity) to the center (device abstraction) and converging to an API exposure system that presents what the platform is about. Each of these cloud based sub-systems have specific scalability requirements that are handled via elasticity and instantiation of specific IT resources.

With the emergence of edge cloud IT resources, it is clear that the M2M platform could be a prime consumer of such  IT resource since it is important for offload as much as possible centralized data center IT resources and it is improving latency requirements which is a key aspect of M2M solutions.

Analytics


On top of handling specific aspects via API, the nature of a platform is to generate relevant data about the activities within the platform. Many levels of information can be produced, at the infrastructure (including network, virtualized OS), application and service level , and each subsystem must be treated as a source of information. While it is not necessary possible or practical to specify a single format for all the data generated , some common tags need to exist for making correlation (horizontal or vertical) to exist.  The data is dumped into a analytic environment either in real time or in batch mode and knowledge extraction (value) is performed. The results of this knowledge extraction must be used by the platform itself  (self-improvement, feedback mechanism, analytic based elasticity…) and by other systems (monetization..).

Conclusion

As a summary, a M2M platform actual focus is to handle devices as service (synonymous to device as a resource in a REST terminology) and therefore will implement in a cloud environment three types of API per device: connectivity, management and abstraction. The device can be a “simple”/complex device or a grouping of devices based on a specific semantic which by itself will have specific API
The platform should then use off the shelf software/platform to expose the API.

Once the API exposed, the consumption can take many forms and it should be clear that while we need to implement end to end solutions, the solution will used many other component/platform than the M2M platform itself.