Back to Air Transport IT Review - Issue 2, September 2009
Is zero downtime achievable?

Aiming for zero IT downtime is a tough challenge, but it is one that must be faced. IT systems are already the backbone of the industry's business activity. The future business requirements of the air transport industry will make it even more dependent on networked technology.
- Heavier reliance on e-commerce/e-business and the use of the Internet
- Paper-based documents replaced by electronic files
- Greater use of kiosks / mobile phone / Web check-in
- Biometric security screening
- Automated boarding gates
- Airport-based employees armed with digital handheld devices
- RFID and sensor technologies for baggage / cargo / asset tracking
- E-enabled aircraft downloading and uploading operational information
The result will be dramatic increases in both network traffic volumes and the complexity of IT hardware and software. It will put increasing strain on IT systems within the industry. As a consequence, the cost and the probability of experiencing failures or unavailability will also increase.
Failure of IT infrastructure and systems can quickly result in lost selling opportunities and operational problems.
Lost selling opportunities
Airlines, in particular, are increasingly positioning themselves as e-commerce companies. Currently, around 27% of ticket sales globally are derived from online sources, of which around 90% of these are generated on the airlines' own website.
The trend is increasing and by 2012, online sales will be the primary sales channel for many airlines. Some carriers have set themselves targets in excess of 90% of sales. Ryanair has set a goal of 100%. This puts airlines in the position of e-commerce companies, such as eBay and Amazon that almost exclusively rely on their websites being continuously available.
That lost opportunity is not just restricted to ticket sales. Ancillary revenues initiated from the website, such as car hire and travel insurance, can also be hit with website downtime having a ripple effect on the revenues of online partners.
A more intangible cost is damage to the brand. Underperforming websites for example, frustrate customers and negatively impact the company's reputation and customer's perception,, driving them to the competition.
Operational disruption
Operational costs when IT fails are harder to quantify, but no less serious. Getting increasing levels of passengers through the airport involves executing a tight set of parallel and connected activities. A failure at any point can have consequences that cause costs to multiply.
Staff overtime and lost productivity can also be affected. What is clear is that I.T. is becoming more embedded in the business of airlines and airports. Employees are becoming more IT dependent, which means they will be less able to function normally in times of outages.
Moving towards zero downtime
Advances in technology have already provided improvements in the robustness and resilience of the industry's IT infrastructure. Redundancy, load balancing, link aggregation, and mirroring systems are all commonly integrated into IT architectures to reduce downtime.
Self-healing systems
Autonomic systems that monitor, diagnose and repair their own internal problems are a step forward. Self-healing architectures can trigger automated reactions like engaging backup systems, ordering replacement parts, or downloading fixes from online collaboration software. This can happen real-time, ensuring that customers do not experience performance decreases.
Virtualization
Virtualization will provide another step toward zero downtime. This technology frees software from the underlying hardware, which means it can be easily moved from one server to another in the event of a hardware crash. The system can be programmed to do this automatically, even when the server is residing in a completely different location. This makes it particularly useful for rapid disaster recovery.
According to the 2009 Airline IT Trends Survey, nearly 90% of airlines will make some investment in virtualization technology within the next 3 years.
Best practice
Technology advances will only take the industry so far. Sustained reliability improvements require a critical coupling between technology, processes and design. This is achieved by using best practices, such as IT Infrastructure Library (ITIL), which reinforces the importance of service lifecycle and operational design for improving IT services to the business.
Re-thinking service provision
The principal challenge is to go beyond individual components of a service being available 24x7 to cover the entire service. End-to-end monitoring is required, which is complex. Network connectivity, end user devices, software and applications can all be sourced from different service providers - all having different SLAs.
In response, SITA is rethinking its service provision with the primary goal of zero downtime. That means moving from a state of taking corrective action when the need arises, to one of proactively taking preventive steps.
To get there we are investing in capabilities to provide multi-vendor customer services and operations. We are also building command centres to provide predictive capabilities to help identify potential problems before they cause major outages. Features include a centralized view to monitor and manage linkages between IT systems and business processes, as well as ensuring staff are fully trained and ITIL certified.
This will enable SITA to eventually provide integrated end-to-end SLA's with the ability to monitor and report with (near) real-time data on service performance. As a result we hope to change the mindset within the industry to focus on 'continuous availability' rather than merely 'high availability' that has previously been the norm.
While this represents a significant move by SITA, it is a goal that will only be achieved if it is shared by airlines and airports. That means providing the funding to support the vision in their own IT environment. Failure to do so could have serious implications for business continuity and ultimately competitiveness, as the industry's dependency on IT increases.
One step at a time
Within five years air travel and information technology will be so inescapably fused that manual back-up processes will only be able to address minor IT failures. This makes the ability to support an 'always-on' environment critical.
However, there are no 'silver bullets' to take us to this new level. Steps to eliminating downtime need to be accomplished in the context of a continuous improvement cycle. Some of those steps will be achieved through technological change. But these advances will need to be augmented with process changes that are tightly bound with standards and best practices from end-to-end, spanning both applications and networks. Trained staff will also be a critical element to provide constant monitoring.
The aim is not to design flawless systems, but ones that are defect-tolerant and so ensure continuous uptime of the industry's IT infrastructure.
Service Desk Integration Programme
One step towards the ultimate goal of zero downtime has been the Service Desk Integration Programme. This provides SITA with a unified global customer service capability by using common tools, support model, processes and best practices. This service offers customers improved operational efficiency through increased support, resiliency and business continuity from highly skilled support staff, through a single point of contact.
The SITA Service Desk gives customers the choice to contact SITA through the SITAView web-based portal or through SITA's contact centre.
Online customers can log and track incidents, self-help/self-diagnose, trouble-shoot and access resolution options are also available. While the customer contact centre gives direct access to skilled agents to help ( facilitate ) with incident management and service requests.
Data Centre Consolidation Programme
Another step towards the zero downtime goal has been SITA's 'Next Generation Data Centre Programme', which started in December 2007. This four-year initiative is transforming data centre infrastructure and activities from a multi-site operation into one logical global operation, executed through three geographical centres that will provide regional disaster recovery for SITA's mission critical systems.
The key objectives are to:
- Consolidate SITA resources and operations, using a centralized shared infrastructure
- Migrate to an 'on demand' operations model, using IT infrastructure and processes that allow solutions to be deployed with agility and flexibility
- Provide increased protection for critical air transport industry customer systems and applications
- Enhance levels of recovery for mission critical systems and continue to ensure compliance with industry certifications and government regulation

