Home >> November 2008 Edition >> CASE STUDY - Who Are You Going To Call?
CASE STUDY - Who Are You Going To Call?
by Tim O’Neill

When a million-dollar-a-day drilling rig cannot communicate efficiently, many serious problems can, and do, occur. The rig operator usually does not have time to fly in an expert to fix the communication issues — so, who will you call?

The industrial Ethernet surrounds the “Must Be Up” network world. Production environments, especially in the Petrochemical industry, demands available communication to address issues ranging from explosions to oil spills. Such occurrences result in physical and financial losses as well as a loss of lives. Unacceptable! When a problem occurs in those areas, operators must have a proven and focused strategy to get their systems back online and with 100 percent capabilities in a hurry. This is an example of one such problem, which manifested itself in a slow response environment.

The rig is a major offshore platform that costs approximately $500K and more per day to operate. Iff the drilling operations or production are halted for any reason, the losses can easily push upwards of one million dollars a day. Rig operators need a strategy that calls in a team that can locate and solve the problem before it can affect the safety and/or the production of the site. The rig needs on the experts on site today — in fact, “now” would be best.

How To Solve The Problem—QUICKLY
The problem can exist anywhere — at the end to end transmission — the satellite WAN configuration — the client computer — the offshore LAN — the WAN itself — the onshore LAN — the servers — the applications — perhaps even a single transducer. Or, there could be a combination of these, and other, concerns.

There are three distinct needs for big rig networks…
  • Human communications (phone,
  • e-mail, video, and so on)
  • Business communications (reports, access to data bases, and so on)
  • Access and control of the technology on the rig, which includes the monitoring and management access for flows (total and instantaneous) data, temperature, pressure, RPM, or speed and more.

If the network, such as the one being discussed at this particular site, starts revealing response issues, such needs to be isolated and fixed before a poor situation becomes worse.

Locating The Problem
Here is how YR20 located the root cause of this rig’s communication issue. Probes run all the time, exactly what is needed for immediate problem recognition and quick solutions. The probe was connected at the edge router offshore and the data collection process was started. The first analysis of the collected data indicated there were three active VLANs on the trunk.
  • VLAN #1 showed light traffic and the WAN QoS was very good. It was OK.
  • VLAN #2 was carrying “process instrumentation” information traffic and was responsible for about 7 percent of the packet traffic. The WAN and LAN QoS was very good. It was OK.
  • VLAN #3 was carrying the “General Services” traffic and was responsible for about 91 percent of the packet traffic. The WAN QoS was very unstable and that resulted in significant packet loss in the Offshore to Onshore traffic. This resulted in numerous retransmissions from both ends. The average Round Trip Time was OK, with times in the 560 ms to 800 ms range.

The important traffic on this VLAN was the business traffic – Citrix on port 1494, Email on port 25, Microsoft Distributed services for file, print and other MS Remote Procedure Calls, with a variety of common ports including 139 and 445.

The first generated report was a TCP Time/Event Graph of Offshore to Onshore Database Traffic. The graph was extremely busy, and that’s not a good. The X axis represented time, and the Y axis represented sequence offset. A normal TCP/IP conversation would result in a near perfect 45 degree straight line on the graph, with few, if any, deviations. However, when a flat line, is observed, such indicates there were no responses during that time period — and that’s definitely that’s not a good sign!

The report showed a small associated packet loss on the Onshore to Offshore network traffic, but at a much lower rate-of-loss than the Offshore to Onshore loss-rate. With a 60-second view of the report, the user experienced three major “blackouts,” which lasted approximately 10 seconds. In each case, there was a packet loss event on the Offshore to Onshore packet stream. The Onshore system sends a TCP SACK and the Offshore client system attempts to recover by re-sending packets. Evenutally, after 10 seconds, the client system re-sends all the outstanding data.

Although the Onshore system has received most of the intermediate packets, they cannot be passed to the end-user application, as there is data missing, revealed by green “flat lines” on the chart. The user would perceive each of these three events as a 10-second, near-total hang-up of the application, followed by a burst of activity at the end of each hang-up, then another hang-up. This renders most interactive applications almost unusable.

Within a period of about one day, the YR20 remote team had pinpointed the problem area causing the data blackouts, as well as the cause of the poor and unreliable communications of the critical data. The problem was resolved when the team changed the WAN queuing and corrected a duplex miss-match in the setup of an onshore switch.

The good news was the other probe data revealed the other VLANs were working well within specifications. The accumulation of the data on all the VLAN’s and WAN is useful as it provides benchmarks for future comparative analysis. If one does not know what is good, then how can you know what is bad? Benchmark analysis data offers the best basis to define network issues.

Today’s Industrial Production networks support many critical operations. Not having an onsite device with a technically competent 24x7 support team could cause many costly consequences. For rigs, this is not an acceptable option. The YR20 first responders are your network rescue team, ready to use their experience and sophisticated equipment and resources to find the problems and repair the situation.

Tim Everitt, one of the three founders of YR20 points out, “Another dimension to offshore situations is that if there are problems, there are usually no IT people on board. This situation enhances the need for remote diagnostics across the satellite links and access to a probe that offers historical information about the network — not just on the WAN traffic, but also using the satellite WAN in order for the support team to have access to the LAN traffic for analysis.”

Even with all of the new satellite link offerings, with multiple classes of traffic with complex QoS/SLA arrangements and signaling/marking (e.g. DiffServ), problems still occur — having instant response to issues is extremely important.

The worst-case situation for rigs or production facilities is export pipeline control. If this fails, production may need to be drastically reduced or even halted, and that is quite expensive!

On dive-supported vessels or rigs, network failure that affects voice communications usually requires divers be pulled out of the water, due to safety concerns. The divers and their vessel find their work is suspended, meaning a loss of income, as well as the resulting production loss.

Other important communication needs include information and access to weather web sites, marine, iceberg, and aeronautical movement information.

Another serious situation is a lack of email. Usually, offshore email has no special support, but is becoming more and more critical to the business processes and production co-ordination. If email fails, the complaining starts immediately. If real time systems are failing, more workers will have to be put offshore — another huge expense and, once again, production is reduced, and that means an even larger cost.

Pumping Up The Solution
In this aforementioned case, the rig was equipped with the YR20 PCAP-Probe. Once the problem was identified, the rig personnel called YR20’s team and, within a short period of time, the experts isolated the problem and had assured the rig operator the issue would not affect production. However, the problem still required fixing to ensure business operations would continue as required.

One of the unique aspects of YR20 is that the Company rents their PCAP-Probe technology to rigs and other production sites, such as refineries. In addition, YR20 stands ready to respond to any call for help — they are 24X7 First Responders! A seasoned technologist is ready to assist at any time — rigs no longer have to fly in experts to solve their communication emergencies.

The name of the company, YR20, actually refers to the fact that every technical employee has more than 20 years of experience in the networking industry. This experience includes Ethernet and TCP/IP as well as satellite, RF, industrial control, as well as actual experience in the Oil and Chemical arenas, from rigs to offshore facilities and more.

The need for quick remote analysis is essential for any production facility — the quick response from an experienced technologist is essential in preventing major losses. Oh, by the way, the actual customer was not named for, as we all know — company networks never have any problems!

About the author
Tim O’Neill “Oldcommguy®” is an independent technology consultant currently working with YR20. He has over 35 years experience working in the WAN, Analog, ISDN, ATM and LAN markets. Tim also has several years experience in the oil, gas and petrochemical instrumentation arena. Tim has been responsible for technology and designed many products for companies like GeoSource, Navtel, Network General, Ganymede and ClearSight Networks and is now helping companies obtain lab and market recognition with technology verification. Tim is also the Chief Contributing Editor for LoveMyTool.com, a website designed to help network managers gain access to valuable information and real solution stories from other customers. Tim is a patent holding, published and degreed engineer. He has been a consultant on several movies and has been involved with law enforcement and industry at all levels from engineer to senior executive. He helped design and bring to market the first WAN DataScope in 1976. Contact Tim at tim@oldcommguy.com