DCN911.COM Off-Network Support


[Complete] svensbluff - Apache, Kernel upgrades scheduled for Thursday 11/20/08

Posted in Servers by Agile on the November 19th, 2008

UPDATE: Upgrade and Reboot complete 11/23/08 - Apache and PHP were upgraded Thursday night, 11/20/08 as rescheduled. Everything has compiled properly and is functioning as it should, however please do be aware that Apache 2 (which we have been running for a while) does handle some operations differently than Apache 1 did. Please double-check your sites and coding to ensure they are Apache 2 compliant.

We have also upgraded to the current Linux kernel, and rebooted the server to make the new kernel take effect. The server was down from 2:22-2:25 PM Central time today (11/23/08) while the server was rebooting. We waited until today to do this work, knowing that weekend internet traffic is usually very low, and Sunday afternoons are often spent watching football rather than playing on the computer. :) We hope that this was accurate in your case as well, and resulted in no inconvenience to you!! The server is now back up and functioning properly.

If you encounter any issues, please do let us know. Thank you very much!! Have a wonderful weekend. ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

UPDATE: Rescheduled for 9:00 PM Central 11/20/08 - Due to unforeseen scheduling conflicts early this morning, we are rescheduling these upgrades to run and complete between 9-10 PM Central time. We will post here once the work has been completed. Thank you! ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We will be performing routine maintenance on the server ’svensbluff’ early Thursday morning, including a kernel upgrade and an Apache/PHP recompile. The purpose of these upgrades are to close potential security holes, as well as to implement several standards required for PCI DSS compliance.

A short period (10-15 minutes) of downtime is possible but not expected during the Apache rebuild. We will need to reboot the server to complete the kernel upgrade. Rebooting usually takes only a few minutes, so you may see 5-7 minutes of downtime as the server restarts all of its services.

Although our shared servers are not yet PCI-compliant, we are working to make them PCI compliant within the timelines required for Level 4 Merchants. We will post confirmation once PCI DSS compliance has been achieved on our shared servers. ##

[Completed] Server ’svensbluff’ undergoing service upgrades

Posted in Servers by Agile on the November 2nd, 2008

UPDATE - 11/03/08, 3:42 AM Central: Project completed!

All software has been upgraded to current versions as outlined below. cPanel has been re-upgraded to version 11.24.0, and this time it did not kill Apache. (This was the ultimate goal behind the software updates - that this new major version of cPanel would play more nicely with current versions of server software rather than the old versions we were running.)

We’ve verified everything is running correctly, except for the couple of sites which had old scripts or modules which are not PHP 5 compliant. (Such are very few & far between.) We will be working with those clients directly to get those issues resolved.

All systems are “Go” and the upgrade project has been completed! :) If you have any questions or need assistance with anything, please contact us anytime. We are always happy to hear from you! ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(more…)

[Solved] svensbluff web server down due to cPanel update problem

Posted in Servers by Agile on the November 1st, 2008

9:24 AM RESOLVED - We’ve rolled-back cPanel to 11.23.6 and Apache is magically running fine!!  :) We’re opening a ticket with cPanel to have them repair whatever is wrong with 11.24.0 that is breaking legacy Apache.

In the meantime we have disabled this cPanel upgrade so it will not run again before we have this bug sorted out.

We are considering this case “closed” for the sake of this posting, however we will post here further as we continue to work on it later today.

Again we are very, very sorry for the trouble this morning!! If we had known cPanel’s handiwork was going to kill the server, we would not have allowed it to update.

Thank you for your patience!!

DCSN Support Team

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

9:14 AM - We’re still working on this with no ETA as of yet. We will post further updates as we know more. ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Good morning,

This morning our server ’svensbluff’ auto-updated to cPanel 11.24.0, which appears to have killed Apache (the web server). Despite a thorough investigation, we’ve been unable to determine why exactly Apache will not run, other than it obviously appears related to the cPanel update.

We are recompiling Apache as we speak in an attempt to restore service.

We will post further updates here as they become available. Thank you for your patience!!

Resolved - 10/03/08: GNAX network outage

Posted in Servers by Agile on the October 3rd, 2008

7:56 PM Central - UPDATE - RESOLVED

All servers and sites are back up as of this time. External monitoring showed routing returned after 7:36 PM. The issue is resolved and service has been restored to normal. Thank you for your patience! ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

7:25 PM Central

At 6:58 PM Central, our servers at the GNAX data center became unreachable. This is a network outage affecting all of our servers hosted at the GNAX data center; it is not limited to one or two servers.

It does appear to be an issue at the data center, as traceroutes are reaching the border routers.

We have notified the data center and will update you as soon as we know more.

Impact: This affects only some dedicated server customers who are hosted at GNAX. This does not affect dedicated servers nor our shared servers which are hosted in Denver.

##

FIXED: DNS issue with server ’svensbluff’

Posted in Servers by Agile on the September 11th, 2008

UPDATE: 7:35 AM: Great news! :) The DNS issue on svensbluff has been resolved. No accounts had to be moved. Rebooting the server appeared to resolve the problem, which would suggest there was a tmp file corruption that needed to be flushed out of the system via a power cycle.

If you notice any irregularities with your account, please let us know. We were not made aware of the issue until 3:30 AM Central time. We feel terribly that we weren’t aware of it sooner.

Unfortunately our monitoring system did not pick up on it because it monitors by IP address, and asks each service for a response at the IP. The services were all up, even DNS, and responding … but DNS wasn’t processing requests correctly. Monitoring just asks “are you responding?” — it doesn’t ask and analyze custom requests. That’s why our monitoring system didn’t pick up on this problem. :( I am very sorry for the trouble this caused you. We are going to look into our options for monitoring, in order to be able to better detect situations like this.

We always love to hear from you. We’d rather receive a false report than no report at all. :) So if you see anything weird, please let us know by opening a support ticket right away. We want your service to be top notch, A-#1 !!!

Thank you for all the feedback as we worked on this, it was very helpful to us and helped us to iron out what was/was not going on. ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Good morning,

Around 3:30 AM Central time, an as-yet unidentified DNS issue caused websites on the server ’svensbluff’ to become inaccessible.

We have been trouble-shooting this issue since that time.

At this point we have been unable to find the exact cause.

We will restore DNS service to down websites as quickly as possible. We have a list of nameservers that we know are not working, and these are the nameservers that we will assure are answering authoritatively like they are supposed to be, as soon as possible.

Thank you for your patience! We will post updates here as soon as they are available.

DCSN Team

Server Status

Posted in Servers by Agile on the August 5th, 2008

Uptime

  • evensonfarm
  • svensbluff

Server ‘pilotisland’ decommissioned, ‘evensonfarm’ live

Posted in by Agile on the April 19th, 2008

Due to repeated problems with the server ‘pilotisland’ in the past several weeks, we have decided to replace it with a new server, ‘evensonfarm’.

There were two ‘pilotisland’ servers, one was a little AMD x2 3600 with 2 GB RAM (which was intended to be a little workhorse administrative server), which was replaced by an AMD x2 4400 with 4 GB RAM (which we commissioned as a new shared server). Both were fraught with random load-spike issues and mystery crashes. These boxes were located at AtlantaNAP, and we want to give the staff at WireSix a hearty “thank you” for their tireless and gracious help trouble-shooting the issues we were experiencing. It wasn’t their fault. We have many AMD x2’s (of various sizes) and the only server that ever gave us trouble was ‘pilotisland’ … maybe it’s the server name that’s jinxed. *wink*

‘evensonfarm’ is a new Intel e2140 Dual-Core Core2Duo 1.6ghz with 2 GB RAM, located in Denver, CO at Handy Networks. It will be a shared server.

All sites on ‘pilotisland’ were migrated to ‘evensonfarm’ on Thursday evening.

‘evensonfarm’ has the same software configuration as ‘pilotisland’ had, including libraries, components, package versions, firewalls and settings, etc. The server IPs are different, of course, and clients can find their new IP by logging in to their account’s cPanel and looking under “Site IP”. The change in IP address has absolutely no bearing on site operations. Everything is running as before, and there was no downtime with the move to the new server.

If you have any questions about the move, or the new server, please do let us know. :)

Thank you!

The DCSN Team

Shared server ‘pilotisland’ outage, recovery

Posted in by Agile on the March 27th, 2008

This morning we encountered a combination of issues with our shared server ‘pilotisland’ which resulted in a lengthy (approx. 6 hours) outage. Now that the server is back up, we are able to determine what happened.

A site on the server experienced a large spike in distributed traffic (similar to the “slashdot” effect - but it wasn’t slashdot) which called a dynamic script (PHP/MySQL based) and caused an unsustainable load on the server, and it went down.

We attempted to reboot the server, but kept running into issues both with access (due to network traffic) and also with the hard drives requiring an FSCK before they would come back up. The automatic FSCK would not complete, requiring a manual FSCK, which was very time-intensive due to the size of the drives.

The FSCK has now been completed, the server has been rebooted, traffic for the busy site has been remediated (through both server/script settings and some traffic sharing) and all server operations came back to normal at approximately 1:15 PM Central Time.

We apologize for the frustration that this has caused. We realize that our clients do not want their sites to be down. We don’t want your sites to be down either! This particular server has been up 100% for six months, and showed no signs of trouble prior to this incident. This was an external traffic problem which then became coupled with normal drive maintenance operations for recovery. We are very sorry for the inconvenience and frustration that this has caused, and will continue to do everything in our power to ensure uninterrupted service on this server moving forward. ##

Server svensbluff down, being rebooted; Update: Recovered

Posted in by Agile on the March 26th, 2008

UPDATE, 1:37 PM: ’svensbluff’ is back online. Thank you for your patience! :)

———————————————————————————————-

1:29 PM: Our shared server ’svensbluff’ is currently down, after having emergency maintenance performed. Our on-site technicians are working on the server as we speak and will have it back online as soon as possible.

We apologize for the inconvenience!

Uptime stats: December 2007

Posted in by Agile on the December 31st, 2007

Shared servers

    svensbluff - 100%
    pilotisland - 99.997%

Networks

    Handy Networks - Denver - 100%
    AtlantaNAP - Atlanta - 99.997%
Next Page »