RSS

FIXED: DNS issue with server ‘svensbluff’

UPDATE: 7:35 AM: Great news! :) The DNS issue on svensbluff has been resolved. No accounts had to be moved. Rebooting the server appeared to resolve the problem, which would suggest there was a tmp file corruption that needed to be flushed out of the system via a power cycle.

If you notice any irregularities with your account, please let us know. We were not made aware of the issue until 3:30 AM Central time. We feel terribly that we weren’t aware of it sooner.

Unfortunately our monitoring system did not pick up on it because it monitors by IP address, and asks each service for a response at the IP. The services were all up, even DNS, and responding … but DNS wasn’t processing requests correctly. Monitoring just asks “are you responding?” — it doesn’t ask and analyze custom requests. That’s why our monitoring system didn’t pick up on this problem. :( I am very sorry for the trouble this caused you. We are going to look into our options for monitoring, in order to be able to better detect situations like this.

We always love to hear from you. We’d rather receive a false report than no report at all. :) So if you see anything weird, please let us know by opening a support ticket right away. We want your service to be top notch, A-#1 !!!

Thank you for all the feedback as we worked on this, it was very helpful to us and helped us to iron out what was/was not going on. ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Good morning,

Around 3:30 AM Central time, an as-yet unidentified DNS issue caused websites on the server ‘svensbluff’ to become inaccessible.

We have been trouble-shooting this issue since that time.

At this point we have been unable to find the exact cause.

We will restore DNS service to down websites as quickly as possible. We have a list of nameservers that we know are not working, and these are the nameservers that we will assure are answering authoritatively like they are supposed to be, as soon as possible.

Thank you for your patience! We will post updates here as soon as they are available.

DCSN Team

Server Status

Uptime

  • evensonfarm
  • svensbluff

Due to repeated problems with the server ‘pilotisland’ in the past several weeks, we have decided to replace it with a new server, ‘evensonfarm’.

There were two ‘pilotisland’ servers, one was a little AMD x2 3600 with 2 GB RAM (which was intended to be a little workhorse administrative server), which was replaced by an AMD x2 4400 with 4 GB RAM (which we commissioned as a new shared server). Both were fraught with random load-spike issues and mystery crashes. These boxes were located at AtlantaNAP, and we want to give the staff at WireSix a hearty “thank you” for their tireless and gracious help trouble-shooting the issues we were experiencing. It wasn’t their fault. We have many AMD x2′s (of various sizes) and the only server that ever gave us trouble was ‘pilotisland’ … maybe it’s the server name that’s jinxed. *wink*

‘evensonfarm’ is a new Intel e2140 Dual-Core Core2Duo 1.6ghz with 2 GB RAM, located in Denver, CO at Handy Networks. It will be a shared server.

All sites on ‘pilotisland’ were migrated to ‘evensonfarm’ on Thursday evening.

‘evensonfarm’ has the same software configuration as ‘pilotisland’ had, including libraries, components, package versions, firewalls and settings, etc. The server IPs are different, of course, and clients can find their new IP by logging in to their account’s cPanel and looking under “Site IP”. The change in IP address has absolutely no bearing on site operations. Everything is running as before, and there was no downtime with the move to the new server.

If you have any questions about the move, or the new server, please do let us know. :)

Thank you!

The DCSN Team

This morning we encountered a combination of issues with our shared server ‘pilotisland’ which resulted in a lengthy (approx. 6 hours) outage. Now that the server is back up, we are able to determine what happened.

A site on the server experienced a large spike in distributed traffic (similar to the “slashdot” effect – but it wasn’t slashdot) which called a dynamic script (PHP/MySQL based) and caused an unsustainable load on the server, and it went down.

We attempted to reboot the server, but kept running into issues both with access (due to network traffic) and also with the hard drives requiring an FSCK before they would come back up. The automatic FSCK would not complete, requiring a manual FSCK, which was very time-intensive due to the size of the drives.

The FSCK has now been completed, the server has been rebooted, traffic for the busy site has been remediated (through both server/script settings and some traffic sharing) and all server operations came back to normal at approximately 1:15 PM Central Time.

We apologize for the frustration that this has caused. We realize that our clients do not want their sites to be down. We don’t want your sites to be down either! This particular server has been up 100% for six months, and showed no signs of trouble prior to this incident. This was an external traffic problem which then became coupled with normal drive maintenance operations for recovery. We are very sorry for the inconvenience and frustration that this has caused, and will continue to do everything in our power to ensure uninterrupted service on this server moving forward. ##

UPDATE, 1:37 PM: ‘svensbluff’ is back online. Thank you for your patience! :)

———————————————————————————————-

1:29 PM: Our shared server ‘svensbluff’ is currently down, after having emergency maintenance performed. Our on-site technicians are working on the server as we speak and will have it back online as soon as possible.

We apologize for the inconvenience!

Uptime stats: December 2007

Shared servers

    svensbluff – 100%
    pilotisland – 99.997%

Networks

    Handy Networks – Denver – 100%
    AtlantaNAP – Atlanta – 99.997%

Our shared servers ‘svensbluff’ and ‘pilotisland’ were both rebooted around 2:00 AM Central time Monday, 12/31/07 to complete kernel upgrades. This was routine security work and caused no change in functionality on either server. Both servers were offline for about 2 minutes during the course of the reboots, and have been online and running at 100% since then.

If you have any questions please open a support ticket so we may assist you. Thank you very much!

DCSN Support Team ##

Reboot on svensbluff tonight

Hello,

We will be rebooting the shared server svensbluff tonight at approximately 10:30 PM to upgrade to a new kernel. The expected downtime is under 10 minutes. If you have any questions, please let us know.

Thank you!

cPanel update breaks SSL certs

UPDATE, RESOLVED – 11:28 AM: Apache has successfully recompiled, and we have verified that the SSL certificates are working again without error. We recompiled the same libraries into Apache as before. However, if your site is experiencing any problems due to missing libraries or modules, please contact us right away and we will be happy to assist you with it. Thank you and have a great weekend! ##

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We’ve learned that last night’s cPanel update has broken the SSL certificates on the svensbluff shared server.

Our technicians are currently working on repairing the problem. Included in the repair efforts are recompiling Apache (the web server), so if your site goes offline for a few minutes, that’s why. :) Please do not worry, we are aware of the issue and are working on repairing it as quickly as possible.

Please accept our apologies for the trouble! We are also working with cPanel to identify what caused this in the first place, and to try to get them to not break things this way again. :)

DCSN Team

This weekend, cPanel auto-updated all servers on the “Release” build to cPanel 11. The previous version of cPanel was 10.x, so that makes cPanel 11 a BIG update.

cPanel has been preparing your server for this update for some time, bringing many packages up-to-date over the past weeks and months, in the hopes of making the transition as smooth as possible. It is imperative for security reasons — not just cPanel 11 — that your server be running the latest packages of OS software. (Please understand we do not allow unpatched/out-of-date servers to run on our network. Running out-of-date software is prohibited by our TOS as it is a security risk to everyone, not just you and your server.)

The most common problem we have seen is servers running an old version of Perl. Perl needs to be version 5.8.8 for cPanel 11 and related perl modules to work properly. Once your server is upgraded to v. 5.8.8, then the dependent perl modules will be able to update. Here’s a tutorial on how to upgrade your Perl to version 5.8.8. :)

The second-most common problem we have been seeing are difficulties with the mbox to maildir conversion of the mail system. cPanel has been running maildir as the default mail system for over 18 months now, but many folks have been dragging their feet (out of fear — we do understand!!!) about converting. Recently, the mail delivery on many servers stopped working completely because clients were still using the out-of-date mbox system. mbox is deprecated, not supported by cPanel, and is not able to handle the extraordinary volume of mail that most ordinary e-mail accounts have these days. mbox also has functionality limitations that you will be able to break free from once your server has been converted to maildir. We will be posting a FAQ/How-to about mbox-to-maildir as well as dealing with some of the common issues after the conversion.

cPanel 11 is here to stay. It is NOT something to be afraid of. :) cPanel 11 brings a lot of fantastic feature enhancements and additional functionality that your clients are going to LOVE:

    • Getting Started Wizard walks them through the initial set-up of their hosting account!
    • New and better File Manager
    • New and better HTML editor
    • MySQL Database set-up wizard
    • Install Perl Modules in client’s home directory
    • Install PHP PEAR modules in client’s home directory
    • Custom PHP Configuration on per-site basis
    • User-level (per e-mail account) e-mail filtering
    • WebDAV WebDisk functionality
    • Dozens of new Video Tutorials embedded right inside the appropriate cPanel pages
    • Much better organization of cPanel’s account features
    • Clients can drag-and-drop cPanel sections to reorganize their cPanel so it best makes sense for them

When you login to WHM, you will see an improved GUI (much easier on the eyes) and WHM has also been somewhat reorganized to make things easier to find. Everything is still basically in the same place; you’ll just find more sections with shorter lists of operations in each.

If you are experiencing anything odd with your server, please open a support ticket with us right away. Our technicians will be happy to trouble-shoot and repair your cPanel 11-related issue promptly.

DCSN Team

RSS